For the given examples which ones can be called alignments:
AGTTTTTT
AGGTTTTT
True
CCGTTTTTT
-AGGTTTTT
True
CCCGTTTTTTGC
-CGGTTTTT
False. In an alignment both strings must have the same length
AG--TTTTTT
AG-GTTTTTT
False. Gaps cannot be aligned with gaps
The Needleman-Wunsch algorithm enables the calculation of the optimal pairwise sequence alignment with linear gap cost. Given the following two sequences S1, S2 and the given cost function complete the tasks A-D.
\[\begin{align} S1 &= TACGCGC\\ S2 &= TCCGA \end{align}\]
\[ w(x,y)= \begin{cases} + 1 & if\ x = `-`\ \vee\ y=`-`\\ - 1 & if\ x = y \\ 0 & else\\ \end{cases} \]
Di,j | - | T | C | C | G | A |
---|---|---|---|---|---|---|
- |
|
|
|
|
|
|
T |
|
|
|
|
|
|
A |
|
|
|
|
|
|
C |
|
|
|
|
|
|
G |
|
|
|
|
|
|
C |
|
|
|
|
|
|
G |
|
|
|
|
|
|
C |
|
|
|
|
|
|
Complete the provided table with the correct initialization step.
\[\begin{align} D_{0,0} &= 0\\ \forall i \leq |S1|: D_{i,0} &= \sum_{k=1}^{i} w(S1_i, -) \\ \forall j \leq |S2|: D_{0,j} &= \sum_{k=1}^{j} w(-, S2_j) \end{align}\]
Di,j | - | T | C | C | G | A |
---|---|---|---|---|---|---|
- | 0 | 1 | 2 | 3 | 4 | 5 |
T | 1 |
|
|
|
|
|
A | 2 |
|
|
|
|
|
C | 3 |
|
|
|
|
|
G | 4 |
|
|
|
|
|
C | 5 |
|
|
|
|
|
G | 6 |
|
|
|
|
|
C | 7 |
|
|
|
|
|
Using dynamic programming technique fill in all values in the matrix.
\[ \forall\ i,j >0\ : D_{i,j}=min \begin{cases} D_{i-1,j-1} + w(S1_i, S2_j)\\ D_{i,j-1} + w(-, S2_j) \\ D_{i-1,j} + w(S1_i, -) \end{cases} \]
Di,j | - | T | C | C | G | A |
---|---|---|---|---|---|---|
- | 0 | 1 | 2 | 3 | 4 | 5 |
T | 1 | -1 | 0 | 1 | 2 | 3 |
A | 2 | 0 | -1 | 0 | 1 | 1 |
C | 3 | 1 | -1 | -2 | -1 | 0 |
G | 4 | 2 | 0 | -1 | -3 | -2 |
C | 5 | 3 | 1 | -1 | -2 | -3 |
G | 6 | 4 | 2 | 0 | -2 | -2 |
C | 7 | 5 | 3 | 1 | -1 | -2 |
Using the matrix from 1B) find the optimal alignment of the given sequences.
TACGCGC
| | ||:
T-C-CGA
Find an optimal alignment of the given sequences, while assuming that the first G character in each sequence has to be matched/aligned.
You can split the alignment at the G and treat it as two separate alignments
TACGCGC TACGCGC TACGCGC
|:||| |:|| : |:|| :
TCCGA-- TCCG-A- TCCG--A
Which statements about Needleman-Wunsch and the Hirschberg recursion are True and which are False.
Programming assignments are available via Github Classroom and contain automatic tests.
We recommend doing these assignments since they will help you to further understand this topic.
Access the Github Classroom link: Programming Assignment: Sheet 03.
Below you find an interactive tool that allows you to create dynamic programming matrices to practice for the exam. You can fill in the table and check which values are correct via the Check Matrix button. Correctly filled cells will turn green.
For now this tool only supports the maximization approach.
Warning
This tool is in beta and might contain bugs.