Consider the following sequences \(S1\) and \(S2\), a similarity scoring via \(s(x,y)\) and a gap cost function \(g(k)\).
\[\begin{align} S1 &= TCCGA\\ S2 &= TACGCAGA\\ s(x, y) &= \begin{cases} +1 &if\ x=y\\ 0 &else \end{cases}\\ g(k) &= -k \end{align}\]
Compute the local alignment matrix \(S_{i,j}\) for the given sequence.
Si,j | - | T | A | C | G | C | A | G | A |
---|---|---|---|---|---|---|---|---|---|
- | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
T | 0 |
|
|
|
|
|
|
|
|
C | 0 |
|
|
|
|
|
|
|
|
C | 0 |
|
|
|
|
|
|
|
|
G | 0 |
|
|
|
|
|
|
|
|
A | 0 |
|
|
|
|
|
|
|
|
\[\begin{align} H_{0,0} &= 0\\ H_{i,0} &= 0\\ H_{0,j} &= 0\\ H_{i,j} &= max \begin{cases} H_{i-1, j-1} + s(a_i, b_j)\\ H_{i-1,j} + s(a_i, -)\\ H_{i, j-1} + s(-, b_j)\\ 0 \end{cases} \end{align}\]
Si,j | - | T | A | C | G | C | A | G | A |
---|---|---|---|---|---|---|---|---|---|
- | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
T | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 2 | 1 | 1 | 1 | 0 | 0 |
G | 0 | 0 | 0 | 1 | 3 | 2 | 1 | 2 | 1 |
A | 0 | 0 | 1 | 0 | 2 | 3 | 3 | 2 | 3 |
Give all optimal local alignments and the according score.
score: 3
TCCG TCCGA TCCG-A CCGA
|:|| |:||: |:|| | |:||
TACG TACGC TACGCA CAGA
Which of these statements are correct?
You want to extend the Smith-Waterman algorithm for local alignment to more general gap scoring functions \(g(k)\) (where \(k\) denotes the gap length).
The following recursions were created analogously to the Waterman-Smith-Beyer algorithm. Which of these (if any) represents a variant of the Smith-Waterman algorithm that allows for an arbitrary gap scoring function?
\[ D_{0,0} = 0,\ D_{i,0}=0,\ D_{0,j}=0 \]
\[\begin{align} D_{i,j} &\stackrel{(i)}{=} max \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ max_{1 \leq k \leq j} D_{i,j-k} + g(1)\\ max_{1 \leq k \leq i} D_{i-k,j} + g(1)\\ 0 \end{cases} &D_{i,j} &\stackrel{(ii)}{=} max \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ max_{1 \leq k \leq j} D_{i,j-k} + g(k)\\ max_{1 \leq k \leq i} D_{i-k,j} + g(k)\\ 0 \end{cases}\\ \\ D_{i,j} &\stackrel{(iii)}{=} min \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ max_{1 \leq k \leq j} D_{i,j-k} + g(k)\\ max_{1 \leq k \leq i} D_{i-k,j} + g(k)\\ 0 \end{cases} &D_{i,j} &\stackrel{(iv)}{=} min \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ max_{1 \leq k \leq j} D_{i,j-k} + g(1)\\ max_{1 \leq k \leq i} D_{i-k,j} + g(1)\\ 0 \end{cases}\\ \\ D_{i,j} &\stackrel{(v)}{=} max \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ max_{1 \leq k \leq j} D_{i,j-k} + g(1)\\ max_{1 \leq k \leq i} D_{i-k,j} + g(1)\\ -1 \end{cases} \end{align}\]
Only Formula \(ii\) allows for an arbitrary gap scoring.
The following recursions were created analogously to the Gotoh algorithm. Which of these (if any) represents a variant of the Smith-Waterman algorithm that allows for an affine gap scoring function?
\[ D_{0,0} = 0,\ D_{i,0}=0,\ D_{0,j}=0 \] \[\begin{align} D_{i,j} &\stackrel{(i)}{=} max \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ P_{i,j}\\ Q_{i,j}\\ 0 \end{cases} \begin{aligned} P_{i,j} &= max \begin{cases} D_{i-1,j} + g(1)\\ P_{i-1, j} + \beta \end{cases}\\ Q_{i,j} &= max \begin{cases} D_{i,j-1} + g(1)\\ Q_{i, j-1} + \beta \end{cases}\\ \end{aligned}\\ \\ D_{i,j} &\stackrel{(ii)}{=} max \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ P_{i,j}\\ Q_{i,j}\\ 0 \end{cases} \begin{aligned} P_{i,j} &= max \begin{cases} D_{i-1,j} + g(k)\\ P_{i-1, j} + \beta \end{cases}\\ Q_{i,j} &= max \begin{cases} D_{i,j-1} + g(k)\\ Q_{i, j-1} + \beta \end{cases}\\ \end{aligned}\\ \\ D_{i,j} &\stackrel{(iii)}{=} min \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ P_{i,j}\\ Q_{i,j}\\ 0 \end{cases} \begin{aligned} P_{i,j} &= min \begin{cases} D_{i-1,j} + g(k)\\ P_{i-1, j} + \beta \end{cases}\\ Q_{i,j} &= min \begin{cases} D_{i,j-1} + g(k)\\ Q_{i, j-1} + \beta \end{cases}\\ \end{aligned}\\ \\ D_{i,j} &\stackrel{(iv)}{=} min \begin{cases} D_{i-1, j-1} + s(a_i, b_j)\\ P_{i,j}\\ Q_{i,j}\\ 0 \end{cases} \begin{aligned} P_{i,j} &= min \begin{cases} D_{i-1,j} + g(1)\\ P_{i-1, j} + \beta \end{cases}\\ Q_{i,j} &= min \begin{cases} D_{i,j-1} + g(1)\\ Q_{i, j-1} + \beta \end{cases}\\ \end{aligned} \end{align}\]
Only Formula \(i\) allows for an affine gap scoring.
Programming assignments are available via Github Classroom and contain automatic tests.
We recommend doing these assignments since they will help you to further understand this topic.
Access the Github Classroom link: Programming Assignment: Sheet 05.
Below you find an interactive tool that allows you to create dynamic programming matrices to practice for the exam. You can fill in the table and check which values are correct via the Check Matrix button. Correctly filled cells will turn green.
Once you filled in the cells, you can trigger the traceback mode toggle. This will disable the input fields. However, you can now select cells that are part of a traceback. While in traceback mode you can click the Check Traceback & Alignment button. If you completely marked a correct traceback it will get a green border. Else marked cells will get a red border.
It will further check whether the alignment in the alignment section below matches your traceback (only if your traceback is correct). If this is the case, the alignment will turn green else it will turn red.
Warning
This tool is in beta and might contain bugs.