Exercise 1

Extend the given Zuker recursions towards the folding of an alignment of K sequences similar to RNAalifold.

For the matrix \(W\),

Init: \(W_{ij} = 0\), with \(i+m\geq j\)

Recursion for entries \(W_{ij}\), with \(i+m < j\): \[ W_{ij} = \min \begin{cases} W_{i{j-1}}\\ W_{i+1{j}}\\ V_{ij}\\ \end{cases} \]

For the matrix \(V\):

Init: \(V_{ij} = \infty\), with \(i+m\geq j\)

Recursion for entries \(V_{ij}\), with \(i+m < j\): \[ V_{ij} = \min \begin{cases} eH(i,j)\\ V_{i+1{j-1}} + eS(i,j)\\ min_{\substack{i<i^{'}<j^{'}<j,\\i^{'}-i+j-j^{'}>2}}V_{i^{'}{j^{'}}}+eL(i,j,i^{'},j^{'})\\ \end{cases} \]

Hide
Solution
  • adding conservation score \(\gamma\), weight \(\beta\) and sum that loops over all k sequences of the alignment.

\[ V_{ij} = \beta \gamma (ij) + \min \begin{cases} \sum_{1 \leq l \leq K} eH(i,j, S_{l})\\ V_{i+1{j-1}} + \sum_{1 \leq l \leq K} eS(i,j, S_{l})\\ min_{\substack{i<i^{'}<j^{'}<j,\\i^{'}-i+j-j^{'}>2}}V_{i^{'}{j^{'}}}+ \sum_{1 \leq l \leq K} eL(i,j,i^{'},j^{'}, S_{l})\\ \end{cases} \]

Exercise 2

Consider the following alignment that is used as input for RNAalifold. No minimal loop length is given.

AAGUUUCG
AAGCUUCG
AAC-AUG-

2.1

Compute the following conservation scores.

\[ \begin{align*} \gamma(1,6) &= \\ \gamma(2,5) &= \\ \gamma(3,7) &= \\ \gamma(4,8) &= \\ \gamma(7,8) &= \end{align*} \]

Hide
Formula

\[ \begin{align*} h(x,y)= \begin{cases} 1 & x \neq y\\ \\ 0 & x=y\\ \end{cases} \end{align*} \]

\[ \begin{align*} \gamma(i,j)= & \;-\hspace{-6pt}\sum_{1\leq\ell<\ell'\leq K} \begin{cases} h(a_{\ell i},a_{\ell' i}) + h(a_{\ell j},a_{\ell' j}) & \text{$a_{\ell i} - a_{\ell j}$, $a_{\ell' i} - a_{\ell' j}$ compl.}\\ 0 & \text{otherwise}, \end{cases}\\ & + \delta \sum_{1\leq\ell\leq K} \begin{cases} 0 & \text{$a_{\ell i} - a_{\ell j}$ complementary}\\ 0.25 & \text{$a_{\ell i}, a_{\ell j}$ are both gaps}\\ \\ 1 & \text{otherwise}, \end{cases} \end{align*} \]

Solution

\[ \begin{align*} \gamma(1,6)&= -[0+0+0] + \delta(0+0+0) = 0\\ \gamma(2,5)&= -[0+0+0] + \delta(0+0+1) = \delta\\ \gamma(3,7)&= -[0+2+2] + \delta(0+0+0) = -4\\ \gamma(4,8)&= -[1+0+0] + \delta(0+0+0.25) = -1+0.25\delta\\ \gamma(7,8)&= -[0+0+0] + \delta(0+0+1) = \delta \end{align*} \]

2.2

Why is \(\gamma(3,7) < \gamma(1,6)\)?

Hide
Solution

\(\gamma(3,7)\) is smaller because of the covariation term. The base pair GC has been mutated into the base pair CG. This is a change in the sequence but not in the secondary structure.

2.3

What is the intuition of the covariation and penalty terms in the conservation scores?

Hide
Solution

The intuition for the covariation term is to favour structures that have been maintained although the sequence has been mutated. The penalty term favours base pairs since it penalizes gaps and unpaired bases.