RNA secondary structures can be represented using a graph notation, where nodes represent nucleotides and edges encode the molecule’s backbone or intramolecular base pairs between nucleotides. Below, an RNA molecule graph is depicted that encodes base pairs in blue.
Decide for the following properties whether they are correct or wrong given the RNA secondary structure graph.
Position 23 represents the 5’-end
wrong; We always encode in the 5’ (=1) to 3’ (=n) direction.
the graph is invalid
wrong
contains invalid base pairs
wrong; all base pairs depicted in the graph are valid
contains a pseudoknot
wrong; There are no pseudoknot structures in this graph.
non-crossing
correct; There are no pseudoknot structures in this graph.
nested
correct
contains base pair (5,12)
wrong; Position 5 is unpaired.
contains base pair (4,13)
correct
base pair (1,10) would be crossing
correct
obeys a minimal loop length of 4
wrong; The minimum loop length for this graph is 3. (number of unpaired bases in loops)
encoded by ((…)).(((((…)).))).
wrong
encoded by .(((.((…))))).((…))
correct
You are given the following dot-bracket string: (((…)))…((((…))..))
Draw graph representations of all nested structures that can be encoded by the dot-bracket string. Assume a minimal loop length of 3.
This is the only possible nested structure based on the dot-bracket string given.
Draw a graph representation of one possible crossing structure that can be encoded by the dot-bracket string. Assume a minimal loop length of 3.
Dot-bracket string: (((…)))…[(((…])..))
There are multiple other possible crossing structures. (e.g. (((…)))…[((<…])..>) )
Given the following partially filled Nussinov matrix \(N\) using a minimal loop length \(l = 0\), i.e. neighbored nucleotides are allowed to pair.
What are the values of the green and red entry?
green: \(2\)
red: \(1\)
How many tracebacks exist for the red entry using the original recursion by Nussinov?
There exist two tracebacks for the red entry; pairing with either G.
Given any matrix \(N\) filled by Nussinov’s algorithm for and RNA sequence \(S\) of length \(n\). Discuss which of the following statements are correct or wrong.
The entry \(N_{1,n}\) of the Nussinov matrix encodes …
… the optimal structure.
wrong; It only encodes the base pair number for the optimal structure.
… the minimum free energy (mfe) structure.
wrong; No energy minimization is done.
… the maximal number of base pairs for sequence \(S_1\)..\(S_n\).
correct
… the traceback end.
wrong; It encodes the traceback start.
… the maximal number of base pairs for any structure.
correct
… information for a unique structure.
wrong; Typically there is more than one optimal structure with maximal number of base pairs.
Provide all recursion and initialization details for the following recursion depictions. Note, also ensure a minimal loop length \(l\) within your recursions.
\[\begin{align*} N_{i,i} = N_{i,i-1} &=& 0 \text{ (no init for $B$ and $D$ needed)} \\ \forall_{1\leq i<j\leq n}: N_{i,j} &=& max\left\{ B_{i,j}, D_{i,j} \right\} \\ B_{i,j} &=& \begin{cases} N_{i+1,j-1}+1 & \text{ if }i+l<j \wedge S_i,S_j\text{ compl.}\\ 0 & \text{ else} \end{cases} \\ D_{i,j} &=& \max_{i\leq k<j} \left\{ B_{i,k} + N_{k+1,j} \right\} \text{ (only valid for $i<j$)} \end{align*}\]