We want to calculate the \(PAM_1\) matrix based on the following two sequence alignments of the DNA sequences a, b, c and d.
a = AAGTACTTT c = AGGTAACACGTTTAGTCA
b = AAATTCCTA d = AGTTTACCGGGTTAATCA
Tip: In order to solve a) and b) create a combined alignment comprised of two combined sequences a’ and b’ (based on the two initial alignments and their symmetric counterparts)
a’ = a + c + b + d
b’ = b + d + a + c
The order does not matter, as the frequency identification is position-insensitive.
Unless otherwise stated round all results to 4 decimal places.
Calculate the nucleotide frequencies \(r_x\)
\[ r_x = \frac{\#\{x \in (a')\}}{\lvert a' \rvert} \]
\[\begin{align} r_A = 0.3333 \\ r_C = 0.1667 \\ r_T = 0.3333 \\ r_G = 0.1667 \\ \end{align}\]
Calculate the symmetric mutation matrix \(E(x,y)\).
\[\begin{align} e_{xy} = \frac{1}{\lvert a' \rvert} \left( \# \begin{vmatrix}x \\ y \end{vmatrix} \in \begin{pmatrix} a' \\ b' \end{pmatrix} \right) \end{align}\]
Non normalized values. Further divided by \(|a'|\)
nt. | A | C | T | G |
---|---|---|---|---|
A | 12 | 1 | 3 | 2 |
C | - | 6 | 1 | 1 |
T | - | - | 12 | 2 |
G | - | - | - | 4 |
nt. | A | C | T | G |
---|---|---|---|---|
A | 0.2222 | 0.0185 | 0.0556 | 0.0370 |
C | - | 0.1111 | 0.0185 | 0.0185 |
T | - | - | 0.2222 | 0.0370 |
G | - | - | - | 0.0741 |
Calculate the non-normalized PAM matrix S with \(10*log_{10} (odds)\), using the previously determined \(r\) values and \(E\) matrix. (round to integers)
\[\begin{align} S_{xy} = 10 \log_{10} \left( \frac{e_{xy}}{r_x r_y} \right) \end{align}\]
nt. | A | C | T | G |
---|---|---|---|---|
A | 3 | -5 | -3 | -2 |
C | - | 6 | -5 | -2 |
T | - | - | 3 | -2 |
G | - | - | - | 4 |
Given the sequences \(a = ACC\) and \(b = ATT\), compute the optimal Needleman-Wunsch alignments using:
\[ s^{g}(x, y) = \begin{cases} 5 & \text{if } x = y \\ -2 & \text{if } x \neq y \text{ or gapped} \end{cases} \]
\[ s^{PAM}(x, y) = \begin{cases} -2 & \text{if } x \text{ or } y \text{ gapped} \\ s_{x,y} & \text{otherwise (match/mismatch)} \end{cases} \]
Sgi,j | - | A | T | T |
---|---|---|---|---|
- | 0 | -2 | -4 | -6 |
A | -2 | 5 | 3 | 1 |
C | -4 | 3 | 3 | 1 |
C | -6 | 1 | 1 | 1 |
SPAMi,j | - | A | T | T |
---|---|---|---|---|
- | 0 | -2 | -4 | -6 |
A | -2 | 3 | 1 | -1 |
C | -4 | 1 | -1 | -3 |
C | -6 | -1 | -3 | -5 |
general scoring:
ACC
ATT
PAM scoring:
ACC--
A--TT
Calculate the normalization factor \(\gamma\) based on \(E\) .
\[ 0.01 = \gamma \sum_{x \neq y} e_{xy} = \gamma \left(1 - \sum_{x} e_{xx}\right) \]
\[ \gamma = 0.027 \]
Calculate the mutation rate matrix \(P\).
\[ p_{xy} = \frac{e_{xy}}{r_x} \]
nt. | A | C | T | G |
---|---|---|---|---|
A | 0.6667 | 0.0555 | 0.1668 | 0.1110 |
C | 0.1110 | 0.6665 | 0.1110 | 0.1110 |
T | 0.1668 | 0.0555 | 0.6667 | 0.1110 |
G | 0.2220 | 0.1110 | 0.2220 | 0.4445 |
Calculate the normalized mutation rate matrix \(P'\) using \(P\) and the normalization factor \(\gamma\).
\[\begin{align*} p'_{xy} &= \gamma p_{xy} \\ p'_{xx} &= 1 - \sum_{x \neq y} p'_{xy} \end{align*}\]
nt. | A | C | T | G |
---|---|---|---|---|
A | 0.9910 | 0.0015 | 0.0045 | 0.003 |
C | 0.0030 | 0.9910 | 0.0030 | 0.003 |
T | 0.0045 | 0.0015 | 0.9910 | 0.003 |
G | 0.0060 | 0.0030 | 0.0060 | 0.985 |
Determine \(PAM_1\) based on the normalized mutation rate matrix \(P'\) with \(10*log_{10}(odds)\) (round to integer)
\[ PAM1_{xy} = 10 \log_{10} \left(\frac{p'_{xy}}{r_y}\right) \]
nt. | A | C | T | G |
---|---|---|---|---|
A | 5 | -20 | -19 | -17 |
C | - | 8 | -20 | -17 |
T | - | - | 5 | -17 |
G | - | - | - | 8 |
Determine \(PAM_2\). (round to integer)
\[ PAM(m)_{xy} = 10 \log_{10} \left(\frac{p'_{xy}}{r_y}\right) \text{ with } p'_{xy} \in (P')^m \]
\[ (P')^2 = \begin{bmatrix} 0.9910 & 0.0015 & 0.0045 & 0.003 \\ 0.0030 & 0.9910 & 0.0030 & 0.003 \\ 0.0045 & 0.0015 & 0.9910 & 0.003 \\ 0.0060 & 0.0030 & 0.0060 & 0.985 \\ \end{bmatrix} \times \begin{bmatrix} 0.9910 & 0.0015 & 0.0045 & 0.003 \\ 0.0030 & 0.9910 & 0.0030 & 0.003 \\ 0.0045 & 0.0015 & 0.9910 & 0.003 \\ 0.0060 & 0.0030 & 0.0060 & 0.985 \\ \end{bmatrix} = \begin{bmatrix} 0.98212375 & 0.00298875 & 0.0089415 & 0.005946 \\ 0.0059775 & 0.982099 & 0.0059775 & 0.005946 \\ 0.0089415 & 0.00298875 & 0.98212375 & 0.005946 \\ 0.011892 & 0.005946 & 0.011892 & 0.97027 \\ \end{bmatrix} \]
nt. | A | C | T | G |
---|---|---|---|---|
A | 5 | -17 | -16 | -14 |
C | - | 8 | -17 | -14 |
T | - | - | 5 | -14 |
G | - | - | - | 8 |
Programming assignments are available via Github Classroom and contain automatic tests.
We recommend doing these assignments since they will help you to further understand this topic.
Access the Github Classroom link: Programming Assignment: Sheet 08.