AJUDA LePMA (Learning Progressive Multiple Alignment)
--Concepts--
Fasta Format
Fasta format is the most standard format
used around the world to represent DNA sequences and it help
a unified process of them.
This format consists in introduce, in
one line, the name of a sequence with symbol '>' going before it,
and, in the next line, the DNA sequence as a nucletoid sequence.
This step is reppeated for every sequence we want to introduce.
For example:
>s1
atgact
>s2
cct-a
Guide Tree
The guide tree is a hierarchical tree,
similar to a phylogenetic tree, that shows graphically the
successive alignments according to the similarity scores between
the sequences or clusters.
Similarity Matrix
The Similarity Matrix is a symmetric matrix
that shows us all the similarities (or distances) between every
pair of sequences or clusters in each moment during the multiple
alignment process.
At first step, we get the scores of this
matrix comparing all the sequences in pairs but, from the second
step, we apply an aproximation formula to calculate the
similarities regarding to the new clusters aligned.
Pair-Wise Scoring Matrix
The Pair-wise Scoring Matrix shows us a
comparison between two sequences character by character as we can
see the similarity score between them.
In this program, we can get this matrix
just clicking any value from the first step Similarity Matrix.