cocoatree.msa.compute_seq_similarity¶
- cocoatree.msa.compute_seq_similarity(sequences, subst_matrix='BLOSUM62', gap_penalty=-4, n_jobs=1, verbose_parallel=0)[source]¶
Computes a similarity matrix using a precalculated substitution matrix.
The similarity score for a pair of sequences is obtained as the sum of the substitution scores at each position of the sequence pair.
Parameters¶
- sequenceslist of str,
list of Nseq MSA sequences.
- subst_matrixstr, default=’BLOSUM62’
name of the substitution matrix. Type Bio.Align.substitution_matrices.load() to obtain a list of available substitution matrices.
- gap_penaltyint, default=-4
penalty score for gaps. You can adjust this parameter to reflect biological assumptions (e.g., -1 for mild, -10 for harsh).
- n_jobsint, default=1 (no parallelization)
the maximum number of concurrently running jobs (-1 uses all available cores)
- verbose_parallelint, default=0
verbosity level for parallelization (see joblib doc)
Returns¶
- similarity_matrixnp.ndarray,
a (Nseq, Nseq) array of similarity scores.
Examples using cocoatree.msa.compute_seq_similarity¶
Plot a similarity heatmap of a XCoR along the phylogenetic tree
Plot a similarity heatmap of a XCoR along the phylogenetic tree