cocoatree.msa.compute_seq_similarity¶

cocoatree.msa.compute_seq_similarity(sequences, subst_matrix='BLOSUM62', gap_penalty=-4, n_jobs=1, verbose_parallel=0)[source]¶

Computes a similarity matrix using a precalculated substitution matrix.

The similarity score for a pair of sequences is obtained as the sum of the substitution scores at each position of the sequence pair.

Parameters¶

sequenceslist of str,: list of Nseq MSA sequences.
subst_matrixstr, default=’BLOSUM62’: name of the substitution matrix. Type Bio.Align.substitution_matrices.load() to obtain a list of available substitution matrices.
gap_penaltyint, default=-4: penalty score for gaps. You can adjust this parameter to reflect biological assumptions (e.g., -1 for mild, -10 for harsh).
n_jobsint, default=1 (no parallelization): the maximum number of concurrently running jobs (-1 uses all available cores)
verbose_parallelint, default=0: verbosity level for parallelization (see joblib doc)