cocoatree.msa.compute_seq_weights¶
- cocoatree.msa.compute_seq_weights(sequences, threshold=0.8, verbose_every=0, n_jobs=1, verbose_parallel=5)[source]¶
Compute sequence weights
Each sequence s is given a weight ws = 1/Ns where Ns is the number of sequences with an identity to s above a specified threshold.
Parameters¶
sequences : list of sequences
- thresholdfloat, optional, default: 0.8
percentage identity above which the sequences are considered identical (default=0.8)
- verbose_everyint
if > 0, verbose every {verbose_every} sequences
- n_jobsint, default=1 (no parallelization)
the maximum number of concurrently running jobs (see joblib doc)
- verbose_parallelint
verbosity level for parallelization (see joblib doc)
Returns¶
weights : np.array (nseq, ) of each sequence weight
- m_efffloat
number of effective sequences