cocoatree.msa.filter_ref_seq

cocoatree.msa.filter_ref_seq(sequences, sequences_id, delta=0.2, refseq_id=None, verbose=False)[source]

Filter the alignment based on identity with a reference sequence

Remove sequences r with Sr < delta, where Sr is the fractional identity between r and a specified reference sequence.

Arguments

sequences : list of sequences in the MSA

sequences_id : list of sequence identifiers in the MSA

delta : identity threshold (default 0.2)

refseq_ididentifier of the reference sequence, if ‘None’, a reference

sequence is computed (default ‘None’)

Returns

filt_seqs : filtered list of sequences

filt_seqs_id : corresponding list of sequence identifiers