cocoatree.statistics.compute_all_frequencies¶
- cocoatree.statistics.compute_all_frequencies(sequences, seq_weights=None, freq_regul=0.03)[source]¶
Compute frequencies on sequences
Parameters¶
sequences : list of sequences
- seq_weights{None, np.ndarray (n_seq)}
if None, will re-compute the sequence weights.
freq_regul : regularization parameter (default=__freq_regularization_ref)
Returns¶
- aa_freqsnp.ndarray (nseq, 21)
A (nseq, 21) ndarray containing the amino acid frequencies at each positions.
- bkgd_freqsnp.ndarray (21, )
A (21,) np.array containing the background amino acid frequencies at each position; it is computed from the mean frequency of amino acid a in all proteins in the NCBI non-redundant database (see Rivoire et al., https://dx.plos.org/10.1371/journal.pcbi.1004817)
- aa_joint_freqsnp.ndarray (nseq, nseq, 21, 21)
An ndarray containing the pairwise joint frequencies of amino acids for each pair of positions in the list of provided sequences.