I am trying to write a script analyzing codon usage in sequence utilizing the codon-bias package.
I am trying to use the class codonbias.scores.FrequencyOfOptimalCodons, but when I do so in my code:
FOC = cb.scores.FrequencyOfOptimalCodons (ref_seq=sequence_list, genetic_code=11)
where sequence_list
is a list of str objects containing ORFs, and the genetic_code
is set for Bacterial, archaeal and plant plastid code,
I get the following from my Shell:
Traceback (most recent call last):
File "I:\R&D\Product Research Group\Metabolic&Regulatory modeling\Codon usage\Python scripts\CodonBias CAI analyzer.py", line 117, in <module>
codon_df, total_orfs, analyzed_orfs = analyze_codon_usage(orfs_file, trna_file)
File "I:\R&D\Product Research Group\Metabolic&Regulatory modeling\Codon usage\Python scripts\CodonBias CAI analyzer.py", line 71, in analyze_codon_usage
FOC = cb.scores.FrequencyOfOptimalCodons (ref_seq=sequence_list, genetic_code=11)
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\codonbias\scores.py", line 199, in __init__
self.weights = self.weights.droplevel('aa')
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\generic.py", line 943, in droplevel
new_labels = labels.droplevel(level)
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py", line 2155, in droplevel
levnums = sorted(self._get_level_number(lev) for lev in level)[::-1]
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py", line 2155, in <genexpr>
levnums = sorted(self._get_level_number(lev) for lev in level)[::-1]
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\multi.py", line 1660, in _get_level_number
raise ValueError(
ValueError: The name aa occurs multiple times, use a level number
Any idea what I'm doing wrong? I had the program print out the sequences in sequence_list before the call, and they appear in order.
Here the offending code in init of FrequencyOfOptimalCodons.
I am trying to write a script analyzing codon usage in sequence utilizing the codon-bias package.
I am trying to use the class codonbias.scores.FrequencyOfOptimalCodons, but when I do so in my code:
FOC = cb.scores.FrequencyOfOptimalCodons (ref_seq=sequence_list, genetic_code=11)
where sequence_list
is a list of str objects containing ORFs, and the genetic_code
is set for Bacterial, archaeal and plant plastid code,
I get the following from my Shell:
Traceback (most recent call last):
File "I:\R&D\Product Research Group\Metabolic&Regulatory modeling\Codon usage\Python scripts\CodonBias CAI analyzer.py", line 117, in <module>
codon_df, total_orfs, analyzed_orfs = analyze_codon_usage(orfs_file, trna_file)
File "I:\R&D\Product Research Group\Metabolic&Regulatory modeling\Codon usage\Python scripts\CodonBias CAI analyzer.py", line 71, in analyze_codon_usage
FOC = cb.scores.FrequencyOfOptimalCodons (ref_seq=sequence_list, genetic_code=11)
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\codonbias\scores.py", line 199, in __init__
self.weights = self.weights.droplevel('aa')
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\generic.py", line 943, in droplevel
new_labels = labels.droplevel(level)
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py", line 2155, in droplevel
levnums = sorted(self._get_level_number(lev) for lev in level)[::-1]
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py", line 2155, in <genexpr>
levnums = sorted(self._get_level_number(lev) for lev in level)[::-1]
File "C:\Users\shlomog\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\multi.py", line 1660, in _get_level_number
raise ValueError(
ValueError: The name aa occurs multiple times, use a level number
Any idea what I'm doing wrong? I had the program print out the sequences in sequence_list before the call, and they appear in order.
Here the offending code in init of FrequencyOfOptimalCodons.
Share Improve this question edited Mar 4 at 22:44 Vasilis G. 7,8694 gold badges21 silver badges31 bronze badges asked Mar 3 at 15:45 Shlomo GorenShlomo Goren 111 bronze badge 3 |1 Answer
Reset to default 0try to figure out what is going on inside the code,
I modded FrequencyOfOptimalCodons
in score.py
see score.py like this:
class FrequencyOfOptimalCodons(ScalarScore, VectorScore):
"""
Frequency of Optimal Codons (FOP, Ikemura, J Mol Biol, 1981).
This model determines the optimal codons for each amino acid based
on their frequency in the given set of reference sequences
`ref_seq`. Multiple codons may be selected as optimal based on
`thresh`. The score for a sequence is the fraction of codons in
the sequence deemed optimal. The returned vector for a sequence is
a binary array where optimal positions contain 1 and non-optimal
ones contain 0.
Parameters
----------
ref_seq : iterable of str
A set of reference DNA sequences for codon usage statistics.
thresh : float, optional
Minimal ratio between the frequency of a codon and the most
frequent one in order to be set as optimal, by default 0.95
genetic_code : int, optional
NCBI genetic code ID, by default 1
ignore_stop : bool, optional
Whether STOP codons will be discarded from the analysis, by
default True
pseudocount : int, optional
Pseudocount correction for normalized codon frequencies. this is
effective when `ref_seq` contains few short sequences. by default 1
"""
def __init__(self, ref_seq, thresh=0.95, genetic_code=1,
ignore_stop=True, pseudocount=1):
self.thresh = thresh
self.counter = CodonCounter(genetic_code=genetic_code,
ignore_stop=ignore_stop)
self.pseudocount = pseudocount
print('self.counter: ', self.counter , type(self.counter),'\n\n')
for i in dir(self.counter):
print(i)
print('self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount) : ',
self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount) ,
type(self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount)),'\n\n')
print("self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount).groupby('aa') : ",
self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount).groupby('aa') ,
type(self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount).groupby('aa')),"\n\n")
self.weights = self.counter.count(ref_seq)\
.get_aa_table(normed=True, pseudocount=pseudocount).groupby('aa').transform(lambda x: x / x.max())
#.groupby('aa').apply(lambda x: x / x.max())
print('self.weights ####### : ', self.weights , type(self.weights),'\n\n')
#self.weights = self.counter.count(ref_seq)\
# .get_aa_table(normed=True, pseudocount=pseudocount)\
# .groupby('aa').apply(lambda x: x / x.max())
#print('self.weights : ', self.weights , type(self.weights),'\n\n')
self.weights[self.weights >= self.thresh] = 1 # optimal
print('self.weights : ', self.weights , type(self.weights),'\n\n')
self.weights[self.weights < self.thresh] = 0 # non-optimal
print('self.weights : ', self.weights , type(self.weights),'\n\n')
print(self.weights.to_string())
#self.weights = self.weights.drop_duplicates()
#print("self.weights.drop_duplicates() : ", self.weights , type(self.weights),'\n\n')
#print(self.weights.to_string())
self.weights = self.weights.droplevel('aa')
print("self.weights.droplevel('aa') : ", self.weights , type(self.weights),'\n\n')
print(self.weights.to_string())
print('self.weights.values : \n', self.weights.values)
print('self.weights.keys() : \n', self.weights.keys())
def _calc_score(self, seq):
#counts = self.counter.count(seq).counts
#print('\nself.weights : \n', self.weights.to_string())
#print('\ncounts : \n', counts)
#return mean(self.weights, counts)
print('(i[1] for i in self._calc_vector(seq))', [i[1] for i in self._calc_vector(seq)])
print('len(seq)/3 ', len(seq)/3)
return sum(i[1] for i in self._calc_vector(seq))/(len(seq)/3)
def _calc_vector(self, seq):
print('self._get_codon_vector(seq) : \n' , self._get_codon_vector(seq))
#return self.weights.reindex(self._get_codon_vector(seq)).values
return [(i , self.weights.get(key = i.upper())) for i in self._get_codon_vector(seq)]
Then run this test_code.py
:
import codonbias as cb
sequence_list = ["atgccgaaaagcttttatgatgccgtgggcggcgcgaaaacctttgatgcgattgtgagc",
"cgcttttatgcgcaggtggcggaagatgaagtgctgcgccgcgtgtatccggaagatgat",
"ctggcgggcgcggaagaacgcctgcgcatgtttctggaacagtattggggcggcccgcgc",
"aagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaag",
"atgatgatggccgccgcc"]
FOC = cb.scores.FrequencyOfOptimalCodons(ref_seq=sequence_list, genetic_code=11)
print('\n\nFOC : \n', FOC)
print('score: \n',FOC._calc_score(sequence_list[0]))
print('calc : \n', FOC._calc_vector(sequence_list[0]))
print('calc : \n', FOC._calc_vector(sequence_list[1]))
print('calc : \n', FOC._calc_vector(sequence_list[2]))
print('calc : \n', FOC._calc_vector(sequence_list[3]))
print('score: \n',FOC._calc_score(sequence_list[3]))
print('calc : \n', FOC._calc_vector(sequence_list[4]))
print('score: \n',FOC._calc_score(sequence_list[4]))
output , kind of long and needs to be studied:
self.counter: <codonbias.stats.CodonCounter object at 0x7f9b64e1d0d0> <class 'codonbias.stats.CodonCounter'>
__class__
__delattr__
__dict__
__dir__
__doc__
__eq__
__format__
__ge__
__getattribute__
__gt__
__hash__
__init__
__init_subclass__
__le__
__lt__
__module__
__ne__
__new__
__reduce__
__reduce_ex__
__repr__
__setattr__
__sizeof__
__str__
__subclasshook__
__weakref__
_count
_count_single
_format_counts
_init_table
concat_index
count
genetic_code
get_aa_table
get_codon_table
ignore_stop
k_mer
sum_seqs
self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount) : aa codon
A GCA 0.071429
GCC 0.357143
GCG 0.500000
GCT 0.071429
C TGC 0.500000
...
V GTG 0.666667
GTT 0.111111
W TGG 1.000000
Y TAC 0.166667
TAT 0.833333
Name: count, Length: 61, dtype: float64 <class 'pandas.core.series.Series'>
self.counter.count(ref_seq).get_aa_table(normed=True, pseudocount=pseudocount).groupby('aa') : <pandas.core.groupby.generic.SeriesGroupBy object at 0x7f9b64db78e0> <class 'pandas.core.groupby.generic.SeriesGroupBy'>
self.weights ####### : aa codon
A GCA 0.142857
GCC 0.714286
GCG 1.000000
GCT 0.142857
C TGC 1.000000
...
V GTG 1.000000
GTT 0.166667
W TGG 1.000000
Y TAC 0.200000
TAT 1.000000
Name: count, Length: 61, dtype: float64 <class 'pandas.core.series.Series'>
self.weights : aa codon
A GCA 0.142857
GCC 0.714286
GCG 1.000000
GCT 0.142857
C TGC 1.000000
...
V GTG 1.000000
GTT 0.166667
W TGG 1.000000
Y TAC 0.200000
TAT 1.000000
Name: count, Length: 61, dtype: float64 <class 'pandas.core.series.Series'>
self.weights : aa codon
A GCA 0.0
GCC 0.0
GCG 1.0
GCT 0.0
C TGC 1.0
...
V GTG 1.0
GTT 0.0
W TGG 1.0
Y TAC 0.0
TAT 1.0
Name: count, Length: 61, dtype: float64 <class 'pandas.core.series.Series'>
aa codon
A GCA 0.0
GCC 0.0
GCG 1.0
GCT 0.0
C TGC 1.0
TGT 1.0
D GAC 0.0
GAT 1.0
E GAA 1.0
GAG 0.0
F TTC 0.0
TTT 1.0
G GGA 0.0
GGC 1.0
GGG 0.0
GGT 0.0
H CAC 1.0
CAT 1.0
I ATA 0.0
ATC 0.0
ATT 1.0
K AAA 0.0
AAG 1.0
L CTA 0.0
CTC 0.0
CTG 1.0
CTT 0.0
TTA 0.0
TTG 0.0
M ATG 1.0
N AAC 1.0
AAT 1.0
P CCA 0.0
CCC 0.0
CCG 1.0
CCT 0.0
Q CAA 0.0
CAG 1.0
R AGA 0.0
AGG 0.0
CGA 0.0
CGC 1.0
CGG 0.0
CGT 0.0
S AGC 1.0
AGT 0.0
TCA 0.0
TCC 0.0
TCG 0.0
TCT 0.0
T ACA 0.0
ACC 1.0
ACG 0.0
ACT 0.0
V GTA 0.0
GTC 0.0
GTG 1.0
GTT 0.0
W TGG 1.0
Y TAC 0.0
TAT 1.0
self.weights.droplevel('aa') : codon
GCA 0.0
GCC 0.0
GCG 1.0
GCT 0.0
TGC 1.0
...
GTG 1.0
GTT 0.0
TGG 1.0
TAC 0.0
TAT 1.0
Name: count, Length: 61, dtype: float64 <class 'pandas.core.series.Series'>
codon
GCA 0.0
GCC 0.0
GCG 1.0
GCT 0.0
TGC 1.0
TGT 1.0
GAC 0.0
GAT 1.0
GAA 1.0
GAG 0.0
TTC 0.0
TTT 1.0
GGA 0.0
GGC 1.0
GGG 0.0
GGT 0.0
CAC 1.0
CAT 1.0
ATA 0.0
ATC 0.0
ATT 1.0
AAA 0.0
AAG 1.0
CTA 0.0
CTC 0.0
CTG 1.0
CTT 0.0
TTA 0.0
TTG 0.0
ATG 1.0
AAC 1.0
AAT 1.0
CCA 0.0
CCC 0.0
CCG 1.0
CCT 0.0
CAA 0.0
CAG 1.0
AGA 0.0
AGG 0.0
CGA 0.0
CGC 1.0
CGG 0.0
CGT 0.0
AGC 1.0
AGT 0.0
TCA 0.0
TCC 0.0
TCG 0.0
TCT 0.0
ACA 0.0
ACC 1.0
ACG 0.0
ACT 0.0
GTA 0.0
GTC 0.0
GTG 1.0
GTT 0.0
TGG 1.0
TAC 0.0
TAT 1.0
self.weights.values :
[0. 0. 1. 0. 1. 1. 0. 1. 1. 0. 0. 1. 0. 1. 0. 0. 1. 1. 0. 0. 1. 0. 1. 0.
0. 1. 0. 0. 0. 1. 1. 1. 0. 0. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0. 1.]
self.weights.keys() :
Index(['GCA', 'GCC', 'GCG', 'GCT', 'TGC', 'TGT', 'GAC', 'GAT', 'GAA', 'GAG',
'TTC', 'TTT', 'GGA', 'GGC', 'GGG', 'GGT', 'CAC', 'CAT', 'ATA', 'ATC',
'ATT', 'AAA', 'AAG', 'CTA', 'CTC', 'CTG', 'CTT', 'TTA', 'TTG', 'ATG',
'AAC', 'AAT', 'CCA', 'CCC', 'CCG', 'CCT', 'CAA', 'CAG', 'AGA', 'AGG',
'CGA', 'CGC', 'CGG', 'CGT', 'AGC', 'AGT', 'TCA', 'TCC', 'TCG', 'TCT',
'ACA', 'ACC', 'ACG', 'ACT', 'GTA', 'GTC', 'GTG', 'GTT', 'TGG', 'TAC',
'TAT'],
dtype='object', name='codon')
FOC :
<codonbias.scores.FrequencyOfOptimalCodons object at 0x7f46da53ebe0>
self._get_codon_vector( atgccgaaaagcttttatgatgccgtgggcggcgcgaaaacctttgatgcgattgtgagc ) :
['atg', 'ccg', 'aaa', 'agc', 'ttt', 'tat', 'gat', 'gcc', 'gtg', 'ggc', 'ggc', 'gcg', 'aaa', 'acc', 'ttt', 'gat', 'gcg', 'att', 'gtg', 'agc']
(i[1] for i in self._calc_vector( atgccgaaaagcttttatgatgccgtgggcggcgcgaaaacctttgatgcgattgtgagc )) [1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
len(seq)/3 20.0
self._get_codon_vector( atgccgaaaagcttttatgatgccgtgggcggcgcgaaaacctttgatgcgattgtgagc ) :
['atg', 'ccg', 'aaa', 'agc', 'ttt', 'tat', 'gat', 'gcc', 'gtg', 'ggc', 'ggc', 'gcg', 'aaa', 'acc', 'ttt', 'gat', 'gcg', 'att', 'gtg', 'agc']
score:
0.85
self._get_codon_vector( atgccgaaaagcttttatgatgccgtgggcggcgcgaaaacctttgatgcgattgtgagc ) :
['atg', 'ccg', 'aaa', 'agc', 'ttt', 'tat', 'gat', 'gcc', 'gtg', 'ggc', 'ggc', 'gcg', 'aaa', 'acc', 'ttt', 'gat', 'gcg', 'att', 'gtg', 'agc']
calc :
[('atg', 1.0), ('ccg', 1.0), ('aaa', 0.0), ('agc', 1.0), ('ttt', 1.0), ('tat', 1.0), ('gat', 1.0), ('gcc', 0.0), ('gtg', 1.0), ('ggc', 1.0), ('ggc', 1.0), ('gcg', 1.0), ('aaa', 0.0), ('acc', 1.0), ('ttt', 1.0), ('gat', 1.0), ('gcg', 1.0), ('att', 1.0), ('gtg', 1.0), ('agc', 1.0)]
self._get_codon_vector( cgcttttatgcgcaggtggcggaagatgaagtgctgcgccgcgtgtatccggaagatgat ) :
['cgc', 'ttt', 'tat', 'gcg', 'cag', 'gtg', 'gcg', 'gaa', 'gat', 'gaa', 'gtg', 'ctg', 'cgc', 'cgc', 'gtg', 'tat', 'ccg', 'gaa', 'gat', 'gat']
calc :
[('cgc', 1.0), ('ttt', 1.0), ('tat', 1.0), ('gcg', 1.0), ('cag', 1.0), ('gtg', 1.0), ('gcg', 1.0), ('gaa', 1.0), ('gat', 1.0), ('gaa', 1.0), ('gtg', 1.0), ('ctg', 1.0), ('cgc', 1.0), ('cgc', 1.0), ('gtg', 1.0), ('tat', 1.0), ('ccg', 1.0), ('gaa', 1.0), ('gat', 1.0), ('gat', 1.0)]
self._get_codon_vector( ctggcgggcgcggaagaacgcctgcgcatgtttctggaacagtattggggcggcccgcgc ) :
['ctg', 'gcg', 'ggc', 'gcg', 'gaa', 'gaa', 'cgc', 'ctg', 'cgc', 'atg', 'ttt', 'ctg', 'gaa', 'cag', 'tat', 'tgg', 'ggc', 'ggc', 'ccg', 'cgc']
calc :
[('ctg', 1.0), ('gcg', 1.0), ('ggc', 1.0), ('gcg', 1.0), ('gaa', 1.0), ('gaa', 1.0), ('cgc', 1.0), ('ctg', 1.0), ('cgc', 1.0), ('atg', 1.0), ('ttt', 1.0), ('ctg', 1.0), ('gaa', 1.0), ('cag', 1.0), ('tat', 1.0), ('tgg', 1.0), ('ggc', 1.0), ('ggc', 1.0), ('ccg', 1.0), ('cgc', 1.0)]
self._get_codon_vector( aagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaag ) :
['aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag']
calc :
[('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0), ('aag', 1.0)]
self._get_codon_vector( aagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaag ) :
['aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag']
(i[1] for i in self._calc_vector( aagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaag )) [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
len(seq)/3 20.0
self._get_codon_vector( aagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaagaag ) :
['aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag', 'aag']
score:
1.0
self._get_codon_vector( atgatgatggccgccgcc ) :
['atg', 'atg', 'atg', 'gcc', 'gcc', 'gcc']
calc :
[('atg', 1.0), ('atg', 1.0), ('atg', 1.0), ('gcc', 0.0), ('gcc', 0.0), ('gcc', 0.0)]
self._get_codon_vector( atgatgatggccgccgcc ) :
['atg', 'atg', 'atg', 'gcc', 'gcc', 'gcc']
(i[1] for i in self._calc_vector( atgatgatggccgccgcc )) [1.0, 1.0, 1.0, 0.0, 0.0, 0.0]
len(seq)/3 6.0
self._get_codon_vector( atgatgatggccgccgcc ) :
['atg', 'atg', 'atg', 'gcc', 'gcc', 'gcc']
score:
0.5
I am not sure I got it right, but to me now I get what the Docs states.
As per comments under your question:
The error is in Pandas dataframe so this issue is likely due to the way the codon-bias package is constructing or processing its internal DataFrame based on the reference sequences you're providing. Check he content of your sequence_list. You could do a test with a very simple one and see if the error still occurs. –
there is something wrong in the class definition meaning it doesnt work, let the developer/manteiner know about this bug. Also The score for a sequence is the fraction of codons in the sequence deemed optimal. The returned vector for a sequence is a binary array where optimal positions contain 1 and non-optimal ones contain 0. These have to be called on FOC like FOC.__calc_score(seq) FOC._calc_vector(seq) –
BE WARNED:
that calculating the score of sequences that are not made of exact triplets (that contain 2 or 1 extra nucleotide at their C-term) will throw an error like :
line 202, in _calc_score
return sum(i[1] for i in self._calc_vector(seq))/(len(seq)/3)
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
can change def _calc_score
with :
`def _calc_score(self, seq):
return sum(i[1] for i in self._calc_vector(seq) if isinstance(i[1], float))/(len(seq)/3)
`
I strongly suggest to open an issue on the project github page:
https://github/alondmnt/codon-bias/issues
codon-bias
package is constructing or processing its internal DataFrame based on the reference sequences you're providing. Check he content of yoursequence_list
. You could do a test with a very simple one and see if the error still occurs. – Lewis Commented Mar 5 at 6:23