我已经尝试了这个不起作用的double for循环。(请参见下文。)
基本上,我有一个构造列表和一个引物列表。引物通过“构建号”和“部件号”与构建物相关联。(每个构件由多个部分组成。)对于每个部分,都有一个“正向”和一个“反向”入门。对于那些对分子生物学感兴趣的人来说,我基本上是在写一个脚本来帮助我处理PCR。
我想要做的是:我想在引物列表中搜索那些应该与构建部分关联的引物,并将它们连接到一个主列表中。例如,如果我有一个包含EMP792 ( fw )和EMP793 ( re )的列表(它们在不同的行上),并且它们与我的构建列表中的构建#1的part #2相关联,我希望能够在"primers_list“中搜索相应的fw和re引物。如果构建部分的列表中没有关联的引物,我想先跳过这些构建。
我使用的策略是:我做了一个嵌套的for循环。对于构建列表中的每个构建,我希望它在引物列表中搜索fw和re引物。我知道这很低效,但作为一个初学者,这是我唯一能想到的方法。我加入了一些条件,通过检查与引物相关的构造号和部件号,来检查这些结构是否存在引物。
我面临的问题是:对于列表中的每个结构,循环都不会搜索整个primer_list。它似乎会自动跳过之前比较的所有引物,而只比较下一个尚未比较的引物。这导致了处理过程中的问题,如果您使用关联的数据集(我也粘贴在代码下面)运行代码,您会发现应该打印出关联引物的结构却没有关联的引物,这让我非常头疼,试图找出哪里出了问题(哈哈……)!
如果能帮上忙,我将不胜感激!
代码:
with open('constructs-to-make-shortened2.csv', 'rU') as constructs:
construct_list = csv.DictReader(constructs)
with open('primers-with-notes-names.csv', 'rU') as primers:
primers_list = csv.DictReader(primers)
#make list of constructs for checking later on#
## construct_numbers_list = []
## for row in primers_list:
## construct_numbers_list.append(row['construct number'])
##
## print(construct_numbers_list)
for construct in construct_list:
## print('Currently at construct number ' + construct['Construct'])
## print('Construct counter at ' + str(construct_counter))
## print('Part number counter is at ' + str(part_number))
master_row = {}
master_row['construct'] = construct['Construct']
master_row['strategy'] = construct['Strategy']
master_row['construct name'] = construct['Construct Name']
master_row['sequence'] = construct['Sequence']
master_row['source'] = construct['Source']
master_row['content'] = construct['Content']
print('We are at construct number ' + str(construct['Construct']))
print('Construct counter is at ' + str(construct_counter))
is_next_construct = (int(construct['Construct']) > construct_counter)
print('Are we at the next construct?')
print(is_next_construct)
if is_next_construct:
part_number = 1
construct_counter = int(construct['Construct'])
print('Part number is now ' + str(part_number))
for primer in primers_list:
print(primer)
## print('Is primer ' + str(primer['name']) + ' associated with the construct?')
is_associated_with_construct = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number))
## print(is_associated_with_construct)
if(is_associated_with_construct == False):
break
is_forward = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number) and primer['direction'] == 'fw primer')
print('Primer ' + str(primer['name']) + ' is a forward primer?')
print(is_forward)
is_reverse = bool(primer['construct number'] == construct['Construct'] and str(primer['part number']) == str(part_number) and primer['direction'] == 're primer')
print('Primer ' + str(primer['name']) + ' is a reverse primer?')
print(is_reverse)
if is_forward:
master_row['primer1'] = primer['name']
master_row['primer1 sequence'] = primer['primer sequence']
master_row['primer1 description'] = primer['notes']
master_row['primer1 length'] = primer['length']
## print(master_row)
continue
elif is_reverse:
master_row['primer2'] = primer['name']
master_row['primer2 sequence'] = primer['primer sequence']
master_row['primer2 description'] = primer['notes']
master_row['primer2 length'] = primer['length']
## print(master_row)
part_number += 1
print('Part number now = ' + str(part_number) + '\n')
master_list.append(master_row)
break
数据子集(构造)(消除精确序列以保持在SO字符限制内):
{'Sequence': '', 'Construct': '12', 'Strategy': 'Gibson', 'Content': 'Amp resistance marker', 'Source': 'pEM096', 'Construct Name': 'T7 RNAP core on BAC ori only with AmpR'}
{'Sequence': '', 'Construct': '12', 'Strategy': 'Gibson', 'Content': 'BAC origin and T7 RNAP core', 'Source': 'THSS301', 'Construct Name': 'T7 RNAP core on BAC ori only with AmpR'}
{'Sequence': '', 'Construct': '13', 'Strategy': 'Cut Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid'}
{'Sequence': '', 'Construct': '13', 'Strategy': 'Cut Gibson', 'Content': 'vioABE pathway and pSC101 ori and CmR; digest with EcoRI and XbaI', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid'}
{'Sequence': '', 'Construct': '14', 'Strategy': 'Cut Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid, with lyc in reverse direction'}
{'Sequence': '', 'Construct': '14', 'Strategy': 'Cut Gibson', 'Content': 'vioABE pathway and pSC101 ori and CmR; digest with EcoRI and XbaI', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid, with lyc in reverse direction'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'vioABE pathway with random nucleotide spacers', 'Source': 'KT-587', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'lycopene pathway (crtE.B.I.dxs.idi)', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '15', 'Strategy': 'Gibson', 'Content': 'pSC101 origin of replication and CmR resistance marker', 'Source': 'KT-537', 'Construct Name': 'Combined vio and lyc plasmid made by high GC polymerase'}
{'Sequence': '', 'Construct': '16', 'Strategy': 'Gibson', 'Content': 'P(tac)-SynZip18-T7 fragment', 'Source': 'THSS303', 'Construct Name': 'P(tac)-T7 fragment controller'}
{'Sequence': '', 'Construct': '16', 'Strategy': 'Gibson', 'Content': 'IncW backbone and TpR resistance and lacIq', 'Source': 'pEM103', 'Construct Name': 'P(tac)-T7 fragment controller'}
{'Sequence': '', 'Construct': '17', 'Strategy': 'Gibson', 'Content': 'P(tac)-SynZip18-T3 fragment', 'Source': 'THSS304', 'Construct Name': 'P(tac)-T3 fragment controller'}
{'Sequence': '', 'Construct': '17', 'Strategy': 'Gibson', 'Content': 'IncW backbone and TpR resistance and lacIq', 'Source': 'pEM103', 'Construct Name': 'P(tac)-T3 fragment controller'}
数据子集(引物):
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP790', 'primer sequence': 'gtttgtcggtgaactaattCttattaccaatgcttaatcagggaggcacctatctcagcg', 'notes': 'Fw Gibson primer on pEM096 to extract Amp resistance marker', 'length': '60', 'construct number': '12'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP787', 'primer sequence': 'gatgaggatcgtttcgcatgctaaatacattcaaatatctatccgctcatgagacaataa', 'notes': 'Re Gibson primer on pEM096 to extract Amp resistance marker', 'length': '60', 'construct number': '12'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP788', 'primer sequence': 'agatatttgaatgtatttagcatgcgaaacgatcctcatcctgtctcttgatcagatctt', 'notes': 'Fw Gibson primer on THSS301 to extract BAC and R6K origins and T7 RNAP core', 'length': '60', 'construct number': '12'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP791', 'primer sequence': 'tgattaagcattggtaataaGaattagttcaccgacaaacaacagataaaacgaaaggcc', 'notes': 'Re Gibson primer on THSS301 to extract BAC origin and T7 RNAP core', 'length': '60', 'construct number': '12'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP792', 'primer sequence': 'aaggaatattcagcaatttgGTTGGGGATAGCGCTAGCTATAATAactaTCACTATAGGG', 'notes': 'Fw Gibson primer on KT-587 to extract vioABE pathway with random nucleotide spacers', 'length': '60', 'construct number': '15'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP793', 'primer sequence': 'gggcctttcttcggcacgggGTTGTAGCAGGCGTCTTTGTCAAAAAACCCCTCAAGACCC', 'notes': 'Re Gibson primer on KT-587 to extract vioABE pathway with random nucleotide spacers', 'length': '60', 'construct number': '15'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP794', 'primer sequence': 'ACAAAGACGCCTGCTACAACcccgtgccgaagaaaggcccacccgtgaaggtgagccagt', 'notes': 'Fw Gibson primer on KT-537 to extract lycopene pathway (crtE.B.I.dxs.idi)', 'length': '60', 'construct number': '15'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP795', 'primer sequence': 'gaggtcattactggatctaTcccgtgccgaagaaaggcccacccgtgaaggtgagccagt', 'notes': 'Re Gibson primer on KT-537 to extract lycopene pathway (crtE.B.I.dxs.idi)', 'length': '60', 'construct number': '15'}
{'part number': '3', 'direction': 'fw primer', 'name': 'EMP796', 'primer sequence': 'gggcctttcttcggcacgggAtagatccagtaatgacctcagaactccatctggatttgt', 'notes': 'Fw Gibson primer on KT-537 to extract pSC101 origin of replication and CmR resistance marker', 'length': '60', 'construct number': '15'}
{'part number': '3', 'direction': 're primer', 'name': 'EMP797', 'primer sequence': 'TAGCTAGCGCTATCCCCAACcaaattgctgaatattccttttcttagacgtcaggtggca', 'notes': 'Re Gibson primer on KT-537 to extract pSC101 origin of replication and CmR resistance marker', 'length': '60', 'construct number': '15'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP798', 'primer sequence': 'aaatattctgaaatgagctgttgacaattaatcatcggctcgtataatgtgtggaattgt', 'notes': 'Fw Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '16'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP799', 'primer sequence': 'attaccgcctttgagtgagccccaatgataaccccaagggaagttttagtcaaaagcctc', 'notes': 'Re Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '16'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP800', 'primer sequence': 'cccttggggttatcattggggctcactcaaaggcggtaatcagataaaaaaaatccttag', 'notes': 'Fw Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '16'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP801', 'primer sequence': 'agccgatgattaattgtcaacagctcatttcagaatatttgccagaaccgttatgatgtc', 'notes': 'Re Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '16'}
{'part number': '1', 'direction': 'fw primer', 'name': 'EMP798', 'primer sequence': 'aaatattctgaaatgagctgttgacaattaatcatcggctcgtataatgtgtggaattgt', 'notes': 'Fw Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '17'}
{'part number': '1', 'direction': 're primer', 'name': 'EMP799', 'primer sequence': 'attaccgcctttgagtgagccccaatgataaccccaagggaagttttagtcaaaagcctc', 'notes': 'Re Gibson primer on THSS303 to extract P(tac)-SynZip18-T7 fragment', 'length': '60', 'construct number': '17'}
{'part number': '2', 'direction': 'fw primer', 'name': 'EMP800', 'primer sequence': 'cccttggggttatcattggggctcactcaaaggcggtaatcagataaaaaaaatccttag', 'notes': 'Fw Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '17'}
{'part number': '2', 'direction': 're primer', 'name': 'EMP801', 'primer sequence': 'agccgatgattaattgtcaacagctcatttcagaatatttgccagaaccgttatgatgtc', 'notes': 'Re Gibson primer on pEM103 to extract IncW backbone and TpR resistance and lacIq', 'length': '60', 'construct number': '17'}
发布于 2013-01-27 03:42:49
问题是您正在迭代一个csv.DictReader
对象,它不是一个列表,而是一个迭代器。
两者的不同之处在于,使用迭代器时,您不能“回到开头”。内部循环的每一步,在primer_list
上的迭代都从上次停止的地方开始。
如果您希望能够多次迭代所有项,并且有足够的内存,请将它们存储在列表中:
primers_list = list(csv.DictReader(primers))
如果希望保持较低的内存使用量,则可以在每次循环内从头开始创建DictReader
对象。但是,这会增加一些(可能是很小的)执行时间开销,您应该注意通过将with
语句移到循环中来关闭文件。
另一种方法是在循环体的末尾执行primers.seek(0)
,以便它在下一次迭代时从文件的开头开始读取,但我不确定这是不是一个好办法。
https://stackoverflow.com/questions/14540767
复制相似问题