我有一个很大的文本文件,我需要搜索它的行,如果行中有某个值,则拉出该行并将其存储在列表中
当我尝试使用for循环时,它不是逐行执行,而是逐字符检查,我真的不想使用循环,因为文件非常大,所以如果你们知道如何搜索文本文件,找到一个值,然后提取这个值所在的整个行。S=‘*LOCATION**** ** MATERIAL PROGRAM+ SH MUD LOGGING CABIN UML111 + ORS MUD CABIN & WM CABIN **G能源:4“3/4英寸钻探罐QN:475-0029**86*JTS JTS:4”1/2油管:13,5#P110N;VAM;+344;jtsTBG;2“;7/8;6.4#;N80;N.VAM;PUP;JTS:13‘’;3/8:68#;N80:BTC;JTS:7‘’;JTS:7‘’;32:32;JTS油管::2”3/8:+1;P110:N.VAM;+770;JTS油管:2“3/8:+1;X-OVER:9”5/8:47:P110:N,9“VAM、PIN、X、BTC、BOX**、BAKER、取心、设备、人员代码:
def convert_pdf_to_txt(path):
rsrcmgr = PDFResourceManager()
retstr = io.StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
fp = open(path, 'rb')
interpreter = PDFPageInterpreter(rsrcmgr, device)
password = ""
maxpages = 0
caching = True
pagenos = set()
for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages,
password=password,
caching=caching,
check_extractable=True):
interpreter.process_page(page)
text = retstr.getvalue()
fp.close()
device.close()
retstr.close()
return text
path="C:\DDR reports\Smith General server\DDR Algeria\DDR\\07.July\\02.07.2019\\BELN-1-Daily Drilling Report-Report Number51-(07-02-2019).pdf"
r=convert_pdf_to_txt(path)
regex=re.compile('[1-9]*\s[a-zA-Z]*\sJar', re.IGNORECASE)
list_jar=list()
i=0
for line in r.split('\n'):
#search_v=re.findall(pattern,r)
x=re.findall(regex, line)发布于 2019-09-30 00:29:40
我认为你需要一个这样的for循环: For x in open('file.txt'):If 'value‘in x: List.append(x)
https://stackoverflow.com/questions/58156998
复制相似问题