最初,我的代码是:
# encoding = utf-8
from bs4 import BeautifulSoup
import urllib
import re
import os
url = []
urlbase = "https://quizlet.com/subject/四级乱序/page/"
for i in range(0,2):
url.append(urlbase + str(i+1))
indexname = str(url[i])[-1] + ".html"
urllib.urlretrieve(url[i], indexname)
print indexname + " downloaded"
f = open(indexname,"r")
soup = BeautifulSoup(f, "html.parser")
linkclass = soup.find_all("a", attrs={"class":"SetPreview-link","href":re.compile(r"unit(\s\w+)?")})
for link in link class:
flink = link.get("href")
print flink结果是一些链接,工作得很好。
,但当我用如下代码将它写入文件时:
# encoding = utf-8
from bs4 import BeautifulSoup
import urllib
import re
import os
url = []
urlbase = "https://quizlet.com/subject/四级乱序/page/"
flinkfile = open("links.txt",'wb')
for i in range(0,2):
url.append(urlbase + str(i+1))
indexname = str(url[i])[-1] + ".html"
urllib.urlretrieve(url[i], indexname)
print indexname + " downloaded"
f = open(indexname,"r")
soup = BeautifulSoup(f, "html.parser")
linkclass = soup.find_all("a", attrs={"class":"SetPreview-link", "href":re.compile(r"unit(\s\w+)?")})
for link in linkclass:
flink = link.get("href")
flinkfile.writelines(flink)
flinkfile.close()结果是一个只有一行的txt文件:https://quizlet.com/146113318/unit31-flash-cards/
为什么会这样呢?
发布于 2016-12-21 06:17:07
问题是文件关闭在for i in range(0,2)循环中。如果您将其移出,您应该得到更多的行(假设有更多的行):
# encoding = utf-8
from bs4 import BeautifulSoup
import urllib
import re
import os
url = []
urlbase = "https://quizlet.com/subject/四级乱序/page/"
flinkfile = open("links.txt",'wb')
for i in range(0,2):
url.append(urlbase + str(i+1))
indexname = str(url[i])[-1] + ".html"
urllib.urlretrieve(url[i], indexname)
print indexname + " downloaded"
f = open(indexname,"r")
soup = BeautifulSoup(f, "html.parser")
linkclass = soup.find_all("a", attrs={"class":"SetPreview-link", "href":re.compile(r"unit(\s\w+)?")})
for link in linkclass:
flink = link.get("href")
flinkfile.writelines(flink)
# close file outside loop
flinkfile.close()若要确保即使发生错误也关闭文件,请使用with语句:
with open(...) as flinkfile:
for in in range(0,2):
...关于这里的更多信息:http://effbot.org/zone/python-with-statement.htm
https://stackoverflow.com/questions/41256083
复制相似问题