我正在构建一个相当简单的漂亮汤/请求web刮刀,但是当在作业网站上运行它时,出错了。
AttributeError:“NoneType”对象没有属性“find_all”
就会出现。这是我的代码:
import requests
from bs4 import BeautifulSoup
URL = "https://uk.indeed.com/jobs?q&l=Norwich%2C%20Norfolk&vjk=139a4549fe3cc48b"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find(id="ResultsContainer")
job_elements = results.find_all("div", class_="resultContent")
python_jobs = results.find_all("h2", string="Python")
for job_element in job_elements:
title_element = job_element.find("h2", class_="jobTitle")
company_element = job_element.find("span", class_="companyName")
location_element = job_element.find("div", class_="companyLocation")
print(title_element)
print(company_element)
print(location_element)
print()
有人知道问题出在哪里吗?
发布于 2022-03-18 10:45:23
检查您的选择器,以确定results
属性id
应该是resultsBody
。错误的选择器会导致使用results
的行中出现错误,导致None
不具有属性:
results = soup.find(id="resultsBody")
另外,job_elements
是td而不是div:
job_elements = results.find_all("td", class_="resultContent")
您还可以使用css selectors
链接选择器。
job_elements = soup.select('#resultsBody td.resultContent')
只获取包含Python
的这些
job_elements = soup.select('#resultsBody td.resultContent:has(h2:-soup-contains("Python"))')
示例
import requests
from bs4 import BeautifulSoup
URL = "https://uk.indeed.com/jobs?q&l=Norwich%2C%20Norfolk&vjk=139a4549fe3cc48b"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find(id="resultsBody")
job_elements = results.find_all("td", class_="resultContent")
python_jobs = results.find_all("h2", string="Python")
for job_element in job_elements:
title_element = job_element.find("h2", class_="jobTitle")
company_element = job_element.find("span", class_="companyName")
location_element = job_element.find("div", class_="companyLocation")
print(title_element)
print(company_element)
print(location_element)
print()
https://stackoverflow.com/questions/71531356
复制相似问题