我正在努力使网页刮刀,下载图片从搜索的关键字。代码完全正常工作,直到它必须从提取的URL下载该图像
from bs4 import BeautifulSoup
import requests
import os
import urllib
search = raw_input("search for images: ")
params = {"q": search}
r = requests.get("http://wwww.bing.com/images/search", params=params)
dir_name = search.replace(" ", "_").lower()
if not os.path.isdir(dir_name):
os.makedirs(dir_name)
soup = BeautifulSoup(r.text, "html.parser")
links = soup.findAll("a", {"class": "thumb"})
for items in links:
img_obj = requests.get(items.attrs["href"])
print "Getting: ", items.attrs["href"]
title = items.attrs["href"].split("/")[-1]
urllib.urlretrieve(items.attrs["href"], "./scraped_images/")
产出:
search for images: cats
Getting: http://c1.staticflickr.com/3/2755/4353908962_2a0003aebf.jpg
Traceback (most recent call last):
File "C:/Users/qazii/PycharmProjects/WebScraping/exm.py", line 21, in <module>
urllib.urlretrieve(items.attrs["href"], "./scraped_images/")
File "E:\anaconda\envs\WebScraping\lib\urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "E:\anaconda\envs\WebScraping\lib\urllib.py", line 249, in retrieve
tfp = open(filename, 'wb')
IOError: [Errno 13] Permission denied: './scraped_images/'
发布于 2019-08-20 11:11:46
您正在尝试将图像保存到一个名为./scraped_images/
的“文件”中。因为这是一个目录而不是一个文件,所以您会得到一个权限错误(您不能打开一个具有写权限的目录)。相反,尝试保存到特定的文件名。
urllib.urlretrieve(items.attrs["href"], os.path.join("./scrapped_images", title))
https://stackoverflow.com/questions/57580043
复制