blocks|key|5027466|text|您应该使用一个库来实现这一点，并且有几个库，但是为了回答您的问题，请更改您向我们展示的代码……|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5027467|您的问题是，您正在尝试查找图像，但是图像没有使用<a+...>标记。它们使用<img+...>标记。下面是一个示例：|offset|length|style|CODE|5027468|<img+src="smiley.gif"+alt="Smiley+face"+height="42"+width="42">|code-block|syntax|javascript|5027469|您应该做的是将start+=+page.find('<a+img=')行更改为start+=+page.find('<img+')，如下所示：|5027470|def+getImage(url):
++++page+=+urllib2.urlopen(url)
++++page+=+page.read()+#Gives+HTML+to+parse

++++start+=+page.find('<img+')
++++end+=+page.find('>',+start)

++++img+=+page[start:end%2B1]
++++return+img|5027471|entityMap^0|0|O|7|12|9|0|0|7|S|13|Q|0|0^^$0|@$1|2|3|4|5|6|7|S|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|T|8|@$D|U|E|V|F|G]|$D|W|E|X|F|G]]|9|@]|A|$]]|$1|H|3|I|5|J|7|Y|8|@]|9|@]|A|$K|L]]|$1|M|3|N|5|6|7|Z|8|@$D|10|E|11|F|G]|$D|12|E|13|F|G]]|9|@]|A|$]]|$1|O|3|P|5|J|7|14|8|@]|9|@]|A|$K|L]]|$1|Q|3|-4|5|6|7|15|8|@]|9|@]|A|$]]]|R|$]]

You should use a library for this and there are several out there, but to answer your question by changing the code you showed us...

Your problem is that you are trying to find images, but images don't use the <code>&lt;a ...&gt;</code> tag. They use the <code>&lt;img ...&gt;</code> tag. Here is an example:

<pre><code>&lt;img src="smiley.gif" alt="Smiley face" height="42" width="42"&gt;
</code></pre>

What you should do is change your <code>start = page.find('&lt;a img=')</code> line to <code>start = page.find('&lt;img ')</code> like so:

<pre><code>def getImage(url):
 page = urllib2.urlopen(url)
 page = page.read() #Gives HTML to parse

 start = page.find('&lt;img ')
 end = page.find('&gt;', start)

 img = page[start:end+1]
 return img
</code></pre>

blocks|key|1217605|text|考虑使用BeautifulSoup来解析您的超文本标记语言：|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1217606|from+BeautifulSoup+import+BeautifulSoup
import+urllib
url++=+'http://www.google.com'
html+=+urllib.urlopen(url).read()
soup+=+BeautifulSoup(html)
for+img+in+soup.findAll('img'):
+++++print+img['src']|code-block|syntax|javascript|1217607|entityMap|0|LINK|mutability|MUTABLE|url|http://pypi.python.org/pypi/BeautifulSoup^0|4|D|0|0|0^^$0|@$1|2|3|4|5|6|7|Q|8|@]|9|@$A|R|B|S|1|T]]|C|$]]|$1|D|3|E|5|F|7|U|8|@]|9|@]|C|$G|H]]|$1|I|3|-4|5|6|7|V|8|@]|9|@]|C|$]]]|J|$K|$5|L|M|N|C|$O|P]]]]

Consider using <a href="http://pypi.python.org/pypi/BeautifulSoup" rel="nofollow">BeautifulSoup</a> to parse your HTML:

<pre><code>from BeautifulSoup import BeautifulSoup
import urllib
url = 'http://www.google.com'
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)
for img in soup.findAll('img'):
 print img['src']
</code></pre>

blocks|key|1223686|text|关于用ruby抓取屏幕的文章：http://www.igvita.com/2007/02/04/ruby-screen-scraper-in-60-seconds/它不是抓取图像，但它是一篇很好的文章，可能会有所帮助。|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1223687|entityMap|0|LINK|mutability|MUTABLE|url|http://www.igvita.com/2007/02/04/ruby-screen-scraper-in-60-seconds/^0|F|1V|0|0^^$0|@$1|2|3|4|5|6|7|L|8|@]|9|@$A|M|B|N|1|O]]|C|$]]|$1|D|3|-4|5|6|7|P|8|@]|9|@]|C|$]]]|E|$F|$5|G|H|I|C|$J|K]]]]

Article on screen scraping with ruby: 
 <a href="http://www.igvita.com/2007/02/04/ruby-screen-scraper-in-60-seconds/" rel="nofollow">http://www.igvita.com/2007/02/04/ruby-screen-scraper-in-60-seconds/</a>
Its not scraping images but its a good article and may help.

blocks|key|5027427|text|以这种方式提取图像信息不是一个好主意。根据你的知识和学习新事物的动机，有几个更好的选择：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5027428|5027429|http://scrapy.org/是一个很好的从网页中提取数据的框架。由于您看起来像是初学者，因此可能需要一些overkill.|unordered-list-item|offset|length|5027430|Learn正则表达式来提取信息:从page.read().|style|CODE|5027431|的结果中解析数据的http://docs.python.org/library/re.html和Learning+Regular+Expressions|5027432|Use+http://www.crummy.com/software/BeautifulSoup/|5027433|5027434|entityMap|0|LINK|mutability|MUTABLE|url|http://scrapy.org/|1|http://docs.python.org/library/re.html|2|https://stackoverflow.com/questions/4736/learning-regular-expressions|3|http://www.crummy.com/software/BeautifulSoup/^0|0|0|0|I|0|0|H|B|0|9|12|1|1C|S|2|0|4|19|3|0|0^^$0|@$1|2|3|4|5|6|7|14|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|15|8|@]|9|@]|A|$]]|$1|C|3|D|5|E|7|16|8|@]|9|@$F|17|G|18|1|19]]|A|$]]|$1|H|3|I|5|E|7|1A|8|@$F|1B|G|1C|J|K]]|9|@]|A|$]]|$1|L|3|M|5|6|7|1D|8|@]|9|@$F|1E|G|1F|1|1G]|$F|1H|G|1I|1|1J]]|A|$]]|$1|N|3|O|5|E|7|1K|8|@]|9|@$F|1L|G|1M|1|1N]]|A|$]]|$1|P|3|-4|5|6|7|1O|8|@]|9|@]|A|$]]|$1|Q|3|-4|5|6|7|1P|8|@]|9|@]|A|$]]]|R|$S|$5|T|U|V|A|$W|X]]|Y|$5|T|U|V|A|$W|Z]]|10|$5|T|U|V|A|$W|11]]|12|$5|T|U|V|A|$W|13]]]]

Extracting the image information this way is not a good idea. There are severaly better options, depending on your knowledge and your motivation to learn something new:

<ul>
<li><a href="http://scrapy.org/" rel="nofollow noreferrer">http://scrapy.org/</a> is a very good framework for extracting data from web pages. As it looks like you're a beginner, it might a bit overkill.</li>
<li>Learn regular expressions to extract the information: <a href="http://docs.python.org/library/re.html" rel="nofollow noreferrer">http://docs.python.org/library/re.html</a> and <a href="https://stackoverflow.com/questions/4736/learning-regular-expressions">Learning Regular Expressions</a></li>
<li>Use <a href="http://www.crummy.com/software/BeautifulSoup/" rel="nofollow noreferrer">http://www.crummy.com/software/BeautifulSoup/</a> to parse data from the result of <code>page.read()</code>.</li>
</ul>

blocks|key|5027519|text|一些可能对您有帮助的说明：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5027520|5027521|使用谷歌浏览器。将鼠标放在图像上并单击鼠标右键。选择"Inspect+element“。这将打开一个区域，您可以在其中看到图像附近的html。|ordered-list-item|5027522|使用Beautiful+Soup来解析html：|5027523|5027524|从BeautifulSoup导入BeautifulSoup请求=+urllib2.Request(url)+response+=urllib2.urlopen(请求)+html+=+response.read()+soap+=+BeautifulSoap(html)+imgs+=+soup.findAll("+img+")+items+=+[]对于img中的img:打印img‘’src‘#打印图像位置items.append(img’‘src’)#存储下载以后的的位置|5027525|entityMap^0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|N|8|@]|9|@]|A|$]]|$1|C|3|D|5|E|7|O|8|@]|9|@]|A|$]]|$1|F|3|G|5|E|7|P|8|@]|9|@]|A|$]]|$1|H|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]|$1|I|3|J|5|6|7|R|8|@]|9|@]|A|$]]|$1|K|3|-4|5|6|7|S|8|@]|9|@]|A|$]]]|L|$]]

Some instructions that might be of help:

<ol>
<li>Use Google Chrome. Set the mouse over the image and right click. Select "Inspect element". That will open a section where you'll be able to see the html near the image.</li>
<li>Use Beautiful Soup to parse the html:

<pre><code>from BeautifulSoup import BeautifulSoup

request = urllib2.Request(url)
response = urllib2.urlopen(request)
html = response.read()
soap = BeautifulSoap(html)
imgs = soup.findAll("img")
items = []
for img in imgs:
 print img['src'] #print the image location
 items.append(img['src']) #store the locations for downloading later
</code></pre></li>
</ol>

My code only returns an empty string, and I have no idea why.

<pre><code>import urllib2

def getImage(url):
 page = urllib2.urlopen(url)
 page = page.read() #Gives HTML to parse

 start = page.find('&lt;a img=')
 end = page.find('&gt;', start)

 img = page[start:end]

return img
</code></pre>

It would only return the first image it finds, so it's not a very good image scraper; that said, my primary goal right now is just to be able to find an image. I'm unable to.

Image scraping program in Python not functioning as intended

33 天实现自己的 AI 进化论？发文瓜分机械键盘、耳机等万元奖池！点击参加活动：<a href="https://cloud.tencent.com/developer/article/2503022" target="_blank">https://cloud.tencent.com/developer/article/2503022</a> 
 <img src="https://qcloudimg.tencent-cloud.cn/raw/01543e0407db10898a7ee9f420794432.jpg"/>

技术创作特训营有奖征文

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋 

腾讯云AI代码助手

CODING DevOps

Cloud Studio

SDK中心

API中心

命令行工具

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

我的代码只返回一个空字符串，我不知道为什么。import urllib2def getImage(url): page = urllib2.urlopen(url) page = page.read() #Gives HTML to parse start = page.find('<a img=') end = page.find('>', start) img =

问Python中的图像抓取程序无法正常运行
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python中的图像抓取程序无法正常运行EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python中的图像抓取程序无法正常运行
EN