下面是提取的div
代码,我需要从中获取输出,尝试通常的提取不起作用
<div class="container-inhalt">
<div class="container-hauptinfo s16">
<a title="Ki-dong Kim" id="0" href="/ki-do190">Ki-Kim</a> </div>
<div class="container-zusatzinfo-small">
<b>Age:</b> 48 Years
<img src="https://tny/87.png?lm=1520611569" title="Korea, South" alt="Ka, Sh" class="flaggenrahmen" /> <br />
<b>Appointed:</b> Apr 23, 2019 <br />
<b>Contract expires:</b> - <br />
<b>Success rate as coach:</b> 1,63 PPM </div>
<div class="container-zusatzinfo">
</div>
</div>
输出:1,63 PPM
发布于 2020-05-09 18:55:15
如果您希望继续使用webscraping来学习XPath和XPath Functions,这将是一项坚实的投资,因为几乎总是可以描述如何针对特定节点。然后,Scrapy还允许为“最后一英里”部分运行正则表达式:
def parse(self, response):
response.xpath('//b[contains("Success rate as coach:", text())]'
'/following-sibling::node()'
).re(r'\s*(\S+)\s*')
# ['1,63', 'PPM']
https://stackoverflow.com/questions/61700750
复制相似问题