在Python中提取标签之间的内容可以使用各种库和方法,以下是其中几种常用的方法:
import re
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
paragraphs = re.findall(r"<p>(.*?)</p>", html)
print(paragraphs)
输出结果为:['This is a paragraph.', 'This is another paragraph.']
from bs4 import BeautifulSoup
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
soup = BeautifulSoup(html, 'html.parser')
paragraphs = soup.find_all('p')
for p in paragraphs:
print(p.text)
输出结果为:This is a paragraph. This is another paragraph.
from lxml import etree
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
tree = etree.HTML(html)
paragraphs = tree.xpath('//p/text()')
print(paragraphs)
输出结果为:['This is a paragraph.', 'This is another paragraph.']
这些方法都可以用来提取标签之间的内容,具体选择哪种方法取决于个人偏好和项目需求。
领取专属 10元无门槛券
手把手带您无忧上云