如何从div class="ember-view" id="ember760">
开始提取此lxml中的文本。请帮帮忙。我尝试了下面的代码,但是文本没有被捕获。
我尝试过的代码
#soup is an beautifulsoup element
exp = soup.find('header', {'class': 'pv-profile-section__card-header'})
exp
lxml文件
<div class="pv-recommendation-entity__highlights">
<blockquote class="pv-recommendation-entity__text relative">
<div class="ember-view" id="ember760"> <span class="lt-line-clamp__line">I know Abc from Data Analysis training sessions with abc,</span>
<span class="lt-line-clamp__line">Abc
is an enthusiastic candidature in training sessions. He is an</span>
<span class="lt-line-clamp__line">extremely capable and dedicated entry-level Data Science Analyst.</span>
<span class="lt-line-clamp__line">He is enhancing Analytics skills by his enthusiasm for learning new</span>
<span class="lt-line-clamp__line lt-line-clamp__line--last">
things, and has learnt new tools like R, SPSS, and Pytho<span class="lt-line-clamp__ellipsis">...
<a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">See more</a>
</span></span>
<!-- --><span class="lt-line-clamp__ellipsis lt-line-clamp__ellipsis--dummy">... <a class="lt-line-clamp__more" href="#" role="button">See more</a></span></div>
</blockquote>
</div>
</li>
</ul>
<!-- --></div>
</div></div>
预期输出
I know Abc from Data Analysis training sessions with abc,
is an enthusiastic candidature in training sessions. He is an
extremely capable and dedicated entry-level Data Science Analyst.
He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
发布于 2020-10-01 10:56:50
soup = BeautifulSoup(html, 'lxml')
lines = soup.select('div.ember-view > span.lt-line-clamp__line')
text = ''.join([line.find(text=True, recursive=False) for line in lines])
print(text)
给出了文本:
I know Abc from Data Analysis training sessions with abc,Abc
is an enthusiastic candidature in training sessions. He is anextremely capable and dedicated entry-level Data Science Analyst.He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
“查看更多信息..”将被忽略
发布于 2020-10-01 10:54:35
您可以使用CSS选择器div#ember760
来选择<div class="ember-view" id="ember760">
和.get_text()
方法:
from bs4 import BeautifulSoup
txt = '''
<div class="pv-recommendation-entity__highlights">
<blockquote class="pv-recommendation-entity__text relative">
<div class="ember-view" id="ember760"> <span class="lt-line-clamp__line">I know Abc from Data Analysis training sessions with abc,</span>
<span class="lt-line-clamp__line">Abc
is an enthusiastic candidature in training sessions. He is an</span>
<span class="lt-line-clamp__line">extremely capable and dedicated entry-level Data Science Analyst.</span>
<span class="lt-line-clamp__line">He is enhancing Analytics skills by his enthusiasm for learning new</span>
<span class="lt-line-clamp__line lt-line-clamp__line--last">
things, and has learnt new tools like R, SPSS, and Pytho<span class="lt-line-clamp__ellipsis">...
<a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">See more</a>
</span></span>
<!-- --><span class="lt-line-clamp__ellipsis lt-line-clamp__ellipsis--dummy">... <a class="lt-line-clamp__more" href="#" role="button">See more</a></span></div>
</blockquote>
</div>
</li>
</ul>
<!-- --></div>
</div></div>'''
soup = BeautifulSoup(txt, 'lxml')
print(soup.select_one('div#ember760').get_text(strip=True, separator='\n'))
打印:
I know Abc from Data Analysis training sessions with abc,
Abc
is an enthusiastic candidature in training sessions. He is an
extremely capable and dedicated entry-level Data Science Analyst.
He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
...
See more
...
See more
https://stackoverflow.com/questions/64153485
复制相似问题