使用Python从HTML文件中读取数据并将数据写入CSV文件的步骤如下:
import csv
from bs4 import BeautifulSoup
with open('input.html', 'r') as html_file:
soup = BeautifulSoup(html_file, 'html.parser')
find_all
方法找到所有的表格行,并提取所需的数据。with open('output.csv', 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(['Header1', 'Header2', 'Header3']) # 写入CSV文件的表头
for row in data_rows:
writer.writerow(row) # 写入数据行
在上述代码中,data_rows
是包含提取的数据的列表,每个元素代表一行数据。
完整代码示例:
import csv
from bs4 import BeautifulSoup
with open('input.html', 'r') as html_file:
soup = BeautifulSoup(html_file, 'html.parser')
data_rows = []
table = soup.find('table') # 假设数据在表格中
for row in table.find_all('tr'):
data = [cell.get_text(strip=True) for cell in row.find_all('td')]
data_rows.append(data)
with open('output.csv', 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(['Header1', 'Header2', 'Header3'])
for row in data_rows:
writer.writerow(row)
这样,Python就可以从HTML文件中读取数据,并将数据写入CSV文件。请注意,上述代码中的文件路径需要根据实际情况进行修改。
领取专属 10元无门槛券
手把手带您无忧上云