首页python爬取网页表格_python提取网页表格并保存为csv

python爬取网页表格_python提取网页表格并保存为csv

时间: 2024-04-28 13:22:19 浏览: 86

你可以使用Python的第三方库BeautifulSoup和pandas来提取网页表格并保存为CSV文件。首先，你需要安装这两个库。你可以在终端或命令提示符中运行以下命令来安装它们： ``` pip install beautifulsoup4 pip install pandas ``` 接下来，你需要使用requests库来获取网页的HTML代码，并使用BeautifulSoup来解析HTML代码。以下是一个例子： ```python import requests from bs4 import BeautifulSoup import pandas as pd url = 'https://www.example.com/table.html' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') table = soup.find('table') rows = table.find_all('tr') data = [] for row in rows: cols = row.find_all('td') cols = [col.text.strip() for col in cols] data.append(cols) df = pd.DataFrame(data) df.to_csv('table.csv', index=False, header=False) ``` 在这个例子中，我们首先使用requests库获取网页的HTML代码，并使用BeautifulSoup来解析HTML代码。然后，我们使用find方法来找到表格元素，并使用find_all方法来找到表格行和单元格。我们使用一个循环来迭代每一行，并将每个单元格的文本添加到一个列表中。最后，我们使用pandas库创建一个DataFrame对象，并使用to_csv方法将数据保存为CSV文件。

阅读全文