首页爬取网页 https://tjj.hunan.gov.cn/hntj/tjfx/tjgb/pcgbv/202105/t20210519_19079329.html 将表格标签下的内容存储在字符串bg中 python代码

爬取网页 https://tjj.hunan.gov.cn/hntj/tjfx/tjgb/pcgbv/202105/t20210519_19079329.html 将表格标签下的内容存储在字符串bg中 python代码

时间: 2023-09-02 17:07:27 浏览: 561

python 爬取网页

python 爬取网页爬取豆瓣网top250电影

好的，我可以帮你完成这个任务。你可以使用Python中的requests和BeautifulSoup库来实现网页的爬取和解析。以下是示例代码： ```python import requests from bs4 import BeautifulSoup url = "https://tjj.hunan.gov.cn/hntj/tjfx/tjgb/pcgbv/202105/t20210519_19079329.html" response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') table = soup.find('table') # 找到表格标签 rows = table.find_all('tr') # 找到所有行 bg = "" # 存储表格内容的字符串 for row in rows: cols = row.find_all('td') # 找到该行的所有列 for col in cols: bg += col.get_text().strip() + "\t" # 将列的内容添加到字符串中，并以制表符分隔 bg += "\n" # 换行符分隔每一行 print(bg) # 输出表格内容的字符串 ``` 运行上述代码，即可将网页 https://tjj.hunan.gov.cn/hntj/tjfx/tjgb/pcgbv/202105/t20210519_19079329.html 中表格标签下的内容存储在字符串bg中。

阅读全文