首页python抓取网页数据

python抓取网页数据

时间: 2023-09-22 09:10:28 浏览: 102

可以使用Python中的requests和beautifulsoup库来实现抓取网页数据。首先，使用requests库发送GET请求获取网页内容： ```python import requests url = 'http://www.example.com' response = requests.get(url) ``` 然后，使用beautifulsoup库解析网页内容： ```python from bs4 import BeautifulSoup soup = BeautifulSoup(response.text, 'html.parser') ``` 接下来，可以使用beautifulsoup提供的一系列方法来获取网页中的数据。例如，获取网页中的所有链接： ```python for link in soup.find_all('a'): print(link.get('href')) ``` 还可以使用正则表达式来匹配特定的内容： ```python import re for tag in soup.find_all(re.compile('^h')): print(tag.name) ``` 以上是一个简单的示例，实际应用中需要根据网页结构和需求来定制代码。

阅读全文