首页python爬虫代码大全

python爬虫代码大全

时间: 2023-10-30 18:07:27 浏览: 79

以下是一些常用的 Python 爬虫代码示例： 1. 获取网页内容 ```python import requests url = 'http://www.example.com' response = requests.get(url) content = response.text ``` 2. 解析 HTML 页面 ```python from bs4 import BeautifulSoup soup = BeautifulSoup(content, 'html.parser') # 通过标签获取元素 links = soup.find_all('a') # 通过类名获取元素 elems = soup.find_all(class_='my-class') ``` 3. 获取图片 ```python import requests url = 'http://www.example.com/image.jpg' response = requests.get(url) with open('image.jpg', 'wb') as f: f.write(response.content) ``` 4. 获取 JSON 数据 ```python import requests url = 'http://www.example.com/data.json' response = requests.get(url) data = response.json() ``` 5. 使用 Selenium 自动化浏览器 ```python from selenium import webdriver driver = webdriver.Chrome() driver.get('http://www.example.com') elem = driver.find_element_by_id('my-id') elem.click() ``` 这些代码示例只是 Python 爬虫中的冰山一角，具体实现还要根据不同的需求进行调整和完善。同时，需要注意遵守网站的爬虫规则，不要过度频繁地访问同一网站，以免被封禁 IP。

阅读全文