python爬虫代码大全
时间: 2023-07-22 13:12:34 浏览: 345
当然!下面是一些常见的Python爬虫代码示例:
1. 使用requests库发送HTTP请求并获取页面内容:
```python
import requests
url = "http://example.com"
response = requests.get(url)
content = response.text
print(content)
```
2. 使用BeautifulSoup库解析HTML页面:
```python
from bs4 import BeautifulSoup
html = "<html><body><h1>Hello, World!</h1></body></html>"
soup = BeautifulSoup(html, "html.parser")
title = soup.find("h1").text
print(title)
```
3. 使用正则表达式提取页面中的特定信息:
```python
import re
text = "Hello, my email is example@example.com"
pattern = r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"
emails = re.findall(pattern, text)
print(emails)
```
4. 使用Selenium库模拟浏览器行为:
```python
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://example.com")
content = driver.page_source
print(content)
driver.quit()
```
这些代码示例只是爬虫功能的一小部分,你可以根据具体需求进行进一步的学习和开发。