python爬虫的代码
时间: 2024-01-21 17:13:46 浏览: 84
以下是一个简单的Python爬虫代码示例:
```python
import requests
# 发送HTTP请求获取网页内容
url = "https://www.example.com"
response = requests.get(url)
html_content = response.text
# 使用正则表达式提取信息
import re
pattern = r"<title>(.*?)</title>"
result = re.findall(pattern, html_content)
print(result)
# 使用Beautiful Soup解析HTML
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, "html.parser")
title = soup.title.string
print(title)
# 使用XPath提取信息
from lxml import etree
tree = etree.HTML(html_content)
result = tree.xpath("//title/text()")
print(result)
# 使用Selenium模拟浏览器操作
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(url)
title = driver.title
print(title)
# 使用Scrapy框架进行爬取
import scrapy
class MySpider(scrapy.Spider):
name = "example"
start_urls = [url]
def parse(self, response):
title = response.css("title::text").get()
print(title)
```
阅读全文