首页python爬虫的代码

python爬虫的代码

时间: 2024-01-21 12:13:46 浏览: 94

以下是一个简单的Python爬虫代码示例： ```python import requests # 发送HTTP请求获取网页内容 url = "https://www.example.com" response = requests.get(url) html_content = response.text # 使用正则表达式提取信息 import re pattern = r"<title>(.*?)</title>" result = re.findall(pattern, html_content) print(result) # 使用Beautiful Soup解析HTML from bs4 import BeautifulSoup soup = BeautifulSoup(html_content, "html.parser") title = soup.title.string print(title) # 使用XPath提取信息 from lxml import etree tree = etree.HTML(html_content) result = tree.xpath("//title/text()") print(result) # 使用Selenium模拟浏览器操作 from selenium import webdriver driver = webdriver.Chrome() driver.get(url) title = driver.title print(title) # 使用Scrapy框架进行爬取 import scrapy class MySpider(scrapy.Spider): name = "example" start_urls = [url] def parse(self, response): title = response.css("title::text").get() print(title) ```

阅读全文