首页爬取只能爬取一页，如何爬取多页

爬取只能爬取一页，如何爬取多页

时间: 2024-04-19 22:27:02 浏览: 116

要爬取多页数据，您需要了解每一页的URL规律，并通过修改URL来获取不同页的数据。以下是一个示例代码，演示如何爬取中行外汇牌价的多页数据： ```python import requests from bs4 import BeautifulSoup base_url = 'http://www.boc.cn/sourcedb/whpj/' # 设置爬取的页数范围 start_page = 1 end_page = 5 for page in range(start_page, end_page+1): # 构造当前页的URL url = base_url + f'index_{page}.html' # 发起请求获取网页内容 response = requests.get(url) html = response.content # 使用BeautifulSoup解析网页内容 soup = BeautifulSoup(html, 'html.parser') # 找到外汇牌价表格 table = soup.find('table', class_='publish') rows = table.find_all('tr') # 遍历每一行，提取需要的数据 for row in rows: cols = row.find_all('td') if len(cols) >= 7: currency = cols[0].text.strip() # 币种 buy_rate = cols[2].text.strip() # 现汇买入价 sell_rate = cols[6].text.strip() # 现钞卖出价 print(currency, buy_rate, sell_rate) ``` 在这个示例中，我们通过循环遍历`start_page`到`end_page`之间的页面，构造每一页的URL。然后，我们发送HTTP请求获取页面内容，并使用BeautifulSoup解析页面内容，提取所需的数据。最后，我们打印输出了每一页中的外汇牌价数据。请注意，您需要根据实际情况修改`start_page`和`end_page`的值，并根据中行外汇牌价网站的URL规律来构造每一页的URL。

阅读全文