首页python爬取下厨房菜谱

python爬取下厨房菜谱

时间: 2023-07-12 17:00:33 浏览: 105

要爬取下厨房菜谱，可以使用 Python 中的 requests 和 Beautiful Soup 库。具体步骤如下： 1. 使用 requests 库向下厨房的菜谱搜索页面发送请求，获取 HTML 页面内容。 2. 使用 Beautiful Soup 库解析 HTML 页面，获取菜谱列表。 3. 遍历菜谱列表，获取每道菜谱的详细页面链接。 4. 使用 requests 库向每道菜谱的详细页面链接发送请求，获取 HTML 页面内容。 5. 使用 Beautiful Soup 库解析 HTML 页面，获取菜谱的详细信息，例如食材、做法等。下面是一个简单的 Python 爬取下厨房菜谱的示例代码： ```python import requests from bs4 import BeautifulSoup # 发送请求，获取搜索页面的 HTML 内容 url = 'https://www.xiachufang.com/search/?keyword=红烧肉' response = requests.get(url) html = response.text # 解析 HTML 页面，获取菜谱列表 soup = BeautifulSoup(html, 'html.parser') recipe_list = soup.find_all('div', class_='recipe') for recipe in recipe_list: # 获取菜谱的详细页面链接 recipe_link = recipe.find('a')['href'] # 发送请求，获取详细页面的 HTML 内容 recipe_response = requests.get(recipe_link) recipe_html = recipe_response.text # 解析 HTML 页面，获取菜谱的详细信息 recipe_soup = BeautifulSoup(recipe_html, 'html.parser') recipe_name = recipe_soup.find('h1', class_='page-title').text.strip() recipe_ingredients = recipe_soup.find('div', class_='ings').text.strip() recipe_steps = recipe_soup.find('div', class_='steps').text.strip() # 打印菜谱的详细信息 print(recipe_name) print(recipe_ingredients) print(recipe_steps) ``` 注意，爬取网站内容时需要遵守网站的爬虫协议，不要过度频繁地发送请求，以免对网站造成影响。

阅读全文