首页爬虫兰州四十天天气预报代码

爬虫兰州四十天天气预报代码

时间: 2024-05-13 07:12:35 浏览: 110

Python爬虫实例扒取2345天气预报

5星 · 资源好评率100%

寒假里学习了一下Python爬虫，使用最简单的方法扒取需要的天气数据，对，没听错，最简单的方法。甚至没有一个函数封装。。网址：http://tianqi.2345.com/wea_history/53892.htm 火狐中右键查看网页源代码，没有发现天气数据，因此推断网页采用的json格式数据。右击->查看元素->网络->JS，找到了位置用Python爬虫下载为json格式数据存储下来，代码如下： #-*- coding:utf-8 -*- import urllib2 import json months = [1,2,3,4,5,6,7,8,9,10,11,12] years

爬虫是一种自动化获取网页信息的程序，可以通过解析HTML页面来提取需要的数据。以下是获取兰州四十天天气预报的Python代码示例： ```python import requests from bs4 import BeautifulSoup url = 'http://www.weather.com.cn/weather/101160101.shtml' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.content, 'html.parser') weather_list = soup.find('ul', class_='t clearfix').find_all('li') for weather in weather_list: date = weather.find('h1').string weather_info = weather.find_all('p') weather_day = weather_info.string.strip() temperature = weather_info.find('span').string + '/' + weather_info.find('i').string wind_direction = weather_info.find('span')['title'] wind_level = weather_info.find('i').string print(date, weather_day, temperature, wind_direction, wind_level) ``` 该代码通过requests库发送HTTP请求，然后使用BeautifulSoup库解析HTML页面，最终获取到对应的天气数据。其中，`url`是待获取的网页链接，`headers`是请求头部信息，`soup`是使用BeautifulSoup解析后的页面对象，`weather_list`是解析出来的天气数据列表。在循环中，对于每一个天气数据，我们提取其中的日期、天气情况、温度、风向和风力等信息，并打印输出。

阅读全文