python爬取微博、今日头条、知乎、网易、腾讯的新闻的代码
时间: 2023-11-29 14:45:38 浏览: 224
以下是使用Python爬取微博、今日头条、知乎、网易和腾讯新闻的代码示例:
1. 爬取微博新闻:
```python
import requests
url = 'https://m.weibo.cn/api/container/getIndex?type=uid&value=1234567890'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
data = response.json()
# 处理返回的数据
# ...
```
2. 爬取今日头条新闻:
```python
import requests
url = 'https://www.toutiao.com/api/pc/feed/'
params = {
'category': 'news_hot',
'utm_source': 'toutiao',
'widen': 1,
'max_behot_time': 0,
'max_behot_time_tmp': 0,
'tadrequire': 'true',
'as': 'A1F5B8C9C9F5B8C',
'cp': '5D8E7D9C9F5B8C1'
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, params=params, headers=headers)
data = response.json()
# 处理返回的数据
# ...
```
3. 爬取知乎新闻:
```python
import requests
url = 'https://www.zhihu.com/api/v3/feed/topstory/hot-lists/total?limit=10&desktop=true'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
data = response.json()
# 处理返回的数据
# ...
```
4. 爬取网易新闻:
```python
import requests
url = 'https://c.m.163.com/nc/article/headline/T1348647853363/0-20.html'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
data = response.json()
# 处理返回的数据
# ...
```
5. 爬取腾讯新闻:
```python
import requests
url = 'https://pacaio.match.qq.com/irs/rcd?cid=108&ext=&token=&num=10'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
data = response.json()
# 处理返回的数据
# ...
```
阅读全文