爬虫软科大学排名Excel
时间: 2023-12-17 10:27:19 浏览: 35
根据提供的引用[1]和引用,可以使用Python爬虫来获取软科大学排名数据并保存为Excel文件。具体步骤如下:
1. 导入所需的库,包括requests、BeautifulSoup和pandas。
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
```
2. 发送HTTP请求并获取网页内容。
```python
url = 'http://www.zuihaodaxue.com/zuihaodaxuepaiming2021.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
```
3. 解析HTML内容并提取数据。
```python
soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find_all('table', {'class': 'table table-striped table-bordered table-hover'})[0]
df = pd.read_html(str(table))[0]
```
4. 将数据保存为Excel文件。
```python
df.to_excel('university_ranking.xlsx', index=False)
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'http://www.zuihaodaxue.com/zuihaodaxuepaiming2021.html'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
table = soup.find_all('table', {'class': 'table table-striped table-bordered table-hover'})[0]
df = pd.read_html(str(table))[0]
df.to_excel('university_ranking.xlsx', index=False)
```