首页python编写一个获取世界大学学术排名的爬虫实例

python编写一个获取世界大学学术排名的爬虫实例

时间: 2023-09-12 21:10:22 浏览: 102

好的，以下是使用 Python 编写的获取世界大学学术排名的爬虫实例： ```python import requests from bs4 import BeautifulSoup # 请求头，模拟浏览器访问 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 请求 URL url = 'https://www.shanghairanking.cn/rankings/arwu/2021' # 发送 GET 请求 response = requests.get(url, headers=headers) # 解析 HTML soup = BeautifulSoup(response.text, 'html.parser') # 定位到目标标签 table = soup.find('table', {'class': 'rk'}) # 学术排名表格 # 打印表头 print('排名\t学校名称\t所在国家/地区') # 遍历表格，打印每一行的数据 for tr in table.tbody.find_all('tr'): tds = tr.find_all('td') rank = tds[0].text.strip() # 排名 name = tds[1].text.strip() # 学校名称 country = tds[2].text.strip() # 国家/地区 print(rank + '\t' + name + '\t' + country) ``` 以上代码使用 requests 库发送 GET 请求，使用 BeautifulSoup 库解析 HTML，定位到学术排名表格，并遍历每一行的数据，输出排名、学校名称、所在国家/地区。需要注意的是，为了避免被网站反爬虫机制拦截，我们在请求头中添加了 User-Agent，模拟浏览器访问。