首页使用爬虫知识，爬取https://nba.hupu.com/stats/players网址中的球员数据，存储excel文件里

使用爬虫知识，爬取https://nba.hupu.com/stats/players网址中的球员数据，存储excel文件里

时间: 2023-12-09 22:02:31 浏览: 183

好的，这是一个比较常见的爬虫任务，我可以为您提供一些思路和代码。不过在开始之前，需要确保您已经安装了Python和相关的第三方库，比如requests、beautifulsoup4和pandas。首先，我们需要发送请求获取网页内容。代码如下： ```python import requests url = 'https://nba.hupu.com/stats/players' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) content = response.content.decode('utf-8') ``` 接下来，我们需要使用beautifulsoup4解析网页内容，并提取需要的球员数据。代码如下： ```python from bs4 import BeautifulSoup soup = BeautifulSoup(content, 'html.parser') table = soup.find('table', {'class': 'players_table'}) rows = table.find_all('tr') data = [] for row in rows: cols = row.find_all('td') if len(cols) == 0: continue player = [] for col in cols: player.append(col.text.strip()) data.append(player) ``` 最后，我们需要使用pandas将数据存储到Excel文件中。代码如下： ```python import pandas as pd df = pd.DataFrame(data, columns=['排名', '球员', '球队', '场次', '首发', '时间', '投篮', '三分', '罚球', '篮板', '助攻', '抢断', '盖帽', '失误', '犯规', '得分']) df.to_excel('players.xlsx', index=False) ``` 运行完整代码后，将在当前目录下生成一个名为players.xlsx的Excel文件，里面存储了球员数据。希望这段代码可以帮助到您！

阅读全文