爬取天天基金网中医疗服务板块所有的个股基金的历史净值并保存为csv文件
时间: 2024-04-01 10:34:02 浏览: 55
基金净值历史明细导入execl
5星 · 资源好评率100%
好的,需要用到Python的爬虫库和数据处理库。以下是一个可能的实现过程:
1. 导入需要的库:
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
```
2. 设置请求头和目标URL:
```python
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
url = 'http://fund.eastmoney.com'
```
3. 发送请求获取页面内容,并用BeautifulSoup解析页面:
```python
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
```
4. 找到医疗服务板块的URL:
```python
med_url = ''
for a in soup.find_all('a'):
if '医疗服务' in str(a):
med_url = url + a['href']
break
```
5. 进入医疗服务板块页面,获取所有个股基金的URL:
```python
response = requests.get(med_url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
fund_urls = []
for a in soup.find_all('a'):
if '基金净值' in str(a) and 'html' in str(a):
fund_urls.append(url + a['href'])
```
6. 进入每个基金的净值页面,获取历史净值数据并存为CSV文件:
```python
for fund_url in fund_urls:
response = requests.get(fund_url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
# 获取基金名称
fund_name = soup.find('div', {'class': 'fundDetail-tit'}).text.strip()
# 获取历史净值数据
table = soup.find('table', {'class': 'w782 comm tzxq'})
df = pd.read_html(str(table))[0]
# 保存为CSV文件
df.to_csv(fund_name + '.csv', index=False)
```
完整代码如下:
阅读全文