python爬虫考研核心词汇
时间: 2023-12-01 10:42:46 浏览: 32
以下是使用Python爬虫爬取考研核心词汇的步骤:
1. 导入必要的库
```python
import requests
from bs4 import BeautifulSoup
import json
```
2. 爬取词汇
```python
url = 'https://www.kuakao.com/kaoyan/cihui/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
words = soup.select('.wordList li a')
word_list = []
for word in words:
word_list.append(word.text)
```
3. 爬取近反义词及其例句
```python
word_dict = {}
for word in word_list:
url = 'https://www.iciba.com/word?w=' + word
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
near_antonyms = soup.select('.nearAntonyms .content')
near_antonyms_list = []
for na in near_antonyms:
near_antonyms_list.append(na.text)
example_sentences = soup.select('.exampleLists .group_pos .group_pos')
example_sentences_list = []
for es in example_sentences:
example_sentences_list.append(es.text)
word_dict[word] = {'near_antonyms': near_antonyms_list, 'example_sentences': example_sentences_list}
```
4. 保存到json格式的文件里
```python
with open('word_dict.json', 'w', encoding='utf-8') as f:
json.dump(word_dict, f, ensure_ascii=False)
```