模拟登录“古诗文网”：用自己的用户名和密码登录后，爬取个人收藏列表的古诗的名字和作者

好的，这个问题可以通过模拟登录古诗文网，并爬取个人收藏列表来解决。以下是实现的步骤： 1. 导入必要的库：requests、BeautifulSoup、re、json、time、selenium 2. 使用selenium模拟登录古诗文网，获取登录后的cookies 3. 使用requests库发送请求，带上cookies获取个人收藏列表页面的HTML内容 4. 使用BeautifulSoup解析HTML内容，获取收藏列表中每首古诗的链接 5. 使用requests库发送请求，获取每首古诗的具体页面HTML内容 6. 使用正则表达式或BeautifulSoup解析HTML内容，获取古诗的名字和作者以下是代码实现的详细步骤： ```python import requests from bs4 import BeautifulSoup import re import json import time from selenium import webdriver # 设置登录账号和密码 username = "your_username" password = "your_password" # 使用selenium模拟登录获取cookies login_url = "https://so.gushiwen.cn/user/login.aspx?from=http://so.gushiwen.cn/user/collect.aspx" driver = webdriver.Chrome() driver.get(login_url) driver.find_element_by_id("email").send_keys(username) driver.find_element_by_id("pwd").send_keys(password) driver.find_element_by_id("denglu").click() time.sleep(5) cookies = driver.get_cookies() cookies_dict = {} for cookie in cookies: cookies_dict[cookie['name']] = cookie['value'] # 使用cookies发送请求获取收藏列表页面的HTML内容 collection_url = "https://so.gushiwen.cn/user/collect.aspx" response = requests.get(collection_url, cookies=cookies_dict) html = response.text # 使用BeautifulSoup解析HTML内容，获取收藏列表中每首古诗的链接 soup = BeautifulSoup(html, "html.parser") poems = soup.find_all("div", class_="sons") poem_links = [] for poem in poems: link = poem.find("a")["href"] poem_links.append("https://so.gushiwen.cn" + link) # 使用requests库发送请求，获取每首古诗的具体页面HTML内容，并使用正则表达式或BeautifulSoup解析HTML内容，获取古诗的名字和作者 for link in poem_links: response = requests.get(link, cookies=cookies_dict) html = response.text soup = BeautifulSoup(html, "html.parser") title = soup.find("div", class_="cont").find("h1").text.strip() author = soup.find("div", class_="cont").find("p", class_="source").text.strip() author = re.sub(r"\s+", " ", author).split(" ")[0][:-1] print(title, author) ``` 这样就可以成功爬取个人收藏列表中的古诗的名字和作者了。

模拟登录“古诗文网”：用自己的用户名和密码登录后，爬取个人收藏列表的古诗的名字和作者

相关推荐

爬虫学习笔记：爬取古诗文网

python 爬取古诗文存入mysql数据库的方法

Scrapy入门：爬取古诗文

python爬取古诗文

python爬虫爬取古诗文

python爬取古诗文网唐诗的数量

python 爬取古诗文

python爬虫爬取古诗词网

python使用selenium访问古诗文网首页

python爬虫古诗文

python爬取古诗文网五言绝句、七言绝句、五言律诗、七言律诗、五言古诗、七言古诗、乐府这几种体裁的诗的数量的分别为多少

用python写一个古诗文问答系统

使用python统计古诗文网页上唐诗的数量

、在古诗文网找到《登鹳雀楼》，通过编写函数来获取这首诗的名称、作者、诗正文等内容

python统计古诗文网页上唐诗数量代码

python爬虫爬取名句

python爬虫爬取唐诗

基于LSTM的古诗生成的定义以及评价标准

最新推荐

爬虫学习笔记：爬取古诗文网

android手机应用源码Imsdroid语音视频通话源码.rar

zigbee-cluster-library-specification

管理建模和仿真的文件

实现实时数据湖架构：Kafka与Hive集成

可见光定位LED及其供电硬件具体型号，广角镜头和探测器，实验设计具体流程步骤，

JSBSim Reference Manual

"互动学习：行动中的多样性与论文攻读经历"

实现实时监控告警系统：Kafka与Grafana整合

解释这行代码 c = ((double)rand() / RAND_MAX) * (a + b - fabs(a - b)) + fabs(a - b);