能否不使用Selenium完成这个爬虫的功能

可以的，在这个任务中，我们主要需要获取页面中的微博内容和转发内容，可以使用 Python 的 requests 库和 BeautifulSoup 库来实现。具体实现步骤如下：

使用 requests 库发送 GET 请求，获取页面的 HTML 内容。
使用 BeautifulSoup 库解析 HTML 内容，提取微博内容和转发内容。
将微博内容和转发内容存入 MySQL 数据库中。

下面是一个基于 requests 和 BeautifulSoup 实现微博爬虫的示例代码：

import requests
from bs4 import BeautifulSoup
import pymysql
import datetime


# 构造请求头
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}

# 请求页面
url = 'https://weibo.com/n/%E4%B8%AD%E5%9B%BD%E9%A3%9F%E5%93%81%E5%8D%AB%E7%94%9F?from=feed&amp;loc=at&amp;nick=%E4%B8%AD%E5%9B%BD%E9%A3%9F%E5%93%81%E5%8D%AB%E7%94%9F&amp;order=hot'
response = requests.get(url, headers=headers)

# 解析页面
soup = BeautifulSoup(response.text, 'html.parser')

# 获取微博内容
weibo_content = soup.select('.WB_feed_detail .WB_text.W_f14')[0].text

# 获取转发内容
repost_items = soup.select('.list_ul .list_li')
reposts = []
for item in repost_items:
    nickname = item.select('.WB_text.W_f14')[0].text
    content = item.select('.WB_text.W_f14 + .comment_txt')[0].text
    reposts.append({'nickname': nickname, 'content': content})

# 将微博内容和转发内容存入 MySQL 数据库中
db = pymysql.connect(host='hostname', port=port, user='root', password='root', db='weibo', charset='utf8mb4')
cursor = db.cursor()

# 存储微博内容
now = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
sql = "INSERT INTO weibo_user (content, create_time) VALUES ('%s', '%s')" % (weibo_content, now)
cursor.execute(sql)
db.commit()

# 存储转发内容
weibo_id = cursor.lastrowid
for repost in reposts:
    sql = "INSERT INTO weibo_repost (weibo_id, nickname, content) VALUES (%d, '%s', '%s')" % (weibo_id, repost['nickname'], repost['content'])
    cursor.execute(sql)
    db.commit()

db.close()

需要注意的是，如果页面中的内容需要登录才能查看，则需要先进行登录操作。另外，在爬虫过程中，需要注意不要过于频繁地请求同一个网站，以免被网站封禁 IP。

向AI提问

能否不使用Selenium完成这个爬虫的功能

相关推荐

详解Selenium+PhantomJS+python简单实现爬虫的功能

网络爬虫–Selenium的使用

基于selenium的网络爬虫

爬虫selenium

Python爬虫-Selenium

使用Selenium服务器实现高效网页爬虫技巧

使用selenium自动化操作：Python爬虫实战

使用显式等待优化Selenium爬虫-今日头条实战

使用Selenium处理渲染动态页面的网络爬虫

使用Selenium模拟浏览器行为进行动态网页爬虫

Python爬虫进阶篇：使用Selenium模拟浏览器操作

jaav使用selenium爬虫实战

java Selenium爬虫

微博爬虫selenium

在python中使用selenium爬虫京东商品

爬虫滚动selenium

python爬虫selenium爬取

selenium京东爬虫换页

python爬虫淘宝selenium

大家在看

dmx512无线舞台灯光系统

Aspose.Pdf.dll v17.7.0.0 无限制 无水印

The Open Group IT4IT™参考架构版本 2.1.pdf

C语言第四次作业ppt课件.ppt

antelope.zip

最新推荐

Python中Selenium库使用教程详解

selenium API速查手册 python版.docx

达内培训：深入解析当当网java源码项目

【视频编码与传输的终极指南】：掌握H265与VP8的关键技术及应用场景

Expected comma問題是什麼

squat-timer：基于角度的倒计时训练时间应用程序

【Acronis True Image 高效备份艺术】：2010版高级备份策略的终极指南

# Error loading design # Error: Error loading design # Pausing macro execution # MACRO ./FIFO_run_msim_rtl_verilog.do PAUSED at line 12

随机生成JSON格式账号数据的脚本工具

【遗传算法效率革命】：全面评价GA算法性能的七大维度

Aspose.Pdf.dll v17.7.0.0 无限制无水印