import requests from bs4 import BeautifulSoup from pprint import pprint for page in range(5): url = "https://movie.douban.com/subject/30391186/comments?start={}&limit=20&sort=new_score&status=P".format(page*20) headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" } response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, "html.parser") comments = [] for comment in soup.select(".comment-item"): content = comment.select_one(".comment-content").get_text().strip() author = comment.select_one(".comment-info a").get_text().strip() likes = comment.select_one(".votes").get_text().strip() time = comment.select_one(".comment-time").get_text().strip() location = comment.select_one(".comment-location").contents[-1].strip() comments.append({ "content": content, "author": author, "likes": likes, "time": time, "location": location }) print("第{}页的数据:".format(page+1)) pprint(comments),我要把数据存入数据库, host='localhost', user='root', password='root', database='pa',
时间: 2024-04-03 18:35:34 浏览: 64
import sys import os import urllib from bs4 import BeautifulSoup
好的,如果您想将数据存储在MySQL数据库中,您可以按照以下步骤进行操作:
1. 导入MySQLdb模块
```
import MySQLdb
```
2. 创建数据库连接并打开游标
```
conn = MySQLdb.connect(host='localhost', user='root', password='root', database='pa', charset='utf8')
cursor = conn.cursor()
```
3. 创建数据表
```
cursor.execute('''
CREATE TABLE IF NOT EXISTS comments (
id INT(11) NOT NULL AUTO_INCREMENT,
content TEXT,
author VARCHAR(255),
likes INT(11),
time VARCHAR(255),
location VARCHAR(255),
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
''')
```
4. 插入数据
```
for comment in comments:
cursor.execute('''
INSERT INTO comments (content, author, likes, time, location)
VALUES (%s, %s, %s, %s, %s)
''', (comment['content'], comment['author'], comment['likes'], comment['time'], comment['location']))
```
5. 提交更改并关闭连接
```
conn.commit()
conn.close()
```
这样您就可以将数据存入名为"pa"的MySQL数据库中了。请注意,您需要在本地安装MySQL数据库,并使用正确的主机,用户名,密码和数据库名称来连接数据库。
阅读全文