爬取welan.com的数据 2.采集的内容有图书封面、书名、ISBN、作者、价格、出版日期、出版社等。 3.要求有数据爬取、存储、展示过程 4.数据爬取、存储、展示过程的在flask web程序中控制 5.数据存储在MongoDB中,将代码写出
时间: 2024-02-01 15:04:28 浏览: 82
好的,根据您的需求,我可以提供以下代码实现:
1. 爬取数据并存储到 MongoDB 中
```python
import requests
from bs4 import BeautifulSoup
from pymongo import MongoClient
# 连接 MongoDB
client = MongoClient()
db = client['welan']
collection = db['books']
# 爬取数据
url = 'http://www.welan.com/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
book_list = soup.find_all('div', class_='booklist')
# 存储数据
for book in book_list:
img_url = book.find('img')['src']
book_name = book.find('h3').text.strip()
isbn = book.find('span', class_='isbn').text.strip().split(':')[1]
author = book.find('span', class_='author').text.strip().split(':')[1]
price = book.find('span', class_='price').text.strip().split(':')[1]
pub_date = book.find('span', class_='pub-date').text.strip().split(':')[1]
publisher = book.find('span', class_='publisher').text.strip().split(':')[1]
data = {
'img_url': img_url,
'book_name': book_name,
'isbn': isbn,
'author': author,
'price': price,
'pub_date': pub_date,
'publisher': publisher
}
collection.insert_one(data)
```
2. 展示数据
```python
from flask import Flask, render_template
from pymongo import MongoClient
# 连接 MongoDB
client = MongoClient()
db = client['welan']
collection = db['books']
app = Flask(__name__)
@app.route('/')
def index():
books = collection.find()
return render_template('index.html', books=books)
if __name__ == '__main__':
app.run()
```
3. 前端模板
```html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Welan.com Books</title>
</head>
<body>
<table>
<thead>
<tr>
<th>图书封面</th>
<th>书名</th>
<th>ISBN</th>
<th>作者</th>
<th>价格</th>
<th>出版日期</th>
<th>出版社</th>
</tr>
</thead>
<tbody>
{% for book in books %}
<tr>
<td><img src="{{ book.img_url }}" width="100"></td>
<td>{{ book.book_name }}</td>
<td>{{ book.isbn }}</td>
<td>{{ book.author }}</td>
<td>{{ book.price }}</td>
<td>{{ book.pub_date }}</td>
<td>{{ book.publisher }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</body>
</html>
```
以上是一个简单的爬取welan.com图书信息并存储到MongoDB中,然后通过Flask展示出来的代码实现。
阅读全文