网页抓取(内容自选),将数据处理结果保存进入MySQL,利用pyecharts处理结果可视化,flask联通前后端,首页界面(bootstrap),要有登录界面,,pyecharts嵌入网页模板,这个实践项目的步骤及代码
时间: 2024-06-08 14:07:26 浏览: 140
实践项目步骤:
1. 安装必要的库:requests、beautifulsoup4、pymysql、pyecharts、flask、flask-bootstrap、flask-wtf。
2. 设计数据库表结构,创建数据库和表。
3. 编写爬虫程序,抓取目标网页的数据,并将数据保存到数据库中。
4. 编写数据处理函数,提取、清洗、统计数据,并返回处理结果。
5. 编写Flask应用程序,包括登录、首页、数据可视化等页面及其对应的视图函数。
6. 在模板中嵌入pyecharts图表。
7. 启动Flask应用程序,查看结果。
代码如下:
1. 爬虫程序
```python
import requests
from bs4 import BeautifulSoup
import pymysql
# 抓取网页
def get_html(url):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
r = requests.get(url, headers=headers)
return r.content.decode('utf-8')
# 解析网页
def parse_html(html):
soup = BeautifulSoup(html, 'html.parser')
# 提取数据
data = []
# ...
return data
# 保存数据到数据库
def save_data(data):
conn = pymysql.connect(host='localhost', port=3306, user='root', password='password', db='test')
cursor = conn.cursor()
sql = "INSERT INTO table_name (col1, col2, col3) VALUES (%s, %s, %s)"
for item in data:
cursor.execute(sql, item)
conn.commit()
cursor.close()
conn.close()
if __name__ == '__main__':
url = 'http://www.example.com'
html = get_html(url)
data = parse_html(html)
save_data(data)
```
2. 数据处理函数
```python
import pymysql
# 数据处理函数
def process_data():
conn = pymysql.connect(host='localhost', port=3306, user='root', password='password', db='test')
cursor = conn.cursor()
# 查询数据
sql = "SELECT col1, COUNT(*) FROM table_name GROUP BY col1"
cursor.execute(sql)
results = cursor.fetchall()
# 处理数据
data = []
for row in results:
data.append((row[0], row[1]))
cursor.close()
conn.close()
return data
```
3. Flask应用程序
```python
from flask import Flask, render_template, request, redirect, url_for, flash
from flask_bootstrap import Bootstrap
from flask_wtf import FlaskForm
from wtforms import StringField, SubmitField
from wtforms.validators import DataRequired
from pyecharts.charts import Bar
from pyecharts import options as opts
from pyecharts.globals import ThemeType
app = Flask(__name__)
app.config['SECRET_KEY'] = 'secret'
Bootstrap(app)
# 登录表单
class LoginForm(FlaskForm):
username = StringField('Username', validators=[DataRequired()])
password = StringField('Password', validators=[DataRequired()])
submit = SubmitField('Login')
# 首页
@app.route('/')
def index():
return render_template('index.html')
# 登录
@app.route('/login', methods=['GET', 'POST'])
def login():
form = LoginForm()
if form.validate_on_submit():
# 验证用户名和密码
if form.username.data == 'admin' and form.password.data == 'password':
return redirect(url_for('dashboard'))
else:
flash('Invalid username or password.')
return render_template('login.html', form=form)
# 数据可视化
@app.route('/dashboard')
def dashboard():
data = process_data()
# 绘制图表
x_data = [item[0] for item in data]
y_data = [item[1] for item in data]
c = (
Bar(init_opts=opts.InitOpts(theme=ThemeType.LIGHT))
.add_xaxis(x_data)
.add_yaxis('Count', y_data)
.set_global_opts(title_opts=opts.TitleOpts(title='Data Visualization'))
)
return render_template('dashboard.html', chart=c.render_embed(), host='http://localhost:5000', script_list=c.get_js_dependencies())
if __name__ == '__main__':
app.run(debug=True)
```
4. 模板文件
index.html
```html
{% extends 'base.html' %}
{% block title %}Home{% endblock %}
{% block content %}
<div class="container">
<h1>Welcome to My Website</h1>
<p>Please login to access the dashboard.</p>
<a href="{{ url_for('login') }}" class="btn btn-primary">Login</a>
</div>
{% endblock %}
```
login.html
```html
{% extends 'base.html' %}
{% block title %}Login{% endblock %}
{% block content %}
<div class="container">
<h1>Login</h1>
<form method="POST" action="{{ url_for('login') }}">
{{ form.hidden_tag() }}
<div class="form-group">
{{ form.username.label }}{{ form.username(class='form-control', placeholder='Enter username') }}
</div>
<div class="form-group">
{{ form.password.label }}{{ form.password(class='form-control', placeholder='Enter password') }}
</div>
<button type="submit" class="btn btn-primary">Login</button>
</form>
</div>
{% endblock %}
```
dashboard.html
```html
{% extends 'base.html' %}
{% block title %}Dashboard{% endblock %}
{% block content %}
<div class="container">
<h1>Data Visualization</h1>
<div id="chart-container" style="width: 100%; height: 500px;"></div>
</div>
{% endblock %}
{% block script %}
{{ super() }}
{{ chart | safe }}
<script>
require.config({
paths: {
echarts: '{{ host }}/static/js/echarts.min',
'echarts/chart/bar': '{{ host }}/static/js/echarts-bar.min'
}
});
require(['echarts', 'echarts/chart/bar'], function (ec) {
var chart = ec.init(document.getElementById('chart-container'));
chart.setOption(option);
});
</script>
{% endblock %}
```
base.html
```html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>{% block title %}{% endblock %} - My Website</title>
<link rel="stylesheet" href="{{ url_for('static', filename='css/bootstrap.min.css') }}">
{% block head %}{% endblock %}
</head>
<body>
<nav class="navbar navbar-expand-lg navbar-light bg-light">
<div class="container">
<a class="navbar-brand" href="{{ url_for('index') }}">My Website</a>
</div>
</nav>
{% with messages = get_flashed_messages() %}
{% if messages %}
<div class="container">
<br>
{% for message in messages %}
<div class="alert alert-danger">{{ message }}</div>
{% endfor %}
</div>
{% endif %}
{% endwith %}
{% block content %}{% endblock %}
<script src="{{ url_for('static', filename='js/jquery.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/bootstrap.min.js') }}"></script>
{% block script %}{% endblock %}
</body>
</html>
```
阅读全文