python3.9版本下编程一个例子用tkinter设计系统界面,将所输入的网站进行数据爬取(例如标题与链接),并将所爬取到的内容存入数据库中,从数据库中读取数据进行处理,用matplotlib实现数据绘图
时间: 2024-02-20 15:59:34 浏览: 53
Python实现在tkinter中使用matplotlib绘制图形的方法示例
好的,我可以帮你解答这个问题。
首先,需要安装相关的库,包括tkinter、beautifulsoup4、matplotlib和sqlite3。你可以使用以下命令在终端(或命令行)中安装这些库:
```
pip install tkinter beautifulsoup4 matplotlib sqlite3
```
接下来,我们可以开始编写代码。首先,需要导入必要的库:
```python
import tkinter as tk
import urllib.request
from bs4 import BeautifulSoup
import sqlite3
import matplotlib.pyplot as plt
```
然后,我们可以创建一个GUI窗口来输入要爬取的网站:
```python
root = tk.Tk()
root.geometry("400x200")
url_label = tk.Label(root, text="请输入要爬取的网站:")
url_label.pack()
url_entry = tk.Entry(root)
url_entry.pack()
def crawl():
# 这里编写爬取数据的代码
crawl_button = tk.Button(root, text="开始爬取", command=crawl)
crawl_button.pack()
root.mainloop()
```
在`crawl()`函数中,我们可以使用`urllib.request`库来获取网站的HTML源代码,并使用`beautifulsoup4`库来解析HTML代码并获取标题和链接:
```python
def crawl():
url = url_entry.get()
html = urllib.request.urlopen(url).read().decode('utf-8')
soup = BeautifulSoup(html, 'html.parser')
title_links = []
for link in soup.find_all('a'):
title = link.get('title')
href = link.get('href')
if title and href:
title_links.append((title, href))
# 这里将数据存入数据库中
```
接下来,我们可以使用`sqlite3`库来创建一个数据库,并将所爬取到的数据存入其中:
```python
def crawl():
url = url_entry.get()
html = urllib.request.urlopen(url).read().decode('utf-8')
soup = BeautifulSoup(html, 'html.parser')
title_links = []
for link in soup.find_all('a'):
title = link.get('title')
href = link.get('href')
if title and href:
title_links.append((title, href))
conn = sqlite3.connect('data.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS websites
(title text, link text)''')
for title, link in title_links:
c.execute("INSERT INTO websites VALUES (?, ?)", (title, link))
conn.commit()
conn.close()
# 这里从数据库中读取数据并进行处理
```
最后,我们可以使用`matplotlib`库来绘制所爬取到的数据的柱状图:
```python
def crawl():
url = url_entry.get()
html = urllib.request.urlopen(url).read().decode('utf-8')
soup = BeautifulSoup(html, 'html.parser')
title_links = []
for link in soup.find_all('a'):
title = link.get('title')
href = link.get('href')
if title and href:
title_links.append((title, href))
conn = sqlite3.connect('data.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS websites
(title text, link text)''')
for title, link in title_links:
c.execute("INSERT INTO websites VALUES (?, ?)", (title, link))
conn.commit()
conn.close()
# 从数据库中读取数据并进行处理
conn = sqlite3.connect('data.db')
c = conn.cursor()
c.execute("SELECT title, COUNT(*) FROM websites GROUP BY title")
results = c.fetchall()
labels = [result[0] for result in results]
counts = [result[1] for result in results]
conn.close()
# 绘制柱状图
plt.bar(labels, counts)
plt.xticks(rotation=90)
plt.show()
```
完整的代码如下:
```python
import tkinter as tk
import urllib.request
from bs4 import BeautifulSoup
import sqlite3
import matplotlib.pyplot as plt
root = tk.Tk()
root.geometry("400x200")
url_label = tk.Label(root, text="请输入要爬取的网站:")
url_label.pack()
url_entry = tk.Entry(root)
url_entry.pack()
def crawl():
url = url_entry.get()
html = urllib.request.urlopen(url).read().decode('utf-8')
soup = BeautifulSoup(html, 'html.parser')
title_links = []
for link in soup.find_all('a'):
title = link.get('title')
href = link.get('href')
if title and href:
title_links.append((title, href))
conn = sqlite3.connect('data.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS websites
(title text, link text)''')
for title, link in title_links:
c.execute("INSERT INTO websites VALUES (?, ?)", (title, link))
conn.commit()
conn.close()
# 从数据库中读取数据并进行处理
conn = sqlite3.connect('data.db')
c = conn.cursor()
c.execute("SELECT title, COUNT(*) FROM websites GROUP BY title")
results = c.fetchall()
labels = [result[0] for result in results]
counts = [result[1] for result in results]
conn.close()
# 绘制柱状图
plt.bar(labels, counts)
plt.xticks(rotation=90)
plt.show()
crawl_button = tk.Button(root, text="开始爬取", command=crawl)
crawl_button.pack()
root.mainloop()
```
希望这个例子可以帮助你更好地理解如何使用Python和相关库来设计系统界面、爬取网站数据、存入数据库并绘制数据图表。
阅读全文