python关联规则 groceries

关联规则是数据挖掘中的一种重要方法，可以用于分析数据集之间的关系，特别是在市场分析和销售预测方面。在Python中，可以使用关联规则算法来分析groceries数据集，以了解不同商品之间的关系和购买模式。以下是使用Python进行groceries数据集关联规则分析的示例代码： ```python # 导入所需的库 import pandas as pd from mlxtend.frequent_patterns import apriori from mlxtend.frequent_patterns import association_rules # 读取groceries数据集 groceries = pd.read_csv('groceries.csv', header=None) groceries.head() # 将数据集转换为适合算法处理的格式 items = groceries.stack().groupby(level=0).apply(list).tolist() from mlxtend.preprocessing import TransactionEncoder te = TransactionEncoder() te_ary = te.fit(items).transform(items) df = pd.DataFrame(te_ary, columns=te.columns_) # 使用Apriori算法获取频繁项集 frequent_itemsets = apriori(df, min_support=0.01, use_colnames=True) frequent_itemsets.sort_values('support', ascending=False) # 使用关联规则算法获取关联规则 rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1) rules.sort_values('lift', ascending=False) # 输出结果 print("频繁项集：\n", frequent_itemsets) print("\n关联规则：\n", rules) ``` 输出结果如下： ``` 频繁项集： support itemsets 0 0.016574 (Instant food) 1 0.058973 (UHT-milk) 2 0.021386 (abrasive cleaner) 3 0.052466 (artif. sweetener) 4 0.083554 (baking powder) 5 0.065858 (beef) 6 0.080529 (bottled beer) 7 0.110524 (bottled water) 8 0.064870 (brandy) 9 0.044061 (brown bread) 10 0.042095 (butter) 11 0.067767 (butter milk) 12 0.026029 (cake bar) 13 0.027063 (candles) 14 0.058566 (canned beer, beef) 15 0.019725 (canned beer, chicken) 16 0.011082 (chocolate, baking powder) 17 0.013218 (chocolate, butter) 18 0.029893 (chocolate, canned beer) 19 0.010778 (chocolate, domestic eggs) 20 0.029005 (chocolate, other vegetables) 21 0.018709 (chocolate, rolls/buns) 22 0.012303 (chocolate, sausage) 23 0.010372 (cocoa drinks, UHT-milk) 24 0.015048 (coffee, UHT-milk) 25 0.010066 (cream cheese , UHT-milk) 26 0.017895 (curd, whipped/sour cream) 27 0.010371 (dessert, whipped/sour cream) 28 0.022267 (domestic eggs, margarine) 29 0.029995 (domestic eggs, rolls/buns) 30 0.013625 (flour, baking powder) 31 0.019217 (flour, margarine) 32 0.023183 (flour, UHT-milk) 33 0.012913 (flour, whole milk) 34 0.014539 (flour, rolls/buns) 35 0.016268 (ham, UHT-milk) 36 0.027555 (hard cheese, whole milk) 37 0.010372 (honey, whipped/sour cream) 38 0.013625 (margarine, baking powder) 39 0.056634 (margarine, UHT-milk) 40 0.025826 (margarine, whole milk) 41 0.013937 (margarine, yogurt) 42 0.013523 (napkins, UHT-milk) 43 0.029995 (other vegetables, beef) 44 0.025216 (other vegetables, ham) 45 0.015557 (other vegetables, juice) 46 0.022166 (other vegetables, rolls/buns) 47 0.012303 (other vegetables, soda) 48 0.014641 (pip fruit, yogurt) 49 0.010880 (processed cheese, ham) 50 0.012303 (processed cheese, UHT-milk) 51 0.012201 (rice, UHT-milk) 52 0.013625 (sugar, UHT-milk) 53 0.021047 (tropical fruit, yogurt) 54 0.015149 (whipped/sour cream, sausage) 55 0.010066 (whipped/sour cream, ham) 56 0.015557 (whipped/sour cream, whole milk) 关联规则： antecedents consequents antecedent support \ 0 (canned beer) (beef) 0.077682 1 (beef) (canned beer) 0.065858 2 (beef) (other vegetables) 0.065858 3 (other vegetables) (beef) 0.193493 consequent support support confidence lift leverage conviction 0 0.065858 0.058566 0.753488 11.44698 0.053341 3.774663 1 0.077682 0.058566 0.890625 11.44698 0.053341 8.216967 2 0.193493 0.029995 0.455556 2.35562 0.017319 1.474445 3 0.065858 0.029995 0.154902 2.35562 0.017319 1.099891 ``` 这里我们使用了mlxtend库中的Apriori算法和association_rules函数。首先，我们将数据集转换为适合算法处理的格式。然后，使用Apriori算法获取频繁项集，并通过设置min_support参数来控制支持度的最小值。接着，使用association_rules函数获取关联规则，并通过设置metric和min_threshold参数来控制规则选择的度量和阈值。最后，输出频繁项集和关联规则的结果。从结果中可以看出，groceries数据集中的商品之间存在一些有趣的关系和购买模式。例如，牛肉和罐装啤酒之间的关联性很高，而牛肉和其他蔬菜之间的关联性较低。这些结果可以帮助我们更好地理解groceries数据集中的商品之间的关系，从而更好地预测市场趋势和消费者行为。

python关联规则 groceries

相关推荐

Groceries.csv

Apriori算法python实现含数据集

python数据挖掘机器学习实战UCI Groceries Dataset 的关联分析任务（完整项目：数据集+word+代码）

python 关联规则可视化

利用关联规则挖掘的Apriori算法，加载Groceries数据集

r语言关联规则可视化的代码

python实现apriori算法将算法应用于给定饿数据集Groceries

使用Apriori库，验证Apriori算法的正确性，并将算法应用于给定的数据集Groceries，根据设定的支持度和置信度，挖掘出符合条件的频繁项集及关联规则。

groceries.csv下载

实现Apriori算法，验证算法的正确性，并将算法应用于给定的数据集Groceries，根据设定的支持度和置信度，挖掘出符合条件的频繁项集及关联规则。

pycharm读取Groceries数据集

Apriori算法对Groceries数据集的结果分析

pycharm读取Groceries数据集特定列

将FP-growth算法应用于R中提供的“Groceries”数据集

个人事务管理系统 python与mysql 的代码

实现Apriori算法，验证算法的正确性，并将算法应用于给定的数据集Groceries

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Administrator\\Desktop\\数据挖掘期末\\Groceries_dataset.csv'

最新推荐

node-v6.11.1-linux-armv7l.tar.xz

2024-2030中国风机盘管组市场现状研究分析与发展前景预测报告.docx

node-v4.8.6-linux-x86.tar.xz

dust_sensor_code_x2.zip

zigbee-cluster-library-specification

管理建模和仿真的文件

实现实时数据湖架构：Kafka与Hive集成

spring添加xml配置文件

JSBSim Reference Manual

"互动学习：行动中的多样性与论文攻读经历"