from apriori import * # 编写代码实现关联规则抽取 dataset = load_data('data/apriori.txt') # 定义 generate_big_rules函数来获取关联规则 def generate_big_rules(L, support_data, min_conf): big_rule_list = [] sub_set_list = [] for i in range(0, len(L)): for freq_set in L[i]: # freq_set:（'B4'）、（'B4', 'C4', 'H4'） for sub_set in sub_set_list: #******** BEGIN * if sub_set.issubset(freq_set): # 计算置信度 # 前件、后件、支持度、置信度 # END *********** sub_set_list.append(freq_set) return big_rule_list def task(): L, support_data = generate_L(dataset, k=4, min_support=0.06) # 根据频繁项集寻找关联规则，设置置信度为 0.75 big_rules_list = generate_big_rules(L, support_data, min_conf=0.75) return big_rules_list

时间: 2023-07-19 20:55:35 浏览: 124

这段代码的功能是实现关联规则抽取，具体步骤如下： 1. 使用apriori.py中的load_data方法从文件中读入经过离散化处理后的数据集。 2. 定义generate_big_rules函数来获取关联规则。该函数的输入参数包括频繁项集L、支持度信息support_data和最小置信度min_conf。 3. 在函数中，通过两层循环遍历频繁项集L，对于每个频繁项freq_set，再遍历它的子集sub_set_list，判断子集是否是freq_set的子集，如果是，则计算置信度并保存关联规则。 4. 最后返回关联规则列表big_rule_list。 5. 在task函数中，调用generate_L方法得到频繁项集L和支持度信息support_data，然后调用generate_big_rules方法得到关联规则列表big_rules_list，并将其作为返回值。需要注意的是，该代码中的generate_L函数没有给出，因此可能需要在其他地方进行定义。

def create_C1(dataset): C1 = [] for transaction in dataset: for item in transaction: if not [item] in C1: C1.append([item]) C1.sort() return list(map(frozenset, C1)) # print(len(create_C1(dataset))) def scan_D(D, Ck, min_support): ssCnt = {} for tid in D: for can in Ck: if can.issubset(tid): if not can in ssCnt: ssCnt[can] = 1 else: ssCnt[can] += 1 num_items = float(len(D)) retList = [] support_data = {} for key in ssCnt: support = ssCnt[key] / num_items if support >= min_support: retList.insert(0, key) support_data[key] = support return retList, support_data def apriori_gen(Lk, k): retList = [] len_Lk = len(Lk) for i in range(len_Lk): for j in range(i + 1, len_Lk): L1 = list(Lk[i])[:k - 2] L2 = list(Lk[j])[:k - 2] L1.sort() L2.sort() if L1 == L2: retList.append(Lk[i] | Lk[j]) return retList def apriori(dataset, min_support): C1 = create_C1(dataset) D = list(map(set, dataset)) L1, support_data = scan_D(D, C1, min_support) L = [L1] k = 2 while len(L[k - 2]) > 0: Ck = apriori_gen(L[k - 2], k) Lk, support_k = scan_D(D, Ck, min_support) support_data.update(support_k) L.append(Lk) k += 1 return L, support_data如果我想在这段代码中加入置信度，怎么加

在这段代码中加入置信度，实际上是要实现关联规则算法。在Apriori算法的基础上，可以通过以下步骤来实现关联规则算法： 1. 从频繁项集L中生成候选规则集合，即对于每个频繁项集，生成包含该项集中所有元素的所有子集。 2. 对于每个候选规则，计算其置信度，并筛选出满足最小置信度要求的规则。下面是修改后的代码实现： ```python def generate_rules(L, support_data, min_conf): rules_list = [] for i in range(1, len(L)): for freq_set in L[i]: H1 = [frozenset([item]) for item in freq_set] if i > 1: rules_from_conseq(freq_set, H1, support_data, rules_list, min_conf) else: calc_conf(freq_set, H1, support_data, rules_list, min_conf) return rules_list def calc_conf(freq_set, H, support_data, brl, min_conf): pruned_h = [] for conseq in H: conf = support_data[freq_set] / support_data[freq_set - conseq] if conf >= min_conf: print(freq_set - conseq, '-->', conseq, 'conf:', conf) brl.append((freq_set - conseq, conseq, conf)) pruned_h.append(conseq) return pruned_h def rules_from_conseq(freq_set, H, support_data, brl, min_conf): m = len(H[0]) if len(freq_set) > (m + 1): Hmp1 = apriori_gen(H, m + 1) Hmp1 = calc_conf(freq_set, Hmp1, support_data, brl, min_conf) if len(Hmp1) > 1: rules_from_conseq(freq_set, Hmp1, support_data, brl, min_conf) def apriori(dataset, min_support, min_conf): C1 = create_C1(dataset) D = list(map(set, dataset)) L1, support_data = scan_D(D, C1, min_support) L = [L1] k = 2 while len(L[k - 2]) > 0: Ck = apriori_gen(L[k - 2], k) Lk, support_k = scan_D(D, Ck, min_support) support_data.update(support_k) L.append(Lk) k += 1 rules_list = generate_rules(L, support_data, min_conf) return L, support_data, rules_list ``` 其中，generate_rules函数用于生成关联规则，calc_conf函数用于计算规则的置信度，rules_from_conseq函数用于从频繁项集中生成候选规则，apriori函数用于调用Apriori算法和关联规则算法，并返回频繁项集、支持度数据和关联规则列表。在调用apriori函数时，需要传入最小支持度和最小置信度阈值。

Given the following transaction record Transaction Records Transaction ID Items #1 apple, banana, coca-cola, doughnut #2 banana, coco-cola #3 banana, doughnut #4 apple, coca-cola #5 apple, banana, doughnut #6 apple, banana, coca-cola Build the FP-tree using a minimum support min_sup = 2. Show how the tree evolves for each transaction. Use the FP-Growth algorithm to discover frequent itemsets from the FP-tree. With the previous transaction record, Use the Apriori algorithm on this dataset and verify that it will generate the same set of frequent itemsets with min_sup = 2. Suppose that { Apple, Banana, Doughnut } is a frequent item set, derive all its association rules with min_confidence = 70%

Building the FP-tree: Transaction ID #1: apple, banana, coca-cola, doughnut ``` root | a | p | p - b | | | c | | | d ``` Transaction ID #2: banana, coca-cola ``` root | a | p - b - c | | | d ``` Transaction ID #3: banana, doughnut ``` root | a | p - b - c | | | | | d | | | d ``` Transaction ID #4: apple, coca-cola ``` root | a - c | | | p - b - c | | | | | d | | | d ``` Transaction ID #5: apple, banana, doughnut ``` root | a - b - d | | | | | c | | | p - b - c | | | d | b - d | c ``` Transaction ID #6: apple, banana, coca-cola ``` root | a - b - c | | | | | d | | | p - b - c | | | d | b - d | c ``` Using the FP-Growth algorithm to discover frequent itemsets: Starting with the most frequent item (d): - d (4) - b-d (3) - c-b-d (2) - a-b-d (2) - a-p-b-d (2) Next, starting with the next most frequent item (b): - b (4) - a-b (3) - p-b (3) - c-b (2) - a-p-b (2) - c-b-d (2) - a-b-d (2) - a-p-b-d (2) Finally, starting with the least frequent item (c): - c (3) - b-c (2) - a-b-c (2) - p-b-c (2) - c-b-d (2) - a-b-d (2) - a-p-b-d (2) All sets of frequent itemsets with minimum support of 2 are: - {d} (4) - {b} (4) - {c} (3) - {a, d} (2) - {b, d} (3) - {p, b, d} (2) - {c, b, d} (2) - {a, b, d} (2) - {a, p, b, d} (2) - {a, b} (3) - {p, b} (3) - {c, b} (2) - {a, p, b} (2) - {c, b, d} (2) - {a, b, d} (2) - {a, p, b, d} (2) - {a, c, b} (2) - {p, c, b} (2) - {a, p, c, b} (2) Using the Apriori algorithm to verify the frequent itemsets with minimum support of 2: Starting with 1-itemsets: - {apple} (3) - {banana} (4) - {coca-cola} (3) - {doughnut} (4) Next, starting with 2-itemsets: - {apple, banana} (2) - {apple, coca-cola} (1) - {apple, doughnut} (2) - {banana, coca-cola} (2) - {banana, doughnut} (2) - {coca-cola, doughnut} (2) Finally, starting with 3-itemsets: - {apple, banana, doughnut} (2) All sets of frequent itemsets with minimum support of 2 are: - {banana} (4) - {doughnut} (4) - {apple} (3) - {coca-cola} (3) - {banana, doughnut} (2) - {apple, doughnut} (2) - {apple, banana} (2) - {banana, coca-cola} (2) - {coca-cola, doughnut} (2) - {apple, banana, doughnut} (2) The Apriori algorithm generates the same set of frequent itemsets with minimum support of 2 as the FP-Growth algorithm. Deriving all association rules with 70% minimum confidence for the frequent itemset {Apple, Banana, Doughnut}: First, find all the subsets of {Apple, Banana, Doughnut}: - {Apple, Banana} - {Apple, Doughnut} - {Banana, Doughnut} - {Apple} - {Banana} - {Doughnut} Next, calculate the confidence for each rule: - {Apple, Banana} -> {Doughnut} (2/2 = 100%) - {Apple, Doughnut} -> {Banana} (2/2 = 100%) - {Banana, Doughnut} -> {Apple} (2/2 = 100%) - {Apple} -> {Banana, Doughnut} (2/3 = 67%) - {Banana} -> {Apple, Doughnut} (2/4 = 50%) - {Doughnut} -> {Apple, Banana} (2/4 = 50%) All association rules with minimum confidence of 70% for the frequent itemset {Apple, Banana, Doughnut} are: - {Apple, Banana} -> {Doughnut} - {Apple, Doughnut} -> {Banana} - {Banana, Doughnut} -> {Apple}

阅读全文

相关推荐

C语言实现Apriori算法的关联规则挖掘

Apriori算法在关联规则挖掘中的MATLAB实现研究

掌握Apriori算法：有效的关联规则数据挖掘技术

从外部录入数据集不用apriori库的关联规则apriori算法代码

Apriori挖掘关联规则 python 源代码 csv 文件示例

apriori关联规则算法python

用python写根据若干数据集（程序中提供的数据），使用开源的关联规则挖掘程序，利用 Apriori 和 FP-Growth 等算法实现关联规则挖掘任务，分析支持度和置信度参数对结果的影响

apriori算法python实现，能算出频繁项集和最强关联规则

apriori算法c语言代码编程实现

pythen实现Apriori算法代码

python实现Apriori的完整代码

如何使用Apriori算法在Python中进行关联规则挖掘？请结合支持度和置信度概念，给出一个实际操作的示例。

实现Apriori算法

dataset=[[1, 3, 4], [2, 3, 5], [1, 2, 3, 5], [2, 5]],计算R:3->5的置信度的代码

用c语言实现apriori算法，生成频繁项集，找出关联规则。然后最小支持度和最小置信度可以自己输入，数据D也要自己输入

Apriori算法python实现

用python代码实现Apriori算法，并选取超市清单数据集，进行Apriori算法模型设计

apriori算法实例代码

大家在看

Pr1Wire2432Eng_reset_2432_

郑轻大计通院考研专业课考纲.pdf

SIMATIC S71200和1500安全编程指南

【微电网】基于Matlab实现孤岛和并网的状态下的微电网潮流计算 上传.zip

定向耦合器与三分贝电桥.pdf

最新推荐

IncompatibleClassChangeError(解决方案).md

掌握HTML/CSS/JS和Node.js的Web应用开发实践

管理建模和仿真的文件

计算机体系结构概述：基础概念与发展趋势

int a[][3]={{1,2},{4}}输出这个数组

勒玛算法研讨会项目：在线商店模拟与Qt界面实现

"互动学习：行动中的多样性与论文攻读经历"

【计算机组成原理精讲】：从零开始深入理解计算机硬件

vue2加载高德地图

Edge语法革新：打造WPF界面新体验

【微电网】基于Matlab实现孤岛和并网的状态下的微电网潮流计算上传.zip