用python实现Apriori算法
时间: 2024-02-20 13:57:11 浏览: 89
好的,以下是用 Python 实现 Apriori 算法的代码:
```
# 导入必要的库
from itertools import combinations
# 定义函数实现 Apriori 算法
def apriori(transactions, support_threshold):
itemsets = {}
num_transactions = len(transactions)
# 计算单个项的出现次数
for transaction in transactions:
for item in transaction:
if item in itemsets:
itemsets[item] += 1
else:
itemsets[item] = 1
# 筛选出现次数大于等于 support_threshold 的项
frequent_itemsets = {item: count for item, count in itemsets.items() if count >= support_threshold}
# 迭代计算频繁项集
while frequent_itemsets:
yield frequent_itemsets
# 生成候选项集
candidate_itemsets = {}
for itemset in combinations(frequent_itemsets.keys(), 2):
candidate = set(itemset[0]).union(itemset[1])
if len(candidate) == len(itemset[0]) + 1:
candidate_itemsets[tuple(candidate)] = 0
# 计算候选项集的出现次数
for transaction in transactions:
for candidate in candidate_itemsets.keys():
if set(candidate).issubset(set(transaction)):
candidate_itemsets[candidate] += 1
# 筛选出现次数大于等于 support_threshold 的项
frequent_itemsets = {item: count for item, count in candidate_itemsets.items() if count >= support_threshold}
# 测试
transactions = [
['apple', 'banana', 'orange'],
['banana', 'orange'],
['apple', 'pear'],
['banana', 'pear'],
['apple', 'banana', 'pear']
]
support_threshold = 3
for frequent_itemset in apriori(transactions, support_threshold):
print(frequent_itemset)
```
这里使用了 Python 自带的 itertools 库中的 combinations 函数来生成候选项集。这个函数能够生成指定长度的所有组合,例如 combinations(['a', 'b', 'c'], 2) 将返回 [('a', 'b'), ('a', 'c'), ('b', 'c')]。我们利用这个函数来生成大小为 k+1 的候选项集,其中 k 是上一轮迭代中的频繁项集大小。
阅读全文