用python实现apriori算法
时间: 2023-06-18 14:05:41 浏览: 99
好的,以下是用Python实现apriori算法的示例代码:
```
from itertools import combinations
def generate_candidates(itemsets, k):
# 生成候选项集
candidates = set()
for itemset1 in itemsets:
for itemset2 in itemsets:
if len(itemset1.union(itemset2)) == k:
candidates.add(itemset1.union(itemset2))
return candidates
def prune(itemsets, candidates, k):
# 剪枝
pruned_candidates = set()
for candidate in candidates:
subsets = combinations(candidate, k)
if all([subset in itemsets for subset in subsets]):
pruned_candidates.add(candidate)
return pruned_candidates
def apriori(transactions, min_support=0.5):
# Apriori算法
itemsets = [frozenset([item]) for item in set(transactions)]
k = 1
frequent_itemsets = {}
while itemsets:
candidates = generate_candidates(itemsets, k+1)
item_counts = {}
for transaction in transactions:
for candidate in candidates:
if candidate.issubset(transaction):
if candidate not in item_counts:
item_counts[candidate] = 0
item_counts[candidate] += 1
frequent_itemsets[k+1] = {itemset: item_counts[itemset]/len(transactions) for itemset in candidates if item_counts[itemset]/len(transactions) >= min_support}
itemsets = frequent_itemsets[k+1].keys()
itemsets = prune(itemsets, generate_candidates(itemsets, k+2), k)
k += 1
return frequent_itemsets
```
示例的输入是一个交易列表,其中每个交易是一个项的集合。函数apriori返回一个字典,其中键是项集大小k,值是频繁项集。
例如,对于以下交易列表:
```
transactions = [
{'apple', 'banana', 'orange'},
{'banana', 'pear'},
{'pear', 'orange'},
{'apple', 'banana', 'pear'},
{'apple', 'pear'}
]
```
调用apriori函数并指定最小支持度为0.5:
```
frequent_itemsets = apriori(transactions, min_support=0.5)
```
将返回以下频繁项集:
```
{
1: {
frozenset({'banana'}): 0.8,
frozenset({'pear'}): 0.6,
frozenset({'apple'}): 0.6,
frozenset({'orange'}): 0.4
},
2: {
frozenset({'banana', 'pear'}): 0.4,
frozenset({'banana', 'apple'}): 0.4,
frozenset({'pear', 'apple'}): 0.4
}
}
```
这表示banana、pear和apple都是频繁项,它们的支持度分别为0.8、0.6和0.6。同时,(banana, pear)、(banana, apple)和(pear, apple)也是频繁项集,它们的支持度分别为0.4。
阅读全文