apriori算法python实现
时间: 2023-10-04 17:05:53 浏览: 32
好的,以下是 apriori 算法的 Python 实现:
```python
# 导入必要的库
from itertools import combinations
# 定义函数,生成频繁项集
def generate_frequent_itemsets(transactions, min_support):
item_counts = {} # 存储每个项的出现次数
for transaction in transactions:
for item in transaction:
if item not in item_counts:
item_counts[item] = 0
item_counts[item] += 1
# 过滤掉不满足最小支持度的项
frequent_items = set(item for item, count in item_counts.items() if count >= min_support)
itemsets = frequent_items
frequent_itemsets = [itemsets]
k = 2
while True:
# 生成候选项集
candidate_itemsets = set(combinations(itemsets, k))
# 计算每个候选项集的支持度
item_counts = {}
for transaction in transactions:
for candidate in candidate_itemsets:
if set(candidate).issubset(set(transaction)):
if candidate not in item_counts:
item_counts[candidate] = 0
item_counts[candidate] += 1
# 过滤掉不满足最小支持度的项集
frequent_items = set(itemset for itemset, count in item_counts.items() if count >= min_support)
# 如果没有更多满足最小支持度的项集,则结束循环
if not frequent_items:
break
itemsets = frequent_items
frequent_itemsets.append(itemsets)
k += 1
return frequent_itemsets
# 测试代码
transactions = [
['apple', 'banana', 'orange'],
['banana', 'orange'],
['apple', 'banana'],
['apple', 'orange'],
['apple', 'banana', 'orange', 'pear'],
['apple', 'banana', 'pear'],
['pear']
]
frequent_itemsets = generate_frequent_itemsets(transactions, 3)
print(frequent_itemsets)
```
这里的 `transactions` 是一个包含多个交易的列表,每个交易是一个项的列表。`min_support` 是最小支持度,即一个项集在所有交易中出现的次数必须达到该值才会被认为是频繁项集。函数会返回一个列表,包含所有的频繁项集。
阅读全文