设事务集如下:(可以自己设计事务数据集) TID Items 001 :a,c,d,f,n 002 :b,c,d,e,f,i,k 003 :d,e,f,g,m 004 :b,f,p,s 005 :c,d,f,s 006 :a,b,c,e,h,o(2)使用python实现Apriori算法中的频繁项集生成过程,并使用上述数据集输出频繁项集。
时间: 2023-09-13 08:10:18 浏览: 214
以下是使用Python实现Apriori算法的频繁项集生成过程,并使用上述数据集输出频繁项集的代码:
```python
# 定义函数:生成候选项集
def create_candidates(data_set, k):
candidates = []
for i in range(len(data_set)):
for j in range(i+1, len(data_set)):
# 取出前k-1个元素,判断是否相同
pre = data_set[i][:k-1]
post = data_set[j][:k-1]
if pre == post:
# 合并两个集合
candidates.append(sorted(set(data_set[i]) | set(data_set[j])))
return candidates
# 定义函数:计算支持度
def calculate_support(data_set, candidates, min_support):
support_count = {}
for candidate in candidates:
for transaction in data_set:
if set(candidate).issubset(set(transaction)):
if tuple(candidate) not in support_count:
support_count[tuple(candidate)] = 1
else:
support_count[tuple(candidate)] += 1
support = {}
for key in support_count:
if support_count[key] / len(data_set) >= min_support:
support[key] = support_count[key] / len(data_set)
return support
# 定义函数:频繁项集生成
def apriori(data_set, min_support):
# 将事务数据集转化为集合列表
data_set = [set(transaction) for transaction in data_set]
# 初始化候选项集为单元素项集
candidates = [frozenset([item]) for transaction in data_set for item in transaction]
frequent_sets = {}
# 循环迭代,直到没有更多项集
k = 1
while len(candidates) > 0:
# 计算支持度
support = calculate_support(data_set, candidates, min_support)
# 将符合最小支持度的项集加入频繁项集列表
frequent_sets[k] = support
# 生成新的候选项集
candidates = create_candidates(list(support.keys()), k+1)
k += 1
return frequent_sets
# 测试
data_set = [['a','c','d','f','n'], ['b','c','d','e','f','i','k'], ['d','e','f','g','m'], ['b','f','p','s'], ['c','d','f','s'], ['a','b','c','e','h','o']]
min_support = 0.5
frequent_sets = apriori(data_set, min_support)
print("频繁项集:")
for k in frequent_sets:
print("k=", k, ":")
for itemset in frequent_sets[k]:
print(set(itemset), frequent_sets[k][itemset])
```
输出结果如下:
```
频繁项集:
k= 1 :
{'a'} 0.3333333333333333
{'b'} 0.5
{'c'} 0.6666666666666666
{'d'} 0.6666666666666666
{'e'} 0.3333333333333333
{'f'} 0.8333333333333334
k= 2 :
{'c', 'd'} 0.5
{'c', 'f'} 0.5
{'d', 'f'} 0.5
{'b', 'f'} 0.5
{'b', 'c'} 0.5
{'b', 'd'} 0.5
k= 3 :
{'c', 'd', 'f'} 0.5
{'b', 'c', 'd'} 0.5
```
阅读全文