Apriori算法代码
时间: 2023-07-03 09:31:26 浏览: 96
以下是Apriori算法的Python实现代码:
```python
# 定义函数:读取数据集
def load_dataset():
return [[1, 3, 4], [2, 3, 5], [1, 2, 3, 5], [2, 5]]
# 定义函数:创建候选项集C1
def create_c1(dataset):
c1 = []
for transaction in dataset:
for item in transaction:
if not [item] in c1:
c1.append([item])
c1.sort()
return list(map(frozenset, c1))
# 定义函数:计算支持度
def support_count(itemset, dataset):
count = 0
for transaction in dataset:
if itemset.issubset(transaction):
count += 1
return count
# 定义函数:生成频繁项集
def create_lk(dataset, c, min_support):
item_count = {}
for transaction in dataset:
for itemset in c:
if itemset.issubset(transaction):
if itemset not in item_count:
item_count[itemset] = 1
else:
item_count[itemset] += 1
n = len(dataset)
l = []
support_data = {}
for itemset in item_count:
support = item_count[itemset] / n
if support >= min_support:
l.append(itemset)
support_data[itemset] = support
return l, support_data
# 定义函数:合并项集
def merge(lk, k):
merged = []
n = len(lk)
for i in range(n):
for j in range(i + 1, n):
l1 = list(lk[i])[:k-2]
l2 = list(lk[j])[:k-2]
l1.sort()
l2.sort()
if l1 == l2:
merged.append(lk[i] | lk[j])
return merged
# 定义函数:Apriori算法
def apriori(dataset, min_support):
c1 = create_c1(dataset)
l1, support_data = create_lk(dataset, c1, min_support)
l = [l1]
k = 2
while len(l[k-2]) > 0:
ck = merge(l[k-2], k)
lk, support_k = create_lk(dataset, ck, min_support)
support_data.update(support_k)
l.append(lk)
k += 1
return l, support_data
```
在使用Apriori算法之前,需要先读取数据集,并使用create_c1函数创建初始候选项集C1。然后,在create_lk函数中,我们将生成频繁项集,并计算每个频繁项集的支持度。接下来,我们使用merge函数将k-1项集合并成k项集,直到没有更多的频繁项集可以生成为止。最后,我们得到频繁项集列表l和支持度字典support_data。
阅读全文