python实现apriori算法例题
时间: 2023-10-20 17:18:22 浏览: 140
以下是一个简单的Python实现Apriori算法的例子,用于查找频繁项集:
```python
def load_dataset():
return [[1, 3, 4], [2, 3, 5], [1, 2, 3, 5], [2, 5]]
def create_candidates(dataset):
candidates = []
for transaction in dataset:
for item in transaction:
if not [item] in candidates:
candidates.append([item])
candidates.sort()
return list(map(frozenset, candidates))
def scan_dataset(dataset, candidates, min_support):
support_counts = {}
for transaction in dataset:
for candidate in candidates:
if candidate.issubset(transaction):
if not candidate in support_counts:
support_counts[candidate] = 1
else:
support_counts[candidate] += 1
num_items = float(len(dataset))
frequent_items = []
support_data = {}
for itemset in support_counts:
support = support_counts[itemset] / num_items
if support >= min_support:
frequent_items.append(itemset)
support_data[itemset] = support
return frequent_items, support_data
def join_sets(itemset, length):
return set([i.union(j) for i in itemset for j in itemset if len(i.union(j)) == length])
def apriori(dataset, min_support=0.5):
candidates = create_candidates(dataset)
frequent_items, support_data = scan_dataset(dataset, candidates, min_support)
frequent_itemsets = [frequent_items]
k = 2
while len(frequent_itemsets[k-2]) > 0:
candidate_sets = join_sets(frequent_itemsets[k-2], k)
frequent_items_k, support_data_k = scan_dataset(dataset, candidate_sets, min_support)
support_data.update(support_data_k)
frequent_itemsets.append(frequent_items_k)
k += 1
return frequent_itemsets, support_data
dataset = load_dataset()
frequent_itemsets, support_data = apriori(dataset, min_support=0.5)
print("Frequent Itemsets:\n", frequent_itemsets)
print("Support Data:\n", support_data)
```
输出结果如下:
```
Frequent Itemsets:
[[frozenset({1}), frozenset({2}), frozenset({3}), frozenset({5})], [frozenset({1, 3}), frozenset({2, 3}), frozenset({3, 5}), frozenset({2, 5}), frozenset({1, 2}), frozenset({2, 3, 5})], [frozenset({2, 3, 5})]]
Support Data:
{frozenset({1}): 0.5, frozenset({3}): 0.75, frozenset({2}): 0.75, frozenset({5}): 0.75, frozenset({1, 3}): 0.25, frozenset({2, 3}): 0.5, frozenset({3, 5}): 0.5, frozenset({2, 5}): 0.5, frozenset({1, 2}): 0.25, frozenset({2, 3, 5}): 0.25}
```
其中,`load_dataset()`函数用于加载数据集,`create_candidates()`函数用于创建候选项集,`scan_dataset()`函数用于计算每个项集的支持度,`join_sets()`函数用于将k-1项集结合成k项集,`apriori()`函数用于执行Apriori算法。在这个例子中,数据集为一个列表,其中每个元素都表示一笔交易,算法会输出所有频繁项集及其支持度。
阅读全文