编写python代码实现apriori算法。
时间: 2024-01-18 11:51:18 浏览: 81
Apriori算法的Python实现_python代码_下载
以下是Python实现Apriori算法的示例代码:
```python
def load_data_set():
data_set = [['apple', 'beer', 'rice', 'chicken'],
['apple', 'beer', 'rice'],
['apple', 'beer'],
['apple', 'banana', 'chicken'],
['apple', 'banana']]
return data_set
def create_c1(data_set):
c1 = []
for transaction in data_set:
for item in transaction:
if not [item] in c1:
c1.append([item])
c1.sort()
return list(map(frozenset, c1))
def scan_data_set(data_set, candidate_set, min_support):
support_count = {}
for transaction in data_set:
for candidate in candidate_set:
if candidate.issubset(transaction):
if not candidate in support_count:
support_count[candidate] = 1
else:
support_count[candidate] += 1
num_items = float(len(data_set))
frequent_set = []
support_data = {}
for key in support_count:
support = support_count[key] / num_items
if support >= min_support:
frequent_set.append(key)
support_data[key] = support
return frequent_set, support_data
def apriori_gen(frequent_set, k):
new_frequent_set = []
len_frequent_set = len(frequent_set)
for i in range(len_frequent_set):
for j in range(i + 1, len_frequent_set):
l1 = list(frequent_set[i])[:k - 2]
l2 = list(frequent_set[j])[:k - 2]
l1.sort()
l2.sort()
if l1 == l2:
new_frequent_set.append(frequent_set[i] | frequent_set[j])
return new_frequent_set
def apriori(data_set, min_support=0.5):
candidate_set = create_c1(data_set)
frequent_set_1, support_data = scan_data_set(data_set, candidate_set, min_support)
frequent_set = [frequent_set_1]
k = 2
while len(frequent_set[k - 2]) > 0:
candidate_set = apriori_gen(frequent_set[k - 2], k)
frequent_set_k, support_k = scan_data_set(data_set, candidate_set, min_support)
support_data.update(support_k)
frequent_set.append(frequent_set_k)
k += 1
return frequent_set, support_data
data_set = load_data_set()
frequent_set, support_data = apriori(data_set, min_support=0.5)
print(frequent_set)
print(support_data)
```
在上面的代码中,我们定义了 `load_data_set` 函数用于加载数据集, `create_c1` 函数用于创建候选项集 `C1` , `scan_data_set` 函数用于扫描数据集并获取频繁项集和支持度字典, `apriori_gen` 函数用于生成 `k` 项频繁项集的候选项集, `apriori` 函数是Apriori算法的主体部分。
我们通过调用 `apriori` 函数来运行整个算法,并将频繁项集和支持度输出到控制台。
阅读全文