FP-Growth算法的python实现
时间: 2024-05-14 11:12:09 浏览: 164
FP-Growth算法是一种高效的频繁项集挖掘算法,可以用于数据挖掘、关联规则挖掘等领域。下面是FP-Growth算法的Python实现:
```python
class FPTree:
def __init__(self, item, count, parent):
self.item = item
self.count = count
self.parent = parent
self.children = {}
self.next = None
def increment(self, count):
self.count += count
def display(self, indent=1):
print(' ' * indent, self.item, ' ', self.count)
for child in self.children.values():
child.display(indent + 1)
def build_fptree(transactions, min_support):
counts = {}
for transaction in transactions:
for item in transaction:
if item in counts:
counts[item] += 1
else:
counts[item] = 1
items = [item for item in counts.keys() if counts[item] >= min_support]
items.sort(key=lambda item: counts[item], reverse=True)
root = FPTree(None, 0, None)
for transaction in transactions:
transaction = [item for item in transaction if item in items]
transaction.sort(key=lambda item: counts[item], reverse=True)
current = root
for item in transaction:
if item in current.children:
child = current.children[item]
child.increment(1)
else:
child = FPTree(item, 1, current)
current.children[item] = child
current = child
return root, items, counts
def find_prefix_path(item, node):
paths = []
while node is not None:
path = []
while node.item is not None and node.item != item:
path.append(node.item)
node = node.parent
if node is not None:
path.append(node.item)
paths.append(path[::-1])
node = node.parent
return paths
def mine_frequent_itemsets(fptree, items, min_support, prefix=[]):
frequent_itemsets = []
for item in items:
support = sum([1 for path in find_prefix_path(item, fptree) if len(path) >= len(prefix) + 1])
if support >= min_support:
frequent_itemset = prefix + [item]
frequent_itemsets.append((frequent_itemset, support))
conditional_tree, conditional_items, conditional_counts = build_fptree(
[path for path in find_prefix_path(item, fptree) if len(path) >= len(prefix) + 1], min_support)
if conditional_tree is not None:
frequent_itemsets.extend(mine_frequent_itemsets(conditional_tree, conditional_items,
min_support, frequent_itemset))
return frequent_itemsets
```
其中,FPTree类表示FP树,包含了项、计数、父节点、子节点和链表指针等信息。build_fptree函数用于构建FP树,输入参数为事务列表和最小支持度,输出结果为FP树、频繁项和项计数。find_prefix_path函数用于寻找所有包含指定项的前缀路径,输入参数为项和节点,输出结果为前缀路径列表。mine_frequent_itemsets函数用于挖掘频繁项集,输入参数为FP树、频繁项、最小支持度和前缀,输出结果为频繁项集列表。
阅读全文