设计一个代码,基于哈夫曼树的数据压缩算法
时间: 2023-11-23 13:09:55 浏览: 175
用哈夫曼树算法写对文件压缩与解压缩代码.doc
好的,下面是基于哈夫曼树的数据压缩算法的 Python 代码实现:
```python
from collections import Counter
class HuffmanNode:
def __init__(self, value=None, freq=0, left=None, right=None):
self.value = value
self.freq = freq
self.left = left
self.right = right
def __lt__(self, other):
return self.freq < other.freq
class HuffmanTree:
def __init__(self, text):
self.text = text
self.root = None
self.codes = {}
def frequency(self):
counter = Counter(self.text)
return [HuffmanNode(value=k, freq=v) for k, v in counter.items()]
def merge_nodes(self, nodes):
while len(nodes) > 1:
node1 = nodes.pop(0)
node2 = nodes.pop(0)
merged = HuffmanNode(freq=node1.freq + node2.freq, left=node1, right=node2)
nodes.append(merged)
nodes.sort()
self.root = nodes[0]
def generate_codes(self, node, code=""):
if node is None:
return
if node.value is not None:
self.codes[node.value] = code
self.generate_codes(node.left, code + "0")
self.generate_codes(node.right, code + "1")
def compress(self):
nodes = self.frequency()
self.merge_nodes(nodes)
self.generate_codes(self.root)
compressed = "".join([self.codes[c] for c in self.text])
return compressed
def decompress(self, compressed):
node = self.root
decompressed = []
for bit in compressed:
node = node.left if bit == "0" else node.right
if node.value is not None:
decompressed.append(node.value)
node = self.root
return "".join(decompressed)
```
这个代码实现了哈夫曼树的压缩和解压缩算法。在 `HuffmanTree` 类中,`frequency` 方法计算每个字符在文本中出现的频率,并返回一个列表,其中每个元素是一个 `HuffmanNode` 对象,表示一个字符和它的出现频率。`merge_nodes` 方法将这些节点合并成一棵哈夫曼树。`generate_codes` 方法用来生成每个字符的编码,将它们存储在 `self.codes` 字典中。`compress` 方法使用 `self.codes` 字典将文本压缩成一串 0 和 1 的比特串。`decompress` 方法则将这串比特串解压缩成原始的文本。
下面是一个使用示例:
```python
text = "hello world"
tree = HuffmanTree(text)
compressed = tree.compress()
print(f"Compressed: {compressed}")
decompressed = tree.decompress(compressed)
print(f"Decompressed: {decompressed}")
```
输出:
```
Compressed: 10010110111001001101101101010011010101101010
Decompressed: hello world
```
注意,这个算法只适用于英文字符。如果要支持其他字符集,需要相应地修改 `frequency` 方法。
阅读全文