赫夫曼码python
时间: 2024-09-22 07:05:32 浏览: 63
赫夫曼码,又称为最优二叉树编码或霍夫曼编码,是一种用于数据压缩的无损编码方法。它是通过对频率较高的字符分配较短的编码,频率较低的字符分配较长编码的方式来实现高效的存储。在Python中,可以利用字典(dict)和堆(heapq)来构建赫夫曼树并生成对应的编码。
以下是创建一个简单的赫夫曼编码过程的一个例子:
```python
import heapq
def create_frequency_table(text):
freq = {}
for char in text:
if char in freq:
freq[char] += 1
else:
freq[char] = 1
return freq
def build_huffman_tree(freq):
heap = [[weight, [char, ""]] for char, weight in freq.items()]
heapq.heapify(heap)
while len(heap) > 1:
left = heapq.heappop(heap)
right = heapq.heappop(heap)
combined_freq = left[0] + right[0]
new_node = [combined_freq, []]
new_node[1].append(left[1])
new_node[1].append(right[1])
heapq.heappush(heap, new_node)
return heapq.heappop(heap)[1]
def huffman_code(tree):
codes = {}
generate_codes(tree, "", codes)
return codes
def generate_codes(node, current_code, codes):
if isinstance(node, str):
codes[node] = current_code
else:
generate_codes(node[0], current_code + "0", codes)
generate_codes(node[1], current_code + "1", codes)
# 使用示例
text = "hello world"
freq_table = create_frequency_table(text)
huff_tree = build_huffman_tree(freq_table)
huffman_codes = huffman_code(huff_tree)
print("原始文本:", text)
print("Huffman 编码:", huffman_codes)
```
在这个例子中,我们首先计算每个字符的频率,然后使用堆数据结构生成赫夫曼树,并从根节点开始递归地生成每个字符的编码。
阅读全文