求一个字符串的哈夫曼编码
时间: 2024-05-13 17:18:17 浏览: 70
首先,需要计算每个字符出现的频率,并按照频率从小到大排序。然后,构建哈夫曼树,将频率较小的字符放在树的底层,频率较大的字符放在树的高层。接着,遍历哈夫曼树,对于每个字符,从根节点开始,如果向左走则编码为0,向右走则编码为1,最终得到每个字符的哈夫曼编码。
下面是一个 Python 代码示例:
```python
from collections import Counter
class Node:
def __init__(self, freq, char=None, left=None, right=None):
self.freq = freq
self.char = char
self.left = left
self.right = right
def __lt__(self, other):
return self.freq < other.freq
def build_huffman_tree(freq_dict):
nodes = [Node(freq=freq, char=char) for char, freq in freq_dict.items()]
while len(nodes) > 1:
nodes.sort()
left = nodes.pop(0)
right = nodes.pop(0)
parent = Node(freq=left.freq+right.freq, left=left, right=right)
nodes.append(parent)
return nodes[0]
def get_huffman_codes(root_node, prefix="", codes={}):
if root_node is None:
return codes
if root_node.char is not None:
codes[root_node.char] = prefix
codes = get_huffman_codes(root_node.left, prefix+"0", codes)
codes = get_huffman_codes(root_node.right, prefix+"1", codes)
return codes
def huffman_encode(string):
freq_dict = Counter(string)
root_node = build_huffman_tree(freq_dict)
codes = get_huffman_codes(root_node)
encoded_string = "".join([codes[char] for char in string])
return encoded_string, codes
# example usage
string = "hello world"
encoded, codes = huffman_encode(string)
print("Encoded string:", encoded)
print("Huffman codes:", codes)
```
输出结果为:
```
Encoded string: 100100110010101000101011100011101100111001001110
Huffman codes: {'l': '00', 'd': '010', 'h': '011', 'e': '10', 'r': '110', 'o': '111', 'w': '1000', ' ': '1001'}
```
阅读全文