--------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-1-bf2a8051e1c4> in <module>() 28 model_path = 'rnnlm.vec' 29 similarity_path = 'wordsim353.txt' ---> 30 correlation = evaluate_word_similarity(model_path, similarity_path) 31 print(f'Correlation: {correlation}') <ipython-input-1-bf2a8051e1c4> in evaluate_word_similarity(model_path, similarity_path) 12 13 def evaluate_word_similarity(model_path, similarity_path): ---> 14 model = load_word_vectors(model_path) 15 human_similarities = [] 16 model_similarities = [] <ipython-input-1-bf2a8051e1c4> in load_word_vectors(file_path) 4 word_vectors = {} 5 with open(file_path, 'r', encoding='utf-8') as f: ----> 6 for line in f: 7 line = line.strip().split() 8 word = line[0] D:\anaconda3\lib\codecs.py in decode(self, input, final) 320 # decode input (taking the buffer into account) 321 data = self.buffer + input --> 322 (result, consumed) = self._buffer_decode(data, self.errors, final) 323 # keep undecoded input until the next call 324 self.buffer = data[consumed:] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa8 in position 7450: invalid start byte
时间: 2023-12-23 14:05:39 浏览: 95
这个错误通常是因为文件编码不是 UTF-8 导致的。你可以尝试指定文件编码格式来解决这个问题。例如,如果文件编码是 GBK,你可以这样修改代码:
```python
with open(file_path, 'r', encoding='gbk') as f:
```
如果你不确定文件编码是什么,你可以尝试使用 Notepad++ 等文本编辑器打开文件并查看编码格式。
阅读全文