首页利用python实现以下功能：1.统计文档1和文档2中出现的的英文单词，并使输出在文档1中出现但未在文档2中出现的单词并计算文档2中单词在文档一种所占比例

利用python实现以下功能：1.统计文档1和文档2中出现的的英文单词，并使输出在文档1中出现但未在文档2中出现的单词并计算文档2中单词在文档一种所占比例

时间: 2024-04-30 09:23:41 浏览: 6

假设文档1和文档2分别为"doc1.txt"和"doc2.txt"，以下是实现功能的代码： ```python import string # 读取文档1和文档2 with open("doc1.txt", "r") as f1, open("doc2.txt", "r") as f2: doc1 = f1.read() doc2 = f2.read() # 去除文档中的标点符号和数字 translator = str.maketrans("", "", string.punctuation + string.digits) doc1 = doc1.translate(translator) doc2 = doc2.translate(translator) # 将文档1和文档2中的单词转为小写，并分割为列表 words1 = doc1.lower().split() words2 = doc2.lower().split() # 统计文档1中出现但未在文档2中出现的单词 unique_words = set(words1) - set(words2) # 计算文档2中单词在文档1中所占比例 common_words = set(words1) & set(words2) ratio = len(common_words) / len(words2) # 输出结果 print("文档1中出现但未在文档2中出现的单词：") print(unique_words) print("文档2中单词在文档1中所占比例：{:.2f}%".format(ratio*100)) ``` 解释： 1. 首先使用`with`语句打开文档1和文档2，并读取文本内容。 2. 使用`str.maketrans()`和`translate()`函数去除文档中的标点符号和数字，以便后续统计单词。 3. 将文档1和文档2中的单词转为小写，并分割为列表。 4. 使用集合操作`set()`和`-`运算符统计文档1中出现但未在文档2中出现的单词。 5. 使用集合操作`set()`和`&`运算符统计文档1和文档2中共同出现的单词，并计算文档2中单词在文档1中所占比例。 6. 最后输出结果。