with open("C:\\Users\\26212\\Desktop\\file.txt", 'w', encoding='utf-8') as f:为什么会报错expected an indented block

这个错误通常是因为在代码中缺少了一个缩进。在Python中，缩进非常重要，它用于表示代码块的开始和结束。如果缺少缩进或者缩进不正确，就会导致IndentationError错误。你提供的代码中，第一行是没有问题的，但是第二行似乎没有缩进。你需要在第二行之前添加一个缩进，例如： ``` with open("C:\\Users\\26212\\Desktop\\file.txt", 'w', encoding='utf-8') as f: # 在这里编写你要执行的代码 ``` 注意，在上面的代码中，第二行之前有一个缩进，这样就可以避免IndentationError错误了。

import os import pdfplumber # 文本保存目录 txts_dir = "./Users/hh/Desktop/python/fileshunan" for path in os.listdir(files_dir): if os.path.isfile(os.path.join(files_dir, path)): file = os.path.join(files_dir, path) print(file) if file.endswith(".pdf"): with pdfplumber.open(file) as pdf: txt_file = open(txts_dir + path[0:-4] + ".txt", mode='w', encoding='utf-8') for page in pdf.pages: txt_file.write(page.extract_text()) elif file.endswith(".html"): soup = BeautifulSoup(open(file, encoding='utf-8')) with open(txts_dir + path[0:-5] + ".txt", mode='w', encoding='utf-8') as txt_file: txt_file.write(soup.get_text()) #将文件保存为文本

根据你提供的代码，你似乎想要将保存在文件夹中的 PDF 和 HTML 文件转换为文本文件。你使用了 `pdfplumber` 库来处理 PDF 文件，并使用 BeautifulSoup 库处理 HTML 文件。首先，请确保你已经导入了以下库： ```python import os import pdfplumber from bs4 import BeautifulSoup ``` 然后，你可以使用以下代码将文件转换为文本： ```python # 文本保存目录 txts_dir = "/Users/hh/Desktop/python/fileshunan/" for path in os.listdir(files_dir): if os.path.isfile(os.path.join(files_dir, path)): file = os.path.join(files_dir, path) print(file) if file.endswith(".pdf"): with pdfplumber.open(file) as pdf: txt_file = open(txts_dir + path[0:-4] + ".txt", mode='w', encoding='utf-8') for page in pdf.pages: txt_file.write(page.extract_text()) txt_file.close() elif file.endswith(".html"): soup = BeautifulSoup(open(file, encoding='utf-8'), 'lxml') with open(txts_dir + path[0:-5] + ".txt", mode='w', encoding='utf-8') as txt_file: txt_file.write(soup.get_text()) ``` 在上述代码中，我们首先定义了保存文本文件的目录路径 `txts_dir`。然后，我们使用 `os.listdir()` 函数遍历指定文件夹中的文件。对于每个文件，我们使用 `os.path.isfile()` 函数来判断是否为文件，并使用 `os.path.join()` 函数获取完整路径。如果文件是以 `.pdf` 结尾的 PDF 文件，我们使用 `pdfplumber.open()` 函数打开 PDF 文件，并使用循环遍历每个页面。我们使用 `page.extract_text()` 方法提取页面的文本内容，并将其写入到以原文件名为基础的新文本文件中。如果文件是以 `.html` 结尾的 HTML 文件，我们使用 BeautifulSoup 打开文件并解析 HTML 内容。然后，我们使用 `soup.get_text()` 方法获取 HTML 的纯文本内容，并将其写入到以原文件名为基础的新文本文件中。请确保你已经设置好了 `txts_dir` 变量，它指定了保存文本文件的目录路径。请注意，以上代码仅供参考，并需要根据你的实际文件路径和需求进行适当的修改。另外，记得导入所需的库。如果你有任何进一步的问题，请随时提问。

import jieba import wordcloud import imageio mask = imageio.imread('C:/Users/Febird/Desktop/1.png') #设定一个词云背景 with open('C:/Users/Febird/Desktop/傲世.txt', encoding='utf-8') as f: t = f.read() # 打开需要制作词云图的文件 ls = jieba.lcut(t) txt = " ".join(ls) # 将文章中的词组提出来 stopwords_file = open('C:/Users/Febird/Desktop/stopWord.txt', 'r', encoding='utf-8') stopwords = [(words.strip()) for words in stopwords_file.readlines()] w = wordcloud.WordCloud(width=4000, height=4000, stopwords=stopwords, font_path="msyh.ttc", colormap='hot', background_color='pink', mask=mask) # 定义一个词云 w.generate(txt) #生成词云 w.to_file(r'C:/Users/Febird/Desktop/test_3.png') #将词组变量txt导入词云对象w中并保存添加代码完成排名前五十个词的词云

在代码中添加以下部分可以完成排名前五十个词的词云： ``` from collections import Counter # 统计词频 words_count = Counter(ls) # 获取词频前50的词及其频次 top50_words = words_count.most_common(50) # 将排名前50的词转换为一个字符串 top50_str = "" for word, count in top50_words: top50_str += (word + " ") * count # 生成词云 w.generate(top50_str) ``` 完整代码如下： ``` import jieba import wordcloud import imageio from collections import Counter mask = imageio.imread('C:/Users/Febird/Desktop/1.png') with open('C:/Users/Febird/Desktop/傲世.txt', encoding='utf-8') as f: t = f.read() ls = jieba.lcut(t) txt = " ".join(ls) stopwords_file = open('C:/Users/Febird/Desktop/stopWord.txt', 'r', encoding='utf-8') stopwords = [(words.strip()) for words in stopwords_file.readlines()] w = wordcloud.WordCloud(width=4000, height=4000, stopwords=stopwords, font_path="msyh.ttc", colormap='hot', background_color='pink', mask=mask) # 统计词频 words_count = Counter(ls) # 获取词频前50的词及其频次 top50_words = words_count.most_common(50) # 将排名前50的词转换为一个字符串 top50_str = "" for word, count in top50_words: top50_str += (word + " ") * count # 生成词云 w.generate(top50_str) w.to_file(r'C:/Users/Febird/Desktop/test_3.png') ```

阅读全文

with open("C:\\Users\\26212\\Desktop\\file.txt", 'w', encoding='utf-8') as f:为什么会报错expected an indented block

相关推荐

使用ADODB.STREAM解决FSO生成UTF-8编码文件问题

PB12.5中解决UTF-8文件BOM问题：字符串转XML编码优化

批量转换utf-8文件加BOM以解决Qt中文乱码

with open(C:\Users\26212\Desktop, 'zy', encoding='utf-8') as f:有什么错误

with open(r'C:\Users\Administrator\Desktop\data.csv', encoding='utf-8') as csvfile: # 读取 CSV 文件内容并转换为列表 data = list(csv.reader(csvfile)

Traceback (most recent call last): File "C:\Users\zyh\Desktop\大三下\python\测试.py", line 7, in <module> with open('Who Moved My Cheese.txt', 'r', encoding='utf-8') as f: FileNotFoundError: [Errno 2] No such file or directory: 'Who Moved My Cheese.txt'

(most recent call last): File "C:\Users\MECHREVO\Desktop\python�ļ�\DBSCN.py", line 113, in <module> with open(file_name + ".txt", "r", encoding="utf-8") as f:# TypeError: 'encoding' is an invalid keyword argument for this function

File "C:\pythonproject\firstprj\推箱子.py", line 176 f = open('C:\Users\Administrator\Desktop\level_file_path.txt', encoding='utf-8') ^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

wordList = textParse(open('C:/Users/16214/Desktop/email/ham/%d.txt' % i,encoding='UTF-8').read()),文本既有UTF-8又有ANSI

大家在看

SHIMAX_MAC3&MAC50通讯手册

基于综合评价语义描述的领域本体构建 (2013年)

ansys workbench 非线性分析

hw1.rar_C++图像插值_二维插值_二维插值 C++_图像_最近邻插值

Chamber and Station test.pptx

最新推荐

036GraphTheory(图论) matlab代码.rar

macOS 10.9至10.13版高通RTL88xx USB驱动下载

PyCharm开发者必备：提升效率的Python环境管理秘籍

matlab中VBA指令集

在Windows Forms和WPF中实现FontAwesome-4.7.0图形

【Postman进阶秘籍】：解锁高级API测试与管理的10大技巧

ubuntu22.04怎么恢复出厂设置

2001年度广告运作规划：高效利用资源的策略

【Postman终极指南】：掌握API测试到自动化部署的全流程

叙述图神经网络领域近年来最新研究进展