com.aspose.words Document 如何获取文件中的文字及格式
时间: 2024-09-12 22:02:11 浏览: 98
Aspose, word、excel、PPT 转PDF文件jar包加工具类
5星 · 资源好评率100%
要从Aspose.Words `Document` 对象中获取文件中的文字及其格式,你可以按照以下步骤操作:
1. **打开文档**[^1]:
使用`Document`类的构造函数加载Word文档,如示例中所示:
```csharp
// Open the Word document
Aspose.Words.Document doc = new Aspose.Words.Document("demo.doc");
```
2. **遍历文本范围**:
使用`DocumentBuilder`对象可以方便地访问和修改文档中的文本。通过调用`MoveTo`和`ReadText`方法,你可以逐行读取并处理文本:
```csharp
DocumentBuilder builder = new DocumentBuilder(doc);
// Move to the start of the document
builder.MoveToStart();
// Read text in chunks while there's more content
while (!builder.AtEndOfDocument)
{
string line = builder.ReadText(UnitType.Paragraph); // Read a paragraph at a time
Console.WriteLine(line); // Process the text as needed, including formatting
}
```
3. **处理格式**:
在循环内部,你可以使用`Run.Font`, `ParagraphFormat`等属性来访问文本的字体、字号、颜色等格式信息。例如:
```csharp
Run run = builder.CurrentRun;
Font font = run.Font; // Get the font properties
ParagraphFormat pf = builder.ParagraphFormat; // Get paragraph-level formatting
```
阅读全文