日韩美女电影,精品国产三级国产,欧美日韩精品一区二区

Spire.Doc系列教程：如何在 Java 中提取 Word 文檔中的批注文本和圖片

翻譯|使用教程|編輯：吉煒煒|2024-12-27 10:38:45.807|閱讀 106 次

概述：Word 文檔中的批注通常用于協(xié)作審閱和反饋。這些批注可能包含文本和圖片，它們?yōu)槲臋n改進(jìn)提供了重要的參考信息。本文將演示如何使用 Spire.Doc for Java 在 Java 中提取 Word 文檔中的批注文本和圖片。

# 界面/圖表報表/文檔/IDE等千款熱門軟控件火熱銷售中 >>

相關(guān)鏈接：

Word 文檔中的批注通常用于協(xié)作審閱和反饋。這些批注可能包含文本和圖片，它們?yōu)槲臋n改進(jìn)提供了重要的參考信息。提取批注中的文本和圖片可以幫助你分析和評估審閱者的反饋，從而全面了解文檔的優(yōu)點(diǎn)、缺點(diǎn)以及改進(jìn)建議。本文將演示如何使用 Spire.Doc for Java 在 Java 中提取 Word 文檔中的批注文本和圖片。

Spire.Doc for Java下載

安裝 Spire.Doc for Java

首先，您需要在 Java 程序中添加 Spire.Doc.jar 文件作為依賴項。

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>//repo.e-iceblue.cn/repository/maven-public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>12.11.9</version>
    </dependency>
</dependencies>

Java 提取 Word 文檔批注中的文本

使用 Java 獲取 Word 文檔批注中的文本并不難。首先遍歷 Word 文檔中的所有批注，然后使用 Spire.Doc for Java 提供的 Document.getComments().get() 方法獲取當(dāng)前的批注，再然后遍歷批注正文的每一個段落并獲取當(dāng)前段落，最后使用 Paragraph.getText() 方法獲取該段落的文本。下面是具體的操作步驟：

創(chuàng)建一個 Document 類的對象。
通過 Document.loadFromFile() 方法，加載一個 Word 文檔。
遍歷這個文檔中的所有批注。
對于每條批注，遍歷其正文中的所有段落。
對于每個段落，使用 Paragraph.getText() 方法提取其文本內(nèi)容。
將提取到的內(nèi)容保存為文本文件。

import com.spire.doc.*;
import com.spire.doc.documents.*;
import com.spire.doc.fields.*;
import java.io.*;

public class ExtractComments {
   public static void main(String[] args) throws IOException {

       // 創(chuàng)建一個 Document 類的對象
       Document doc = new Document();

       // 加載一個 Word 文檔
       doc.loadFromFile("/AI繪畫的利弊及法律應(yīng)對.docx");

       // 遍歷文檔中的每個批注
       for (int i = 0; i < doc.getComments().getCount(); i++) {
           // 獲取當(dāng)前索引處的批注
           Comment comment = doc.getComments().get(i);

           // 遍歷批注正文中的每個段落
           for (int j = 0; j < comment.getBody().getParagraphs().getCount(); j++) {
               // 獲取當(dāng)前的段落
               Paragraph para = comment.getBody().getParagraphs().get(j);

               // 獲取該段落的文本
               String result = para.getText() + "\r\n";

               // 將提取到的批注保存為文本文件
               writeStringToTxt(result, "/批注信息.txt");
           }
       }

       // 釋放資源
       doc.dispose();
   }

   // 自定義將數(shù)據(jù)寫入到文本文件的方法
   public static void writeStringToTxt(String content, String txtFileName) throws IOException {
       FileWriter fWriter = new FileWriter(txtFileName);
       try {
           // 寫入文本文件
           fWriter.write(content);
       } catch (IOException ex) {
           ex.printStackTrace();
       } finally {
           try {
               // 關(guān)閉文件寫入器
               fWriter.flush();
               fWriter.close();
           } catch (IOException ex) {
               ex.printStackTrace();
           }
       }
   }
}

Java 提取 Word 文檔批注中的文本

Java 提取 Word 文檔批注中的圖片

要從 Word 文檔的批注中提取圖片，需要遍歷批注段落中的子對象，找到 DocPicture 對象。然后通過 DocPicture.getImageBytes() 方法獲取圖片數(shù)據(jù)，并將其保存為圖像文件。

創(chuàng)建一個 Document 類的對象。
使用 Document.loadFromFile() 方法加載一個 Word 文檔。
創(chuàng)建一個列表以儲存提取的圖片數(shù)據(jù)。
遍歷文檔中的批注。
對每一個批注，遍歷其批注正文的每一個段落。
對每個段落，遍歷該段落的所有子對象。
檢查該對象是否為 DocPicture 類型。
如果對象是 DocPicture，則使用 DocPicture.getImageBytes 屬性獲取圖片數(shù)據(jù)并將其添加到列表中。
將列表中的圖片數(shù)據(jù)保存為圖像文件。

import com.spire.doc.*;
import com.spire.doc.documents.*;
import com.spire.doc.fields.*;
import java.io.*;
import java.nio.file.*;
import java.util.ArrayList;
import java.util.List;

public class ExtractCommentImages {
   public static void main(String[] args) {
       // 創(chuàng)建一個 Document 對象
       Document document = new Document();

       // 加載包含批注的 Word 文檔
       document.loadFromFile("/AI繪畫的利弊及法律應(yīng)對.docx");

       // 創(chuàng)建一個列表來存儲提取的圖片數(shù)據(jù)
       List images = new ArrayList<>();

       // 遍歷文檔中的批注
       for (int i = 0; i < document.getComments().getCount(); i++) {
           Comment comment = document.getComments().get(i);

           // 遍歷批注正文中的所有段落
           for (int j = 0; j < comment.getBody().getParagraphs().getCount(); j++) {
               Paragraph paragraph = comment.getBody().getParagraphs().get(j);

               // 遍歷段落中的所有子對象
               for (int k = 0; k < paragraph.getChildObjects().getCount(); k++) {
                   DocumentObject obj = paragraph.getChildObjects().get(k);

                   // 檢查是否為圖片
                   if (obj instanceof DocPicture) {
                       DocPicture picture = (DocPicture) obj;

                       // 獲取圖片數(shù)據(jù)并添加到列表
                       images.add(picture.getImageBytes());
                   }
               }
           }
       }

       // 指定輸出路徑
       String outputDir = "/批注圖片/";
       new File(outputDir).mkdirs();

       // 保存圖片數(shù)據(jù)為文件
       for (int i = 0; i < images.size(); i++) {
           String fileName = String.format("批注圖片-.png", i);
           Path filePath = Paths.get(outputDir, fileName);
           try (FileOutputStream fos = new FileOutputStream(filePath.toFile())) {
               fos.write(images.get(i));
           } catch (IOException e) {
               e.printStackTrace();
           }
       }
   }
}

Java 提取 Word 文檔批注中的圖片