翻譯|使用教程|編輯:楊鵬連|2021-02-25 11:14:15.837|閱讀 299 次
概述:LEADTOOLS將自動檢測并識別所有內容!以下是快速且準確地處理各種表單類型的主要步驟,無論數據如何格式化。
# 界面/圖表報表/文檔/IDE等千款熱門軟控件火熱銷售中 >>
LEADTOOLS Recognition Imaging SDK是精選的LEADTOOLS SDK功能集,旨在在企業級文檔自動化解決方案中構建端到端文檔成像應用程序,這些解決方案需要OCR,MICR,OMR,條形碼,表單識別和處理,PDF,打印捕獲 ,檔案,注釋和圖像查看功能。 這套功能強大的工具利用LEAD屢獲殊榮的圖像處理技術,智能識別可用于識別和提取任何類型的掃描或傳真形式圖像數據的文檔功能。
點擊下載LEADTOOLS Recognition Imaging SDK試用版
使用最先進的表單處理API可以自動解決數據輸入問題 。無論您是在處理客戶調查,稅務文件還是開票記錄,每個行業都每天使用表格開展業務。將數據從紙張移動到數字介質可能會很耗時。因此,LEADTOOLS開發了專有功能,可以從包含機器打印文本, 手寫文本, MICR, MRZ和 OMR字段的任意組合的圖像中提取文本 。LEADTOOLS將自動檢測并識別所有內容!以下是快速且準確地處理各種表單類型的主要步驟,無論數據如何格式化。
首先,我們需要初始化表單引擎。這完成了讀取和識別數據的所有艱苦工作:
static void InitFormsEngines() { Console.WriteLine("Initializing Engines"); codecs = new RasterCodecs(); recognitionEngine = new FormRecognitionEngine(); processingEngine = new FormProcessingEngine(); formsOCREngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD); formsOCREngine.Startup(codecs, null, null, @"C:\LEADTOOLS21\Bin\Common\OcrLEADRuntime"); OcrObjectsManager ocrObjectsManager = new OcrObjectsManager(formsOCREngine); ocrObjectsManager.Engine = formsOCREngine; recognitionEngine.ObjectsManagers.Add(ocrObjectsManager); Console.WriteLine("Engines initialized successfully"); }表格識別需要一個主表格和一個填寫表格。主表單包含空白字段,并用作指定區域的模板。填充表單是一種包含字段中數據的表單。
下一步是指定主表單:
private static void CreateMasterFormAttributes() { Console.WriteLine("Processing Master Form"); string[] masterFileNames = Directory.GetFiles(@"C:\LEADTOOLS21\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories); foreach (string masterFileName in masterFileNames) { string formName = Path.GetFileNameWithoutExtension(masterFileName); using (RasterImage image = codecs.Load(masterFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)) { FormRecognitionAttributes masterFormAttributes = recognitionEngine.CreateMasterForm(formName, Guid.Empty, null); for (int i = 0; i < image.PageCount; i++) { image.Page = i + 1; recognitionEngine.AddMasterFormPage(masterFormAttributes, image, null); } recognitionEngine.CloseMasterForm(masterFormAttributes); File.WriteAllBytes(formName + ".bin", masterFormAttributes.GetData()); } } Console.WriteLine("Master Form Processing Complete"); Console.WriteLine("============================================================="); }最后,我們準備閱讀填寫的表格:
private static void RecognizeForm() { Console.WriteLine("Recognizing Form\n"); var GetProjectDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location); string formToRecognize = @"C:\LEADTOOLS21\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif"; using (RasterImage image = codecs.Load(formToRecognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)) { FormRecognitionAttributes filledFormAttributes = recognitionEngine.CreateForm(null); for (int i = 0; i < image.PageCount; i++) { image.Page = i + 1; recognitionEngine.AddFormPage(filledFormAttributes, image, null); } recognitionEngine.CloseForm(filledFormAttributes); string resultMessage = "The form could not be recognized"; string[] masterFileNames = Directory.GetFiles(GetProjectDirectory, "*.bin"); foreach (string masterFileName in masterFileNames) { string fieldsfName = Path.GetFileNameWithoutExtension(masterFileName) + ".xml"; string fieldsfullPath = Path.Combine(@"C:\LEADTOOLS21\Resources\Images\Forms\MasterForm Sets\OCR", fieldsfName); processingEngine.LoadFields(fieldsfullPath); FormRecognitionAttributes masterFormAttributes = new FormRecognitionAttributes(); masterFormAttributes.SetData(File.ReadAllBytes(masterFileName)); FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null); if (recognitionResult.Confidence >= 80) { List<PageAlignment> alignment = new List<PageAlignment>(); for (int k = 0; k < recognitionResult.PageResults.Count; k++) alignment.Add(recognitionResult.PageResults[k].Alignment); resultMessage = $"This form has been recognized as a {Path.GetFileNameWithoutExtension(masterFileName)}"; ProcessForm(image, alignment); break; } } Console.WriteLine(resultMessage, "Recognition Results"); Console.WriteLine("=============================================================\n"); } } private static void ProcessForm(RasterImage image, List<PageAlignment> alignment) { processingEngine.OcrEngine = formsOCREngine; string resultsMessage = string.Empty; processingEngine.Process(image, alignment); foreach (FormPage formPage in processingEngine.Pages) foreach (FormField field in formPage) if (field != null) resultsMessage = $"{resultsMessage}{field.Name} = {(field.Result as TextFormFieldResult).Text}\n"; if (string.IsNullOrEmpty(resultsMessage)) Console.WriteLine("No fields were processed", "FieldProcessing Results"); else Console.WriteLine(resultsMessage, "Field ProcessingResults"); }這是從填寫的表單中提取數據所需的全部。要更深入地了解,請參考有關如何識別和處理表單的教程
免費評估!
直接從我們的網站免費下載LEADTOOLS SDK。該試用版有效期為60天,并提電子郵件支持。
本站文章除注明轉載外,均為本站原創或翻譯。歡迎任何形式的轉載,但請務必注明出處、不得修改原文相關鏈接,如果存在內容上的異議請郵件反饋至chenjj@fc6vip.cn
文章轉載自: