97色在线,国产精品自拍一区,日韩人兽精品在线

欧美日韩亚-欧美日韩亚州在线-欧美日韩亚洲-欧美日韩亚洲第一区-欧美日韩亚洲二区在线-欧美日韩亚洲高清精品

LEADTOOLS OCR文字識別教程:掃描文檔并識別為可搜索的PDF文件

原創|使用教程|編輯：龔雪|2015-07-23 10:20:43.000|閱讀 773 次

概述：LEADTOOLS OCR文字識別教程:掃描文檔并識別為可搜索的PDF文件。

# 界面/圖表報表/文檔/IDE等千款熱門軟控件火熱銷售中 >>

相關鏈接：

根據下面的步驟來創建和運行一個程序用來展示如何使用OCR掃描一個圖片然后得到識別結果，最后將識別結果保存為可搜索的PDF文件。

1. 打開Visual Studio

2. 在菜單中選擇文件->新建->項目

3. 在新建項目對話框中，模板選擇"Visual C#"，然后選擇Windows窗體應用程序。

4. 在名稱欄輸入這個項目的名稱："OcrTutorial4"，然后選擇確定，當然如果需要的話可以重新指定一個目錄來存放這個項目。

5. 在“解決方案資源管理器”窗口，右鍵點擊“引用”，然后在彈出菜單中選擇“添加引用”。在彈出的引用管理器對話框中，選擇“框架”然后選擇“瀏覽(B)”按鈕，定位到LEADTOOLS安裝目錄：

"<安裝目錄>\Bin\DotNet4\Win32" 然后選擇如下幾個DLL：

Leadtools.dll

Leadtools.Codecs.dll

Leadtools.Twain.dll

Leadtools.ImageProcessing.Core.dll

Leadtools.Forms.dll

Leadtools.Forms.DocumentWriters.dll

Leadtools.Forms.Ocr.dll

Leadtools.Forms.Ocr.Advantage.dll

Leadtools.Codecs.Bmp.dll

Leadtools.Codecs.Cmp.dll

Leadtools.Codecs.Tif.dll

Leadtools.Codecs.Fax.dll

注意：Leadtools.Codecs.*.dll這種引用是根據支持的圖像格式命名的，例如BMP、TIF、FAX、JPG等，請根據您的需要添加不同的格式支持。

6. 從工具箱中拖拽3個button到Form1中，button名稱保持button1、2、3，然后修改button文字為如下內容：

button1：修改保存路徑

button2：選擇掃描設備

button3：掃描并識別

7. 切換到Form1的代碼視圖，然后添加如下代碼到文件的最前面，如果已經有了using代碼的話請添加到已有代碼后：

using Leadtools; 
using Leadtools.Twain;  
using Leadtools.ImageProcessing;  
using Leadtools.ImageProcessing.Core;  
using Leadtools.Forms;  
using Leadtools.Forms.DocumentWriters;  
using Leadtools.Forms.Ocr;

8. 在Form1的構造函數中添加如下代碼：

// 請將這兩個字段替換為你得到的License文件路徑和Developer Key 
string licenseFilePath = @"D:\Program Files\LEADTOOLS 19\Common\License\LEADTOOLS.LIC";  
string developerKey = "***";

9. 在Form1類中添加如下的私有變量：

// OCR引擎 
private IOcrEngine _ocrEngine;  
// OCR文檔  
private IOcrDocument _ocrDocument;  
// TWAIN  
private TwainSession _twainSession;  
// 保存PDF的路徑  
private string _outputDirectory = @"D:\ScanImages";  
// 圖像處理命令列表，我們使用這個功能來處理掃描的圖片  
private List<RasterCommand> _imageProcessingCommands;  
private int _scanCount;

10. 重寫Form1的 Onload事件，然后添加如下代碼：

protected override void OnLoad(EventArgs e) 
{  
	// 初始化OCR引擎  
	_ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);  
	// 啟動引擎  
	_ocrEngine.Startup(null, null, null, @"D:\Program Files\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");  
	// 設置語言為中英  
	_ocrEngine.LanguageManager.EnableLanguages(new string[] { "zh-Hans", "en" });  
	// 初始化TWAIN  
	_twainSession = new TwainSession();  
	_twainSession.Startup(this.Handle, "My Company", "My Product", "My Version", "My Application", TwainStartupFlags.None);  

 
	// 訂閱事件TwainSession.Acquire來獲取掃描圖像  
	_twainSession.AcquirePage += new EventHandler<TwainAcquirePageEventArgs>(_twainSession_AcquirePage);  

 
	// 初始化我們將要使用到的圖像處理命令  
	// 您可以添加任意命令進行預處理, 這里我們只添加傾斜校正和去除噪點  
	_imageProcessingCommands = new List<RasterCommand>();  
	_imageProcessingCommands.Add(new DeskewCommand());  
	_imageProcessingCommands.Add(new DespeckleCommand());  

 
	base.OnLoad(e);  
}

11. 重寫Form1的OnFormClosed方法，然后添加如下代碼：

protected override void OnFormClosed(FormClosedEventArgs e) 
{  
	// 釋放引擎  
	_ocrEngine.Dispose();  

 
	// 釋放TWAIN  
	_twainSession.Shutdown();  

 
	base.OnFormClosed(e);  
}

12. 為button1（修改保存路徑）添加如下代碼：

private void button1_Click(object sender, EventArgs e) 
{  
	// 變更保存路徑  
	using (FolderBrowserDialog dlg = new FolderBrowserDialog())  
	{  
		dlg.SelectedPath = _outputDirectory;  
		dlg.ShowNewFolderButton = true;  
		if (dlg.ShowDialog(this) == DialogResult.OK)  
			_outputDirectory = System.IO.Path.GetFullPath(dlg.SelectedPath);  
	}  
}

13. 為button2（選擇掃描設備）按鈕添加如下代碼：

private void button2_Click(object sender, EventArgs e) 
        {  
            // 選擇您想要使用的掃描儀  
            _twainSession.SelectSource(null);  
        }

14. 為button3（掃描并識別）按鈕添加如下代碼：

private void button3_Click(object sender, EventArgs e) 
{  
	// 如果輸出路徑不存在的話創建一個  
	if (!System.IO.Directory.Exists(_outputDirectory))  
		System.IO.Directory.CreateDirectory(_outputDirectory);  

 
	// 建立PDF文件名稱  
	string name = "Scanned" + _scanCount;  
	_scanCount++;  
	string pdfFileName = System.IO.Path.Combine(_outputDirectory, name + ".pdf");  

 
	// 創建一個基于文件的OCR文檔以便于將掃描的文檔添加進來  
	_ocrDocument = _ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile);  

 
	// 掃描  
	_twainSession.Acquire(TwainUserInterfaceFlags.Show);  

 
	// 保存PDF  
	_ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, null);  

 
	// 釋放頁面  
	_ocrDocument.Dispose();  

 
	// 顯示結果  
	System.Diagnostics.Process.Start(pdfFileName);  
}

15. 添加掃描事件：

private void _twainSession_AcquirePage(object sender, TwainAcquirePageEventArgs e) 
{  
	// 掃描進來的文檔  
	RasterImage image = e.Image;  

 
	// 進行預處理  
	foreach (RasterCommand command in _imageProcessingCommands)  
	{  
		command.Run(image);  
	}  

 
	// 創建OCR頁面  
	using (IOcrPage ocrPage = _ocrEngine.CreatePage(image, OcrImageSharingMode.AutoDispose))  
	{  
		// 識別  
		ocrPage.Recognize(null);  

 
		_ocrDocument.Pages.Add(ocrPage);  
	}  
}

16. 保存然后編譯執行。