WinSoftコンポーネントを使ってOCR文字認識を行う

はじめに

印刷物で提供された書類を業務で扱う際に、OCR（Optical Character Recognition）を利用すると情報のデータ化が省力化できます。WinSoftのOCRコンポーネントは、VCLアプリケーションにOCR機能を追加したいときに利用できるWindows向けコンポーネントです。 OCRコンポーネントを用いれば、画像に含まれるテキストを文字データとして抽出できます。 OCRコンポーネントのインストール方法については、以下のビデオをご確認ください。

デモのコンポーネントとその機能

デモアプリケーションには、2つのメインパネルがあります。左右それぞれのパネルには、他のすべてのビジュアルコンポーネントが配置されています。左側はTPanelで、TImageが置かれています。ここには、選択した画像がロードされます。［Select Picture］ボタンをクリックすると、OnClickイベントでTOpenPictureDialogウィンドウを実行し、ファイルから画像を読み込みます。選択したファイルは、左側のパネルのTImageコンポーネントに表示されます。以下のビデオでは、このデモを実行した様子をご覧いただけます。

with OpenPictureDialog do
    if Execute then
    begin
      Shape.Width := 0;
      Shape.Height := 0;
      ProgressBar.Position := 0;
      MemoText.Clear;
      MemoHtml.Clear;
      MemoUnlv.Clear;
      ImageWords.Picture := nil;
      ImageRegions.Picture := nil;
      ImageTextLines.Picture := nil;
      ImageComponents.Picture := nil;
      ImageParagraphs.Picture := nil;
      try
        Image.Picture.LoadFromFile(FileName);
        Ocr.Picture.Assign(Image.Picture);
      except
        Image.Picture := nil;
        ShowMessage('Preview of this image cannot be displayed. Click Recognize button to start OCR.');
        Ocr.PictureFileName := FileName;
      end;
      if not Ocr.Active then
      begin
        Ocr.DataPath := ExtractFilePath(Application.ExeName) + 'tessdata';
        Ocr.Active := True;
      end;
      ButtonRecognize.Enabled := True;
      Image.Cursor := crCross;
    end

右側パネルには、より多くのオプションが用意されています。テキスト画像ファイルを読み込んだときは、［Recognize］ボタンをクリックしてテキスト認識を開始することができます。テキスト認識処理を実行している間は、［Recognize］ボタンの表示は［Cancel］に変わり、テキスト認識処理が完了する前に停止させることができます。テキスト認識処理自体には、TOcrコンポーネントを使用しています。このコンポーネントの主な用途は、まさに画像内のテキスト認識です。

上記のコードからわかるように、テキスト画像がOCRコンポーネントによって割り当てられると、OCRコンポーネントによって認識されたテキストはTextプロパティによって取得することができます。［Recognize］ボタンをクリックしたときに実行される内容は、Canvasによる別の画像設定などもありますが、これが主なものです。

これらすべての設定は、TPageControlコンポーネントを使用して確認できます。選択したページに応じて右側パネルのコンテンツを変更するタブコントロールとして動作します。最初のページでは、単純テキストを表示します。次は、HTML形式のテキスト、UNLVが続きます。次の「Regions」タブでは、フレーム内に段落を表示します。「Paragraphs」タブでは、（段落が複数ある場合）認識されたテキストを、段落ごとに分割します。「Text Lines」タブでは、テキストの行数をカウントします。「Words」では、認識したテキストを単語ごとに枠で囲みます。「Components」タブでは、認識したたテキストを（TRect）コンポーネント単位で分割して表示します。

以下は、上記の内容を処理するコードです。

if not CancelRequest then
      if Image.Picture <> nil then
      begin
        ImageWords.Picture.Bitmap.Assign(Image.Picture.Graphic);
        ImageWords.Picture.Bitmap.PixelFormat := pf32bit;
        ImageWords.Canvas.Brush.Color := TColor($007FFF);
        for i := 0 to Ocr.WordCount - 1 do
          ImageWords.Canvas.FrameRect(RectToShapeRect(UseShape, Ocr.Words[i], ShapeRect));
        ImageRegions.Picture.Bitmap.Assign(Image.Picture.Graphic);
        ImageRegions.Picture.Bitmap.PixelFormat := pf32bit;
        ImageRegions.Canvas.Brush.Color := TColor($7F00FF);
        for i := 0 to Ocr.RegionCount - 1 do
          ImageRegions.Canvas.FrameRect(RectToShapeRect(UseShape, Ocr.Regions[i], ShapeRect));
        ImageTextLines.Picture.Bitmap.Assign(Image.Picture.Graphic);
        ImageTextLines.Picture.Bitmap.PixelFormat := pf32bit;
        ImageTextLines.Canvas.Brush.Color := TColor($00FF7F);
        for i := 0 to Ocr.TextLineCount - 1 do
          ImageTextLines.Canvas.FrameRect(RectToShapeRect(UseShape, Ocr.TextLines[i], ShapeRect));
        ImageComponents.Picture.Bitmap.Assign(Image.Picture.Graphic);
        ImageComponents.Picture.Bitmap.PixelFormat := pf32bit;
        ImageComponents.Canvas.Brush.Color := TColor($FF7F00);
        for i := 0 to Ocr.ConnectedComponentCount - 1 do
          ImageComponents.Canvas.FrameRect(RectToShapeRect(UseShape, Ocr.ConnectedComponents[i], ShapeRect));
        ImageParagraphs.Picture.Bitmap.Assign(Image.Picture.Graphic);
        ImageParagraphs.Picture.Bitmap.PixelFormat := pf32bit;
        ImageParagraphs.Canvas.Brush.Color := TColor($7FFF00);
        for i := 0 to Ocr.ParagraphCount - 1 do
          ImageParagraphs.Canvas.FrameRect(Ocr.Paragraphs[i].Location);
      end;

テキスト認識処理中に［キャンセル］ボタンをクリックすると、CancelRequestフラグがtrueとなり、上記のコードは実行されず、右パネルに結果が表示されません。

このOCRデモのコードは、こちらからダウンロードできます。

補足：日本語認識のための設定

このOCRコンポーネントは、TesseractのOCRエンジンを使用しています。Tesseract OCRエンジンは、多言語に対応しており、日本語用の言語データ（tessdata）を使用すれば、日本語認識も可能です。デモアプリケーションで日本語認識を行うには、以下の追加手順を実施してください。

日本語用tessdataファイルのダウンロード

GitHubリポジトリから以下の2つの日本語用言語データをダウンロードし、デモアプリケーションの実行フォルダにコピーします。

　jpn.traineddata（横書き用）
　jpn_vert.traineddata（縦書き用）

OCRコンポーネントのプロパティ設定

OCRコンポーネントの以下のプロパティを設定します。

プロパティ	値
Language	lgCustom
LanguageCode	jpn+jpn_vert

OCRコンポーネントは、WinSoft社の製品です。この記事に記載された機能を利用するには、WinSoft社のサイトからOCRコンポーネントを購入する必要があります。OCRコンポーネントに関するサポートは、WinSoft社によって提供されます。

Special Live Webinar: Introducing Kai - A New Chapter for RAD Studio

Reduce development time and get to market faster with RAD Studio, Delphi, or C++Builder.
Design. Code. Compile. Deploy.

Start Free Trial Upgrade Today

Free Delphi Community Edition Free C++Builder Community Edition

WinSoftコンポーネントを使ってOCR文字認識を行う

はじめに

デモのコンポーネントとその機能

補足：日本語認識のための設定

日本語用tessdataファイルのダウンロード

OCRコンポーネントのプロパティ設定

Leave a ReplyCancel reply

検索

Something Fresh

Share What You Built With Kai For Recognition And Great Giveaways

Update Subscription Customers Invited to Join RAD Studio “Pasiphae” Beta

Kai 1.0.2 is Now Available

Popular Posts

Announcing the Availability of RAD Studio 13 Florence Update 1

The Spirit of C++: Freedom, Responsibility, and the Reality of Complex Systems

A Summary of Year 2025 for RAD Studio, Delphi, and C++Builder

Is C++ Too Complex?

Rethinking C++: Ignorance, Surface, and Deep Architecture

カテゴリー

Popular From News

New in 10.3.2: C++17 for Win64 - target all Windows with the C++17 Clang compiler

Delphi 12 And C++Builder 12 Community Editions Released!

Submit Your Own Amazing Projects To The Embarcadero Showcase

We've Updated The HUGE Delphi Anniversary “Innovation Timeline” Infographic. Grab it Now!

Embarcadero InterBase 2020 Update 6 Released!

Latest From GetItNow

C++Builder @ stackoverflow

Delphi @ stackoverflow

InterBase @ stackoverflow

Delphi @ スタックオーバーフロー

Categories

Useful Links

エンバカデロをフォロー

jp custom