Site icon Embarcadero RAD Studio, Delphi, & C++Builder Blogs

This API Adds Machine Learning Computer Vision To Your App

OCR API Featured Image

Microsoft’s Azure has a broad collection of services you can access with an easy-to-use API.

Azure is Microsoft’s cloud hosting and computing platform with a catalog of more than 200 different products. It also includes products that allow you to implement Machine Learning services. Those services all have an API that you can access using client access libraries or a REST client. Delphi, an IDE software, takes this ease of use one step further by providing a TAzureConnectionInfo to implement some of those services quickly and with a minimal amount of code. Also, we can access the services using Delphi’s built-in REST client.

Is Azure Read Client Free?

No, the Azure Read Client is not free, but the good news is when you first register you will get some free 12 month services and a free-tier allowance for services so you can try things out as you develop, test, and launch your app. Some Azure services have very generous ‘always free’ levels. The service we are going to use in our OCR application is  “Cognitive Search” which is one of those which is always free. It has a permanent limit of 10,000 documents, but that’s more than enough for our testing purposes.

How do we get API credentials for the “Computer Vision” service resources?

For our OCR application, we are going to use the “Cognitive Services->Computer Vision” resource. First, you must have an “Azure subscription”. Go to the link below and create an “Azure subscription” if you don’t have one. It’s free to start!

https://azure.microsoft.com/en-us/free/cognitive-services/

Once you have the subscription, you can create the required service links. Go to this link to create a “Computer Vision” resource.

https://portal.azure.com/#create/Microsoft.CognitiveServicesComputerVision

Make sure you select the correct region because you can’t go back and change it later. Now go to the resource you created and choose “Keys and Endpoint” from the left-hand menu. Now, copy one of the keys and the location. We need those in our application.

How to connect to the Cognitive Services REST API?

We need TRESTClient, TRESTRequest and a TRESTResponse components to connect to the cognitive services REST Api. Lets drag and drop TRESTClient in to the forum and do some basic property changes. Make sure the “Accept” property has “application/json” type. Set the content type to “application/json“.

Then drop a TRESTRequest component into the forum and set client property to the client we created earlier. Set the method to “rmPOST“.

Place a TRESTResponse component and set the response property in the request component to this response object.

Add some edit boxes, buttons and a memo box to complete our interface.

How do we use the API to post the image for processing?

We cannot post the image to cognitive services and get the result in one call. The image processing takes time, although usually this is less than five seconds. So, first we need to submit our image to the cognitive service and get the “Operation-Location” and check the status until it’s “succeeded”. If the server is still processing the image, status will be “running” instead.

For each and every call to cognitive service API, we need to provide our subscription key through the HTTP Header. It’s the key we copied earlier from the resource we created. To do that, add a new parameter to TRESTRequest component, set the ‘kind’ to “pkHTTPHEADER“, set name to “Ocp-Apim-Subscription-Key” and set the the value to your key.

We must provide the image URL we want to process in a JSON format. To do that, add new parameter with the name to “data”. Set content type to “ctAPPLICATION_JSON“, set kind to “pkGETorPOST“, and set value to something like this:

[crayon-67435033e3bd9430560314/]

We need to set the base URL of our REST client. To make the base URL, we need to know the location of your resource which we copied earlier. Replace the “LOCATION” with the location of your resource.

[crayon-67435033e3be2180973279/]

Now your are ready to post the request. Make sure the request method is “rmpost” and execute. If the request is success without any error, you will get a response with empty content but with some headers. The header wee need for the next request is “Operation-Location”. This is the location to to our request results once it’s finished. As it’s mentioned earlier, the request can take some times (Usually 1-3 seconds) to process. But it’s hard to predict. So we must check weather it’s finished or not.

How do we get the OCR results from the Operation-Location?

To request the OCR reading results, we need to send a GET request to the URL of “Operation-Location”. But we need to pass the “Subscription key” in the header as earlier. Please create a parameter and add the Subscription key as earlier. Now send the GET request and you will get a response with JSON body. The “status” object value will be “running” it it’s still running. Then you have to request it in a later time. If it succeeded you will get the OCR reading in a JSON format. All you need to do is parse the JSON and you’re done!

You can download the demo application source code from this GitHub link.

https://github.com/checkdigits/OCRReadClient_example

Also the source code for the post button is like this:

[crayon-67435033e3be4040783417/]

Are you ready to add computer vision to your applications?

Exit mobile version