Apps Google vision API: tune textBlock detection

Bernard D · Jul 18, 2019

Hello, I am writing an app which require some OCR from the camera. I am currently using google vision API as it seems to be the best option so I'm writing this question with that library in mind, but maybe I could use another API.

One of my goal is to get text from a paper like this one :

When the camera is correctly focused on the paper, I wanted to get only one text block with seven lines. From this seven string of text I'll be able to extract the data I need.

My issue is that I never get only one text block, it look like the space between each column is too big and I got 3 or sometimes 4 text block, from which I can't correctly extract the full lines.

The only workaround that I have in mind now is to break the textBlock up to the 'word' element and with the positions information of each word element I may be able to build back each lines. But this seems like a lot of work.

Is there any way to changes the parameters of the google vision API such that I can 'increase' the white space before it become another textBlock ? Or any other tuning that could make it work as I wanted ? Maybe using the black lines or whatever. Note that there is always 7 lines on the paper, starting with always the same letters as in my pictures. There maybe one or two columns with hours in each lines, or none like in the last line of my picture.

Thanks !

Apps Google vision API: tune textBlock detection

Bernard D

Lurker