OCR Segmentation in MVTec HALCON

Published on October 18, 2019

This post, OCR Segmentation in MVTec HALCON is the second in a series of 7 posts from Pushing OCR Performance with MVTec HALCON: 1, 2, 3, 4, 5, 6, 7.

Segmentation, the OCR assistant's second tab, (fig. 1, below) is used to optimize the parameters for character segmentation. Here relevant parameters (e.g. text polarity, basic symbol appearance, shape, and symbol size) can be adjusted. Please note: symbol width and height do not represent maximum values but rather the approximate size of an uppercase character measured in pixels.

<b>Fig. 1:</b> OCR assistant

Keep in mind that all Segmentation tab settings refer to segmentation only. Therefore, ALL UPPER CASE (listed under Symbol Shape) only concerns the expected segmentation regions and does not influence the classification part of OCR. For uppercase letters, all letters have approximately the same height as given in symbol height. In general, areas below the baseline might be discarded from segmentation when this option is selected.

Min. Fragment Size can be adjusted so as to exclude small noisy or cluttered regions that are recognized as punctuation or separators. This value, given in pixels, defines the minimum size of a fragment to be considered as an individual symbol. In order to avoid losing desirable fragments such as the tittle (dot) on the letters i or j, the option to Connect Fragments should be checked. In so doing, the tittle and the bar of the letter i are merged into one region.

In order to facilitate segmentation, a range of angles can also be defined to correct text line orientation. Internally, Line Orientation is checked in a range between the maximum and minimum values specified. Take care in selecting suitable values here since this range will influence the calculation's run time. Symbol Slant achieves something similar, but this time the characters themselves have a different orientation relative to the baseline.

Some advanced parameters for defining text layout such as Max. Number of Lines and Line Structure can be also be set. In general, these parameters narrow or expand the segmented regions. If there is available text characteristic information, it could be helpful to set these parameters accordingly.

In the field Inspection (bottom of the Segmentation tab), the individual segmentation steps show how line orientation and slant are corrected and how the foreground and symbols are extracted.

Returning to the sample butter wrapper, the appropriate settings could be similar to the ones pictured in Fig. 1 (above, left). The characters are composed of individual dots which are relatively large and are all uppercase. The symbols are slightly slanted which is corrected in a range from -14 to 10 degrees.

The next post examines the selection of suitable classifiers from the OCR assistant's OCR Classifier tab. (Please see navigation at the top of this page for additional posts.)

Please click here to download image.

Share this post with your friends and coworkers:

Post published by TIS Marketing on October 18, 2019.

About The Imaging Source

Established in 1990, The Imaging Source is one of the leading manufacturers of industrial cameras, video converters and embedded vision components for factory automation, quality assurance, medicine, science, security and a variety of other markets.

Our comprehensive range of cameras with USB 3.1, USB 3.0, USB 2.0, GigE, MIPI interfaces and other innovative machine vision products are renowned for their high quality and ability to meet the performance requirements of demanding applications.

ISO 9001:2015 certified MVTEC | Edge AI + Vision Alliance | EMVA

Contact us