Selecting Suitable Classifiers in MVTec HALCON

Published on October 22, 2019

This post, Selecting Suitable Classifiers in MVTec HALCON is the third in a series of 7 posts from Pushing OCR Performance with MVTec HALCON: 1, 2, 3, 4, 5, 6, 7.

After successful segmentation, the next step is to select a suitable classifier for the OCR (OCR Classifier tab, Fig. 1 below). There is an extensive list of pretrained classifiers in HALCON which may seem overwhelming at first. Fortunately, the available classifiers are sorted by font and character set which can be displayed by clicking on the magnifying lens icon (top, right). There are always two versions of a classifier: one trained without rejection class (_NoRej) and one trained with rejection class (_Rej). Fonts that are trained with rejection class are able to distinguish characters from noise or background clutter. As a result, a symbol that can not be classified as one of the characters in the character set is classified as "not assignable" and is put into the rejection class.

Fig. 1: OCR assistant: OCR Classifier tab

For dot matrix print, the Dot Print classifier can be used as well as the Universal classifier. Since the sample image uses the whole range of symbols, a classifier without a specific ending for a symbol set can selected. The effect of rejection-class-trained classifiers becomes apparent when a classifier for a smaller set of symbols is used. For example, if the rejection-class-trained classifier for Dot Print is trained on a symbol set for letters A-Z and numbers 0-9, the symbols like colon, point and slash get classified into the rejection class (fig. 2.). Fig. 3 shows what happens when the classifier is trained without rejection class: even if classified with very low confidence, the symbols are always assigned a class.

<b>Fig. 2 (above, left):</b> <i>Classification via rejection-class-trained classifier and character set A-Z 0-9: symbols like colon, point and slash are classified as rejected.</i> <b>Fig. 3 (above, right):</b> <i>Classification via classifier without rejection class and character set A-Z 0-9: symbols are classified as any other symbol with highest confidence even if this confidence is very low.</i>

The next post takes a look at the Results tab. (Please see navigation at the top of this page for additional posts.)

Please click here to download image.

Share this post with your friends and coworkers:

Post published by TIS Marketing on October 22, 2019.

About The Imaging Source

Established in 1990, The Imaging Source is one of the leading manufacturers of industrial cameras, frame grabbers and video converters for production automation, quality assurance, logistics, medicine, science and security.

Our comprehensive range of cameras with USB 3.1, USB 3.0, USB 2.0, GigE interfaces and other innovative machine vision products are renowned for their high quality and ability to meet the performance requirements of demanding applications.

Automated Imaging Association ISO 9001:2015 certified

Contact us