Glare and reflections are a common problem in image processing. Given the reflective properties of various materials under polarized light, The Imaging Source's GigE and USB polarization cameras can be used to solve inspection tasks where conventional cameras provide limited visual information. Degree of Linear Polarity (DoLP) processing of polarization image data improves image contrast and allows for more effective image segmentation.
The simple integration of the polarization image data into HALCON enables these properties to be easily used in image processing applications, such as completeness checks, surface inspections, etc.
The last post covered the training of a custom classifier tailored to particular application requirements. The focus of this post will be on using advanced parameters when training a custom classifier.
Since the classifier should be invariant with respect to scaling, symbol training regions (and later on, the ones for classification) are scaled to a fixed size. This size can be defined with the parameters Pattern Width and Pattern Height found in the Basic Features section (Fig. 1). If the scaling is too small, important information might not be included and result in underfitting. Conversely, too-large scaling could result in over-specification (overfitting) and increase calculation time for both training and classification.
Gray Values, Symbol Region or Gradient information can be selected under Basic Features. Additional features for the classifier can be selected in Advanced Training Parameters and Features - (e.g. ratio, anisometry or convexity).
In Advanced Training Parameters and Features, the classifier type can also be changed. The available approaches range from support-vector machines (SVM) to k-nearest neighbor (k-NN) and Hyperbox classifiers. (In-depth information on classifiers is discussed at the The Imaging Source's regularly-occuring Machine Learning in HALCON seminars.) For the purposes of this post, a multilayer perceptron (MLP) will be used as the default classifier. Often, the default parameters deliver satisfactory results. Therefore, default classifiers should only be changed if there is certainty that misclassifications are a result of the underlying classifier structure.
Since multiple samples are usually necessary for a good training set, more than one image might be required. After adding samples to the training data and saving the training file, it is always possible to go back to the Setup tab and load a new image and a new region.
Best case, the segmentation will not need to be adapted to the new image and the teaching process can begin again. It is possible that the classifier (already trained on the previous samples) suggests the right classifications. These suggestions can be inserted into the text field by clicking the blue arrow underneath the text field. Training has to be executed again with the whole training file.
The Training File Browser (Fig. 2) which is available by clicking on the right symbol next to Selection of Training File, allows you to inspect the single samples. It not only shows every single sample with segmented region, class and size but also enables the testing of a classifier on this data. If training samples are rare and also occur in rotated or scaled versions, it could also be useful to augment the data accordingly and add new samples to the training data set. To generate variations of the data, select the correct option in the Edit menu.
If the training symbols and classification results on the training set are satisfactory, the classifier can be used like any other pre-trained classifier. Classifier training can also be performed without the assistant: Under the tab Code Generation HALCON allows for the placement of the respective code. If all samples should not be added (but rather only in a semi-automatic way), the desired training file can be appended with append_ocr_trainf.
This series of eight posts covered a simple way of getting started with OCR (e.g. basic operators in working with text model readers and how to create and modify a training file). If you have any questions feel free to contact us or visit one of our training courses.
Please click here to download image.
Previous posts covered working with the OCR assistant and the underlying methodology of a text model reader. This post focuses on the classification aspects of optical character recognition and, most importantly, the training of a custom classifier.
In some cases, the pre-trained fonts or symbol sets available in HALCON are unsuitable for the text at hand or do not deliver satisfactory results. Under these circumstances, an option to be considered is training a custom classifier tailored to the application's needs.
HALCON's OCR Assistant provides options for creating so-called training files. Training files contain information about the symbol region's ground truth class as well as information about their width and height.
The successful implementation of the segmentation step is crucial to the creation of such a training file (or when adding symbols with their respective class). This means, the image and region containing the symbols have been defined and set with correct segmentation parameters. Please note: Quick Setup can be used to perform these tasks.
After successful segmentation, teaching works as follows: By clicking into the text field (fig. 2, above), the first symbol for labeling is displayed in the image field (right). This process should be repeated with every symbol which needs to be taught, one after the other, by typing the corresponding class name into the text field. If all symbols are labeled with their ground truth class, the information can be added to the training data by clicking the button add to training data.
The number of samples is automatically adapted in the next section and training of the classifier is enabled. The classifier filename and status give some basic information about training success (e.g. the classifier's confidence score if it is applied on the training data set). You can click on Train Now to start the training process.
In the next post, the training of custom classifiers will be the topic as well as the creation of a training file and the usage of the Training File Browser in HDevelop. (Please see navigation at the top of this page for additional posts.)
Please click here to download image.
This post gives a short overview of the different approaches for optical character recognition (OCR) in HDevelop. The generated code from the previous post will be examined in detail with a view to working with text model readers. (The download link at the bottom of this post provides the generated code).
In general, the code generated by the OCR assistant is based on text model readers and consists of the following steps:
While additional steps are performed in between those listed above to assure consistent border and domain handling, these operations do not directly impact the topic OCR and so are not covered here. Looking at step 1 (creation of text model reader), a text model reader creates a text model describing to-be-segmented text. Basically, there are two modes which can be used to create a text model: auto and manual. In the generated code, the manual mode has been used. Due to the available options in the assistant, the manual mode is always selected when generating code automatically.
One major advantage of the auto mode is its ability to pass a classifier when creating the text model--simultaneously performing segmentation and classification in one step. Retrieving results is also slightly more convenient in this mode. Additionally, a range for the anticipated width and height of segmented characters can be defined. The auto mode offers an advanced functionality--namely, the segmentation of dot-matrix characters which constitutes another advantage here.
While it's possible to segment dot-matrix print with a model created in manual mode, there are fewer parameters with which to define this behavior. In contrast to the auto mode, manual mode does not allow for the definition of a range for character width and height but rather only an approximate size. The manual mode isn't all bad though: it is able to segment imprinted letters and restrict segmentation to uppercase letters. An option to define segmentation behavior regarding small fragments (which might be discarded as clutter or noise) is an additional advantage offered in manual mode.
There are some additional differences between the two modes, but the choice of mode is always highly dependent upon the specific application and necessary restrictions for character segmentation. The manual mode may be the one selected when using the assistant, but most HALCON guides recommend auto mode.
After selecting the right mode for the text model reader, crucial parameters are defined. In general, every parameter available for the manual mode is characterized by the suffix manual_; auto mode can be used with all the rest.
With step 1 complete, it's time to look at step 2 (reading of OCR classifier). For manual mode it is necessary to read a suitable classifier manually using the operator read_ocr_class_mlp. The ending mlp indicates that the underlying classifier type is a multilayer perceptron which is one of the common approaches for OCR in HALCON. (To learn more about the different types of classifiers used in HALCON, consider attending one of our regular training courses on Machine Learning in HALCON.) Options for various pretrained classifiers were discussed in a previous blog post. In auto mode, a suitable classifier can be passed directly when creating a text model reader; in this case, step 2 is not necessary and can be skipped.
To finally perform segmentation of the text visible in the input image, the operator find_text is applied. This returns a result handle with information about the segmented regions and, if auto mode was used, information about the classification results. These results can then be accessed via the operators get_text_object to address segmented regions and get_text_result to address classification results.
If segmentation and classification are performed separately, the operators do_ocr_multi_class_mlp or do_ocr_single_class_mlp can be used for classification of segmented regions. The multi-class operator is able to process multiple regions but only yields results about the best one class for every region. Single-class operators can only process a single region at a time but provides information about alternative classifications and confidences. Application of the classifier simultaneously on multiple regions performs slightly faster than iteratively executing do_ocr_single_class on the single regions.
To sum it up: There are text model readers which define a text model that describes the text to be segmented. There are two different modes, auto and manual which both have advantages and disadvantages. After segmentation, classification has to be performed on the single regions which were segmented using the text model.
Sometimes, pre-trained classifiers do not yield satisfying results in which case custom classifiers become necessary--a topic which will be discussed in subsequent posts. These posts also cover the creation of a training file and usage of the training file browser in HDevelop.
Please click here to download generated code from the previous post.
In addition to using one of the pre-trained classifiers, custom classifiers can be built by creating a training file and selecting suitable features. More often than not, however, it is enough to use one of the pretrained classifiers. (The creation of a training file deserves its own blog post, and so will be addressed as a separate post in this blog.)
After fine-tuning the segmentation and selecting a suitable classifier, the last two steps of setting up an OCR application are rather easy. To define which features should be displayed in the results table (Results tab, Fig. 2), select parameters and options in Display Parameters. Postprocessing of the classification results is also possible by enabling Word Processing. The classified sequence of characters is then compared to a regular expression or words in a lexicon file.
After prototyping the OCR reader via point-and-click, go to the Code Generation tab which takes care of the rest. Simply set Generation Mode to Text Reading and select no alignment. If the goal is only to read the text in the image, alignment will not be necessary. If OCR is being used in an application where the text position changes, automatic localization of the corresponding region of interest might be necessary. Apply segmentation and classification in this new region, so that code can be generated to align the region. Be aware, however, that alignment only functions when a transformation matrix has been defined which localizes the new region of interest.
At this point, the generated code can be additionally fine-tuned. The next blog post in the series covers craeting text model readers and additional parameters. (Please see navigation at the top of this page for additional posts.)
Please click here to the download image.
Please click here to download the generated code.
Established in 1990, The Imaging Source is one of the leading manufacturers of industrial cameras, frame grabbers and video converters for production automation, quality assurance, logistics, medicine, science and security.
Our comprehensive range of cameras with USB 3.1, USB 3.0, USB 2.0, GigE interfaces and other innovative machine vision products are renowned for their high quality and ability to meet the performance requirements of demanding applications.