Langraf-SKRINS is a set of basic software recognition module for handwritten and printed text.
Scopes of application - manuscript processing, questionnaire and test automation, testing, handwriting sorting, form processing, document management. Customer-oriented development of original software for final interpretation of the identified text.
Basic Module set
DRV-SKRINS-BIN - binarization, alignment, noise and artifact removal.
DRV-SKRINS-SEG - segmentation
DRV-SKRINS-DIG - digital recognition
DRV-SKRINS-TXT - text string recognition
DRV-SKRINS-BIN – intelligent graphics processing module to improve readability and software processing of text information
- Binarization of input complex images. Each input image pixel is classified based on knowledge of its neighboring pixelsto determine whether the pixel refers to text.
- Highlighting image sections containing text for binarization. Areas with pictures, form elements, background, etc. are excluded.
- Eliminating noise and binaryization artifacts. Discarding non-text image defects and pixel group. Aligningimages to text.
DRV-SKRINS-SEG - module for processing structured graphic information divided into geometric forms.
Functionality combines two techniques to highlight information areas in the image.
Geometric selections – for highly structured graphic information, i.e. tables in the image, horizontal and vertical lines, forms of information input, special identification marks.
Adaptive algorithm: coordinate-wise arrangement of image structural elements not required; only general idea of the document.
Machine learning separation: for poorly structured graphic information, when blocks of information are visually separated from each other with logical splitting of text information, but no geometric separators. Based on the created document model, information is extracted from input images.
Combined version - both geometric and neural network - for input images containing both highly structured and weakly structured data.
DRV-SKRINS-DIG – handwritten and typewritten digit recognition module.
- Highlighting handwritten and typewritten numbers in the input image, finding coordinates of areas of interest;
- Pre-processing of selected areas, increasing the contrast of the text, removing distortion, aligning handwriting inclination;
- Segmentation of a selected region into regions with independent symbols, preprocessing and normalization of selected symbols;
- Deep machine learning methods to recognize typewritten and handwritten numbers on input images.
- Trained neural network can recognize non-standard spelling of numbers, not used in training.
DRV-SKRINS-TXT - handwritten text string recognition module
- Highlighting handwritten and typewritten numbers in the input image, finding coordinates of areas of interest
- Pre-processing of selected areas, increasing the contrast of the text, removing distortion, aligning handwriting inclination
- Deep machine learning methods to recognize handwritten strings using End-to-end approach, resulting in a handwriting probability matrix.
- Modern approaches in convolution and recursive neural networks for high network descriptive powerin terms of input string vectorization (convolution part) and high result of processing sequences of characters represented by a text string (recurrence part).
- Dictionary phrase algorithm based on a probability distribution matrix from the neural network.