Development of Russian Cars Plates Recognition Library

by admin in Car Plate Recognition on July 22, 2020

At present, in various areas of human activity is increasingly applied computer vision technology. For example, image recognition is used in automated systems for control of production processes, When checking e-documents, in information and control systems for various purposes. This paper discusses the license plate recognition, that is true for automation of work of public security organs traffic, Police, security organizations. Recognition is performed using the OpenCV library

Object of research: the methods of image recognition.

Subject of research: the process of detection and recognition of road numbers on the digital images using the Viola-Jones

The purpose of the: development of libraries, implementing the automatic recognition of Russian car number plates.


  1. Explore methods of recognition and classification of image tiles.
  2. Study method of Viola-Jones as the basis for the development of a modified algorithm to find identifiable areas within an image car rooms.
  3. A study of the Cascades as classifiers Haar and analyze structures results in XML format.
  4. Develop search algorithms of the borders of the selected motor rooms
  5. Develop an algorithm to normalize the angle of the car numbers
  6. Develop an algorithm of normalization and the allocation of related areas in the image of road numbers.
  7. Publish an article about the Tesseract OCR library Assembly under MinGW
  8. Publish an article about the process of learning classifier Haar
  9. Consider the recognition of characters found motoring rooms for example working with TesseractOCR.
  10. Develop a library for the automation of Russian plates recognition and test its software development kit, carrying out an analysis of the numbers in the wild.

1. Search for road numbers

1.1 Method Of Viola-Jones

Currently, Viola-Jones method is a popular method to search for the object in the image due to its high speed and efficiency. In the Foundation of the Viola-Jones based on the: integral representation of an image based on Haar, building classifier algorithm based Adaptive boosting and way of combining classifiers in cascade structure. These ideas allow the Search the object in real-time. Let’s look at them in more detail.

Integral representation of the image is the matrix, similar in size to the source image. In each element of the matrix is stored amount of intensity all pixels, located to the left and above this element is the lower right corner of the rectangular region (0,0) before (x,y). Matrix elements of L can be calculated by the formula:

where (I)((I),(j)) -the brightness of the pixels of the original image.

Calculation of matrix elements values takes place during, proportional to the number of pixels in the original image, Therefore the integral image is calculated in one pass.

Matrix elements are calculated according to the formula:

Using integral presentation of images, you can quickly calculate the total brightness of an arbitrary rectangular region of the image. Example of calculations given in Appendix “a”.

At the stage of object detection in Viola-Jones method uses a specific window size, that moves on the image. For each area of the image, over the window, the calculated sign of Haar, with the help of which the search object.

Sign-display, where Df is the set of allowed values for a characteristic. If there are signs of f1, …, fn, the signs of vector x =(F1(x),…,fn(x)) called the priznakovym descriptor for an object x. Indicative descriptions of acceptable compared with the objects themselves. While many X = Df1 * … * Dfn called priznakovym space.

The signs are divided into the following types depending on a multitude of Df:

  • binary sign, DF ={0,1};
  • nominal sign: Df is a finite set;
  • serial sign: Df is a finite ordered set;
  • quantitative trait: Df is the set of real numbers.
Figure 1 -Haar Primitives

Sign Haar is calculated on contiguous rectangular areas. In the standard method, the Viola-Jones used rectangular primitives, изображенные на рисунке 1.

Вычисляемым значением F признака Хаара будет где X -the sum of the brightness values of the pixels, the closed bright part of the primitive, Y -the sum of the brightness values of the pixels, the closed dark part. To calculate the notion of integral images, discussed above, and signs of Haar can be evaluated quickly, for constant time. Use signs Haar gives brightness set-point point on the x-axis and y-axis, respectively.

Since signs of Haar little suitable for study or classification, to describe the object with sufficient accuracy the greater number of signs. So signs of Haar come in a cascade classifier, for a quick fling Windows, where the requested object is not found, and issuance of result “true” or “false” to find object.

Classifier based on boosting algorithm (from the English. boost-improved, strengthening) to select the most appropriate indicators for the desired object on the part of the image. In General, busting is a complex of methods, to improve the accuracy of analytical models. Effective model, allowing a little buggy classification, called «strong». “Weak”, on the contrary, prevents reliably share classes or give accurate predictions, makes a lot of errors. Therefore, busting means “strengthening” weak “models and serial composition is the process of machine learning algorithms, When each of the following algorithm seeks to compensate for the shortcomings of composition all previous algorithms.

As a result, boosting algorithm works on each iteration is formed by simple type classifier:

-the direction of the inequality sign, -threshold value,-the calculated value indication,-image window size 24×24 pixels.

The resulting classifier has a minimum error relative to the current values of the weights, involved in a learning procedure for the determination of errors.

Search car numbers on the digital image

To search for an item on a digital image uses trained classifier, submitted in xml format. The categorizer is formed on the primitives Haar.

Structure of a classifier:

Figure 2 -Structure of a classifier

where maxWeakCount – the number of weak classifiers;

stageThereshold – maximum brightness threshold;

weakClassifiers – a set of weak classifiers, on the basis of which the decision is made on how, is the object on the image or not;

internalNodes and leafValues  parameters specific to the weak classifier.

The first two values in internalNodes not used, the third is the number sign in the overall table signs (It is located in the XML file under the tag features), Fourth-threshold weak classifier. If the value of the trait is less weak threshold Haar classifier, the first value is selected leafValues, If there is a second.

In Figure 3 showcased the results of the work of the first cascade classifier. White spots indicate the alleged plot for further search.

On the basis of this baseline built cascade classifier, a decision on whether, detected object in the image or not. The presence or absence of the subject in a window is determined by the difference between the value of the trait and threshold, as a result of training received.

In Figure 3.1 search results are demonstrated using a classifier.

Figure 3 -The result of the work of the first cascade classifier
Figure 3 -The result of the work of the first cascade classifier
Figure 3 -The result of the work of the first cascade classifier
Figure 3 -The result of the work of the first cascade classifier
D:\Programing\DataBase\Задний номер\Test.jpg
Figure 3.1 -The result of the work of the automotive search rooms

2. Preparation for detection of road numbers

2.1 Normalization algorithm angle

Figure 4 -The maximum angles.

After we found the car room (Figure 3), We need to prepare him or her for identification, to do this we must normalize the angle of the car numbers.

Figure 5 -the result of the algorithm. 1.4 angle°

Suppose, that number may have rotation range from -10° to 10°. In so doing, we will handle each time a new frame in increments of 0.1°. With each frame will work independently. What is the hypothesis of the turn will give the best result, TA and win. The maximum angles of inclination are given in Figure 4. For each frame is calculated the lower bound of image. After the algorithm wins a corner, for which has been calculated the top border. This will be the desired angle. The result of the algorithm shown in Figure 5. More information about the algorithm to search the bottom boundary can be found in chapter 2.2. Flowchart of algorithm is given in annex “b”.

2.2 Algorithm to search the bottom border road numbers

After we found the angle of the car numbers, We need to find the boundaries of the. Search the bottom border road rooms is built on analyzing the brightness histogram.

Figure 6 -the principle of construction of histograms

To start the image is processed by a specific rule, which defines the condition split pixels on black and white. This operation is called image thresholding. After we believe binarization in each column number of black pixels and on the basis of information received, build a histogram of the image. The principle of construction of the histogram is displayed in Figure 6. Example binarizirovannogo motor histogram and its rooms is the figure 7.

D:\YandexDisk\Скриншоты\2015-07-12 15-42-07 Скриншот экрана.png
Figure 7 -motor histogram rooms and thresholding

The figure shows, that there are sharp level histogram at the beginning and end of the road numbers. The algorithm examines dincreasing data.

On their basis hypothesis builds lower(e)th borders(e) (a)vtomobil′nogo rooms.

Flowchart of the bottom border of a search algorithm is given in annex “d”.

2.3 Algorithm to search the upper border of the road numbers

Figure 8 -job search algorithm of the top border of the road numbers

In some cases,, algorithm using Haar cascade may not be successful, This is typical for images with a very low resolution. For such images we use as an alternate search algorithm using border luminance histograms.

2.4 Search algorithm of lateral boundaries road numbers

At this point, we have already cut the car number on the top and bottom border, It remains to determine the lateral border of the road numbers. In this case,, We will apply the method to construct a histogram of brightness, but we have a problem: the color of the car. If the car white, After the binarization image around the edges will be white, and if the car black color, the edges are black, and the room white. From this it follows, that we should use two hypotheses on each side, one to search the border on the white car and a second to find the black border. WINS Ta hypothesis research result, which is closest to the center of the car numbers. To improve search quality lateral limits of automobile numbers, before you build the luminance histograms we conduct basic morphological transformations:

  • Erode — blur (operation narrowing).
  • Dilate-stretch (operation expansion).
Figure 9 -Binarizirovannye black and white footage of a car.

An example of black and white frame binarization machines is the figure 9. Flowchart search algorithm of lateral boundaries is given in annex “e”.

3. Segmentation of road numbers to characters

Figure 10 -The result of executing the search algorithm and borders/n.
Figure 11 -conversion steps

The result of the work of the lateral limits of the search algorithm of road numbers, is the image numbers, for applicable character segmentation algorithm. An example of the resulting motor rooms after you complete a search algorithm of lateral boundaries is the figure 10.

To search for characters road rooms we use industry-standard methods of library OpenCV. First we need to translate the image into a black and white format and its binarizirovat′. After we process the filtered image binarization MIDs, to remove the pomehovuû component. The next step is the process of image blurring using the Gauss method. Blur is necessary for smoothing the edges of characters as shown by tests, This conversion significantly increases the probability of a successful search symbol. In the next step we are going to use the most popular method for selection borders-borders detector Kenny. Detector algorithm quite trivial and is following:

  1. Remove noise and extraneous details from the image
  2. Calculate the gradient image
  3. Make the edge thin
  4. Bind the edges to paths
Figure 13 -room with noise component
Figure 12 -Select characters

Results and conversion steps road numbers displayed in Figure 11. After selecting the boundaries, we use a standard feature to find related areas. The final step in finding related area is the comparison found the field character pattern of road numbers and discard invalid scopes. The result of the selection of characters is given in Figure 12. Besides, This method allows you to correctly determine the characters of motor numbers on the image with enhanced noise component, an example is given in Figure 13. Block diagram of the algorithm is given in annex “yo”.


4. Character recognition of road numbers

Character recognition is the final stage of the work of the library, at the moment we have an array with selected characters road numbers. For character recognition, it was decided to use the ready, time-tested, product from Google’s Tesseract-OCR. Library for Tesseract OCR OCR uses neural network, We had to train with taking into account road symbol template rooms. During the Assembly of the Tesseract OCR under mingw32 compiler we have encountered certain problems. We watched a lot of it resources and found no detailed instructions to build under mingw, so we decided to write an article and published it on habrahabr, link to the article and the article itself is located in Appendix “f”. Block diagram of the algorithm for character recognition of road numbers is given in annex “z”.

5. Basic methods of library

This paragraph describes the basic methods of license plate recognition library.

The name of the methodInput dataThe output of theDescription
recognizeImageThe text of theImage recognition
getLicenseTextThe text of theThe return of recognized automobile numbers
getframesImageReturn the found area with road number
findLettersImageCoordinatesSelect contiguous regions and normalization of images
SetImageImageSets the image for OCR
showSymboldemonstration on screen images with characters found in the automobile room
saveSymbolsSaves to disk the user found symbols on the road numbers
saveFrameSaves to disk the user area found in the automobile room
recognizeLettersCoordinatesThe text of theRecognition of characters found
showNormalImageDemonstration of image recognition

6. Practical application developed by libraries: test run the program on protected object

The project developed open library license plate recognition. Program, written using library functions, was tested in a private security company. To test was recorded on video entering the territory of the machines; the program has implemented the allocation of road numbers, text representation which can be subjected to further processing. To demonstrate the present several frames from videos. In Figure 10 shows the image and the result of the work programme for the identification.

The results of the work programme for the identification numbers are given in Figure 11.

Figure 11 -Demonstration programme on the example input data as video


Tasks, raised at the outset of its work on the draft, implemented: during the project we explored methods of recognition and classification of image tiles, method of Viola-Jones. Developed search algorithm and normalization of road inclination rooms, Search top, the bottom and lateral limits of the car numbers, gistogramian analysis, normalization and the allocation of related areas in the image of road numbers and found recognition of characters. We have optimized the recognition algorithm and taught work with library rooms, that have small jamming sources components. We have developed a library to automate the recognition of Russian car number plates, and it has been tested under real conditions.

Further development of the project includes the improvement of the algorithm of clipping noise component images.

List of sources used

  1. A Real-Time Mobile Vehicle License Plate Detection and Recognition Kuo-Ming Hung and Ching-Tang Hsieh : gistogrammnyj approach for identifying numbers [Electronic resource]. URL: (date of treatment 20.08.2014)
  2. Automated Number Plate Recognition Using Hough Lines and Template Matching Saqib Rasheed, Asad Naeem and Omer Ishaq: search room through the HOG descriptors vertical lines [Electronic resource]. URL: (date of treatment 21.08.2014)
  3. Survey of Methods for Character Recognition Suruchi G. Dedgaonkar, Anjali (A). Chandavale, Ashok M. Sapkal : a small review article about recognition of letters and figures [Electronic resource]. URL: (date of treatment 20.08.2014).
  4. Krasheninnikov In. R. Image processing theory basics: A collection of laboratory works [Electronic resource].- Ulyanovsk: Ulstu, 2004. URL: (date of treatment 21.08.2014)
  5. Haar cascade work in OpenCV in pictures: theory and practice [Electronic resource].
    URL: (date of treatment 20.08.2014)
  6. Build Tesseract OCR libraries under MinGW [Electronic resource]. Author: Konstantin Kulakov.
    URL: (date of treatment 13.08.2015)

Annex “A”

Example of calculation of integral matrix

We have a rectangle ABCD with object of interest (D) (Figure a. 1):

Figure a. 1

In the picture you can see, that amount within a rectangle can be expressed through the sum and difference of the adjacent rectangles according to the following formula:

Approximate calculation is shown in figure a. 2:

Figure a. 2

Annex “B”

D:\GoogleDrive\Угол наклона.png

Annex “G”

D:\GoogleDrive\поиск нижней границы.png

Annex “D”

D:\GoogleDrive\Поиск верхней границы (1).png

Annex “E”

D:\GoogleDrive\Поиск левой границы.png
D:\GoogleDrive\Поиск правой границы.png

Application «Yo»

D:\GoogleDrive\Сегментация а_н.png

Annex W

D:\GoogleDrive\Распознавание символов.png

Recommended Articles and Codes:


Share Your Valuable Opinions