Pre-processing## The preprocessing of a document image in WISDOM++

The preprocessing of a document image in WISDOM++ involves the evaluation of the skew angle, the rotation of the document, and the computation ofa spread factor.Following Baird’s definition (Baird, 1987), the*skew angle* of a document image *I* is the orientation angle*q* of its text baselines: It is positive when the image is rotated counter-clockwise, otherwise it is negative. In WISDOM++ the estimation of the actual skew angle *q*is obtained as composition of two functions: *S* (*I*), which returns a *sample region* *R* of the document image *I*,and *E*(*R*), which returns the *estimation* of the skew angle in the sample region *R*. The selection of a sample region is necessary to reduce the computational cost of the estimation step. Once the sample region *R* has been selected, *E(R) *is* *computed. Let *H*_{q}* *be the horizontal projection profile of *R* after a virtual rotation of an angle *q*. The histogram *H*_{q}* *shows sharply rising peaks with base equal to the character height when text lines span horizontally. In the presence of a large skew angle, smooth slopes and lower peaks* *characterize *H*_{q}. This observation is mathematically depicted by a real-valued function, *A(q )*, which has a global maximum atthe correct skew angle. Thus finding the actual skew angle is cast as the problem of locating the global maximum value of *A(q)*.Since this measure is not smooth enough to apply gradient techniques, the system adopts some peak-finding heuristics (Altamura,1999).

WISDOM++ uses the estimated skew angle when the user asks the system to *rotate* the document.

Another parameter computed during the preprocessing phase is the *spreadfactor *of the document image. It is defined as the ratio of the average distance between the regions *R*_{i} (*avdist*) and the average height of the same regions (*avheight*). In quite simple documentswith few sparse regions, this ratio is greater than 1.0, while in complex documents with closely written text regions the ratio is lower than the unit. The spread factor is used to define some parameters of the segmentation algorithm.

**Bibliography**

- Altamura, O., F. Esposito, D. Malerba. 1999. WISDOM++:An Interactive and Adaptive Document Analysis System. Proc. of the 5thInt. Conf. on Document Analysis and Recognition, 366-369, IEEE ComputerSociety Press, Los Alamitos.

- Baird, H.S., 1987. The Skew Angle of Printed Documents.Proc. Conf. Of the Society of Photographic Scientists and Engineers, 14-21(also In: R.K.L. O’Gorman, Ed., Document Image Analysis, 204-208, IEEEComputer Society, Los Alamitos (CA), 1995).

Back to home page

berardi@di.uniba.it