Following Baird’s definition (Baird, 1987), theskew angle of a document image I is the orientation angleq of its text baselines: It is positive when the image is rotated counter-clockwise, otherwise it is negative. In WISDOM++ the estimation of the actual skew angle qis obtained as composition of two functions: S (I), which returns a sample region R of the document image I,and E(R), which returns the estimation of the skew angle in the sample region R. The selection of a sample region is necessary to reduce the computational cost of the estimation step. Once the sample region R has been selected, E(R) is computed. Let Hq be the horizontal projection profile of R after a virtual rotation of an angle q. The histogram Hq shows sharply rising peaks with base equal to the character height when text lines span horizontally. In the presence of a large skew angle, smooth slopes and lower peaks characterize Hq. This observation is mathematically depicted by a real-valued function, A(q ), which has a global maximum atthe correct skew angle. Thus finding the actual skew angle is cast as the problem of locating the global maximum value of A(q).Since this measure is not smooth enough to apply gradient techniques, the system adopts some peak-finding heuristics (Altamura,1999).
WISDOM++ uses the estimated skew angle when the user asks the system to rotate the document.
Another parameter computed during the preprocessing phase is the spreadfactor of the document image. It is defined as the ratio of the average distance between the regions Ri (avdist) and the average height of the same regions (avheight). In quite simple documentswith few sparse regions, this ratio is greater than 1.0, while in complex documents with closely written text regions the ratio is lower than the unit. The spread factor is used to define some parameters of the segmentation algorithm.
Back to home page