segmentation

The segmentation of a document image in WISDOM++

The segmentation of the document image into rectangular blocks enclosing content portions is performed by means of a variant of the Run Length Smoothing Algorithm (RLSA).

The original algorithm (Wong et al., 1982) applies four operators to the document image:

Horizontal smoothing with a threshold C_h

Vertical smoothing with a threshold C_v

Logical AND of the two smoothed images

Additional horizontal smoothing with another threshold C_a.

and, although conceptually simple, requires scanning the image four times.

WISDOM++ implements a variant where:

the image is scanned only twice with no additional cost (Shih& Chen, 1996);

the smoothing parameters C_v and C_a are adaptively defined on the ground of the spread factor;

the segmentation is sped up since it is performed on a document image with a resolution of 75 dpi (reduced document image).

Bibliography

Shih, F. Y., S.-S. Chen. 1996. Adaptive Document BlockSegmentation and Classification. IEEE Trans. on Systems, Man, and Cybernetics- Part B, 26(5), 797-802.

Wong, K. Y., R. G. Casey, F. M. Wahl. 1982. DocumentAnalysis System. IBM Journal of Research Development 26(6), 647-656.

Back to home page

berardi@di.uniba.it