WISDOM++

A WIse System for DOcument Management

LACAM @ Dipartimento di Informatica - Università degli Studi di Bari - Via Orabona, 4 -70126 Bari

A short description
The functional architecture
The distribution package
Glossary
Project team
Related publications
CORDIS Technology Marketplace
FAQs

A short description

WISDOM++ is an intelligent document processing system that can transform paper documents into XML format.
Distinguishing features:
High adaptivity. WISDOM++ is a knowledge-based system capable of supplying assistance to the user during the document analysis and recognition process. The knowledge base, declaratively expressed in the form of decision trees or rules, is automatically built from a set of training documents using machine learning techniques.
Real-time user interaction. WISDOM++ has been designed as a multi-user system where each authorized user has his/her own rule base.
Multi-page document management. WISDOM++ processes the pages of a multi-page document independently of each other in all steps, since the optical scan function is able to work on a single page at a time. The sequence of pages in a multi-page document is defined by the user.

Top of this page

The functional architecture

The document analysis and recognition process in WISDOM++ consists of the following steps:

Page acquisition. Each page is scanned with a resolution of 300 dpi and thresholded into a binary image. The bitmap of an A4-sized page takes 2,496*3,500=1,092,000 bytes and is stored in TIFF format.

Document analysis.

Pre-processing, involving the evaluation of the skew angle, the rotation of the document image, and the computation of a spread factor.
Segmentation, aiming at segmenting the document image into blocks
Block classification, aiming at classifying blocks with respect to the type of content (separation of text from graphics)
Layout analysis, concerning the extraction of the layout structure of the document.

Document Classification and Understanding. The problem of finding the logical structure of a document can be cast as the problem of defining a mapping from the layout structure into the logical one. In WISDOM++, this mapping is limited to the association of a page with a document class (document classification) and the association of page layout components with basic logical components (document understanding). The mapping is built by matching the document description against both models of classes of documents and models of the logical components of interest for that class.

OCR. WISDOM++ allows the user to set up the text extraction process by selecting the logical components to which an OCR has to be applied.

Transformation into a web-accessible format. The web-accessible version (XML format) of the original document contains both text returned by the OCR and pictures extracted from the original bitmap and converted into JPEG format. Text and images are spatially arranged so that the XML reconstruction of the document is as faithful as possible to the original bitmap. Moreover the XML format maintains information extracted during the document understanding phase, since the Document Type Definition (DTD) is specialized for each class of documents in order to represent the specific logical structure.

Top of this page

The distribution package

WISDOM++ 2.0 is an application running under Windows98 or higher.
Download the distribution package (wisdom++.zip, 59.5 MB) and unzip it into a temporary directory.
Sample multi-page documents are also available. They all belong to a single class (tpami) of scientific papers. They have been completely processed and you can load their layout and logical structures from the database provided with this distribution package. The database also includes a set of learned rules used in the document classification and understanding processes. To view document images made available in TIFF format, please be sure that they are unzipped in the "Doc" folder of the user "root" defined in the database.
See the User Guide for further details about system requirements, installation and usage of the system.

Warning: The system WISDOM++ 2.0 is free for evaluation, research and teaching purposes, but not for commercial purposes.

Please Acknowledge

Top of this page

Glossary

Block: Rectangular area enclosing a portion of the document content.

Frame: Rectangular area corresponding to a group of blocks.

Layout structure: Structure which associates the content of a document with a hierarchy of layout objects (such as blocks, frames and pages). The leaves of the layout tree are the blocks, while the root represents the set of pages of the whole document. Intermediate nodes of the layout tree associated to each page may include several frames.

Logical structure: Structure which associates the content of a document with a hierarchy of logical objects (such as sender/receiver of a business letter, title/authors of a scientific article, etc.).

OCR (Optical Character Recognition):

Skew angle: The orientation angle of the text baselines of a document image.

Spread factor: factor computed by WISDOM++ and used to define some parameters of the segmentation algorithm. In simple documents with few sparse regions this ratio is greater than 1.0, while in complex documents with closely written text regions the ratio is lower than the unit.

XML (eXtensible Marked-up Language): metalanguage used by WISDOM++ to export the results, stored in the database, of the whole document image analysis and recognition process.

Top of this page

Project team

Project Leader

Donato Malerba

LACAM Staff

Oronzo Altamura, Margherita Berardi, Michelangelo Ceci, Floriana Esposito

Students currently involved in the project

Fortunato A. Ammendolia

Previous collaborators

Francesco De Tommaso, Massimo Di Ceglie, Dario Gerbino, Francesca A. Lisi, Gaetano Lombardi, Palma Perrucci, Teresa Pinto, Ignazio Sardella, Giacomo Sidella, Rosa Maria Spadavecchia, Silvana Spagnoletta

Top of this page

Related publications

(in inverse chronological order)

M. Berardi, M. Lapi, D. Malerba (2004). An integrated approach for automatic semantic structure extraction in document images. In S. Marinai & A. Dengel (Eds.), Document Analysis Systems VI. 6th International Workshop, DAS 2004, Lecture Notes in Computer Science, Vol. 3163, 179-190, 2004. (pdf)

D. Malerba, F. Esposito, O. Altamura, M. Ceci & M. Berardi (2003). Correcting the Document Layout: A Machine Learning Approach, Proceedings of the Seventh International Conference on Document Analysis and Recognition, 97-102, IEEE Computer Society Press, Los Vaqueros, CA.(pdf)

D. Malerba, M. Ceci & M. Berardi (2003). XML and Knowledge Technologies for Semantic-Based Indexing of Paper Documents, in V. Marík, W. Retschitzegger, & O. Stepánková (Eds.) Database and Expert Systems Applications, 14th International Conference, DEXA 2003, Lecture Notes in Computer Science, 2736, 256-265, Springer, Berlin, Germany. (pdf)

M. Berardi, M. Ceci, F. Esposito & D. Malerba (2003). Learning Logic Programs for Layout Analysis Correction, Proceedings of the 20th International Conference on Machine Learning (ICML 2003), 27-34. (pdf)

D. Malerba, F. Esposito & O. Altamura (2002). Adaptive layout analysis of document images, in H.-S. Hacid, Z.W. Ras, D.A. Zighed & Y. Kodratoff (Eds.), Foundations of Intelligent Systems, 13th International Symposium, ISMIS'2002, Lecture Notes in Artificial Intelligence, 2366, 526-534, Springer, Berlin, Germany. (pdf)

D. Malerba, F. Esposito, F.A. Lisi & O. Altamura (2001). Automated Discovery of Dependencies Between Logical Components in Document Image Understanding, Proceedings of the Sixth International Conference on Document Analysis and Recognition, 174-178, IEEE Computer Society Press, Los Vaqueros, CA. (pdf)

O.Altamura, F. Esposito & D. Malerba (2001). Transforming Paper Documents into XML Format with WISDOM++, International Journal of Document Analysis and Recognition, Springer Verlag, 3(2), 175-198. (pdf)

F. Esposito, D. Malerba, & F.A. Lisi (2000). Machine Learning for Intelligent Processing of Printed Documents, Journal of Intelligent Information Systems, Kluwer Academic Publishers, 14(2/3), 175-198. (pdf)

O. Altamura, F. Esposito, & D. Malerba (1999). WISDOM++: An Interactive and Adaptive Document Analysis System, Proceedings of the Fifth International Conference on Document Analysis and Recognition, 366-369, IEEE Computer Society Press, Los Vaqueros, CA. (pdf)

O. Altamura, F. Esposito, F.A. Lisi, & D. Malerba (1999). Symbolic Learning Techniques in Paper Document Processing, in P. Perner and M. Petrou (Eds.), Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Artificial Intelligence, 1715, 159-173, Springer: Berlin.

O. Altamura, F. Esposito, F.A. Lisi, & D. Malerba (1999). Attributional and Relational Learning Issues in Document Analysis and Recognition, Proc. of the ICML'99 Workshop on Machine Learning in Computer Vision, 20-31, Bled, Slovenia.

F. Esposito, D. Malerba, & F.A. Lisi (1999). Machine Learning for Intelligent Document Processing: The WISDOM System, in Z.W. Ras and A. Skowron (Eds.), Foundations of Intelligent Systems, Lecture Notes in Artificial Intelligence, 1609, 103-113, Springer: Berlin.

F. Esposito, D. Malerba, G. Semeraro, N. Fanizzi, & S. Ferilli (1998). Adding machine learning and knowledge intensive techniques to a digital library service, International Journal on Digital Libraries, 2, 1, 3-19.

F. Esposito, D. Malerba, G. Semeraro, & F.A. Lisi (1998). Machine learning issues in analysis, classification and understanding of document images. Workshop on Learning in Computer Vision.

D. Malerba, F. Esposito, G. Semeraro, & L. De Filippis (1997). Processing Paper Documents with WISDOM. In M. Lenzerini (Ed.), AI*IA 97: Advances in Artificial Intelligence, Lecture Notes in Artificial Intelligence, 1321, 439-442, Springer, Berlin, Germany.

F. Esposito, C.D. Antifora, G. De Gennaro, D. Malerba, & G. Semeraro (1997). Information Capture and Semantic Indexing of Digital Libraries Through Machine Learning Techniques. Proceedings of the Fourth International Conference on Document Analysis and Recognition, 722-727, IEEE Computer Society Press, Los Vaqueros, CA.

F. Esposito, D. Malerba, G. Semeraro, N. Fanizzi, & S. Ferilli (1997). Adding Intelligence to Digital Libraries: IDL. Proceedings of the IJCAI Workshop on "AI in Digital Libraries", 23-31, Nagoya, Japan.

F. Esposito, D. Malerba, and G. Semeraro (1995). "A Knowledge-Based Approach to the Layout Analysis." Proceedings of the Third International Conference on Document Analysis and Recognition, 466-471. IEEE Computer Society Press, Los Alamitos, CA.(pdf)

D. Malerba, G. Semeraro, and E. Bellisari (1995). "LEX: A Knowledge-Based System for the Layout Analysis." Proceedings of the Third International Conference on the Practical Application of Prolog, 429-443.

F. Esposito, D. Malerba, and G. Semeraro (1994). "Multistrategy Learning for Document Recognition". Applied Artificial Intelligence: An International Journal, 8(1), 33-84.

Top of this page

FAQs

None yet available. Send all requests/comments to: Margherita Berardi, Dipartimento di Informatica, Università degli Studi di Bari (Italy).

Top of this page

berardi@di.uniba.it