System for DOcument
Dipartimento di Informatica - Università degli Studi di
Bari - Via Orabona, 4 -70126 Bari
A short description
WISDOM++ is an intelligent document processing system that can transform
paper documents into XML format.
High adaptivity. WISDOM++ is a knowledge-based system capable
of supplying assistance to the user during the document analysis and recognition
process. The knowledge base, declaratively expressed in the form of decision
trees or rules, is automatically built from a set of training
documents using machine learning techniques.
Real-time user interaction. WISDOM++ has been designed as a
multi-user system where each authorized user has his/her own rule base.
Multi-page document management. WISDOM++ processes the pages
of a multi-page document independently of each other in all steps, since
the optical scan function is able to work on a single page at a time. The
sequence of pages in a multi-page document is defined by the user.
Top of this page
The functional architecture
The document analysis and recognition process in WISDOM++ consists of the
Document Classification and Understanding. The problem of finding
the logical structure of a document can be cast as the problem of
defining a mapping from the layout structure into the logical one. In WISDOM++,
this mapping is limited to the association of a page with a document class
(document classification) and the association
of page layout components with basic logical components (document
understanding). The mapping is built by matching the document
description against both models of classes of documents and
models of the logical components of interest for that class.
OCR. WISDOM++ allows the user to set up the text extraction process
by selecting the logical components to which an OCR has to be applied.
Transformation into a web-accessible format. The web-accessible
version (XML format) of the original document contains both text returned
by the OCR and pictures extracted from the original bitmap and converted
into JPEG format. Text and images are spatially arranged so that the
XML reconstruction of the document is as faithful as possible to the
original bitmap. Moreover the XML format maintains information extracted
during the document understanding phase, since the Document Type Definition
(DTD) is specialized for each class of documents in order to represent
the specific logical structure.
Top of this page
Pre-processing, involving the evaluation of the
skew angle, the rotation of the document image, and the computation of
a spread factor.
Segmentation, aiming at segmenting the document
image into blocks
Block classification, aiming at classifying
blocks with respect to the type of content (separation of text from graphics)
Layout analysis, concerning the extraction
of the layout structure of the document.
The distribution package
WISDOM++ 2.0 is an application running under Windows98 or higher.
Download the distribution package (wisdom++.zip,
59.5 MB) and unzip it into a temporary directory.
Sample multi-page documents are also available.
They all belong to a single class (tpami) of scientific papers.
They have been completely processed and you can load their layout and logical structures
from the database provided with this distribution package. The database also includes a set of learned rules
used in the document classification and understanding processes.
To view document images made available in TIFF format, please be sure that they are unzipped in the "Doc" folder of the user "root" defined in the database.
See the User Guide
for further details about system requirements, installation and usage of
Warning: The system WISDOM++ 2.0 is free for evaluation, research and teaching purposes, but not for commercial purposes.
Top of this page
Block: Rectangular area enclosing a portion of the document
Frame: Rectangular area corresponding to a group of blocks.
Layout structure: Structure which associates the content of a
document with a hierarchy of layout objects (such as blocks, frames
and pages). The leaves of the layout tree are the blocks, while
the root represents the set of pages of the whole document. Intermediate
nodes of the layout tree associated to each page may include several frames.
Logical structure: Structure which associates the content of
a document with a hierarchy of logical objects (such as sender/receiver
of a business letter, title/authors of a scientific article, etc.).
OCR (Optical Character Recognition):
Skew angle: The orientation angle of the text baselines of a
Spread factor: factor computed by WISDOM++ and used to define some parameters of the segmentation algorithm.
In simple documents with few sparse regions this ratio is greater than 1.0, while in complex documents with closely written text regions the ratio is lower than the unit.
XML (eXtensible Marked-up Language): metalanguage used by WISDOM++ to export the results, stored in the database, of the whole document image analysis and recognition process.
Top of this page
Top of this page
Oronzo Altamura, Margherita Berardi, Michelangelo Ceci, Floriana Esposito
Students currently involved in the project
Fortunato A. Ammendolia
Francesco De Tommaso, Massimo Di Ceglie, Dario Gerbino, Francesca A. Lisi, Gaetano Lombardi, Palma Perrucci, Teresa Pinto, Ignazio
Sardella, Giacomo Sidella, Rosa Maria Spadavecchia, Silvana Spagnoletta
(in inverse chronological order)
M. Berardi, M. Lapi, D. Malerba (2004). An integrated approach for automatic semantic structure extraction in document images. In S. Marinai & A. Dengel (Eds.),
Document Analysis Systems VI. 6th International Workshop, DAS 2004,
Lecture Notes in Computer Science, Vol. 3163, 179-190, 2004. (pdf)
D. Malerba, F. Esposito, O. Altamura, M. Ceci & M. Berardi (2003). Correcting the Document Layout: A Machine Learning Approach, Proceedings of the Seventh International Conference on Document Analysis and Recognition, 97-102, IEEE Computer Society Press, Los Vaqueros, CA.(pdf)
D. Malerba, M. Ceci & M. Berardi (2003). XML and Knowledge Technologies for Semantic-Based Indexing of Paper Documents, in V. Marík, W. Retschitzegger, & O. Stepánková (Eds.) Database and Expert Systems Applications, 14th International Conference, DEXA 2003, Lecture Notes in Computer Science, 2736, 256-265, Springer, Berlin, Germany. (pdf)
M. Berardi, M. Ceci, F. Esposito & D. Malerba (2003). Learning Logic Programs for Layout Analysis Correction, Proceedings of the 20th International Conference on Machine Learning (ICML 2003), 27-34. (pdf)
D. Malerba, F. Esposito & O. Altamura (2002). Adaptive layout analysis of document images, in H.-S. Hacid, Z.W. Ras, D.A. Zighed & Y. Kodratoff (Eds.), Foundations of Intelligent Systems, 13th International Symposium, ISMIS'2002, Lecture Notes in Artificial Intelligence, 2366, 526-534, Springer, Berlin, Germany. (pdf)
D. Malerba, F. Esposito, F.A. Lisi & O. Altamura (2001). Automated Discovery of Dependencies Between Logical Components in Document Image Understanding, Proceedings of the Sixth International Conference on Document Analysis and Recognition, 174-178, IEEE Computer Society Press, Los Vaqueros, CA. (pdf)
O.Altamura, F. Esposito & D. Malerba (2001). Transforming
Paper Documents into XML Format with WISDOM++,
International Journal of Document Analysis and Recognition, Springer Verlag, 3(2), 175-198. (pdf)
F. Esposito, D. Malerba, & F.A. Lisi (2000). Machine
Learning for Intelligent Processing of Printed Documents,
Journal of Intelligent Information Systems, Kluwer Academic Publishers, 14(2/3), 175-198. (pdf)
O. Altamura, F. Esposito, & D. Malerba (1999). WISDOM++: An Interactive
and Adaptive Document Analysis System, Proceedings of the Fifth International
Conference on Document Analysis and Recognition, 366-369, IEEE Computer
Society Press, Los Vaqueros, CA. (pdf)
O. Altamura, F. Esposito, F.A. Lisi, & D. Malerba (1999). Symbolic
Learning Techniques in Paper Document Processing, in P. Perner and M. Petrou
(Eds.), Machine Learning and Data Mining in Pattern Recognition, Lecture
Notes in Artificial Intelligence, 1715, 159-173, Springer: Berlin.
O. Altamura, F. Esposito, F.A. Lisi, & D. Malerba (1999). Attributional
and Relational Learning Issues in Document Analysis and Recognition, Proc.
of the ICML'99 Workshop on Machine Learning in Computer Vision, 20-31,
F. Esposito, D. Malerba, & F.A. Lisi (1999). Machine Learning for Intelligent
Document Processing: The WISDOM System, in Z.W. Ras and A. Skowron (Eds.),
Foundations of Intelligent Systems, Lecture Notes in Artificial Intelligence,
1609, 103-113, Springer: Berlin.
F. Esposito, D. Malerba, G. Semeraro, N. Fanizzi, & S. Ferilli (1998).
Adding machine learning and knowledge intensive techniques to a digital
library service, International Journal on Digital Libraries, 2, 1, 3-19.
F. Esposito, D. Malerba, G. Semeraro, & F.A. Lisi (1998). Machine learning
issues in analysis, classification and understanding of document images.
Workshop on Learning in Computer Vision.
D. Malerba, F. Esposito, G. Semeraro, & L. De Filippis (1997). Processing
Paper Documents with WISDOM. In M. Lenzerini (Ed.), AI*IA 97: Advances
in Artificial Intelligence, Lecture Notes in Artificial Intelligence, 1321,
439-442, Springer, Berlin, Germany.
F. Esposito, C.D. Antifora, G. De Gennaro, D. Malerba, & G. Semeraro
(1997). Information Capture and Semantic Indexing of Digital Libraries
Through Machine Learning Techniques. Proceedings of the Fourth International
Conference on Document Analysis and Recognition, 722-727, IEEE Computer
Society Press, Los Vaqueros, CA.
F. Esposito, D. Malerba, G. Semeraro, N. Fanizzi, & S. Ferilli (1997).
Adding Intelligence to Digital Libraries: IDL. Proceedings of the IJCAI
Workshop on "AI in Digital Libraries", 23-31, Nagoya, Japan.
F. Esposito, D. Malerba, and G. Semeraro (1995). "A Knowledge-Based Approach
to the Layout Analysis." Proceedings of the Third International Conference
on Document Analysis and Recognition, 466-471. IEEE Computer Society Press,
Los Alamitos, CA.(pdf)
D. Malerba, G. Semeraro, and E. Bellisari (1995). "LEX: A Knowledge-Based
System for the Layout Analysis." Proceedings of the Third International
Conference on the Practical Application of Prolog, 429-443.
Top of this page
F. Esposito, D. Malerba, and G. Semeraro (1994). "Multistrategy Learning
for Document Recognition". Applied Artificial Intelligence: An International
Journal, 8(1), 33-84.
None yet available. Send all requests/comments to: Margherita Berardi, Dipartimento di Informatica, Università
degli Studi di Bari (Italy).
Top of this page