In WISDOM++ block classifiers are automatically built by means of inductive learning techniques, namely decision tree induction. The choice of a tree-based method instead of the most common generalized linear models is due to its inherent flexibility, since decision trees can handle complicated interactions among features and give results that can be easily interpreted.
WISDOM++ discriminates between:
Each block is represented by a set of numeric attribute-value couples:
|Height||Height of the reduced block image.|
|Length||Length of the reduced block image.|
|Area||Area of the reduced block image (height*length).|
|Eccen||Eccentricity of the reduced block image (length/height).|
|Blackpix||Total number of black pixels in the reduced block image.|
|Bw_trans||Total number of black-white transitions in all rows of the reduced block image.|
|Pblack||Percentage of black pixels in the reduced block image (blackpix/area).|
|Mean_tr||Average number of black pixels per black-white transition (blackpix/bw_trans).|
|F1||Short run emphasis.|
|F2||Long run emphasis.|
|F3||Extra long run emphasis.|
In a preliminary study, a decision tree was induced using a set of 5473 examples of pre-classified blocks obtained from 53 documents of various kinds. An average accuracy above 97% (Esposito etal., 1997) was obtained with an extension of C4.5 (Quinlan,1993). However, this learning system operates in batch mode:It is not possible to revise the decision tree when some blocks are misclassified,unless a new tree is generated from scratch using an extended trainingset.
In order to give the user the possibility of training WISDOM++ on-line,we considered the incremental decision tree learning ITI (Utgoff,1989) which is the only one able to handle numerical attributes such as those used to describe the blocks.
ITI can operate in three different ways:
|Operational mode||Behaviour of the decision tree learning system|
|Batch||It builds a new tree from all training examples available|
|Normal||It first updates the frequency counts associated to each node of thetree as soon as a new instance is input. Then it restructures the decision tree according to the updated frequency counts.|
|Error-correction||Analogous to the normal operational mode except that the frequency counts are updated only in case of misclassification of the new instance|
The main difference between the two incremental modes is that the former guarantees to build the same decision tree independently of the order inwhich examples are presented, while the latter does not.
In order to test the performance of ITI as block classifier, a set of 112 real, single-page documents have been considered as input data. Three experiments have been organized. ITI has been trained in the batch mode in the first experiment, in the pure error-correction mode in the second one, and in a mixed normal/error-correction mode in the third one.
The main characteristics of the learned trees are reported in the table below. The last column refers to the number of examples stored in the nodes of the induced trees.
|Size Kb||No. Nodes||No. leaves||No. incorporated examples|
The decision tree built in the batch mode takes more than 24Mb, sinceall instances have to be stored, while the decision tree obtained in the error-correction mode requires 982Kb. Nevertheless, this difference in tree size does not correspond to a significant difference in predictive accuracy estimated on the independent test set. This justifies the use of the decision tree built according to the error-correction operation mode within WISDOM++ 2.0.
Experimental results are reported in (Altamura et al., 1999).
Back to home page