SommarioLearning in Document Analysis and Understanding Overview Objectives Learning in Document Processing: some data Learning in document processing: Some data Document processing requires a large amount of knowledge Hand-coding knowledge? Typical machine learning applications Comparing development times (Michie, 1989) Overview What is learning ? History Neural modeling (1955-1965) (1986-…) Decision-Theoretic Techniques (1955-1965) Symbolic Concept-oriented Techniques (1962-1980) Knowledge Intensive Learning Systems and Multistrategy Learning (1980 - today) The general model of a Learning System Learning Systems Example: A handwriting recognition learning problem Basic Questions What do Machines learn? Subsymbolic and Symbolic Learning Both learning and performance rely on the ability to represent knowledge Representing experience Representing experience Representing experience Representing experience Representing the knowledge Representing the knowledge Representing the knowledge Levels of Concept Descriptions The task The degree of supervision The degree of supervision How do Machine Learn? How do Machine Learn? Inferences Inferences Inferences Diapositiva di PowerPoint The Inductive Paradigm Empirical Learning(inductively learning from many data) Empirical Learning (inductively learning from many data) Example A small training set How many hypotheses? BIAS How many examples do we need? The deductive paradigm(explanation based learning) A multicriteria classification of machine learning methods Overview Statistical learning methods Statistical learning methods Trainable classifiers The basic model for a trainable pattern classifier The basic model for a trainable pattern classifier The basic model for a trainable pattern classifier The basic model for a trainable pattern classifier Discriminant analysis (Fisher 1936) Diapositiva di PowerPoint Fisher classification functions Overview Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning Decision tree learning From Decision Trees to Decision Rules Overview Learning Rules Directly Concept Learning A Concept Learning Task Representing Hypotheses A Formalization A Formalization What information is available? The inductive learning hypothesis Concept Learning as Search An example Semantically distinct hypotheses Efficient search: how ? General-to-specific ordering Diapositiva di PowerPoint Terminology Taking advantage of the general-to-specific ordering Example Diapositiva di PowerPoint FIND-S Algorithm No revision in case of negative example: Why? Limitations of Find-S Limitations of Find-S (cont.) Version Space The List-Then-Eliminate algorithm Pros and cons Version Space: A compact representation General boundary Specific boundary A Version Space Candidate-Elimination algorithm Candidate-Elimination algorithm (cont.) Diapositiva di PowerPoint What does the Candidate-Elimination algorithm converge to? Empty Version Space Other characteristics How can partially learned concepts be used? How can partially learned concepts be used? How can partially learned concepts be used? How can partially learned concepts be used? An interactive learning algorithm Dealing with noisy training instances Dealing with noisy training instances (cont.) What if the concept is not contained in the hypothesis space? A hypothesis space that includes every possible hypothesis? A fundamental property of inductive inference Linear Regression A formal definition of inductive bias Modeling inductive systems by equivalent deductive systems Bias of the Candidate-Elimination algorithm Comparing the inductive bias of learning algorithms Comparing the inductive bias of learning algorithms Related work Related work Related work Related work Related work Related work Related work Related work Related work Learning disjunctive concepts: How? Sequential Covering algorithms Sequential Covering Algorithm LEARN-ONE-RULE Sequential Covering + Candidate Elimination Sequential Covering + Candidate Elimination General-to-specific search The search space for rule preconditions Beam search Simultaneous vs. sequential covering algorithm Simultaneous vs. sequential covering algorithm Computational complexity Induce rules directly or convert a decision tree to a set of rules? Induce rules directly or convert a decision tree to a set of rules? Replication problem Single-concept rule learning Alternatively ... Changes to Sequential Covering algorithm Classification of new cases Default rule Learning multiple concepts Multiple-concept learning Multiple classification Learning multiple independent concepts Learning multiple dependent concepts Learning multiple dependent concepts (cont.) Learning multiple dependent concepts (cont.) Learning multiple dependent concepts (cont.) Learning multiple dependent concepts (cont.) Related work Related work Related work Related work Propositional rules Overview First-order rules First-order rules and labeled graphs Examples Why do we need first-order representations? Why do we need first-order representations? Problems raised by attribute-value representations Problems raised by attribute-value representations A first-order representation for examples A first-order representation for examples Diapositiva di PowerPoint First order rules as Prolog clauses First order rules as SQL queries When to apply first-order learning algorithms? Differences between propositional learning and first-order learning Analogy between propositional and first-order learning systems Terminology Terminology (cont.) Terminology (cont.) Learning sets of first-order rules: FOIL The Basic FOIL algorithm Diapositiva di PowerPoint A FOIL example Further details on FOIL Further details on FOIL Further details on FOIL Limitations of FOIL TILDE: Main characteristics TILDE (cont.) First-order decision tree First-order decision tree TILDE Method PROGOL: Main characteristics PROGOL: an example INDUBI/CSL:Main characteristics INDUBI/CSL:Main characteristics ATRE:Main characteristics ATRE:Main characteristics ATRE: Search strategy Related work Related work Related work Related work Related work Overview Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? Applying machine learning to document analysis & recognition: Where & How ? General comments Stages of a Machine Learning application Development stages Development stages Development stages Development stages Peculiarities of applications to document processing Peculiarities of applications to document processing Peculiarities of applications to document processing Peculiarities of applications to document processing Peculiarities of applications to document processing Machine learning for intelligent document processing: the case of WISDOM++ Overview Document processing steps in WISDOM++ Learning in WISDOM++ WISDOM++: Blocks classification WISDOM++: Document classification WISDOM++: Document understanding Related Work Related Work Related Work Related Work Related Work Related Work Related Work Related Work Conclusions |
Autore:Malerba Donato
Posta elettronica: malerba@di.uniba.it Home Page: http://www.di.uniba.it/~malerba/ |