Contents (in Italian) - Academic Year 2007/2008
Contents (in Italian) - Academic Year 2008/2009
Contents (in Italian) - Academic Year 2010/2011
Contents (in Italian) - Academic Year 2011/2012


NOTICES:



Academic Year 2014/2015

Test in itinere, 26/11/2014 Risultati
Test in itinere, 23/01/2015 Risultati

Academic Year 2013/2014

Test in itinere, 21/11/2013
Test in itinere, 16/01/2014
Risultati

Academic Year 2012/2013

Lecture notes

Introduction to the course
Knowledge Discovery in Databases: the process and the CRISP-DM methodology
Rule-based classification (see also Chapter 2 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 A survey on the separate-and-conquer approach to rule-based learning, a paper on Multiple concept learning)
Decision trees (see also: Chapter 3 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; 1,2 for a survey on decision tree learning; 3, 4, 5 for the simplification of decision trees)
Bayesian framework for classification (see also: Chapter 6 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression , Article on hierarchical text categorization )
Parametric and non parametric regression; Stepwise Model Tree Induction (see also Sections 1.2, 2.1, 2.3, 4.6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; this paper for model trees)
Variable Associations (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; 1 for a perspective on the relation between association measures and association rules; 2 for a seminal paper on mining association rules)

Lab activities:

Practice on WEKA

Presentation demonstrating all graphical user interfaces (GUI) in Weka.
Presentation which explains how to use Weka for exploratory data mining.
Introduction, Selection, Preprocessing, Transformation
Classification (J48 and Naive Bayes Classifier)
Instance Based Learning & Regression
Association Analysis


Test in itinere, 21/11/2012 (Risultati)


Lecture notes: Academic Year 2011/2012

Introduction to the course
Knowledge Discovery in Databases: the process and the CRISP-DM methodology
Rule-based classification (see also Chapter 2 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 A survey on the separate-and-conquer approach to rule-based learning)
Decision trees (see also: Chapter 3 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; 1,2 for a survey on decision tree learning; 3, 4, 5 for the simplification of decision trees)
Bayesian framework for classification (see also: Chapter 6 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression , Article on hierarchical text categorization )
Parametric and non parametric regression; Stepwise Model Tree Induction (see also Sections 1.2, 2.1, 2.3, 4.6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; this paper for model trees)
Variable Associations (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; 1 for a perspective on the relation between association measures and association rules; 2 for a seminal paper on mining association rules)
Relational Data Mining (see also 1 for an introduction on Multi-Relational Data Mining, and 2 for efficiency issues)
Relational Classification Tools (see also FOIL, FOIL vs Related Systems, Progol and TILDE)
Spatial Data Mining: A Relational Approach (Slides Black/white) (see also A relational perspective on spatial data mining , Knowledge Discovery from Geographical Data)
Mining Spatial Association Rules with SPADA (see also 1, 2)
Slides of MSc. Pietro Leo's seminar

Lab activities: Academic Year 2011/2012

Practice on WEKA

Presentation demonstrating all graphical user interfaces (GUI) in Weka.
Presentation which explains how to use Weka for exploratory data mining.
Introduction, Selection, Preprocessing, Transformation
Classification (J48 and Naive Bayes Classifier)
Instance Based Learning & Regression
Association Analysis

Two exampes of case studies on data mining developed by graduate students

KDD Cup 1998 (by Sante Stanisci)
KDD Cup 2004 (by Daniele Manta and Ettore Campanozzi)


Data sets

Machine Learning Repository at University of California, Irvine: A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
The StatLib Datasets Archive: A repository of datasets used in statistics and machine learning.
KDD Cup


Lecture notes: Academic Year 2010/2011

Introduction to the course
Business Intelligence, Datalight project report (see also project home page), Perspectives on BI
Knowledge Discovery in Databases: the process and the CRISP-DM methodology
Rule-based classification (see also Chapter 2 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 A survey on the separate-and-conquer approach to rule-based learning)
Decision trees (see also: Chapter 3 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; 1,2 for a survey on decision tree learning; 3, 4, 5 for the simplification of decision trees)
Bayesian framework for classification (see also: Chapter 6 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression , Article on hierarchical text categorization )
Parametric and non parametric regression; Stepwise Model Tree Induction (see also Sections 1.2, 2.1, 2.3, 4.6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; this paper for model trees)
Variable Associations (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; 1 for a perspective on the relation between association measures and association rules; 2 for a seminal paper on mining association rules)
Relational Data Mining (see also 1 for an introduction on Multi-Relational Data Mining, and 2 for efficiency issues)
Relational Classification Tools (see also FOIL, FOIL vs Related Systems, Progol and TILDE)
Spatial Data Mining: A Relational Approach (Slides Black/white) (see also A relational perspective on spatial data mining , Knowledge Discovery from Geographical Data)
Mining Spatial Association Rules with SPADA (see also 1, 2)


Lab activities: Academic Year 2010/2011

Practice on Oracle Warehouse Builder

ES01 - Oracle DBMS: Introduction
ES02 - PL-SQL
ES03 - Trigger in Oracle
ES04 - Oracle Warehouse Builder
ES05 - Designing the Relational Target Warehouse
Logic schema to be analyzed

Practice on WEKA

Introduction, Selection, Preprocessing, Transformation
Classification (J48 and Naive Bayes Classifier)
Instance Based Learning & Regression
Association Analysis

Two exampes of case studies on data mining developed by graduate students

KDD Cup 1998 (by Sante Stanisci)
KDD Cup 2004 (by Daniele Manta and Ettore Campanozzi)


Data sets

Machine Learning Repository at University of California, Irvine: A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
The StatLib Datasets Archive: A repository of datasets used in statistics and machine learning.
KDD Cup


Lecture notes: Academic Year 2008/2009


Module A - Business Intelligence

Introduction to the course
Business Intelligence, Datalight project report (see also project home page), Perspectives on BI
Knowledge Discovery in Databases: the process

Practice on Oracle Warehouse Builder
ES01 - Oracle DBMS: Introduction
ES02 - PL-SQL
ES03 - Trigger in Oracle
ES04 - Oracle Warehouse Builder
ES05 - Designing the Relational Target Warehouse
Logic schema to be analyzed


Module B - Data Mining

Rule-based classification (see also Chapter 2 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 A survey on the separate-and-conquer approach to rule-based learning)
Decision trees (see also: Chapter 3 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 for a multi-disciplinary survey on the automatic construction of decision trees; 2, 3, 4 for the simplification of decision trees)
Bayesian framework for classification (see also: Chapter 6 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression , Article on hierarchical text categorization )
Parametric and non parametric regression; Stepwise Model Tree Induction (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; this paper for model trees)
Variable Associations (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; 1 for a perspective on the relation between association measures and association rules; 2 for a seminal paper on mining association rules)
Multi-relational Data Mining (see also 1 for an introduction on Multi-Relational Data Mining, and 2 for efficiency issues)
Spatial Data Mining: A Relational Approach (Slides Black/white) (see also A relational perspective on spatial data mining , Knowledge Discovery from Geographical Data)
Mining Spatial Association Rules with SPADA (see also 1)


Practice on WEKA
Introduction, Selection, Preprocessing, Transformation
Classification (J48 and Naive Bayes Classifier)
Instance Based Learning & Regression
Association Analysis

Practice on KDB2000

An example of case study on data mining

Customer Relationship Management: Technical Report


Data sets

Machine Learning Repository at University of California, Irvine: A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
The StatLib Datasets Archive: A repository of datasets used in statistics and machine learning.
Data sets used in the case studies available in Paolo Giudici's book


Lecture notes: Academic Year 2007/2008

Introduction to the course
Business Intelligence, Datalight project report (see also project home page), Perspectives on BI
Knowledge Discovery in Databases: the process
Rule-based classification (see also Chapter 2 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 A survey on the separate-and-conquer approach to rule-based learning)
Decision trees (see also: Chapter 3 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997, 1 for a multi-disciplinary survey on the automatic construction of decision trees; 2, 3, 4 for the simplification of decision trees, 5 for model selection)
Bayesian framework for classification (see also Chapter 6 of the text by T. Mitchell, Machine Learning, Morgan Kaufmann, 1997; Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression )
Parametric and non parametric regression (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; this paper for model trees )
Variable Associations (see also Chapter 6 of the text by A. Azzalini & B. Scarpa, Analisi dei dati e Data Mining. Springer, 2004; 1 for a perspective on the relation between association measures and association rules; 2 for a seminal paper on mining association rules)
Multi-relational Data Mining (see also 1 for an introduction on Multi-Relational Data Mining, and 2 for efficiency issues)
Spatial Data Mining: A Relational Approach (Slides Black/white)

Practice on OLAP

Pentaho Open Business Intelligence Suite
Mondrian: Installation
Working with Mondrian-JPivot
Introduction to MDX
OLAP

Practice on WEKA

Introduction, Selection, Preprocessing, Transformation
Classification (J48 and Naive Bayes Classifier)
Instance Based Learning & Regression
Clustering
Association Analysis

Project assignment


Seminars

Hierarchical and Pyramidal Symbolic Clustering, Prof. Paula Brito (Univ. of Porto, PT), 26 November 2007
Optimization Dynamics for Quadratic Problems and their Role in Machine Intelligence, Prof. Immanuel Bomze (Univ. of Vienna, A), 26 November 2007

Textbooks

  • P. Atzeni, S. Ceri, P. Fraternali, S. Paraboschi, R. Torlone. Basi di Dati - Architetture e linee di evoluzione. McGraw-Hill, 2003. (available in DIB library)
  • T. Mitchell, Machine Learning, Morgan Kaufmann, 1997. (Available in the DIB library)
  • Richard J. Roiger, Michael W. Geatz. Introduzione al Data Mining. McGraw-Hill, 2003. (available in DIB library)
  • A. Azzalini, B. Scarpa. Analisi dei dati e Data Mining. Springer, 2004. (available in the MATH library)




  • Lecture notes: Past Year 2006/2007

    (only additional topics with respect to current academic year)

    SEMINAR: Knowledge Discovery from Documents (see also 1 for knowledge discovery from/for paper documents; 2 for association rule mining on biomedical documents)
    Data mining query languages (see also 1 for a database perspective on Data Mining; 2 for a description of the SDMOQL language)
    Seminar on Distributed Data Mining by Ravi Pratap Singh


    Practice on WEKA

    Introduction, Selection, Preprocessing, Classification (J48)
    Classification (IBk and Naive bayes)
    Regression
    Regole di associazione e clustering

    Lab & Project assignment



    Top of this page