Workshop day:
September 17, 2007

Location:
Warsaw, Poland

 
Scope and Program

Call for Paper

Submission Instructions

Important Dates

Invited Speaker

newPreliminary Programme





Organization

Organizers

Program Committee


 
Relevant Links

ECML/PKDD 2007

MRDM-2005

MRDM-2004

MRDM-2003

MRDM-2002

MRDM-2001



 

MRDM 2007

6th Workshop on Multi-Relational Data Mining

in conjunction with 

The 18th European Conference on Machine Learning (ECML) and the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), ECML-PKDD 2007



Data mining algorithms look for patterns in data. While most existing data mining approaches look for patterns in a single data table, multi-relational data mining (MRDM) approaches look for patterns that involve multiple tables (relations) from a relational database. Mining data which consists of complex/structured objects also falls within the scope of this field, since the normalized representation of such objects in a relational database requires multiple tables.

MRDM aims at integrating results from existing fields such as inductive logic programming (ILP), KDD, statistics, machine learning and relational databases; producing new techniques for mining multi-relational data; and practical applications of such techniques.

Following the mainstream of MRDM research, the most common types of patterns and approaches considered in data mining have been extended to the multi-relational case and MRDM now encompasses relational association rule discovery, relational classification rules, relational decision and regression trees, and probabilistic relational models, among others. At same time, MRDM methods have been successfully applied across many application areas, ranging from the analysis of business data, through bioinformatics and pharmacology to Web mining and Spatial Data mining.

MRDM methods are based on two alternative approaches: propositional and structural. The propositional approach requires the transformation of multi-relational data into a propositional (or attribute-value) representation by building features that capture relational properties of data. This kind of transformation, named propositionalization, decouples feature construction from model construction so that conventional propositional regression methods may be applied to transformed data, and a wider choice of robust and well-known algorithms is allowed. The structural approach takes into account the original data structure, so that the whole hypothesis space is directly explored by the mining method.

The purpose of this workshop is to bring together researchers and practitioners of data mining interested in methods for finding patterns in expressive languages from multi-relational / structured data and their applications. This workshop is the sixth of its kind. It follows the success of the workshops on Multi-Relational Data Mining, held both in Europe (ECML/PKDD 2001) and in USA (KDD 2002, 2003, 2004 and 2005) reports on which appear in online proceedings and SIGKDD Explorations [Vols 4(2), 5(2), 6(2) and 7(2)]. Based on MRDM-02, a special issue of SIGKDD Explorations [Vol 5(1)] was co-edited by Saso Dzeroski and Luc de Raedt. Further information on the workshops can be found at web sites MRDM-2001, MRDM-2002, MRDM-2003,MRDM-2004 or MRDM-2005.

Why the topic is of interest?

The interest of the KDD community in MRDM has increased sharply over the last few years. An evidence for this is the success of the previous MRDM workshops, as well as the summer school on Relational Data Mining at ECML/PKDD-2002, the MRDM tutorial at KDD-2003, and the Dagstuhl Seminar on "Probabilistic, Logical and Relational Learning - Towards a Synthesis" (Dagstuhl, Germany, 30.01.2005 - 04.02.2005).

The MRDM field has reached a relative maturity over the last years. Mining relational data has been addressed in conferences such as ECML/PKDD and ILP. Those conferences regularly devote time to these topics with specific papers related to these topics within the data mining sessions. Despite these many contributions, we are still far away from a deep understanding of the issues in MRDM. Still there is a number of interesting and open questions. For instance, one of the central research topic of MRDM is concerned with combining expressive knowledge representation formalisms such as relational and first-order logic with principled probabilistic and statistical approaches to inference and learning. This combination is needed in order to face the challenge of real-world learning and data mining problems in which the data are complex and heterogeneous and we are interested in finding useful predictive and/or descriptive patterns. The foundations, challenges and research opportunities raised by this problem have been explored in the Dagsthul Seminar on "Probabilistic, Logical and Relational Learning - Towards a Synthesis" organized in 2005. Several successful workshops on the topic of Statistical Relational Learning have been also organized at the AAAI, IJCAI and ICML conferences.

A non-exclusive list of topics for MRDM research, listed in alphabetical order are the following:

  • Applications of (multi-)relational data mining
  • Data mining problems that require (multi-)relational methods
  • Distance-based methods for structured/relational data
  • Inductive databases
  • Kernel methods for structured/relational data
  • Learning in probabilistic relational representations
  • Link analysis and discovery
  • Methods for (multi-)relational data mining
  • Mining (semi-)structured data, such as amino-acid sequences, chemical compounds, HTML and XML documents, spatio-temporal data, ...
  • Propositionalization methods
  • Relational neural networks
  • Relational pattern languages

Contact Information of Organizers

  • Donato Malerba (contact)
    Department of Informatics, University of Bari.
    Bari - Italy
    tel./fax: +39 080 5443269

  • Annalisa Appice
    Department of Informatics, University of Bari.
    Bari - Italy

  • Michelangelo Ceci
    Department of Informatics, University of Bari.
    Bari - Italy

Program Committee Members

  • Hendrik Blockeel (Katholieke Universiteit Leuven)
  • Jean-Francois Boulicaut (INSA Lyon)
  • Saso Dzeroski (Jozef Stefan Institute)
  • Peter Flach (University of Bristol)
  • Thomas Gaertner (Fraunhofer Institute for Autonomous Intelligent Systems)
  • Lise Getoor (University of Maryland)
  • David Jensen (University of Massachusetts )
  • Kristian Kersting (MIT Computer Science and Artificial Intelligence Laboratory)
  • Joerg-Uwe Kietz (Kdlabs AG, Zurich)
  • Arno Knobbe(Universiteit Utrecht)
  • Joost Kok (Leiden University)
  • Stefan Kramer (Technical University Munich)
  • Nada Lavrac Jozef Stefan Institute)
  • Celine Rouveirol(University Paris Sud XI)
  • Michele Sebag (University Paris Sud XI)
  • Arno Siebes (Universiteit Utrecht)
  • Stefan Wrobel (Fraunhofer Institute for Autonomous Intelligent System / University of Bonn)


Submission Instructions

Authors are invited to submit electronically original research and abstract papers in Portable Document Format (PDF) format.

Research papers should be at most 12 pages long whereas extended abstract should be at most 8 pages long.

Papers must be written in English.

Papers must be submitted electronically to mrdm2007@di.uniba.it.

Submitted papers will be evaluated by three reviewers. Acceptance will be based on relevance, technical soundness, originality, and clarity of presentation.

Accepted papers and extended abstracts will be published in the proceedings of the workshop.

Papers must be formatted using the Lecture Notes in Computer Science style available at http://www.springer.de/comp/lncs/authors.html.
[Quick links for MS Word and LaTeX2e]

In addition to the workshop proceedings, we intend to publish a selection of accepted papers in a journal special issue.



Important Dates

  • Deadline for submissions: June 30, 2007
  • Extended deadline for submissions: July 7, 2007
  • Notification: July 21, 2007
  • Camera ready: July 28, 2007
  • Workshop day: September 17, 2007


Invited Speaker

ProbLog and its Application to Link Mining in Biological Networks
Luc De Raedt, Department of Computer Science, Katholieke Universiteit Leuven, Belgium
Abstract. ProbLog is a recently introduced probabilistic extension of Prolog [De Raedt, Kimmig, Toivonen, IJCAI 07]. A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly sampled program, and these probabilities are mutually independent. The semantics of ProbLog is then defined by the success probability of a query in a randomly sampled program. It has been applied to link mining and discovery in a large biological network. In the talk, I will also discuss various learning settings for ProbLog and link mining, in particular, I shall present techniques for probabilistic local pattern mining, probabilistic explanation based learning and theory compression from examples [De Raedt et al, ILP 96].

Preliminary Programme

Monday, September 17

Registration: 8:30 - 9:00

Session 1: 9:00-10:30
(Chair: Donato Malerba)

Opening remarks

Invited Talk: ProbLog and its Application to Link Mining in Biological Netwoks
Invited Speaker: Luc De Raedt

Title: Relational Transformation-based Tagging for Human Activity Recognition
Author: Niels Landwehr, Bernd Gutmann, Ingo Thon, Matthai Philipose, Luc De Raedt

Session 2: 11:00-12:30 (90 min.)
(Chair: Luc De Raedt)

Title: Learning Ground ProbLog Programs from Interpretations
Authors: Fabrizio Riguzzi

Title: Learning Ground CP-logic Theories by means of Bayesian Network Techniques
Authors: Wannes Meert, Jan Struyf, Hendrik Blockeel

Title: Distributed Relational State Representations for Complex Stochastic Processes
Authors: Ingo Thon, Kristian Kersting

Title: Stratified Gradient Boosting for Fast Training of Conditional Random Fields
Authors: Bend Gutmann, Kristian Kersting

Session 3: 14:00-15:30 (90 min.)
(Chair: Annalisa Appice)

Title: Towards a Framework for Relational Learning and Propositionalization
Authors: Ulrich Rückert, Stefan Kramer

Title: Mining Imbalanced Classes in Multirelational Classification
Authors: Hongyu Guo, Herna L. Viktor

Title: ILP: Compute Once, Reuse Often
Authors: Nuno A. Fonseca, Ricardo Rocha, Rui Camacho, Vitor Santos Costa

Title: A Restart Strategy for Fast Subsumption Check and Coverage Estimation
Authors: Ondrej Kuželka, Filip Železný;

Session 4: 16:00-17:20 (80 min.)
(Chair: Michelangelo Ceci)

Title: Mining Frequent Patterns from Multi-Dimensional Relational Sequences
Authors: Nicola Di Mauro, Teresa M.A. Basile, Stefano Ferilli, Floriana Esposito

Title: Choosing the Right Patterns: An Experimental Comparison between Different Tree Inclusion Relations
Authors: Jeroen De Knijf, Ad Feelders

Title: A Multi-Relational Approach to Clustering Trajectory Data
Authors: Gianni Costa, Alfredo Cuzzocrea, Giuseppe Manco, Riccardo Ortale, Howard Scordio

Closing remarks

Opening ceremony starts at 17:30

Further Information

Address any further inquiry to:
MRDM 2007 Workshop chairs,
Department of Informatics, University of Bari,
Via Orabona 4, 70126 Bari (ITALY)
tel./fax: 080/5443269
mrdm2007@di.uniba.it



Last modified: August 20th, 2007 by mrdm2007@di.uniba.it