ECML/PKDD'02 Workshop on

"Mining Official Data" (MOD'02)

Helsinki, 20 August 2002

before the

13th European Conference on Machine Learning (ECML'02)

6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'02)

under the auspices of the

KDnet - the European Network of Excellence in Knowledge Discovery

Technical description

In statistics, the term "official data" denotes data collected in censuses and statistical surveys by National Statistics Institutes (NSIs), as well as administrative and registration records collected by government departments and local authorities. They are used to produce "official statistics" for the purpose of making policy decisions, and to facilitate the appreciation of economic, social, demographic, and other matters of interest to the governments, government departments, local authorities, businesses, and to the general public. For instance, population and economic census information is of great value in planning public services (education, fund allocation, public transport), as well as in private businesses (placing new factories, shopping malls, or banks, as well as marketing particular products). Moreover, survey data on specific topics, such as labour force, time use, household budget, are regularly collected by NSIs to keep updated information on some economic and social phenomena.

The application of data mining techniques to official data has great potential in supporting good public policy and in underpinning the effective functioning of a democratic society. Nevertheless, it is not straightforward and requires a challenging methodological research, which is still in an initial stage. In particular, to develop successful applications of data mining techniques to official data, the following issues must be dealt with:


The workshop will maintain a balance between theoretical issues and descriptions of case studies to promote synergy between theory and practice. Research contributions are welcome even though they have not been tested on "official data" but have a clear relation with some of the research issues reported in the technical description. Topics of interest include, but are not limited to:

Workshop Structure and Attendance

The workshop aims to be a highly communicative meeting place for researchers working on similar topics, but coming from different communities. In order to achieve these goals, the workshop will consist of two invited talks, followed by short presentations and longer discussions. Each author will be encouraged to read another accepted paper and to comment on it after the original talk has been given.

All ECML/PKDD'02 MOD workshop participants must also register for the main ECML/PKDD conference. Workshop attendance will be limited to registered participants.

Submission Procedure

Authors are invited to submit original research contributions or experience reports in English. Submitted papers must be unpublished and substantially different from papers under review. Papers that have been or will be presented at small workshops/symposia whose proceedings are available only to the attendees may be submitted.

Papers should be double-spaced and no longer than 5000 words (about 12 single-spaced pages). Papers should be sent electronically (postscript or pdf) not later than May 24, 2002 to

Papers will be selected on the basis of review of full paper contributions. Authors should make certain that the data mining techniques they describe deal with the special issues that are associated with official data. Notification of acceptance will be given by June 14, 2002. Final camera-ready copies of accepted papers will be due by June 28, 2002. The proceedings will be printed by the ECML/PKDD organizers and distributed at the workshop. A web-publication of the proceedings is expected after the conference.

Style Guide

There is a joint paper style for the proceedings of all ECML/PKDD workshops. Submitted papers should be formatted according to the Springer-Verlag Lecture Notes in Artificial Intelligence guidelines. Authors' instructions and style files can be downloaded from

Important Dates

Submission deadline:  May 24, 2002
Notification of acceptance:  June 14, 2002
Camera-ready copies of papers:  June 28, 2002
Workshop:  August 20, 2002


Organizing Committee

This workshop will be organized by:

Paula Brito, Faculty of Economics, University of Porto, Portugal

Donato Malerba, Department of Informatics, University of Bari, Italy


Program Committee

Timo Alanko, Statistical R&D Unit, Statistics Finland, Helsinki, Finland

Edwin Diday, CEREMADE, Paris-9 Dauphine University, Paris, France

Floriana Esposito, Department of Informatics, University of Bari, Italy

Paulo Gomes, National Institute of Statistics (INE), Lisbon, Portugal

Haralambos Papageorgiou, Department of Mathematics, University of Athens, Athens, Greece

Willi Klösgen, Fraunhofer Institute for Autonomous Intelligent Systems, Sankt Augustin, Germany

Carlos Marcelo, National Institute of Statistics (INE), Lisbon, Portugal

Michael May, Fraunhofer Institute for Autonomous Intelligent Systems, Sankt Augustin, Germany

Monique Noirhomme, Institut d'Informatique, University Notre-Dame de la Paix, Namur, Belgium

Mireille Summa, CEREMADE, Paris-9 Dauphine University, Paris, France

Ian Turton, Centre for Computational Geography, University of Leeds, Leeds, UK

