Interpolative Clustering Tree Learner

Interpolative Clustering Tree Learner





 
A short description 
The distribution package 
Related publications 
Authors & Aknowledgement 
Contact 



A short description

Nowadays ubiquitous sensor stations are deployed worldwide, in order to measure several geophysical variables (e.g. temperature, humidity, light) for a growing number of ecological and industrial processes. Although these variables are, in general, measured over large zones and long (potentially unbounded) periods of time, stations cannot cover any space location. On the other hand, due to their huge volume, data produced cannot be entirely recorded for future analysis. In this scenario, \emph{summarization}, i.e. the computation of aggregates of data, can be used to reduce the amount of produced data stored on the disk, while \emph{interpolation}, i.e. the estimation of unknown data in each location of interest, can be used to supplement station records. The interpolative clustering is a data mining framework that has the merit of addressing both these tasks in time-evolving, multivariate geophysical applications. It yields a time-evolving clustering model, in order to summarize geophysical data and computes a weighted linear combination of cluster prototypes, in order to predict data. Clustering is done by accounting for the local presence of the spatial autocorrelation property in the geophysical data. Weights of linear combination are defined, in order to reflect the inverse distance of the unseen data to each cluster geometry. The cluster geometry is represented through shape-dependent sampling of geographic coordinates of clustered stations.
Interpolative Clustering Tree (ICT) is a data mining algorithm that allows us to summarize data sampled over space for a number of goephysical variables by leveraging the power of a spatial-aware clustering algortihm. ICT determines a descriptive and interpolative model of georeferenced data sampled for the set of elds under examination. Data sampled at a specific time are reduced to a cluster model that is also used as interpolation model. Clustering is performed by accounting for the local presence of the spatial autocorrelation property across the sample data. For each cluster, a predictive model of the grouped data is associated to the cluster surface with the e ect of accounting for the relative spatial proximity of the objects that the data refer, smoothing and compacting data in the cluster. These cluster models are used instead of original data to speed-up the interpolation technique without weakening signi cantly its robustness.
Time-evolving Interpolative Clustering Tree (TICT) is a data mining algorithm that resorts to an incremental strategy in order to yield time-evolving ICTs. The model learned at the past time is adapted to the data changes which may turn up in data across the time.

The distribution package

The Interpolative Clustering Tree learner is implemented in a Java system

jar Description
ICT This rar bundle contains (1) ict. jar that allows us to compute Interpolative Clustering Trees (ICT) from the training data of a spatial data collection and use the computed interpolative clusters to predict testing data of this collection; (2) an example of batch file to run ICT.jar and (3) spatial data collections
TICT This rar bundle contains (1) tict. jar that allows us to compute Time-evolving Interpolative Clustering Trees (ICT) from training data of a spatial data stream and use the computed time-evolving interpolative clusters to predict the testing data of the stream, (2) an example of batch file to run TICT.jar and ( 3) spatial data streams

Warning: The Interpolative Clustering Tree Learners are free for evaluation, research and teaching purposes, but not for commercial purposes.
Please Acknowledge
 

Top of this page


Related publications

 
Appice, A., Malerba, D. Leveraging the power of local spatial autocorrelation in geophysical interpolative clustering (2014) Data Mining and Knowledge Discovery, 28 (5-6), pp. 1266-1313.

Top of this page


Project team

  • Dr. Annalisa APPICE
  • Prof. Donato MALERBA


    Contact

    Name Email address Tel. number Fax
    Annalisa Appice annalisa.appice@uniba.it +39 080 5443262 +39 080 5443262
     

    Top of this page