_____________________________________________________________________________
Requirements:
_____________________________________________________________________________

- PostgreSQL
- JVM
- Apache Spark
_____________________________________________________________________________
Instructions:
_____________________________________________________________________________

1) Import the datasets provided as PostgreSQL plain SQL dumps;

2) Open config.properties and modify the following parameters:

- The path parameter to indicate the path containing the .jar file;
- Database parameters "db.*";
- Remove comments "#" from the lines related to the dataset of interest;
- Select the proper table name and queries (trainQ, testQ) depending from the setting chosen (single target or multi target). 

Example: 

PV NREL - Single target:

"
dataset=PV_NREL
idSplit=1
dateFile=./dates/pv_nrel_1.txt

table=nrel_dataset_hourly

clustering.trainQ=id,idplant,lat,lon,day,ora,anno,temperature,altitude,azimuth,pressure,windspeed,humidity,icon,dewpoint,windbearing,cloudcover,power
clustering.testQ=id,idplant,lat,lon,day,ora,anno,temperature,altitude,azimuth,pressure,windspeed,humidity,icon,dewpoint,windbearing,cloudcover,power

clustering.dateField=day
clustering.numTargets=1
clustering.targetField=power
"

3) Run with spark-submit:

call spark-submit --master local[*] --driver-memory 26G --driver-class-path postgresql-9.4-1201-jdbc41.jar --class MainParams clustering-spark-1.0-SNAPSHOT-jar-with-dependencies.jar WEIGHTED_NEW <dataset_name> <split_num> <dates_path> <table_name> <lsh_rdd_partitions> <lsh_dimensions> <lsh_num_neighbors> <lsh_num_permutations> <lsh_rdd_partitions> <min_cosine_similarity> <minPTS> <window_size> 0 0 0 1 0.0 <num_targets>

Example:
	PV_NREL - Split: 1 - Single Target
	LSH d=5, perm=20, part=4, numNeig=10, minCosine=0.95
	minPTS=5

call spark-submit --master local[*] --driver-memory 26G --driver-class-path postgresql-9.4-1201-jdbc41.jar --class MainParams dencast.jar WEIGHTED_NEW PV_NREL 1 ./dates/pv_nrel_1.txt nrel_dataset_hourly 4 5 10 20 4 0.95 5 30 0 0 0 1 0.0 1