_____________________________________________________________________________ Requirements: _____________________________________________________________________________ - PostgreSQL - JVM - Apache Spark _____________________________________________________________________________ Instructions: _____________________________________________________________________________ 1) Import the datasets provided as PostgreSQL plain SQL dumps; 2) Open config.properties and modify the following parameters: - The path parameter to indicate the path containing the .jar file; - Database parameters "db.*"; - Remove comments "#" from the lines related to the dataset of interest; - Select the proper table name and queries (trainQ, testQ) depending from the setting chosen (single target or multi target). Example: PV NREL - Single target: " dataset=PV_NREL idSplit=1 dateFile=./dates/pv_nrel_1.txt table=nrel_dataset_hourly clustering.trainQ=id,idplant,lat,lon,day,ora,anno,temperature,altitude,azimuth,pressure,windspeed,humidity,icon,dewpoint,windbearing,cloudcover,power clustering.testQ=id,idplant,lat,lon,day,ora,anno,temperature,altitude,azimuth,pressure,windspeed,humidity,icon,dewpoint,windbearing,cloudcover,power clustering.dateField=day clustering.numTargets=1 clustering.targetField=power " 3) Run with spark-submit: call spark-submit --master local[*] --driver-memory 26G --driver-class-path postgresql-9.4-1201-jdbc41.jar --class MainParams clustering-spark-1.0-SNAPSHOT-jar-with-dependencies.jar WEIGHTED_NEW 0 0 0 1 0.0 Example: PV_NREL - Split: 1 - Single Target LSH d=5, perm=20, part=4, numNeig=10, minCosine=0.95 minPTS=5 call spark-submit --master local[*] --driver-memory 26G --driver-class-path postgresql-9.4-1201-jdbc41.jar --class MainParams dencast.jar WEIGHTED_NEW PV_NREL 1 ./dates/pv_nrel_1.txt nrel_dataset_hourly 4 5 10 20 4 0.95 5 30 0 0 0 1 0.0 1