Big Data Analytics


Cohorts: XXXIII, XXXIV

Lecturer: Michelangelo Ceci and Gianvito Pio

01 - Overview
02 - NoSQL
03 - cassandra-slides
04 - Cassandra VM
05 - Apache Spark
06 - GraphX - Documentation
07 - DENCAST


Cohort: XXXII

Lecturers: Michelangelo Ceci and Gianvito Pio



Lecture Notes:

01 - Overview
02 - MapReduce
03 - NoSQL
04 - Spark Presentation
05 - Introduction to GraphX (by Ankur Dave)
An exercise on Spark (with solution)


Cohorts: XXX, XXXI

Lecturer: Michelangelo Ceci



Lecture Notes:


01 - Overview
02 - MapReduce
03 - Spark Presentation (Thanks to Roberto Corizzo)
04 - Distribuited Data Bases






Lab:


Spark word-count and linear regression examples (prepared by Roberto Corizzo):
 
  Instructions   
VM
VagrantFile
Word count: notebook (without solutions)
Word count: notebook (with solutions)
Linear regression: notebook (without solutions)
Linear regression: notebook (with solutions)



Exams:


Design a Spark solution for your ML problem. Details will be provided next week.






Top of this page