Big Data Analytics
Cohorts: XXXIII, XXXIV
01 - Overview
02 - NoSQL
03 - cassandra-slides
04 - Cassandra VM
05 - Apache Spark
06 - GraphX - Documentation
07 - DENCAST
Cohort: XXXII
Lecture Notes:
01 - Overview
02 - MapReduce
03 - NoSQL
04 - Spark Presentation
05 - Introduction to GraphX (by Ankur Dave)
An exercise on Spark (with solution)
Cohorts: XXX, XXXI
Lecture Notes:
01 - Overview
02 - MapReduce
03 - Spark Presentation (Thanks to Roberto Corizzo)
04 - Distribuited Data Bases
Lab:
Spark word-count and linear regression examples (prepared by Roberto Corizzo):
Instructions
VM
VagrantFile
Word count: notebook (without solutions)
Word count: notebook (with solutions)
Linear regression: notebook (without solutions)
Linear regression: notebook (with solutions)
Exams:
Design a Spark solution for your ML problem. Details will be provided next week.