Course Objectives
By the end of this course students should be able to:
Use standard software development tools such as the Linux command line (
bash),git, anddocker.Store and manipulate files in HDFS.
Write
pysparkscripts from within a python notebook (jupyter), and perform analysis to extract insights.Create both "external" and internal
hivetables, and understand the difference. Use Hive and/or Presto to extract insights.Consume streaming messages from Kafka, and join/enrich streaming data using
ksqlStream data into NoSQL datastores such as Elasticsearch or Cassandra, and visualize using Kibana.
Last updated
Was this helpful?