Overview
This four-day training course teaches the key concepts and knowledge developers need to use Apache Spark in developing high-performance, parallel applications on the Cloudera Data Platform (CDP).
Hands-on exercises allow students to practice writing Spark applications that integrate with CDP core components, such as Hive and Kafka. Participants will learn how to use Spark SQL to query structured data, use Spark Streaming to perform real-time processing on streaming data, and work with “big data” stored in a distributed file system.
After taking this course, participants will be prepared to face real-world challenges and build applications that make fast and relevant decisions, implementing interactive analysis applied to a wide variety of use cases, architectures, and industries.
What you'll learn
Through instructor-led discussion and interactive, exercises, you will learn how to:
-
Distribute, store, and process data in a CDP cluster
-
Write, configure, and deploy Apache Spark applications
-
Use Spark interpreters and Spark applications to explore, process, and analyze distributed data
-
Query data using Spark SQL, DataFrames, and Hive tables
-
Use Spark Streaming together with Kafka to process a data stream
Download DE: Developing Applications with Apache Spark 231107
Available Options: |
  |  
|
Include Exam Voucher: | |
Upon completion of the training, you will receive a Training Certificate of Completion.
All prices quoted in Singapore Dollars before GST.
Prices may be subject to change.
Optional examination vouchers are available for sale at a special rate of SGD 440 (before 8% tax) if paid during the registration process for the training course.