Talend Big Data
Certified Developer Exam

Talend certification exams are designed to be challenging to ensure that you have the skills to successfully implement quality projects. Preparation is critical to passing.

This certification exam covers the Talend Big Data Basics, Talend Big Data Advanced – Spark Batch, and Talend Big Data Advanced – Spark Streaming learning plans. The emphasis is on the Talend and Big Data architectures, Hadoop ecosystems, Spark, Spark on YARN, Kafka, and Kerberos.

Certification exam details

Exam content is updated periodically. The number and difficulty of questions may change. The passing score is adjusted to maintain a consistent standard.

Duration: 90 minutes
Number of questions: 55
Passing score: 70%

Recommended experience

At least six months of experience using Talend products
General knowledge of Hadoop (HDFS, Hive, HBase, YARN), Spark, Kafka, Talend Big Data and cloud storage architectures, and Spark Universal
Experience with Talend Big Data solutions and Talend Studio, including metadata creation, configuration, and troubleshooting

Preparation

To prepare for this certification exam, Talend recommends:

Taking the Big Data Basics, Big Data – Spark Batch, and Big Data – Spark Streaming learning plans
Studying the training material in the Talend Big Data Certified Developer preparation training module
Reading the product documentation and Community Knowledge Base article

Badge

After passing this certification exam, you are awarded the Talend Big Data Certified Developer badge. To learn more about the criteria to earn this badge, refer to the Talend Academy Badging program page.

Certification exam topics

Defining Big Data

Define Big Data
Describe the Hadoop ecosystem
Differentiate between Talend architecture and Big Data architecture
Describe cloud storage architecture in a Big Data context

Managing metadata in a Big Data environment

Manage a Talend metadata stored in the repository
Describe the main elements of a Hadoop cluster metadata
Create a Hadoop cluster metadata
Create metadata connections to HBase, HDFS, YARN, and Hive

Managing data using Hive

Import data to a Hive table
Process data stored in a Hive table
Analyze Hive tables in the Profiling perspective
Manage Hive tables on Hive Warehouse Connector with CDP public cloud

Managing Spark in a Big Data Environment

Describe the principal usage of Spark
Manage Spark Universal, including modes, environments, and distributions
Configure Spark Batch and Streaming Jobs
Troubleshoot Spark Jobs
Optimize Spark Jobs at runtime

Streaming with Talend Big Data

Describe principal usage of Kafka
Use Kafka components in Streaming Jobs
Manage Big Data Streaming Jobs in Studio
Tuning Streaming Jobs, including windowing, caching, and checkpointing

Configuring a Big Data environment

Manage Kerberos and security
Manage Apache Knox security with Cloudera Data Platform (CDP)

Managing data on Hadoop and cloud

Describe the principal usage of Hadoop (HDFS, HBase, and Hive) and cloud technologies
Export and import big data files to HDFS
Export and import big data files to the cloud
Export data to an HBase table

Manage Big Data Jobs

Differentiate between Big Data Batch and Big Data Streaming Jobs
Migrate and convert Jobs in a Big Data environment

Managing a Spark cluster

Define Spark on YARN
Describe the principal usage of YARN
Manage YARN, including client and cluster
Monitor Big Data Job executions
Use Studio to configure resource requests to YARN

Talend Big Data Certified Developer Exam