Geoinsyssoft Private Limited
Talend Developer Training Course content
Talend is one of the ETL Tools provider of Open Source Data Integration Software. Its main product is Talend Open Studio. It is an Open Source Project for Data Integration based on Eclipse RCP that primarily supports ETL-oriented implementations and is provided for on-premises deployment as well as in a software-as-a-service (SaaS) delivery model. Talend Open Studio is mainly used for integration between operational systems, as well as for ETL (Extract, Transform, Load) for Business Intelligence and Data Warehousing, and for migration.
Module 1: Introduction Talend
1. Overview of the concept of Data Warehouse.
2. Dimensions, Hierarchy, Facts
3. DW models:- Star and Snowflake schemas.
4. Explain talend and how it works
5. Explain talend open studio and its usefulness
6. Explain metadata and repository
7. Hadoop Ecosystem introduction
Module 2: Components and Jobs
Installation and Configuration Talend Administration Console
Types of Components
1. Basic Components - Overview
2. Component Properties
3. Database connectivity components
4. Explain how to create a new job
5. Create delimited file and explain whole process behind it
6. Use metadata and explain it
7. Explain concept of propagation.
8. Explain data integration schema
9. Use t filter row and string filter in job creation
10. Input delimitation file creation
11. Hands-on
Module 3: Schema and Aggregation
1. Explain job design and its features like edit schema and all
2. Explain T map and T merge
3. How to aggregate data
4. Define triplicate and explain how it works
5. Use tlog and explain its working
6. Define T map properties
7. Lab Exercises
Module 4-DataSource Connectivity
1. Data extracted from source
2. Database source and Target (Mysql/Oracle/Postgres)
3. Create connection
4. Import/create schema or metadata
Module 5-Function/Routines
1. Explain functions how to call and use of them
2. Define routines
3. Explain XML file and how it is used in talend
4. Use format data functions and explain its working
5. Define type casting.
Module 6 : Transformation
1. Context variable
2. Parameterization in ETL
3. Use trow generator explain with example
4. Explain sorting with example
5. Define aggregator
6. Publish data using t flow
7. Explain how we can run job in a loop
8. Other main components on palette
Module 7: Hadoop Connectivity TOS BD Edition
1. How to start Thrift Server
2. How ETL tool connect to Hadoop
3. Define ETL method
4. How Hive can be implemented
5. How to import data into hive with example
6. How to partition in hive with example
7. Why cannot overwrite customer table?
8. ETL component
9. Comparison b/w Hive and Pig
10. Loading data into the demo customer
11. ETL tool
12. Parallel execution
Module 8 : Use Cases / Case Studies
1. Data integration and performance improvement
2. Sentiment analysis with Twitter Dataset
3. Log stream analysis using Apache weblogs
4. ETL offloading with Hadoop Ecosystem
5. Recommendation modeling using Apache Spark as ETL
This comment has been removed by the author.
ReplyDeletegreat information,Thank you for sharing.
ReplyDeletedata science
data science training in USA
data science training in Hyderabad