Talend Open Studio for Data Integration: It is a powerful ETL tool that is free of cost !!.It allows you to efficiently and effectively manage all facets of data extraction, data transformation, and data loading. This leading open source ETL tool boosts developer productivity with a rich set of features including:
- An Eclipse-based graphical integrated development environment that enables easy data profiling and modeling, drag-and-drop job design, and efficient reuse of completed work across projects and modules.
- More than 400 built-in connectors that allow you to easily link a wide array of sources and targets.
- Robust ETL functionality such as string manipulations, management of slowly changing dimensions, and automatic lookup handling
- The ability to execute extract, load, and transform (ELT) processes as well as ETL processes, even within the same job.
Talend Open Studio for Data Integration comprises three major applications (Business Modeler, Job Designer, and Metadata Manager) within a single graphical development environment based on Eclipse, which is easily adapted to corporate needs.
It also provides a rich user manual for how to use various components of Talend and can be accessed from the following link
Input –A program generated file that gives the response of surveys conducted on people. Number of people for each questionnaire varies and hence the column count is not fixed. This file is a positional file with a specification of which char gives which column.
–column count can be captured from header record or from length as response of each person is of fixed length
Sample record for a questionnaire
questionid question response_from_users
1What do you prefer iphone or S2 1 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2
0-1 specify question id
60-* response of each person which is of one char(1 or 2)
Output-Table in MySql with fixed columns for question id,question but varying number for response
Create a positional file in talend with refence to your input file and specify the position which would act like delimiters(except for the last columns position after 60 as that is not fixed).
Similarly create a table in MySql with fixed columns(later we would alter it) and create a database connection for it in Talend.
For this conversion we develop a routine(in Java since Talend is written in Java it lets you use ur coding skills and develop your own components and functionality)and pass the columns of file as parameters
This routine takes the last field (after 60th position) and counts the number of columns (by dividing it by 2) and then applies an alter command to add these new columns on the fly. These columns are then inserted in a table for each row .
To call this routing, Talend provide a component called tJavaRow which gets called for each row from input.
**To load any library use tLibraryLoad
And its all done..
So talend lets you integrate your java code in a job and doesn’t restrict you to its own functionalities.