Skip to main content

Posts

Open Source ETL tools vs Commercial ETL tool

Open Source ETL tools vs Commercial ETL tool The ETL-tools are validated on the following categories √ Infrastructure √ Functionality √ Usability √ Platforms supported √ Debugging facilities √ Data Quality / profiling √ Performance √ Future prospects √ Reusability √ Scalability √ Batch vs Real-time √ Native connectivity Pentaho Kettle vs Talend Pentaho Pentaho is a commerical open-source BI suite that has a product called Kettle for data integration. It uses an innovative meta-driven approach and has a strong and very easy-to-use GUI. The company started around 2001 (2002 was when kettle was integrated into it). It has a strong community of 13,500 registered users. It has a stand-alone java engine that process the jobs and tasks for moving data between many different databases and files. It can schedule tasks (but you need a schedular for that - cron). It can run remote jobs on "slave servers" on other machines. It has data quality features: fro...
Recent posts

Error Handling Mechanism in Talend Open Studio

Error Handling Mechanism in Talend Open Studio Three Error Handling Strategies in Talend Open Studio You can recover from some errors.  Others, like system or network failures are fatal.  But even in the fatal case, your Talend Open Studio job should die gracefully, notifying the operations team and leaving the data in a good state.  This post presents three error handling strategies for your Talend jobs. Some Talend Open Studio job errors are alternate paths that, though infrequent, occur often enough to justify special programming. This programming may come in the form of guard conditions, special logic applied to route the special case to another sub job.  For an example of these type of errors, see this blog post on  ETL Filter Patterns . Other errors are related to system and network activity or are bugs.  There are a few ways to handle this class of error in Talend Open Studio. Do Nothing For simple jobs, say an automated administrative t...

Downloading Talend

Downloading Talend Downloading Talend is a simple process. Just visit  the Talend website  and download the version you require. Talend may be installed on Windows, OSX, Unix and Linux. Current Release For the purpose of this documentation, we're installing Talend Open Studio (TOS) version 5.2 (which is the latest version at the time of writing). In May 2013, Talend 5.3.0 was released. Which installer should I download? There are two installers available from the Talend site, a Windows only executable and a zip file that may be installed on all three platforms. My personal preference is for the zipped version as this is a  no-install  installation. Which platform should I use? This is a personal choice. Talend Open Studio runs on Windows, OSX, Unix and Linux. My personal choice has been to develop on Windows; however, I also develop on OSX from time to time. Windows has proven to be the best experience, so far. On OSX, the development environment can ...

Talend Installation Guide

Installing Talend If you haven't already downloaded Talend, follow  these instructions . You are then ready to install Talend on the platform of your choice, Windows, OSX or Linux. For the purpose of this documentation, we're installing Talend Open Studio (TOS) version 5.2 (which is the latest version at the time of writing). Windows There are two installers available for Windows, an executable or the universal ZIP file. My personal choice is for the ZIP file as this provides a  no-install  installation, on Windows. Once you've downloaded the universal ZIP file, unzip the file in to an appropriate location. On Windows, My preference is to create a directory  C:\Talend . From my experience,  universal  installer software does not work well with directory paths that contain  spaces  so I would avoid  Program Files . OSX Talend is installed on OSX, using the universal ZIP ile. Once you've downloaded the universal ZIP file, unz...

Top Answers to Talend Interview Questions

Top Answers to Talend Interview Questions 1. Talend Characteristics Criteria Result Distinguishing feature First Data integration software as a service Deployment Business modeling, graphical development ETL functionality Makes ETL mapping faster and simpler for diverse data sources 2. What Talend stands for? Talend stands for Talend Open Studio. 3. What do you mean by Talend? Talend open studio is the open source data integration product produced by Talend and it is designed to convert, combine and update data in various areas across a business. 4. When was Talend open studio launched? Talend launched in October 2006 5. Talend is written in which language? It is written in Java language. 6. Tell the latest version of Talend open studio. The latest version is 5.6.0 7. Differentiate between ETL and ELT. ETL stands for Extract, Transform and Load which is a process that involv...

TALEND Interview questions and Answers

TALEND Interview questions and Answers (http://www.deepinopensource.com/talend-interview-questions/) 1.    Talend – Merge multiple files into single file with sorting operation. 2.    Loading Fact Table Using Talend 3.    ROWNUM Analytical Function in Talend 4.    SCD-2 Implementations in Talend 5.    Deployment strategies in Talend 6.    Custom Header Footer in Talend 7.    Data Masking Using Talend 8.    How to use Shared DB Connection in Talend 9.    Load all rows from source to target except last 5 10.    Late Arriving Dimension Using Talend 11.    Date Dimension Using Talend 12.    Dynamic Column Ordering Of Source File Using Talend 13.    Incremental Load Using Talend 14.    Getting Files From FTP Server 15.    Initializing Context At Run Time Using Po...