Talend Open Studio

Talend Open Studio provides advanced capabilities that dramatically improve the productivity of data integration job design and proven scalability to ensure optimal execution.

Business modeling

Talend Open Studio: Business Modeler

Talend Open Studio’s Business Modeler leverages a top-down approach, allowing line-of-business stakeholders to get involved in the design of the integration processes.
The Business Modeler provides an easy-to-understand, non-technical view of a business workflow. It typically includes both the systems and processes already operating in the organization, and the ones that will be needed in the future. Systems, connections, steps and requirements are all designed using standardized workflow notation through an intuitive graphical toolbox.

 

 

Graphical development

Talend Open Studio: Mapper

Talend Open Studio’s Job Designer provides both a graphical and a functional view of the actual integration processes.

The Job Designer features the Component Library – a graphical palette of components and connectors. Integration processes are built by simply drag and dropping components and connectors to the diagram, drawing connections and relationships between them, and setting their properties. Most of these properties are already inherited from the metadata.

The Component Library includes over 200 out-of-the-box components and connectors, providing basic functions such as mappings, transformations, and lookups; specialized functions such as data filtering, data multiplexing, or ELT; and support for most RDBMS, file formats, LDAP directories, etc. The Component Library can easily be extended using industry-standard languages such as Perl, Java, or SQL.

 

Metadata-driven design and execution

Talend Open Studio: File Wizard

Database Wizard

Talend Open Studio is a metadata-driven solution, in which all metadata is stored and managed in a centralized Metadata Repository, shared by all the modules.  The Metadata Repository centralizes all project information and ensures the consistency of all integration processes.

Metadata related to source and target systems of the integration processes is easily loaded in the repository through advanced database or file introspection, facilitated by a number of wizards. Properties defined in the Metadata Repository are inherited by the various processes that make use of these systems.

Beyond source and target systems metadata, the Metadata Repository also stores business models, integration jobs, and results of their execution – making it the unique repository of information on all integration processes

 

Real-time debugging

Talend Open Studio: RealTime Debugging

Talend Open Studio: Debug Mode

Talend Open Studio includes powerful debugging and tuning features that allow the real-time tracking of data flowing through the whole transformation processes.

When an integration job is executed through the Job Designer interface – in graphical mode – statistics are displayed in real-time, showing the number of processed rows and rejected rows, as well as the throughput (rows per second) – allowing you to spot any bottleneck in the process immediately. It is also possible to activate a trace mode, which displays row-by-row behavior and shows the result of the transformations. Traditional debugging breakpoints and variables are also available.

Of course, all code generated by Talend Open Studio, regardless of the target language, is always visible and accessible from the design environment.

 

Robust execution

Talend Open Studio: Job Designer

Unlike many integration solutions, which are based on a centralized integration server or can only use RDBMS engines to process data, Talend Open Studio dynamically distributes the processing across a grid of systems – based on their available capacity. As a result, these systems do not need to be dedicated to executing integration processes. Instead, Talend Open Studio leverages available resources, regardless of their nature.

Talend Open Studio is the only data integration solution that leverages both the traditional ETL (Extract-Transform-Load) approach as well as the ELT (Extract-Load-Transform) approach. ELT leverages the power of the RDBMS engines to execute the data transformations inside the database, achieving unmatched performance for high-volume batches. For each subset of a process, it is possible to choose the most suitable approach, and hence to obtain the highest level of performance and scalability. This architecture design, which is especially suited to leverage grids (large or small) of inexpensive servers, as well as high-range systems, enables data to be processed at a location closest to its source (thus decreasing data transfers), and maximizes the use rate of computing resources.

 

Copyright © 2006-2008 Talend. All rights reserved