Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Wiki Home
Collapse Expand Close

Members
Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Kettle

From MIKE2.0 Methodology

Share/Save/Bookmark
Jump to: navigation, search
Under construction.png
This article is currently Under Construction. It is undergoing major changes as it is in the early stages of development. Users should help contribute to this article to get it to the point where is ready for a Peer Review.


Kettle, now known as Pentaho Data Integration is one of the better known Open source ETL (Extract, Tranform and Load) tools on the market. It is provided by Pentaho, a commercial open source BI vendor.

Contents

Relationship to MIKE2.0

SAFE Architecture

  • Used to deliver the ETL sub-component

Overall Implementation Guide

ETL Design and Implementation Activities of MIKE2.0, which are:

Alignment with Strategic Requirements for Infrastructure Development

Product Review

Key Features

  • Platform: runs on Windows, Unix and Linux.
  • GUI: GUI interface with visual transform indicators. Reporting available from the metadata layer.
  • Code: A 100% Java application with advanced transformations coded in JavaScript via an embedded GUI interface. A metadata driven design.
  • License: Mozilla Public License.
  • Source: Source code available at Get the code.
  • Support: A Pentaho forum and a Issue Tracking and Pentaho Community with deep dive technical articles that are better than some premium ETL vendor sites.
  • Connectivity: Supports Oracle, DB2, SQL Server and Sybase. Supports open source MySQL, PostGres, Hypersonic, FireBird SQL and Ingres. Supports connectivity to SAP R/3 for a license fee.
  • Scalability: supports a Parallel Processing Architecture by distributing ETL tasks across multiple servers.

Pros

  • One of the oldest open source ETL tools it has a large user community and a new drive from the support from Pentaho.
  • Out of the box integration with other Pentaho open source products such as BI, EII and EAI.
  • The GUI Designer interface, the out of the box transformer objects and the support for slowly changing dimensions should enable increased developer productivity.
  • Community articles shows an enthusiastic sharing of tips and tricks.
  • Mozilla Public License allows the embedding of KETTLE into another product without license fees.

Cons

  • Does not have a specialised data quality component or a partnership with a data quality vendor.
  • Potential performance overheads on high volume data joins/lookups where the lookup database is accessed over a network. The streaming lookup works best for small lookup volumes. For very large lookup sources the data should be stored in a database locally on the ETL server.

Desired Enhancements

Functonality that users of the MIKE2.0 Methodology would like to see added to this product are as follows:

User Valuation Enhancements

Voting scores from MIKE2.0 Contributors on the value of the asset in the context of the overall methodology

Usage

Product Access

[1] available from sourceforge.net

Open Source Licensing

Comparable Open Source Products

Reference Implementations through MIKE2.0

Wiki Contributors
Collapse Expand Close

View more contributors