Another Open Source suite of ETL/data integration tools.
databases:
Oracle, Sybase, MySQL, XML, RSS, WebDAV, LDAP. Web site integration components.
platform:
Java on Windows
license:
GNU GPL (open source) or commercial, at the customer's choice.
YK's comment:
The tool is java-based but seems to be working on windows only. It seems to have good XML/web integration capabilities. As with other Open Source offerings, the online demos are quite good.
Last updated: Sun, 16 Sep 2007 00:00:00 GMT
A competent ETL tool developed at Sun Microsystems, with an eye on the bleeding-edge of integration architectures.
databases:
Oracle (v8 and higher), Sybase, DB2 (v5 and higher), SQL Server, Derby, Axion, spreadsheets, HTML tables (unnested), flat files (delimited or fixed width), other via JDBC.
platform:
Java
YK's comment:
ETL Integrator has the ability to expose ETL operations as web services, making it suitable for BI application based on SOA. The editor is a Netbeans module.
Last updated: Mon, 10 Sep 2007 00:00:00 GMT
A business-rules-driven approach to data warehousing.
databases:
DB2
platform:
IBM (z/OS, AIX, iSeries, Linux)
license:
Proprietary with free evaluation.
YK's comment:
The tool generates native SQL for the target version of DB2.
Last updated: Wed, 15 Nov 2006 00:00:00 GMT
An advanced ETL tool with a grid computing architecture.
databases:
MySQL, Postgres, Oracle, SQL Server, DB2, ODBC
platform:
Windows or Linux
license:
GNU GPL (Open Source). Notice that while the Open Studio developer tool is free to download and use, the "Enterprise Tools", called the "Talend Integration Suite" are described as a "subscription service" while still being Open Source.
YK's comment:
Very intuitive GUI. Having some experience with IBM DataStage, I was up and running with this tool in a matter of minutes. Supports both Java and Perl as the underlying languages.
Last updated: Sat, 22 Sep 2007 00:00:00 GMT
A Java framework for building ETL applications.
databases:
MySQL, Oracle, FlashFiler, dBase, JDBC, ...
platform:
Cross-platform
license:
Open Source Engine, Free GUI for non-commercial use.
YK's comment:
ETL processes can be described as XML documents, or apparently as java programs. A separate GUI is available for a fee.
Last updated: Fri, 28 Dec 2007 22:22:12 GMT
An ETL data warehousing tool.
databases:
JDBC
platform:
Cross-platform
license:
GNU LGPL (open source)
YK's comment:
Formerly Kettle. Evidently build by data warehouse experts. It has a few pretty advanced features. One example is a transformer that allows creating records in lookup tables from orphan rows.
Last updated: Fri, 03 Nov 2006 00:00:00 GMT
A Java/XML ETL Tool.
databases:
JDBC
platform:
Cross-platform
license:
GNU GPL (open source)
YK's comment:
N/A
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
Comprehensive & high performance data processing/transform system. Features a simple, user friendly event driven scripting interface that generates highly efficient Perl/C code
databases:
Oracle-OCI, SQLite
platform:
Cygwin (MS Windows), All POSIX (Linux/BSD/Unix-like)
license:
GNU GPL (open source)
YK's comment:
N/A
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
A parallel data transformation, presentation, and protection tool for large sequential and index files. Used for high-volume data integration, staging, loading and BI. Combines sort, merge, join, aggregate, convert, reformat, encrypt, cross-calculate, lookup, report, and other functions in one job script and I/O pass
databases:
DB2, Oracle, flat files (delimited or fixed width, including CSV, LDIF and XML), COBOL index files. API for custom sources/targets.
platform:
Unix (AIX, HP-UX, Mac OS X, Solaris, Tru64, etc.), POSIX (Linux, BSD, etc.), Cygwin, Windows. IBM x, i, p and zSeries (AIX and Linux).
license:
Proprietary with free trial. Perpetual use, low cost.
YK's comment:
I have as yet no personal experience of the tool. The CoSort staff contributed the following details: Uses a 4GL called SortCL to define and manipulate multiple extract and legacy file sources and targets. Also has plug-in sort accelerators for DataStage and Informatica, fast extract (FACT) tools for Oracle and DB2, and can generate safe test data in real table and file formats.
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
An ETL and data integration suite that generates portable Java code
databases:
N/A
platform:
CMS Windows, Web
license:
Proprietary with free trial
YK's comment:
N/A
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
An extremely quick ETL framework using MySQL as backend
databases:
MySQL
platform:
Cygwin (MS Windows), All POSIX (Linux/BSD/Unix-like)
license:
Creative Commons License
YK's comment:
I know of at least two major data warehouses in Sweden using the Orlando Model. The implementation was easy, the performance excellent even on cheap off-the shelf hardware. Practically all you need to set up the tool is to feed it the data warehouse's data model, the characteristics of each data source, and it will figure out the rest. On the downside it is a largish scripted program with no GUI, the interface to the developer being a configuration file. The original developers (lars@subside.com and federico@pietrolucci.net) seem to be nice guys and would probably be reasonable about helping with a new implementation.
Last updated: Thu, 23 Nov 2006 00:00:00 GMT
A sophisticated sorting tool
databases:
N/A
platform:
MacOS X, POSIX (Linux/BSD)
license:
GNU GPL (open source)
YK's comment:
N/A
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
Oracle's ETL software
databases:
Oracle
platform:
Windows, Linux, SunOS, HP-UX, AIX
license:
Proprietary with free trial
YK's comment:
N/A
Last updated: Tue, 10 Oct 2006 00:00:00 GMT
Efficient in parallel environments. ETL processes are built in a GUI, then compiled and run natively.
databases:
DB2, Oracle, Teradata
platform:
Server: Linux, AIX, IBM Mainframe. Client: Windows
license:
Proprietary
YK's comment:
Formerly Ascential DataStage. There are two major version of the software: parallel and server. The server version does not parallelize the process and is rather sluggish. The parallel version is a huge improvement performance-wise. The means of abstraction are rather limited in this tool, i. e. if you expect to be able to create reusable processes you might consider evaluating other tools, or contacting a datastage wizard. The documentation is lacking examples. In practice you need to buy courses from IBM in order to learn the tool.
Last updated: Tue, 31 Oct 2006 00:00:00 GMT
A low cost, high performance ETL tool.
databases:
MySQL 5.x
platform:
Windows, Linux Intel, SunOS, AIX
license:
Proprietary with free trial.
YK's comment:
Formerly Instant Data Warehouse. According to the developer the tool delivers "10x the productivity of high end ETL tools for less than 10% of the price". Still according to the developer, the tool generates most of the 'ETL components' directly from a mapping spreadsheet. I'll make sure to try this one out.
Last updated: Wed, 01 Nov 2006 00:00:00 GMT
If you would like to make suggestions or corrections to this material send an E-mail to: contact@linuxetl.com