Updated at: August 19, 2017
kettle is a open source ETL tool which coded by Java, (support Windows & Unix/Linux)
Currently kettle was acquired by Pentaho and named as PDI(pentaho data intergration), there are both community(free) and enterprise version.
To install kettle, http://wiki.pentaho.com/display/EAI/01.+Installing+Kettle
data-integration directory hierarchy looks like:
I have highlighted 4 main components of kettle in the screenshot above, there are two executable files (.bat / .sh, run on Windows / Unix-like) for each one.
After you run
the first thing you should do, create a repository, kettle currently support 3 kinds of repository:
My suggestion is database repo, you can find details for any kind of repository here.
After you create a repo, connect in Spoon by admin/admin (default password: admin).
To open a transformation/job from repo, ‘Tools’ -> ‘Repository’ -> ‘Explore…’ ( Ctrl + E )
# This example runs a transformation from file on a windows platform: pan.bat /file:"D:\Transformations\Customer Dimension.ktr" /level:Basic # This example runs a transformation from file on a Linux box: pan.sh -file="/PRD/Customer Dimension.ktr" -level=Minimal
# This example runs a transformation from the repository on a windows platform: (Enter on a single line without returns...) pan.bat /rep:"Production Repository" /trans:"update Customer Dimension" /dir:/Dimensions/ /user:matt /pass:somepassword123 /level:Basic
# This example runs a job from file on a windows platform: kitchen.bat /file:D:\Jobs\updateWarehouse.kjb /level:Basic # This example runs a job from file on a Linux box: kitchen.sh -file=/PRD/updateWarehouse.kjb -level=Minimal
# This example runs a job from the repository on a windows platform: kitchen.bat /rep:"Production Repository" /job:"Update dimensions" /dir:/Dimensions /user:matt /pass:somepassword123 /level:Basic
enable carte on your remote host:
carte.sh 127.0.0.1 8081
add new slave server in Spoon:
run options: select the slave sever that you create
view execution result @
Go in deeper, Execute Scheduled Jobs on a Remote Carte Server
Use MySQL as an example, if you try to add a MySQL database connection without a mysql jdbc driver, you will meet this kind of error message:
Error connecting to database [Local MySQL DB] : org.pentaho.di.core.exception.KettleDatabaseException: Error occured while trying to connect to the database Exception while loading class org.gjt.mm.mysql.Driver
kettle will create several tables in your specified database for a database repository,
$HOME/.kettle/repositories.xml to different host’s same location
Notice: you may need to change the ip in
The mistake i had done was i had left the Result name populated thus the component presumed the call was for a function. Once i had removed the value from this field the procedure worked.
e.g. insert into an Oracle table you need to configure the oracle connection like this:
currently, I am planing to create a ETL automation tool with kettle, feel free to star this repo on github.