Benefits of using Docker with Talend

by | Jul 19, 2016 | BlogPosts, Talend, Tech Tips | 0 comments

If you need to use VM’s quite often, perhaps mostly for testing or a particular software that happens to be Linux-based, then you need to take a look at Docker. Docker is the future standard in virtualization and the way you should be testing/developing. Microsoft, Apple, and Linux foundations are giving serious support to Docker. Some Linux Distributions come with Docker pre-installed.

As a Talend Developer you ought to get on board. Here’s an example of how Docker demonstrates its utility.

Let’s say you are working on a project, it went live, and you are then asked to make some changes to a Talend job. However, you can’t run the job on the production environment as it might impact the system performance or the client simply does not want you to mess around with production environment for security reasons. Furthermore, the job depends on another set of technologies, that can be either some Apache open source projects or databases. For the sake of this example, let’s say HP Vertica – which is an analytical database. HP Vertica does not exist on Windows, so if your development platform of choice is Windows, you will need to create a Linux Virtual Machine then install all of the prerequisites that the software requires and then install the database.

This process would take at the very least 40-50 minutes. That is, if you knew what you are doing and you already have all of the software available to you. This will also take away a dissent amount of your machines’ power to power the VM.

With Docker, you can get HP Vertica up and running in less than 10 minutes, with significantly less demanding hardware requirements. This means you can start working on the problem at least 40 minutes ahead of time! You might be done by the time it would have otherwise taken you to install the VM, which you might end up using a total of only 4-5 times. What a waste!

The windows version of Docker currently requires a VMBox. However, in the future it will have a much better support under Windows 10. In the meantime, it needs to run everything in a VM which by default uses a tiny amount of RAM. If you are going to be using Hadoop, you will need to increase that to 8GB. It’s not actually going to reserve that amount of RAM whenever it’s running, but it will prevent crashes of some of the Hadoop stack tools.

From my personal experience, Docker has saved me hours of work. Importantly, it’s motivated me to test certain features or add functionality because I know that I could quickly run a container and start using a database or a Linux tool that would complement my Talend job, thus making me a better developer overall. Give it a go and let me know what you think. Links to resources are at the bottom of this article.

Here are some useful resources you may like to check out if you want to find out more about Docker and how it can help you:

Get Started with Docker for Windows

Self-Paced Training: Introduction to Docker

Welcome, Friends, to the Docker Docs


In the meantime, I hope you enjoyed this blog. If you found it useful, I suggest subscribing to the Datalytyx blog in order to receive a monthly email of recent articles and handy tips.


Submit a Comment

Your email address will not be published. Required fields are marked *