Data integration in business environments can be a painful task. I mean REAL painful. The volume of data is huge, it does not cross-validate, it is dispersed in many heterogeneous formats, yadi yada. You know the song. Some day, I stumbled on Pentaho Data Integration (PDI).This was a real breakthrough.
First thing first, it's not subject to "vendor lock-in". It can read most data formats out there and can write it back to pretty much anything. This is a huge plus because gives it the ability to be used by a plenitude of user types and environments. Being written in Java also gives it an edge as an enterprise tool, for it is platform agnostic.
But the real advantages are not those trivial specifications. My love for PDI has much deeper roots. Simply put : it's powerful. Creating an integration process is a trivial matter. Drag and drop. Link. Execute. Those three simple steps will cover most of your business needs. Really, I mean it. Never again will I write a snippet of code to read a CSV file and write it's content in a database. Mark my words; NEVER! This is a waste of time and a developer who lives with his times should know that.
What about the real juicy stuff ?
As you suspected, there is much more to PDI than meets the eye. It can be clustered, it can use a database based repository for all processes, there are automatic documentation generation tools and is supported by a huge community. Many tutorials exist to address most business needs and challenges. It's well made, very stable and easily expandable with plugins for power users.
I strongly recommend to give it a try. The next version should be released soon and it will include many great new features. I met Matt Casters last June and had the chance to see for myself all the new functionalities that will make it to the next release. We're talking about visual performance bottleneck exploration and some more neat stuff you won't find anywhere else.
Cheers, and have a good time integrating !!