Is Datastage a user friendly ETL tool? Do we need it?

I am getting close to complete one year using Datastage and like all others with whom I have interacted with, I feel the same thing.

DATASTAGE IS HARD TO LEARN and the learn path is time consuming. The most frustating part I felt is the lack of a demo version which is easily available and the mediocre documentation. So I just thought what’s in there in Datastage and why is for every one is crazy about ETL tools.

I guess there is so much market for ETL tools since the world is losing genuine software developers every day. Err… yeah its true. I guess no one will go for ETL tool if you can get a first hand PL/SQL developer who can just give in the excellent loading times using the fully loaded parallel Oracle with pipelined transformations.

Lets take the performance aspect. As I read in one of the DBA books, Databases are evolving too faster for ETL tools to catch up. I just wonder how many years it will take for DataStage to implement Oracle 10g Data pump export/Import’s or does an Partition exchange load? And by the time it does I guess Data Pump would have given way to a more distributed or a more parallel version.  I guess it might be the same case with most other databases.

So what is it that makes ETL tools so popular?  Reusability? Development time? In the end I can come with only one reason, they make the life of Software dev’s easy ….. or is it the other way ….  to kill the computing interest in software developers and produce a wave of lame drag and drop specialists…. 🙂

2 Comments

Filed under Websphere DataStage

2 responses to “Is Datastage a user friendly ETL tool? Do we need it?

  1. maree

    hi,

    I have used both stored procedures and datastage.
    I remember changing the stored procedure from top to bottom in the middle of a night. This is a very stressful activity and it is easier to re-do/maintain ETL code. Important point is re-work will always occur because while developing the data size and env are vastly different from what we encounter during roll-out .

    If you still think stored procedures are better, try changing a stored procedure developed by somebody else (in the middle of the night)

    other advantages of etl tools:
    etl tools offer connection to diverse data sources
    meta-data documentation and maintenance
    integration with profiling and cleansing
    better resource management (sql uses memory which is faster but costly compared to disks(i/o) which are slower but relatively cheaper)

  2. While I don’t exactly think that DataStage is ‘Drag & Drop’ as you say, I do feel where you are coming from. In a lot of ways I think that DataStage makes for a more rounded developer. Let me give you an example.

    Say you need to load a Teradata table. There are several Teradata supported load utilities available, including Fastload, BTEQ, Multiload, and TPump. Assuming you need to load data into a table, you would (as a Teradata developer) decide to use one of the several load utilities available to do the job.

    Now assume that your next task is to load an Oracle table to support an Oracle ApEx application. The Teradata developer would be COMPLETELY lost, as he knows Teradata scripting and not PL/SQL. The DataStage developer, on the other hand, can say “no problem” and be able to stumble his way around the Oracle stage and figure out how to load the table.

    I believe that DataStage is anything but Drag & Drop, however with each release of the tool it brings us closer and closer to making ETL programming streamlined and repeatable. Yes there are countless bugs and no it isn’t easy to learn (especially things like the Modify Stage), but knowing DataStage certainly gives you skills which you can then apply to other types of databases.

    That was a bit lengthy, but I hope you can identify with some of what I am saying.

Leave a comment