There have always been requirements in which you would need to run certain unix commands or shell scripts within Datastage. Although not a popular demand, its still something that can lie in the ‘good to know’ category. There are a number of different ways you can actually do this. I will try and explain the methods I have tried out.
Specifying the shell script in job properties
If you have any pre or post processing requirements that have to be run along with the job and if you have done this via a shell script, then you can specify the script in the After/Before job abort routine section , with its parameters. You should first select the ‘execsh ‘option from the drop down list to indicate that you are going to run the script.
Using the External filter stage
The external filter stage allows us to run UNIX filters during the execution of your job. Filters are programs that you would have normally used during your UNIX career. Filter programs are programs that read data from the input stream and modify it and send it on to the output stream. A filter that everyone is bound to have used is ‘grep’. Other common filters are cut, cat, grep, head, sort, uniq, perl, sh, wc and tail. The external filter stage allows us to run these commands during processing the data in the job. For eg. You can use the grep command to filter the incoming data. Shown below is a simple design.
As per the command we are filtering out data having the number 18 in it, using the grep command.
Using the sequential file stage
Although not a frequently used option, the sequential file stage does allow us to run unix filter commands inside it. In this example I have written a shell script that can be called inside the stage. The place where you have to enter the command is shown below.
The shell script written was a straight forward one that would read from the input directly
#————Shell script————————-#
#!/bin/bash
while read data; do
echo “$data, Modified by emerson” >> E:/Sample/sample.txt
#– You can add your processing logic here—–#
done
#———–End of script ————————-#
The output would be as below.
So the next time someone asks you if you can run a shell script though the sequential file stage, you needn’t hesitate to say YES.




your are doing a great job.
thank you very much.
some of the doubts iam having in usage of Unix in Datastage have been cleared.
Thanks again.
Looking forward for still more….
happy to help
Hi ,
Could you please let me know how transformer mapping is stored or probably where is it stored .
transformer mappings in your datastage parallel jobs are converted to C++ code and compiled to be used in the job..As to where exactly that code is stored, my guess would be to check the RT_ folders in your project..Is there any specific reason that you are searching for this specific file.. Ive never had the chance to specifically use it and hence never really bothered if i could modify the file..If your job is a server job then your transformer is written as BASIC code which will be present along with ur normal datastage code
Hey – good blog, just looking round some blogs, seems a fairly good platform You Are using. I’m presently using Drupal for a couple of of my sites however trying to change certainly one of them over to a platform very a lot the identical to yours as a trial run. Something particularly you’d recommend about it?
its a wordpress blog….pretty easy to use…doesnt require much of time to maintain too
I liked your article is an interesting technology
thanks to google I found you
Very interesting and useful info…. great effort
Looking forward to still more info… Thankss dude
It is very informative. However I am facing a problem applying grep command on the sequencial file created in DataStage. I am able to view the file in DataStage, am able to do “wc”, “head” commands on unix but the moment I do “grep” on the file .. it says “”"”Binary file test_5.data matches”"” and shows no results just mentions if it matched or not. This will be very difficult to handle for production support and hence need to get this resolved.
The file created in DataStage is preceeded by multiple lookups with “continue” option for no match … and the output fields defined are “Char”. I have a transformer stage that that looks for length of these fields and makes it spaces if the length is < 1.
Please let me know if you have some idea about the same.
Thanks .
Is your file a fixed length file or a tab delimited file ? If there are chances that you will have fields coming as null then you will have to handle it prooperly.. There might be chances you might be getting some junk. Im just thinking on the top of my head now
Hi,
Your doing grt job. ur posts are helped me a lot.
Thank you,
Rams
This is so wonderful and so helpful. Thanks a ton , you are doing a great work:-)