1

I have a script which takes multiple arguments and i need to run this script on multiple instances in parallel on AWS. For example, for sake of simplicity, if i have three instances in AWS, i would like to run the following:

On instance-a: script.sh a b
On instance-b: script.sh s t
On instance-c: script.sh y z

I will be spawning the instances using an AMI which will have the runtime (MATLAB) and the program (using the runtime) installed as part of the image.

I was checking this link and i saw Capistrano mentioned. Will that work in my case? Any other lightweight alternative that can be explored? Just to mention, i will be needing the return status and output (CSV file) generated from each instance.

Technext
  • 147

1 Answers1

1

If you only want 3 then this will work (version >= 20161222 for --results my.csv to work):

parallel --results my.csv ssh {1} script.sh {2} {3} ::: instance-a instance-b instance-c :::+ a s y :::+ b t z

But let me guess: You have many more instances listed in a file called hosts.txt:

instance-a
instance-b
instance-c

You do not care which instance runs which jobs - they are just workers. You have a .tsv file like input.tsv:

a[tab]b
s[tab]t
y[tab]z

Then you would run:

parallel --slf hosts.txt --results my.csv -a input.tsv --colsep '\t' script.sh 

If your command returns 0 on success you can even run on cheap ass spot-market servers: By using --retries 5 you can ask GNU Parallel to re-do the job on another server if one server breaks down (i.e. returns not 0).

Ole Tange
  • 3,186