6

I'm looking for a tool for running a series of commands like the existing tool:

parallel -h
parallel [OPTIONS] command -- arguments
    for each argument, run command with argument, in parallel
parallel [OPTIONS] -- commands
    run specified commands in parallel

But I'd like these commands to be run over ssh on multiple computers, with some of the niceties of pssh or pdsh for communicating with many hosts. I've hacked out something that works,, but it's ssh handling is nothing compared to these tools - I can't stop them all, or even see all of their outputs.

Even better if the tool has some basic load balancing, but I was thinking I'd use a separate tool for host selection. (A good tool for querying load, memory, and if a computer is in interactive use would also be appreciated, but I've already written something that will suffice for host selection.) This isn't on a cluster, and I don't want to rely on daemons other than sshd, or ask admins to install a serious cluster job scheduler like Condor. I don't have root access on any of these computers.

Edit: To emphasize, I want to run different commands on each host - typically running the same program with different arguments, as in the first parallel usage example above.

Thomas
  • 191

5 Answers5

3

Ah! It looks like the GNU version of parallel (not the one I had installed) does do this. No load balancing, and I haven't tried it out to see what it does with each stdout and stderr, but this is precisely what I wanted.

To run commands on more than one remote computer run:
seq 10 | parallel --sshlogin server.example.com,server2.example.net echo

Unfortunately I've written a script that gives status updates, has configurable output settings, and incorporates some simple load balancing, so I'll be sticking with it for now.

Thomas
  • 191
1

Blockquote Edit: To emphasize, I want to run different commands on each host. Blockquote

if you want different commands where is the parallel part? parallel means to start the same command on an collection of hosts (running in parallel) ... if you want to do different things on different hosts that is an sequential process

0

You really should look into one of the many clustering technologies out there. Try looking at Apache Hadoop. I recently read a great article that you may find interesting on the subject too about setting up a 10,000-core cluster to do parallel computing: http://goo.gl/A8hgX

TheCompWiz
  • 7,429
0

I've used mussh for this, it's bash based but runs in parallel. I'm pretty happy with it.

I've also seen a few talks for rshall (which despite holding RSH in the name, uses ssh natively) at the local Linuxfests, it's perl based and can use an external source for querying host lists, but it expects certain host information in specific formats.

Neither of these have queuing or job scheduling, although you could run them via cron or at if you wanted to.

None of these require root access but they do require you have key based auth to the systems.

0

clusterssh is another tool that might be worth looking into. It's more interactive in that it will open and tile terminal windows for each host. You can also run commands in each terminal separate from each other or in all (or some) at once. For example, running top on 12 systems at a time then chasing down a process in just one of them.

zilla
  • 89