Is it a good idea to call shell commands from within C?

Question

There's a unix shell command (udevadm info -q path -n /dev/ttyUSB2) that I want to call from a C program. With probably about a week of struggle, I could re-implement it myself, but I don't want to do that.

Is it widely accepted good practice for me to just call popen("my_command", "r");, or will that introduce unacceptable security problems and forwards compatibility issues? It feels wrong to me to do something like this, but I can't put my finger on why it would be bad.

Brian Agnew · Accepted Answer · 2017-06-20T09:22:20.717

It's not particularly bad, but there are some caveats.

how portable will your solution be? Will your chosen binary operate the same everywhere, output the results in the same format etc.? Will it output differently on settings of LANG etc.?
how much extra load does this add on your process? Forking a binary results in a lot more load and requires more resources than executing library calls (generally speaking). Is this acceptable in your scenario?
Are there security issues? Can someone substitute your chosen binary with another, and perform nefarious deeds thereafter? Do you use user-supplied args for your binary, and could they provide ;rm -rf / (for example) (note that some APIs will allow you to specify args more securely than just providing them on the command line)

I'm generally happy executing binaries when I'm in a known environment that I can predict, when the binary output is easy to parse (if required - you may just need an exit code) and I don't need to do it too often.

As you've noted, the other issue is how much work is it to replicate what the binary does? Does it use a library you can also leverage off?

score 37 · Answer 2 · answered Jun 19 '17 at 23:31

It takes extreme care to guard against injection vulnerabilities once you've introduced a potential vector. It's in the forefront of your mind now, but later you may need the ability to select ttyUSB0-3, then that list will be used in other places so it will get factored out to follow the single responsibility principle, then a customer will have a requirement to put an arbitrary device in the list, and the developer making that change will have no idea that the list eventually gets used in an unsafe way.

In other words, code as if the most careless developer you know is making an unsafe change in a seemingly unrelated part of the code.

Second, the output of CLI tools aren't generally considered to be stable interfaces unless the documentation specifically marks them as such. You might be okay counting on them for a script you run that you can troubleshoot yourself, but not for something you deploy to a customer.

Third, if you want an easy way to extract a value from your software, chances are someone else wants it too. Look for a library that already does what you want. libudev was already installed on my system:

#include <libudev.h>
#include <sys/stat.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
    struct stat statbuf;

    if (stat("dev/ttyUSB2", &statbuf) < 0)
        return -1;
    struct udev* udev = udev_new();
    struct udev_device *dev = udev_device_new_from_devnum(udev, 'c', statbuf.st_rdev);

    printf("%s\n", udev_device_get_devpath(dev));

    udev_device_unref(dev);
    udev_unref(udev);
    return 0;
}

There is other useful functionality in that library. My guess is if you needed this, some of the related functions might come in handy too.

score 16 · Answer 3 · answered Jun 19 '17 at 19:54

In your specific case, where you want to invoke udevadm, I'd suspect you could pull in udev as a library and make the appropriate function calls as an alternative?

e.g., you could take a look at what udevadm itself is doing when invoke in "info" mode: https://github.com/gentoo/eudev/blob/master/src/udev/udevadm-info.c and make equiv calls as to those udevadm is making.

This would avoid a lot of the downside-tradeoffs enumerated in Brian Agnew's excellent answer -- e.g., not relying on the binary existing at a certain path, avoiding the expense of forking, etc.

Omnifarious · Answer 4 · 2017-06-20T16:22:28.993

Your question seemed to call for a forest answer, and the answers here seem like tree answers, so I thought I'd give you a forest answer.

This is very rarely how C programs are written. It is always how shell scripts are written, and sometimes how Python, perl or Ruby programs are written.

People typically write in C for easy use of system libraries and direct low-level access to OS system calls as well as for speed. And C is a difficult language to write in, so if people don't need those things, then they don't use C. Also C programs are typically expected to only have dependencies on shared libraries and configuration files.

Shelling out to a sub-process isn't particularly fast, and it doesn't require fine-grained and controlled access to low-level system facilities, and it introduces a possibly surprising dependency on an external executable, so it is uncommon to see in C programs.

There are some additional concerns. The security and portability concerns people mention are completely valid. They are equally valid for shell scripts of course, but people are expecting those kinds of issues in shell scripts. But C programs are not typically expected to have this class of security concern, which makes it more dangerous.

But, in my opinion, the biggest concerns have to do with the way popen will interact with the rest of your program. popen has to create a child process, read its output and collect its exit status. In the meantime, that process' stderr will be connected to the same stderr as your program, which may cause confusing output, and its stdin will be the same as your program, which might cause other interesting issues. You can solve that by including </dev/null 2>/dev/null in the string you pass to popen since it's interpreted by the shell.

And popen creates a child process. If you do anything with signal handling or forking processes yourself you may end up getting odd SIGCHLD signals. Your calls to wait may interact oddly with popen and possibly create strange race conditions.

The security and portability concerns are there of course. As they are for shell scripts or anything that starts up other executables on the system. And you have to be careful that people using your program aren't able to get shell meta-charcaters into the string you pass into popen because that string is given directly to sh with sh -c <string from popen as a single argument>.

But I do not think they are why it is strange to see a C program using popen. The reason it is strange is because C is typically a low level language, and popen is not low level. And because use of popen places design constraints on your program because it will interact strangely with your program's standard input and output and make it a pain to do your own process management or signal handling. And because C programs are not typically expected to have dependencies on external executables.

Dave · Answer 5 · 2017-06-20T13:51:53.853

Your program may be subject to hacking etc. One way to protect yourself from this type of activity is to create a copy of your current machine envronment and run your program using a system called chroot.

From viewpoint of your program its executing in a normal envronment, from a security aspect, if somebody breaks your program it only has access to the elements you provided when you made the copy.

Such a setup is called a chroot jail for more details see chroot jail.

Its normally used for setting up publicly accessible servers etc.

Is it a good idea to call shell commands from within C?

5 Answers5