Error Checking in UNIX Shell Scripts

UNIX commands ALWAYS have an exit code. 0 (zero) is always success, any other exit code means failure for some reason or another.
All UNIX systems have a built in variable $? to get the exit code of the previous command. Let's see how it works


[biff@home ~]$ ls > /dev/null
[biff@home ~]$ echo $?
0
[biff@home ~]$

In this example, I do the "ls" command directed to /dev/null. I then check the exit code and find that it successfully executed.
Let's now look and see what a failed command looks like:


[biff@home ~]$ cat /etc/shadow
cat: /etc/shadow: Permission denied
[biff@home ~]$ echo $?
1
[biff@home ~]$ 
[biff@home ~]$ cat /etc/shadow 2> /dev/null
[biff@home ~]$ echo $?
1
[biff@home ~]$

first I tried to cat a file I didn't have permissions to and the command failed. Then I did the same thing but redirected the error with "2>" to /dev/null
I did that to show how to get rid of unwanted messages in your shell scripts.

Here are a few ways to do some error checking in your Shell scripts:


#!/bin/sh

FILE="/etc/shadow"

cat $FILE 2> /dev/null

if [ $? -ne 0 ]
then
    echo "checking $FILE failed .. exiting"
    exit 1
fi

I check to see if I was able to use cat on a file by checking the return code in my shell script. If I wasn't able to, I leave a nice message and exit

A good UNIX programmer will check the return codes of all the commands he executes within a script. What if I can't create that file? What if I can't remove a file
or directory, what if I can't make a directory because of permissions, what if a particular command isn't found in the path of the person executing your scritp?
It's things like this that will make your program fail. We as shell coders need to learn how to handle these errors

The above if statement works great, but if I have a shell script that is executing dozens or hundreds of commands, writing an if statement for each one can be tiresome
Lucky for us, most UNIX shells support functions
.



#!/bin/sh

$FILE=/etc/shadow

function chkerr_exit(){
     $@ 2> /dev/null
     if [ $? -ne 0 ]
     then
        # epic fail.
        echo "failed executing $@ "
        exit 1
     fi
}

chkerr_exit "cat $FILE"
chkerr_exit "rm $FILE"
exit 0

In my example, I am passing the commands I want to run to a function to execute the commands and check the exit code. If it fails it will exit. Using this method
is easier on much larger shell scripts.

*special NOTE: Always declare variables BEFORE you declare your functions

Some other built in shell features allow you to test files and directories. Checkout Wikipedia for UNIX test operators. They work like this:


if [ ! -d someDirectory ]
then
    echo "directory does not exist"
    exit 1
fi

if [ -f /etc/passwd ]
then
    cat /etc/passwd 
fi

In this case I tested to see if a directory didn't exist and if the /etc/passwd file was a regular file. Using these methods will help you avoid errors at runtime