Ticker

6/recent/ticker-posts

Ad Code

Responsive Advertisement

Linux Fu: Mixing Bash and Python

Although bash scripts are regularly maligned, they do have a certain simplicity and ease of creation that makes them hard to resist. But sometimes you really need to do some heavy lifting in another language. I’ll talk about Python, but actually, you can use many different languages with this technique, although you might need a little adaptation, depending on your language of choice.

Of course, you don’t have to do anything special to call another program from a bash script. After all, that’s what it’s mainly used for: calling other programs. However, it isn’t very handy to have your script spread out over multiple files. They can get out of sync and if you want to send it to someone or another machine, you have to remember what to get. It is nicer to have everything in one file.

The Documents are All Here

The key is to use the often forgotten here document feature of bash. This works best when the program in question is an interpreter like Python.


#!/bin/bash
echo Welcome to our shell script

python <<__EOF_PYTHON_SCRIPT
print 'Howdy from Python!'
__EOF_PYTHON_SCRIPT

echo "And we are back!"

Variations on a Theme

Of course, any shell that supports here documents can do this, not just bash. You can use other interpreters, too. For example:

#!/bin/bash
echo Welcome to our shell script

perl <<__EOF_PERL_SCRIPT
print "Howdy from Perl\n"
__EOF_PERL_SCRIPT

echo "And we are back!"

It might be confusing, but you could even have some sections in Python and others in Perl or another language. Try not to get carried away.

Just in Case

There’s the rare case where an interpreter doesn’t take the interpreted file from the standard input. Awk is a notorious offender. While you can embed the entire script in the command line, that can be awkward and leads to quoting hassles. But it seems silly to write out a temporary file just to send to awk. Luckily, you don’t have to.

There are at least two common ways to handle this problem. The first is to use the bash process substitution. This basically creates a temporary file from a subshell’s standard output:

#!/bin/bash
# Hybrid bash/awk Word count program -- totally unnecessary, of course...

echo Counting words for "$@"
awk -f <( cat - <<EOF_AWK
    
    BEGIN {
        wcount=0;
        lcount=0;
    }
    
    {
        lcount++;
        wcount+=NF;
    }
    
    END {
        print "Lines=" lcount;
        print "Words=" wcount;
    }
    
    EOF_AWK
) "$@"

echo Hope that was enough
exit 0

Yet Another Way

There is another way to organize your process substitutions, so they are all gathered together at the end of the script surrounded by a marker such as “AWK_START” and “AWK_END” or any other pair of strings you like. The idea is to put each pseudo file in its own section at the end of the script. You can then use any number of techniques like sed or awk to strip those lines out and process substitute them like before.

There are two minor problems. First, the script needs to exit before the fake files start. That’s easy. You just have to make sure to code an exit at the end of the script, which you probably ought to do anyway. The other problem is searching for the marker text. If you search the file for, say, AWK_START, you need to make sure the search pattern itself isn’t found. You can fix this by using some arbitrary brackets in the search string or breaking up the search string. Consider this:

#!/bin/bash
# Hybrid bash/awk Word count program -- totally unnecessary, of course...

echo Counting words for "$@"
# use brackets
#awk -f <( sed -e '/[A]WK_START/,/[A]WK_END/!d' $0 ) "$@"
# or
AWK_PREFIX=AWK
awk -f <( sed -e "/${AWK_PREFIX}_START/,/${AWK_PREFIX}_END/!d" $0 ) "$@"

echo Hope that was enough
exit 0

# everything below here will be the supporting "files", in this case, just one for awk

# AWK_START

BEGIN {
    wcount=0;
    lcount=0;
}

{
    lcount++;
    wcount+=NF;
}

END {
    print "Lines=" lcount;
    print "Words=" wcount;
}

#AWK_END

There is no reason you could not have multiple fake files at the end, each with a different pair of markers. Do note, though, that the markers are sent to the program which is why they appear as comments. If these were going to a program that didn’t use # as a comment marker, you’d need to change the marker lines a bit, write a more complex sed expression, or add some commands to take off the first and last lines before sending it.

That’s a Wrap

You could argue that you can do all you need to do in one language and that’s almost certainly true. But having some tricks to embed multiple files inside a file can make creating and distributing scripts easier. This is somewhat similar to how we made self-installing archive files in a previous installment of Linux Fu. If you’d rather script in C or C++, you can do that too.

Enregistrer un commentaire

0 Commentaires