From Bash to Python and vice versa, for admins

Basic settings

When you launch a program in shell, the way it is executed is derived from the first few bytes of executable file. There is a token “#!” called hashbang or shebang for interpreted languages and a path to the language interpreter continues after the given token.

bashPython
#!/bin/bash
...shell commands
#!/usr/bin/env python
...python commands

The path /bin/bash is usual for bash, but for python it could be /usr/bin/python or /usr/local/bin/python or something else. If you are dealing with an unknown path for python use the command env, which executes the following command in a modified environment and it will find python somewhere in the system paths given in the variable $PATH.

Do not forget to set an execution flag to your script: chmod +x ./myscript.sh

How to execute commands from the command line

bashPython
bash -c "echo Hello World\!"
python -c "print 'Hello World!'"

Program initialization

Bash uses two groups of commands, the first group contains embedded functions (e.g. if, for, trap, return, export, declare,... see man bash), the second group contains user defined functions or programs inside system paths of the operating system (ls, stat, date, chmod, chown,...). The presence and parameters of these commands are defined by POSIX norms, but in GNU/Linux a lot of these commands have additional functionality. System paths are defined in the environmental variable called $PATH and separated by a colon. If you call a program which is stored somewhere else, you need to use an absolute or relative path to its file. If you want to read it as a module of another shell script, you need to use the command source or a dot “.” command.

Python has a set of general functions and syntax, but other functions are defined in modules of python's standard library. You can enable them with import. For system administrators the following modules are the most useful:

Functions defined in the module can be called by adding a prefix of the module name, e.g. sys.argv, os.mkdir("/tmp/temp"), etc. Or you can use them directly after the expression: from os import * which puts all os module functions into base name space.

bashPython
#!/bin/bash

source /etc/functions
#!/usr/bin/env

import re, sys, os, subprocess

Basic syntax comparison

Bash separates each command by a new line or a semicolon. Expression statement blocks have defined marks for begin-end, e.g. if-fi, case-esac, for do-done,... Indentation is not a must, but it can help to improve code readability. Long commands can be divided if you use a backslash in front of new lines.

Python also separates commands by new lines and or semicolons, but blocks must be indented and what is inside the block is given by an indentation. You can use one or more spaces or tabulators as indentation. The most common indentation is the use of 4 spaces.

bashPython
for i in {0..10}
do
	echo $i
	echo $((i*2))
done

for i in {0..10}; do echo $i; echo $((i*2)); done
for i in range(0,10):
	print i
	print i*2



for i in range(0,10): print i; print i*2;

Variables

Bash has a very strict syntax for variable creation: NAME=10, NAME="value" – no space exists between variable names and assignment operators. This has an advantage, if you need to find where a variable is changed just grep NAME[+]\?= (you can only use NAME=), but to be exact, this regexp also covers the operator += for the addition of new items to a string or an array. To get the value of a variable, you need to prefix it with a dollar sign, i.e. $NAME or ${NAME}. Variables can contain integers, strings, arrays and associative arrays (hashes) data structures.

Python assigns variables in a common way, and also uses the operator “=”. You can also use the same assignment-expression operators that a common in the C language, so a += 10 is the same as a = a + 10 (but there is no i++, you need to use i+=1 instead). In bash, these operators can be used only in mathematic mode.

Store output of system command into a variable:

bashPython
date +%Y%m%d%H%M # perform command
var=$(date +%Y%m%d%H%M)
# Redirect standard error output
var2=$(ls non_existent_file 2>&1)
subprocess.call(['date', '+%Y%m%d%H%M']) # CLI parameters divided to list
var = subprocess.check_output("date +%Y%m%d%H%M", shell=True)
# Non-zero return value causes error message in check_output(), end it by 'exit 0'
var2 = subprocess.check_output("ls non_existent_file; exit 0", stderr=subprocess.STDOUT, shell=True)

Beware of a security issue when you use a string instead of a list with command line arguments when user input is used for its values. An evil user can end the command with a semicolon and inject his own executable code!

String operations

Bash distinguishes between 'simple' and "double" quotes. In the first case, special characters (e.g. !, $, \n, \t,...) and escape sequences are not interpreted. If you do not put strings into quotes, consequent white chars will be joined together.

Python does not care about used quotes. If you do not want to interpret special sequences use raw strings, e.g. r"\n", R'\n', or ur"\n", UR'\n' for Unicode. If you want to display normal strings as raw data use repr(string).

bashPython
a="Hello"
a+=" World\!"
${#a} # string length
${a:0:1} # returns 'H', return sub-string of length 1 from position 0
${a:6:5} # from position 6, sub-string of length  5 "World"
printf 'Hello%.0s' {1..3} # repeat 3×
printf "%d" \'A # print ASCII value of letter 'A'
printf \\$(printf '%03o' 65) # convert ASCII code 65 to letter
s="jedna dva tři"; pole=( $s ) # separate string into array items
${s/jedna/nula} # replace first occurrence
${s//a/B}       # replace all occurrences
echo 'měšťánek' | sed 's/[[:lower:]]*/\U&/' # lower-case to upper-case
# Find the number of sub-strings is quite breakneck
s="aaa;baaa;c;daaa;e"; s2=${s//[!aaa]/}; s3=${s2//aaa/a}; echo ${#s3}
echo 'hellow world' | rev # print backwards
printf -v s "%03d %s" 10 'Test data' # save string '010 Test data' into $s
s=`printf "%03d %s" 10 'Test data'` # slower way how to save formated string
a = 'Hello'
a += " World!"
len(a)
a[0]
a[6:6+5] # from position 6 to position 11
'Hello' * 3 # you can use it directly in Python
ord(A)
chr(65)
pole = "jedna dva tři".split() 
"jedna dva tři".replace('jedna', 'nula', 1)
"jedna dva tři".replace('a', 'B')
print U'měšťánek'.upper() # do not forget to use prefix U for Unicode
# call count() in Python
s="aaa;baaa;c;daaa;e"; s.count('aaa')
'hello world'[::-1]
s = "%03d %s" % ( 10, 'Test data' )
s = "{0:03d} {1!s}".format( 10, 'Test data' ) # formate string with format()

Formatted output to terminal

Bash only contains the function echo and the command printf. The echo command can use the parameter -e for the interpretation of escape sequences and -n for disabling of printing new-line char by default. Command printf is equivalent to the same named function in the C standard library. Because the behaviour and command line options of echo are different in various UNIX platforms, it is better to use printf instead.

Python contains the function print and from version 3+, it is mandatory to close parameters with brackets. Function print can be used in a similar way as printf, but there are more options for formatting strings. Data is separated by “%” or the method format() can be used. It returns formatted strings which can then be used as input parameters for print/print().

bashPython
a="Hello" b="World" # initialize variable on one line
echo $a$b  # output is HelloWorld
echo $a $b # output is Hello World 
printf "%s %s\n" $b $a # output is World Hello

echo 'Error!' 1>&2 # redirect to standard error output (stderr)
p=("This" "is" "an" "array") # initialize array $p
# By redefinition of IFS can by changed output delimiter of output, IFS must be restored then
OLDIFS="$IFS"; IFS=, ; echo "${p[*]}"; IFS="$OLDIFS" # output is This,is,an,array
a="Hello"; b="World" 
print a + b # use operator for string concatenation
print a, b  # variables will be delimited by white space
print "%s %s" % (b, a) 
print "{1} {0}".format(a, b) # same operation with format()
sys.stderr.write('Error!') 
p=["This", "is", "an", "array"]
# Use method join() to iterate array and print it with delimiter
print ','.join(p)   

Arrays

Bash supports one-dimensional arrays unlike other shells. Their usage will be shown in examples below. You can access each item of array by indexes which start from zero and you can also concatenate arrays, add new items at the end or destroy arrays by using the function unset.

Python contains the data structure called list and you can use it similary as an array, but it also supports some other methods (e.g. remove item by its content remove(), return index by content index(), or sorting sort(),...).

bashPython
a=( Toje je pole "Prvek 3" 31337 ) # initialization, white space is delimiter delimiter
set | grep ^a= # print out all variables, but grep only $a, output is following:
a=([0]="Toje" [1]="je" [2]="pole" [3]="Prvek 3" [4]="31337") # other way of initialization
a[3]=${a[3]}", on fourth place" # change one item
echo ${a[*]} # print array
a+=666   # add string 666 to first item

a+=(666) # add item 666 (array with one item to the end)
for i in "${a[@]}"; do echo $i; done | sort # sort array (check 'man sort')

for i in "${a[@]}"; do echo $i; done | tac  # print backwards

echo ${#a[*]} # print number of items
echo ${a[*]//666/777} # replace all 666 by 777
a = [ 'Toje', 'je', 'pole', 'Prvek 3', 31337 ]
print a # print array

a[3]=a[3] + ", on fourth place"
print str(a).decode('unicode-escape') # pro interpretaci UTF-8 sekvencí takto
a+=666 # ends with error 'int' object is not iterable
a[0]+="666" # add string to first item
a.append(666) 
print sorted(a) # print sorted list
a.sort() # sort items 'in-place'
print list(reversed(a)) # print reversed by reversed(), it return iterator so change type to list
a.reverse() # reverse items 'in-place'
print len(a) 
[ str(b).replace('666','777') for b in a ] # changing type int to str 

Note 1.: There is a big syntax change in Python 3, because print must have parameters closed in brackets now. So you need to use for example print(len(a)). Python 3 also uses Unicode encoding by default so no explicit definition of encoding is necessary.

Note 2.: If you want to use national characters in Python 2.x code, you need to define encoding. Use tag “# -*- coding: encoding_name -*-” in initial comment:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

print 'To je ale nepříjemnost!'

You will get the following error message without the coding tag:

File "./skript.py", line 3
SyntaxError: Non-ASCII character '\xc5' in file ./pok2.py on line 3, but no
encoding declared; see http://www.python.org/peps/pep-0263.html for details

Hash maps

Bash supports hashes/associative arrays from version 4.0. Hash maps are not indexed by integers but by keys. If you want to declare an associative array you need to use the command declare -A name.

Python calls associative arrays as dictionaries.

bashPython
declare -A hash
hash['key']='value'
hash[other key]=777 # it is not necessary to quote the key
set | grep ^hash=   # print its definition
hash=([key]="value" ["other key"]="777" )
${!hash[*]} # all keys
${hash[*]}  # all values
hash = dict()
hash['key'] = 'value'
hash['other key']=777
print(hash) # print dictionary:
{'other key': 777, 'key': 'hodnota'}
hash.keys()   # return list with keys
hash.values() # return list with values

Read and write files

Bash redirects the standard output (stdout) to file by the operator “> file”. If the file doesn't exist but user has rights, then a file is created and its size is truncated. To add data at the end, use the operator “>> file”. There are several options to read data from the file and the example below shows a more complex way of opening a new file descriptor.

Python works with files in similar ways as other programming languages. You must open a file in the first place (open), you can read bytes (read), lines (readline) or read the whole file into a variable (readlines), etc. You can also use shell and redirect all stdout messages into file: ./skript.py > file.txt.

bashPython
echo 'Hello World!' > file.txt
exec 3>soubor.txt # file for writing by descriptor 3
echo 'Hello World!' 1>&3 # stdout redirects to fd 3
sync # operating system command to flush buffers
exec 3>&- # closing descriptor

f = open('soubor.txt', 'w')    # options r, w, rb, w+, r+
print>>f, 'Hello World!' # Python 2.x
print('Hello World!', file=f)  # Python 3.x
f.flush()                      # emptying of file buffer
f.close()                      # closing file

Read and write pipes

Bash uses the vertical bar symbol “|” for connecting commands – pipe. The first command writes data to stdout, but the pipe redirects the stdout into the standard input (stdin) which is read by another command, this command then processes data and can send them further. This is the normal way in which many UNIX commands called filters work. If your script is to work as a filter, you need to use the command read which reads stdin and inside the while-cycle into a variable and you can do some operations with this variable.

Python has functions for inter-process communication in the module subprocess. The module contains variants of system calls and also an object called Popen with methods for data stream manipulations. Popen.communicate() returns tuple (stdoutdata, stderrdata), then Popen.communicate(...)[0] reads data from stdout. Popen.communicate() stores whole data into memory and its usage is not recommended for huge/infinite amounts of data.

bashPython
ls | while read i # read line by line
do 
	echo "Input was: $i" 
done 
# Read output of command 'ls' into variable
filelist = subprocess.Popen("ls", shell=True, stdout=subprocess.PIPE).communicate()[0]
for i in filelist.split('\n') :
   print "Input was: " + i

Regular expressions

Bash has support for extended regular expressions (regex). You can use them with the operator “=~”. You can also use standard utilities like grep, sed for text processing.

Python has the module re which imports support of extended regexes and also contains Perl extensions.

bashPython
a="LinuxDays 2014"
regex='(.*)Days ([0-9]{4})' # better store regex to variable, space is problematic
if [[ $a =~ $regex  ]] ; then # if $a corresponds to regex do following
        echo Input string: \"${BASH_REMATCH[0]}\"
        echo OS: \"${BASH_REMATCH[1]}\"
        echo Year: \"${BASH_REMATCH[2]}\"
fi
a = "LinuxDays 2014"
ro = re.compile('(.*)Days ([0-9]{4})')
result = ro.match(a) # creates new regex object
if  result : # if object exists do following
	print('Input string: "%s" ' % result.group(0) )
	print('OS: "%s" ' % result.group(1) )
	print('Year: "%s" ' % result.group(2) )

Command line parameters

Bash uses positional parameters given from the command line through variables named by their numerical positions: $1 first, $2 – second parameter, etc. to $9. For higher numbers, close number into curly brackets: ${10} … ${255}. To check the input parameters, use special variable operators, e.g. A=${1:-something} which check if $1 contains a value. When it is empty, the value something will be stored into $A. Please check bash cheatsheet or the manual page.

Python has support for CLI in the module sys. Positional parameters can by accessed by the list sys.argv and its index is the position: sys.argv[1] first, sys.argv[2] second, etc.

bashPython
$0 # command what called script
$1 # first positional parameter
${128} # parameters > $9 must be in curly brackets
$# # number of command line parameters
for i in "$*"; do echo \"$i\"; done # $* expands parameters as "a b c"
for i in "$@"; do echo \"$i\"; done # $@ expands parameters as "a" "b" "c"
sys.argv[0]
sys.argv[1]
sys.argv[128]
len(sys.argv)
' '.join(sys.argv[1:]) # returns string of all parameters delimited by space
for i in sys.argv[1:] : print "'%s'" % i # process parameters one-by-one

Note: For more sophisticated processing of input parameters use bash command getopt or getopts, in Python getopt.getopt().