When you launch a program in shell, the way it is executed is derived from the first few bytes of executable file. There is a token “#!” called hashbang or shebang for interpreted languages and a path to the language interpreter continues after the given token.
bash | Python |
---|---|
#!/bin/bash ...shell commands |
#!/usr/bin/env python ...python commands |
The path /bin/bash is usual for bash, but for python it could be /usr/bin/python or /usr/local/bin/python or something else. If you are dealing with an unknown path for python use the command env, which executes the following command in a modified environment and it will find python somewhere in the system paths given in the variable $PATH.
Do not forget to set an execution flag to your script: chmod +x ./myscript.sh
bash | Python |
---|---|
bash -c "echo Hello World\!" |
python -c "print 'Hello World!'" |
Bash uses two groups of commands, the first group contains embedded functions (e.g. if, for, trap, return, export, declare,... see man bash), the second group contains user defined functions or programs inside system paths of the operating system (ls, stat, date, chmod, chown,...). The presence and parameters of these commands are defined by POSIX norms, but in GNU/Linux a lot of these commands have additional functionality. System paths are defined in the environmental variable called $PATH and separated by a colon. If you call a program which is stored somewhere else, you need to use an absolute or relative path to its file. If you want to read it as a module of another shell script, you need to use the command source or a dot “.” command.
Python has a set of general functions and syntax, but other functions are defined in modules of python's standard library. You can enable them with import. For system administrators the following modules are the most useful:
Functions defined in the module can be called by adding a prefix of the module name, e.g. sys.argv, os.mkdir("/tmp/temp"), etc. Or you can use them directly after the expression: from os import * which puts all os module functions into base name space.
bash | Python |
---|---|
#!/bin/bash source /etc/functions |
#!/usr/bin/env import re, sys, os, subprocess |
Bash separates each command by a new line or a semicolon. Expression statement blocks have defined marks for begin-end, e.g. if-fi, case-esac, for do-done,... Indentation is not a must, but it can help to improve code readability. Long commands can be divided if you use a backslash in front of new lines.
Python also separates commands by new lines and or semicolons, but blocks must be indented and what is inside the block is given by an indentation. You can use one or more spaces or tabulators as indentation. The most common indentation is the use of 4 spaces.
bash | Python |
---|---|
for i in {0..10} do echo $i echo $((i*2)) done for i in {0..10}; do echo $i; echo $((i*2)); done |
for i in range(0,10): print i print i*2 for i in range(0,10): print i; print i*2; |
Bash has a very strict syntax for variable creation: NAME=10, NAME="value" – no space exists between variable names and assignment operators. This has an advantage, if you need to find where a variable is changed just grep NAME[+]\?= (you can only use NAME=), but to be exact, this regexp also covers the operator += for the addition of new items to a string or an array. To get the value of a variable, you need to prefix it with a dollar sign, i.e. $NAME or ${NAME}. Variables can contain integers, strings, arrays and associative arrays (hashes) data structures.
Python assigns variables in a common way, and also uses the operator “=”. You can also use the same assignment-expression operators that a common in the C language, so a += 10 is the same as a = a + 10 (but there is no i++, you need to use i+=1 instead). In bash, these operators can be used only in mathematic mode.
bash | Python |
---|---|
date +%Y%m%d%H%M # perform command var=$(date +%Y%m%d%H%M) # Redirect standard error output var2=$(ls non_existent_file 2>&1) |
subprocess.call(['date', '+%Y%m%d%H%M']) # CLI parameters divided to list var = subprocess.check_output("date +%Y%m%d%H%M", shell=True) # Non-zero return value causes error message in check_output(), end it by 'exit 0' var2 = subprocess.check_output("ls non_existent_file; exit 0", stderr=subprocess.STDOUT, shell=True) |
Beware of a security issue when you use a string instead of a list with command line arguments when user input is used for its values. An evil user can end the command with a semicolon and inject his own executable code!
Bash distinguishes between 'simple' and "double" quotes. In the first case, special characters (e.g. !, $, \n, \t,...) and escape sequences are not interpreted. If you do not put strings into quotes, consequent white chars will be joined together.
Python does not care about used quotes. If you do not want to interpret special sequences use raw strings, e.g. r"\n", R'\n', or ur"\n", UR'\n' for Unicode. If you want to display normal strings as raw data use repr(string).
bash | Python |
---|---|
a="Hello" a+=" World\!" ${#a} # string length ${a:0:1} # returns 'H', return sub-string of length 1 from position 0 ${a:6:5} # from position 6, sub-string of length 5 "World" printf 'Hello%.0s' {1..3} # repeat 3× printf "%d" \'A # print ASCII value of letter 'A' printf \\$(printf '%03o' 65) # convert ASCII code 65 to letter s="jedna dva tři"; pole=( $s ) # separate string into array items ${s/jedna/nula} # replace first occurrence ${s//a/B} # replace all occurrences echo 'měšťánek' | sed 's/[[:lower:]]*/\U&/' # lower-case to upper-case # Find the number of sub-strings is quite breakneck s="aaa;baaa;c;daaa;e"; s2=${s//[!aaa]/}; s3=${s2//aaa/a}; echo ${#s3} echo 'hellow world' | rev # print backwards printf -v s "%03d %s" 10 'Test data' # save string '010 Test data' into $s s=`printf "%03d %s" 10 'Test data'` # slower way how to save formated string |
a = 'Hello' a += " World!" len(a) a[0] a[6:6+5] # from position 6 to position 11 'Hello' * 3 # you can use it directly in Python ord(A) chr(65) pole = "jedna dva tři".split() "jedna dva tři".replace('jedna', 'nula', 1) "jedna dva tři".replace('a', 'B') print U'měšťánek'.upper() # do not forget to use prefix U for Unicode # call count() in Python s="aaa;baaa;c;daaa;e"; s.count('aaa') 'hello world'[::-1] s = "%03d %s" % ( 10, 'Test data' ) s = "{0:03d} {1!s}".format( 10, 'Test data' ) # formate string with format() |
Bash only contains the function echo and the command printf. The echo command can use the parameter -e for the interpretation of escape sequences and -n for disabling of printing new-line char by default. Command printf is equivalent to the same named function in the C standard library. Because the behaviour and command line options of echo are different in various UNIX platforms, it is better to use printf instead.
Python contains the function print and from version 3+, it is mandatory to close parameters with brackets. Function print can be used in a similar way as printf, but there are more options for formatting strings. Data is separated by “%” or the method format() can be used. It returns formatted strings which can then be used as input parameters for print/print().
bash | Python |
---|---|
a="Hello" b="World" # initialize variable on one line echo $a$b # output is HelloWorld echo $a $b # output is Hello World printf "%s %s\n" $b $a # output is World Hello echo 'Error!' 1>&2 # redirect to standard error output (stderr) p=("This" "is" "an" "array") # initialize array $p # By redefinition of IFS can by changed output delimiter of output, IFS must be restored then OLDIFS="$IFS"; IFS=, ; echo "${p[*]}"; IFS="$OLDIFS" # output is This,is,an,array |
a="Hello"; b="World" print a + b # use operator for string concatenation print a, b # variables will be delimited by white space print "%s %s" % (b, a) print "{1} {0}".format(a, b) # same operation with format() sys.stderr.write('Error!') p=["This", "is", "an", "array"] # Use method join() to iterate array and print it with delimiter print ','.join(p) |
Bash supports one-dimensional arrays unlike other shells. Their usage will be shown in examples below. You can access each item of array by indexes which start from zero and you can also concatenate arrays, add new items at the end or destroy arrays by using the function unset.
Python contains the data structure called list and you can use it similary as an array, but it also supports some other methods (e.g. remove item by its content remove(), return index by content index(), or sorting sort(),...).
bash | Python |
---|---|
a=( Toje je pole "Prvek 3" 31337 ) # initialization, white space is delimiter delimiter set | grep ^a= # print out all variables, but grep only $a, output is following: a=([0]="Toje" [1]="je" [2]="pole" [3]="Prvek 3" [4]="31337") # other way of initialization a[3]=${a[3]}", on fourth place" # change one item echo ${a[*]} # print array a+=666 # add string 666 to first item a+=(666) # add item 666 (array with one item to the end) for i in "${a[@]}"; do echo $i; done | sort # sort array (check 'man sort') for i in "${a[@]}"; do echo $i; done | tac # print backwards echo ${#a[*]} # print number of items echo ${a[*]//666/777} # replace all 666 by 777 |
a = [ 'Toje', 'je', 'pole', 'Prvek 3', 31337 ] print a # print array a[3]=a[3] + ", on fourth place" print str(a).decode('unicode-escape') # pro interpretaci UTF-8 sekvencí takto a+=666 # ends with error 'int' object is not iterable a[0]+="666" # add string to first item a.append(666) print sorted(a) # print sorted list a.sort() # sort items 'in-place' print list(reversed(a)) # print reversed by reversed(), it return iterator so change type to list a.reverse() # reverse items 'in-place' print len(a) [ str(b).replace('666','777') for b in a ] # changing type int to str |
Note 1.: There is a big syntax change in Python 3, because print must have parameters closed in brackets now. So you need to use for example print(len(a)). Python 3 also uses Unicode encoding by default so no explicit definition of encoding is necessary.
Note 2.: If you want to use national characters in Python 2.x code, you need to define encoding. Use tag “# -*- coding: encoding_name -*-” in initial comment:
#!/usr/bin/env python # -*- coding: utf-8 -*- print 'To je ale nepříjemnost!'
You will get the following error message without the coding tag:
File "./skript.py", line 3 SyntaxError: Non-ASCII character '\xc5' in file ./pok2.py on line 3, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Bash supports hashes/associative arrays from version 4.0. Hash maps are not indexed by integers but by keys. If you want to declare an associative array you need to use the command declare -A name.
Python calls associative arrays as dictionaries.
bash | Python |
---|---|
declare -A hash hash['key']='value' hash[other key]=777 # it is not necessary to quote the key set | grep ^hash= # print its definition hash=([key]="value" ["other key"]="777" ) ${!hash[*]} # all keys ${hash[*]} # all values |
hash = dict() hash['key'] = 'value' hash['other key']=777 print(hash) # print dictionary: {'other key': 777, 'key': 'hodnota'} hash.keys() # return list with keys hash.values() # return list with values |
Bash redirects the standard output (stdout) to file by the operator “> file”. If the file doesn't exist but user has rights, then a file is created and its size is truncated. To add data at the end, use the operator “>> file”. There are several options to read data from the file and the example below shows a more complex way of opening a new file descriptor.
Python works with files in similar ways as other programming languages. You must open a file in the first place (open), you can read bytes (read), lines (readline) or read the whole file into a variable (readlines), etc. You can also use shell and redirect all stdout messages into file: ./skript.py > file.txt.
bash | Python |
---|---|
echo 'Hello World!' > file.txt exec 3>soubor.txt # file for writing by descriptor 3 echo 'Hello World!' 1>&3 # stdout redirects to fd 3 sync # operating system command to flush buffers exec 3>&- # closing descriptor |
f = open('soubor.txt', 'w') # options r, w, rb, w+, r+ print>>f, 'Hello World!' # Python 2.x print('Hello World!', file=f) # Python 3.x f.flush() # emptying of file buffer f.close() # closing file |
Bash uses the vertical bar symbol “|” for connecting commands – pipe. The first command writes data to stdout, but the pipe redirects the stdout into the standard input (stdin) which is read by another command, this command then processes data and can send them further. This is the normal way in which many UNIX commands called filters work. If your script is to work as a filter, you need to use the command read which reads stdin and inside the while-cycle into a variable and you can do some operations with this variable.
Python has functions for inter-process communication in the module subprocess. The module contains variants of system calls and also an object called Popen with methods for data stream manipulations. Popen.communicate() returns tuple (stdoutdata, stderrdata), then Popen.communicate(...)[0] reads data from stdout. Popen.communicate() stores whole data into memory and its usage is not recommended for huge/infinite amounts of data.
bash | Python |
---|---|
ls | while read i # read line by line
do
echo "Input was: $i"
done
|
# Read output of command 'ls' into variable
filelist = subprocess.Popen("ls", shell=True, stdout=subprocess.PIPE).communicate()[0]
for i in filelist.split('\n') :
print "Input was: " + i
|
Bash has support for extended regular expressions (regex). You can use them with the operator “=~”. You can also use standard utilities like grep, sed for text processing.
Python has the module re which imports support of extended regexes and also contains Perl extensions.
bash | Python |
---|---|
a="LinuxDays 2014" regex='(.*)Days ([0-9]{4})' # better store regex to variable, space is problematic if [[ $a =~ $regex ]] ; then # if $a corresponds to regex do following echo Input string: \"${BASH_REMATCH[0]}\" echo OS: \"${BASH_REMATCH[1]}\" echo Year: \"${BASH_REMATCH[2]}\" fi |
a = "LinuxDays 2014" ro = re.compile('(.*)Days ([0-9]{4})') result = ro.match(a) # creates new regex object if result : # if object exists do following print('Input string: "%s" ' % result.group(0) ) print('OS: "%s" ' % result.group(1) ) print('Year: "%s" ' % result.group(2) ) |
Bash uses positional parameters given from the command line through variables named by their numerical positions: $1 first, $2 – second parameter, etc. to $9. For higher numbers, close number into curly brackets: ${10} … ${255}. To check the input parameters, use special variable operators, e.g. A=${1:-something} which check if $1 contains a value. When it is empty, the value something will be stored into $A. Please check bash cheatsheet or the manual page.
Python has support for CLI in the module sys. Positional parameters can by accessed by the list sys.argv and its index is the position: sys.argv[1] first, sys.argv[2] second, etc.
bash | Python |
---|---|
$0 # command what called script $1 # first positional parameter ${128} # parameters > $9 must be in curly brackets $# # number of command line parameters for i in "$*"; do echo \"$i\"; done # $* expands parameters as "a b c" for i in "$@"; do echo \"$i\"; done # $@ expands parameters as "a" "b" "c" |
sys.argv[0] sys.argv[1] sys.argv[128] len(sys.argv) ' '.join(sys.argv[1:]) # returns string of all parameters delimited by space for i in sys.argv[1:] : print "'%s'" % i # process parameters one-by-one |
Note: For more sophisticated processing of input parameters use bash command getopt or getopts, in Python getopt.getopt().