Contents
Python and the system
Python can be used to manage and to communicate with the operating system. The standard library contains functions for
- string processing
- time and date
- working with paths, directories and files
- networking, email
- interprocess and thread communication
- graphical user interface
- Unix, MS-Windows, Mac, SGI, and SunOS specific services
and much more...
String processing
String objects (variables) offer a variety of methods (functions) that can be used to format and process the string. The description of the built-in methods can be found in the standard documentation about string methods.
Examples
Split
Returns a list of the words in the string S, using an optional separator as a delimiter string.
Time and date
Example: profiling
A common problem is the optimization of run time execution of a program. At first you have to identify the bottlenecks. Simple profiling with a timer can be done like this:
More sophisticated functions for profiling can be found here.
When you have found a bottleneck you find here some performance tips to optimize your code
Input output
This lesson deals with the ways of reading and writing data in different formats
Basic Python
The file object can be used for reading and writing plain text as well as unformatted binary data. The following code writes a message in the file with the name out.txt, reads and print the data
write() writes a string to the file
read() reads complete file
read(N) reads N bytes
readlines() reads the file with linebreaks
readline() reads only the next line
Pickle
The pickle module implements an algorithm for serializing and de-serializing a Python object structure. Pickling is the process whereby a Python object hierarchy is converted into a byte stream, and unpickling is the inverse operation, whereby a byte stream is converted back into an object hierarchy. The cPickle module is a much faster implementation and should be preferred.
Comma Separated Values
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets (Excel) and databases. The csv module enables CSV file reading and writing
Pylab
X = load('test.dat') # data in two columns t = X[:,0] y = X[:,1]
Interaction with the operating system
The modules sys and os provide the basic interface to the operating system. The module os creates a portable abstraction layer which is used by high-level modules like glob, socket, thred, time, fcntl.
Module sys
The module sys provides access to system-specific parameters by the interpreter.
Example argv
run system1.py parameter1 parameter2 ['system1.py', 'parameter1', 'parameter2']
The script prints the command line arguments that are passed to the script. argv[0] is the script name
A more sophisticated way of evaluating command line arguments is provided by the module optparse
Module os
The module os is a portable operating system interface.
Some examples:
os.system() Executes the command (a string) in a subshell
os.mkdir() Creates a directory
os.remove() Deletes a file
os.path.isdir() Test if directory
os.path.isfile() Test if file
os.path.exists() Test if file or directory exists
os.path.getsize() Size of a file
os.path.basename() Base name of pathname
os.walk() Directory tree generator
Module fnmatch
The module fnmatch provides support for Unix shell-style wildcards
Module glob
The module glob finds all the pathnames matching a specified pattern according to the rules used by the Unix shell.
The following example looks for all pdf files in the current working directory and converts them into postscript files.
Module shutil — High-level file operations
The shutil module offers a number of high-level operations on files and collections of files. In particular, functions are provided which support file copying and removal.
Unix Specific Services
Features that are unique to the Unix operating system are for example shell pipelines (data streams) that pipe the output of one program to another. The pipeline symbol is |. For example, the command ls -s | sort -rg pipes the output of ls -s to the sort program. The result is a list of filenames sorted by its size
A python pipeline to a Unix programm can be established using the module pipes
System programming: walk example
The following script walks through a directory tree and looks for all files with the matching extension:
Save the file as walkdir.py and use chmod +x walkdir.py to set the execution permissions of the file. The first magic line starts the python interpreter. The script can be exectuted on the bash shell using:
./walkdir.py $HOME/subdir ps
Without the magic line, the script has to be run like this:
python walkdir.py $HOME/sync/ ps
Or within ipython using run
Exercise 1
Read in the following meta-data and extract the field ACQUISITION_DATE L5058011_01120090712_MTL.txt
Exercise 2
Dear User, according to our records you use too much disk space in your home directory. Please delete at least 862312312.29 Megabytes in the next 1.2 days. If you have any questions do not hesitate to send a mail to help-it@zmaw.de. best regards Central IT Services - ZMAW
Have you ever received such an email? If not, perfect! Otherwise you should look at least for the largest files.
- Write a function that walks through a directory tree and collects the pathnames and sizes of all files.
- Write a function that sorts the results of the first function according to the size of the files.
- Combine both functions in a script which can be called from the bash.
- Optional: limit the resulting list to the top10 sizes
- Optional: automatically compress the largest files of a certain type.