Size: 8198
Comment:
|
Size: 10095
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 5: | Line 5: |
Line 9: | Line 8: |
Line 13: | Line 11: |
The [[http://docs.python.org/lib/bltin-file-objects.html|file object]] can be used for reading and writing plain text as well as unformatted binary data. The following code writes a message in the file with the name ''out.txt'', reads and print the data {{{#!python |
The [[http://docs.python.org/lib/bltin-file-objects.html|file object]] can be used for reading and writing plain text as well as unformatted binary data. The following code writes a message in the file with the name ''out.txt'', reads and print the data {{{ #!python |
Line 21: | Line 19: |
Line 29: | Line 26: |
Line 32: | Line 28: |
{{{#!python | {{{ #!python |
Line 39: | Line 37: |
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets (Excel) and databases. The [[http://docs.python.org/lib/module-csv.html|csv module]] enables CSV file reading and writing |
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets (Excel) and databases. The [[http://docs.python.org/lib/module-csv.html|csv module]] enables CSV file reading and writing |
Line 43: | Line 40: |
Line 45: | Line 41: |
*{{{fromfile(...)}}} *{{{.tofile(...)}}} |
* {{{fromfile(...)}}} * {{{.tofile(...)}}} = Scientific data format module matrix = ||<tablewidth="200px">Dataformat ||[[http://pysclint.sourceforge.net/pyhdf/|pyhdf]] ||[[http://www.gdal.org/|GDAL]] ||[[http://www.pyngl.ucar.edu/|PyNIO]] ||[[http://www.pytables.org/moin|PyTables]] || ||Netcdf ||read ||read/write ||read/write || || ||HDF4 ||read/write ||read/write ||read/write || || ||HDF5 || ||read/write || ||read/write || ||GRIB || ||read ||read/write || || |
Line 51: | Line 54: |
NASA's standard file format, the [[http://hdf.ncsa.uiuc.edu/index.html||Hierarchical Data Format (HDF)]] is a self-describing data format. HDF files can contain binary data and allow direct access to parts of the file without first parsing the entire contents. |
NASA's standard file format, developed by NCSA. The [[http://hdf.ncsa.uiuc.edu/index.html|HDF format]] is a self-describing data format. HDF files can contain binary data and allow direct access to parts of the file without first parsing the entire contents. |
Line 57: | Line 59: |
Line 58: | Line 61: |
[[http://pysclint.sourceforge.net/pyhdf/|pyhdf]] is a python interface to the NCSA HDF4 library. | [[http://pysclint.sourceforge.net/pyhdf/|pyhdf]] is a simple python interface to the NCSA HDF4 library. |
Line 61: | Line 64: |
{{{#!python |
{{{ #!python |
Line 76: | Line 81: |
Line 78: | Line 82: |
Line 80: | Line 84: |
Line 84: | Line 89: |
Line 89: | Line 93: |
{{{#!python | {{{ #!python |
Line 111: | Line 117: |
resources.mpFillOn = True | resources.mpFillOn = True |
Line 137: | Line 143: |
Line 141: | Line 145: |
The September mean sea ice concentration values derived from the [[http://nsidc.org/data/docs/daac/ssmi_instrument.gd.html|Special Sensor Microwave Imager (SSM/I)]] are stored in the netCDF file [[ftp://ftp.ifremer.fr/ifremer/cersat/products/gridded/psi-concentration/data/arctic/climatology/netcdf/climatology_09.nc.Z|climatology_09.nc]] . Compressed data with the extension {{{.Z}}} or {{{.gz}}} can be uncompressed using {{{uncompress}}} or {{{gunzip}}}, respectively. A description of the data and the algorithm can be found [[ftp://ftp.ifremer.fr/ifremer/cersat/products/gridded/psi-drift/documentation/ssmi.pdf|here]]. |
The September mean sea ice concentration values derived from the [[http://nsidc.org/data/docs/daac/ssmi_instrument.gd.html|Special Sensor Microwave Imager (SSM/I)]] are stored in the netCDF file [[ftp://ftp.ifremer.fr/ifremer/cersat/products/gridded/psi-concentration/data/arctic/climatology/netcdf/climatology_09.nc.Z|climatology_09.nc]] . Compressed data with the extension {{{.Z}}} or {{{.gz}}} can be uncompressed using {{{uncompress}}} or {{{gunzip}}}, respectively. A description of the data and the algorithm can be found [[ftp://ftp.ifremer.fr/ifremer/cersat/products/gridded/psi-drift/documentation/ssmi.pdf|here]]. |
Line 145: | Line 148: |
Line 152: | Line 154: |
== Envisat ASAR == | |
Line 153: | Line 156: |
{{{#!python | {{{ #!python |
Line 171: | Line 177: |
Line 173: | Line 178: |
== Landsat 7 == {{{ #!python import sys,os home=os.getenv('HOME') import gdal,glob from pylab import * def sensor_params(): Freqs={'B10':(0.45,0.52),'B20':(0.52,0.60),'B30':(0.63,0.69),'B40':(0.77,0.90),'B50':(1.55,1.75),'B70':(2.09,2.35),'B61':(10.4,12.5),'B62':(10.4,12.5),'B80':(0.52,0.90)} Resolution={'B10':1,'B20':1,'B30':1,'B40':1,'B50':1,'B70':1,'B61':0.5,'B62':0.5,'B80':2} # Relative to 30m return {'f':Freqs,'r':Resolution} def read_ls7(datadir,crop): SP=sensor_params() files=glob.glob(datadir+'*.TIF') B={} for fn in files: band=fn[-7:-4] C=((crop*SP['r'][band]).astype(int)) print 'Reading Band ',band,'Size Position',C B[band]=gdal.Open(fn).ReadAsArray(xsize=C[0],ysize=C[1],xoff=C[2],yoff=C[3]) return B def display_rgb(B,**kw): RGB=array([B['B30'],B['B20'],B['B10']]).transpose() if kw.has_key('crop'): C=kw['crop']#extent is (left, right, bottom, top) data values of the axes extent=(C[2],C[2]+C[0],C[3],C[3]+C[1]) imshow(RGB,origin='lower',interpolation='nearest',extent=extent) else: imshow(RGB,origin='lower',interpolation='nearest') show() return datadir=home+'/data/landsat/' crop=array([600,600,2600,4700]) # Crop region xsize,ysize,xoff,yoff B=read_ls7(datadir,crop) display_rgb(B,crop=crop) savefig('landsat7.png',dpi=100) }}} {{attachment:landsat7.png}} |
Contents
Reading different data formats
This lesson deals with the ways of reading and writing data in different formats
Basic Python
The file object can be used for reading and writing plain text as well as unformatted binary data. The following code writes a message in the file with the name out.txt, reads and print the data
write() writes a string to the file
read() reads complete file
read(N) reads N bytes
readlines() reads the file with linebreaks
readline() reads only the next line
Pickle
The pickle module implements an algorithm for serializing and de-serializing a Python object structure. Pickling is the process whereby a Python object hierarchy is converted into a byte stream, and unpickling is the inverse operation, whereby a byte stream is converted back into an object hierarchy. The cPickle module is a much faster implementation and should be preferred.
Comma Separated Values
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets (Excel) and databases. The csv module enables CSV file reading and writing
NumPy/SciPy
Arrays can be read from a file and written to a file using the function and method:
fromfile(...)
.tofile(...)
Scientific data format module matrix
Dataformat |
||||
Netcdf |
read |
read/write |
read/write |
|
HDF4 |
read/write |
read/write |
read/write |
|
HDF5 |
|
read/write |
|
read/write |
GRIB |
|
read |
read/write |
|
HDF
NASA's standard file format, developed by NCSA. The HDF format is a self-describing data format. HDF files can contain binary data and allow direct access to parts of the file without first parsing the entire contents.
The HDF versions 4 and 5 are not compatible.
Different modules are available for reading and writing HDF files
HDF4 pyhdf
pyhdf is a simple python interface to the NCSA HDF4 library.
The following example demonstrates how to read level-3 data from the Multi-angle Imaging Spectral Radiometer (MISR) on the Terra Satellite
1 from scipy import array
2 from pylab import imshow,colorbar,title,savefig
3 from pyhdf.SD import SD
4
5 f=SD('MISR_AM1_CGLS_MAY_2007_F04_0025.hdf')
6 print f.datasets().keys()
7 data=array(f.select('NDVI average').get())
8 data[data<0]=0
9
10 imshow(data,interpolation='nearest',cmap=cm.YlGn)
11 colorbar(shrink=0.5)
12 title('Normalized Difference Vegetation Index')
Line 5 opens the HDF file object. line 6 prints the keywords of the included datasets. From this one can identify the keyword for the desired parameter. Line 7 reads the data in a SciPy array. Line 8 selects the negative (bad and missing) data and sets them to zero.
HDF5
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data.
netCDF
PyNIO is a Python package that allows read and/or write access to a variety of data formats using an interface modelled on netCDF.
The following example demonstrates how to use the PyNGL module to read and display sea ice concentration data.
1 import os
2 from scipy import array,arange,nan
3 from PyNGL import Ngl
4 from PyNGL import Nio
5
6 def Ngl_map(C,lat,lon,psfile):
7 rlist = Ngl.Resources()
8 rlist.wkColorMap = 'posneg_1'
9 wks_type = "ps"
10 wks = Ngl.open_wks(wks_type,psfile,rlist)
11 resources = Ngl.Resources()
12 resources.sfXArray = lon[:,:]
13 resources.sfYArray = lat[:,:]
14 resources.mpProjection = "Stereographic"
15 resources.mpDataBaseVersion = "MediumRes"
16 resources.mpLimitMode = "LatLon"
17 resources.mpMinLonF = 0
18 resources.mpMaxLonF = 360
19 resources.mpMinLatF = 65
20 resources.mpMaxLatF = 90
21 resources.mpCenterLonF = 0.
22 resources.mpFillOn = True
23 igray = Ngl.new_color(wks,0.7,0.7,0.7)
24 resources.mpFillColors = [0,-1,igray,-1]
25 resources.cnLineDrawOrder = "Predraw"
26 resources.cnFillOn = True
27 resources.cnFillDrawOrder = "Predraw"
28 resources.cnLineLabelsOn = False
29 resources.nglSpreadColorStart = 8
30 resources.nglSpreadColorEnd = -2
31 resources.cnLevelSelectionMode = "ExplicitLevels" # Define own levels.
32 resources.cnLevels = arange(0.,100,10)
33 resources.lbTitleString = 'Concentration [%]'
34 resources.lbOrientation = "Horizontal"
35 resources.cnFillMode = "RasterFill"
36 resources.cnLinesOn = False
37 resources.tiMainString = "~F22~Arctic Sea Ice Coverage~C~~F21~September average from SSM/I"
38 map = Ngl.contour_map(wks,C[:,:],resources)
39
40 grid = Nio.open_file('grid_north_12km.nc')
41 lat=array(grid.variables['latitude'])
42 lon=array(grid.variables['longitude'])
43 nc = Nio.open_file('climatology_09.nc')
44 C=array(nc.variables['concentration'])[0,:,:].astype(float)
45 Ngl_map(C,lat,lon,'map')
46 os.system('gv map.ps &')
The data files can be downloaded from the ftp server of the Center for Satellite Exploitation and Research (CERSAT) which is one of the major world data centers for oceanography.
The September mean sea ice concentration values derived from the Special Sensor Microwave Imager (SSM/I) are stored in the netCDF file climatology_09.nc . Compressed data with the extension .Z or .gz can be uncompressed using uncompress or gunzip, respectively. A description of the data and the algorithm can be found here.
The main program that reads the data starts in line 40. At first the coordinates of the corresponding pixels are read into the variables lat and lon. The ice concentration data are read in the variable C. The function Ngl_map creates a postscript output in the file map.ps which is displayed using the command gv.
Various Satellite data formats
The Geospatial Data Abstraction Library (GDAL) has a single abstract data model for all supported formats
Envisat ASAR
The following code demonstrates how to read in an Envisat ASAR image using the gdal module.
1 import gdal, struct
2 from scipy import array,empty,uint16
3 from pylab import imshow
4
5 filename='/pf/u/u242023/ASA_IMM_1PNPDK20080410_095538_000001462067_00337_31954_4832.N1'
6
7 f=gdal.Open(filename)
8 a=f.GetRasterBand(1)
9 img=empty((a.YSize,a.XSize),dtype=uint16)
10
11
12 for yi in xrange(a.YSize):
13 scanline=struct.unpack("H"*a.XSize,a.ReadRaster(0,yi,a.XSize,1,a.XSize,1,gdal.GDT_UInt16))
14 img[yi,:]=array(scanline).astype(uint16)
15
16 imshow(img,vmin=0,vmax=3000,interpolation='nearest',origin='lower')
Landsat 7
1 import sys,os
2 home=os.getenv('HOME')
3 import gdal,glob
4 from pylab import *
5
6 def sensor_params():
7 Freqs={'B10':(0.45,0.52),'B20':(0.52,0.60),'B30':(0.63,0.69),'B40':(0.77,0.90),'B50':(1.55,1.75),'B70':(2.09,2.35),'B61':(10.4,12.5),'B62':(10.4,12.5),'B80':(0.52,0.90)}
8 Resolution={'B10':1,'B20':1,'B30':1,'B40':1,'B50':1,'B70':1,'B61':0.5,'B62':0.5,'B80':2} # Relative to 30m
9 return {'f':Freqs,'r':Resolution}
10
11
12 def read_ls7(datadir,crop):
13 SP=sensor_params()
14 files=glob.glob(datadir+'*.TIF')
15 B={}
16 for fn in files:
17 band=fn[-7:-4]
18 C=((crop*SP['r'][band]).astype(int))
19 print 'Reading Band ',band,'Size Position',C
20 B[band]=gdal.Open(fn).ReadAsArray(xsize=C[0],ysize=C[1],xoff=C[2],yoff=C[3])
21 return B
22
23 def display_rgb(B,**kw):
24 RGB=array([B['B30'],B['B20'],B['B10']]).transpose()
25 if kw.has_key('crop'):
26 C=kw['crop']#extent is (left, right, bottom, top) data values of the axes
27 extent=(C[2],C[2]+C[0],C[3],C[3]+C[1])
28 imshow(RGB,origin='lower',interpolation='nearest',extent=extent)
29 else:
30 imshow(RGB,origin='lower',interpolation='nearest')
31 show()
32 return
33
34 datadir=home+'/data/landsat/'
35 crop=array([600,600,2600,4700]) # Crop region xsize,ysize,xoff,yoff
36 B=read_ls7(datadir,crop)
37 display_rgb(B,crop=crop)
38 savefig('landsat7.png',dpi=100)