Contents
Useful links
numpy overview (numpy 1.5 currently installed at ZMAW)
http://ipython.org/IPython]]
Python Data Types
Strings
There is no character type. Characters are strings of length 1.
In [3]: string = 'Hello wolrd! I\'m a string' In [4]: print string Hello wolrd! I'm a string
Numbers
In [6]: number_i = 3 In [7]: number_f = 3.141597
Lists
In [9]: list_name = ['A', 3, number_f, [12., 34.], (1.,2.), {'a':[1,2,3,4]}] In [10]: print list_name[0] A In [11]: print list_name[3] [12.0, 34.0] In [17]: list_name.append(5.1234) In [18]: print list_name ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234]
Get help on objects
In [13]: list_name? Type: list String Form:['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}] Length: 6 Docstring: list() -> new empty list list(iterable) -> new list initialized from iterable's items In [14]: # help(list_name) In [15]: # list_name. "double tab" In [16]: # these two cannot be saved for the purpose here, but please try them out - these helps are very useful
Tuples
In [20]: tuple_name = ('A', 3, number_f, [12., 34.], (1.,2.), {'a':[1,2,3,4]})
What is the difference between list and tuples?
lists are mutable, i.e., values can change (many methods exist, i.e., append, remove, sort, etc.)
tuples are immutable, i.e., values cannot change after creation (only count and index methodes exist)
Dictionaries
In [25]: dict_name = {'key':'value', 'Th morning':'python course'} In [26]: print dict_name.get('key') value In [27]: print dict_name.keys() ['Th morning', 'key'] In [28]: print dict_name.values() ['python course', 'value'] In [29]: dict_name.update({'participants':34}) In [30]: print dict_name {'participants': 34, 'Th morning': 'python course', 'key': 'value'}
Copies of mutable objects
By assigning one variable to another, only references are created. If the variables are mutable and the first variable is changed, then the second variable is also changed (because is actually is the same one, only with a different reference).
To see if two variables acutally reference the same object you can use the is keyword. And to check the memory location of an object there is id. These two are good to play around with and get a feeling for how Python changes references after assignments. For examples see below at Copies of numpy arrays
Copies of lists
In order to create a real copy of a mutable object one needs to deepcopy the object if it consists of mutable values, otherwise a shallow copy is sufficient (if the single values of the mutable object are immutable).
In [32]: list_name_copy = list_name In [33]: print list_name_copy ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234] In [34]: list_name.append('new value') In [35]: print list_name ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234, 'new value'] In [36]: print list_name_copy # has also changed ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234, 'new value'] In [37]: import copy In [38]: list_name_deepcopy = copy.deepcopy(list_name) In [39]: list_name.append('next new value') In [40]: print list_name ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234, 'new value', 'next new value'] In [41]: print list_name_copy # has changed again ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234, 'new value', 'next new value'] In [42]: print list_name_deepcopy # did not change ['A', 3, 3.141597, [12.0, 34.0], (1.0, 2.0), {'a': [1, 2, 3, 4]}, 5.1234, 'new value']
For simple lists (that do not contain other lists) instead of using the copy module, you can also use slicing:
In [19]: list = ['astring', 12, 3.14259] In [20]: list_copy = list In [21]: list is list_copy Out[21]: True In [22]: list_copy = list[:] In [23]: list_copy is list Out[23]: False
but
In [34]: list = ['astring', 12, 3.14259, ['another', 'list']] In [35]: list_copy = list[:] In [36]: list_copy is list Out[36]: False In [37]: list_copy[3] is list[3] Out[37]: True
Copies of numpy arrays
Remember that numpy arrays are also mutable objects.
In [22]: arr = n.array([1,2,3,4], dtype=n.float64) In [23]: brr = arr In [24]: brr is arr Out[24]: True In [25]: id(arr) Out[25]: 25589568 In [26]: id(brr) Out[26]: 25589568 In [27]: id(arr[1]) Out[27]: 26177776 In [28]: id(brr[1]) Out[28]: 26177776 In [29]: brr[1] = 10. In [30]: arr Out[30]: array([ 1., 10., 3., 4.])
Assigning the result of an operation is of course safe.
In [30]: arr Out[30]: array([ 1., 10., 3., 4.]) In [31]: brr = 2 * arr In [32]: brr is arr Out[32]: False In [44]: brr[0]=20. In [45]: arr Out[45]: array([ 1., 10., 3., 4.]) In [49]: brr Out[49]: array([ 20., 20., 6., 8.])
To make confusion complete, slicing is not safe for arrays! Slicing an array creates a so called view. To copy an array use the copy method of the array.
In short: For numpy arrays, always use the copy method if you want a real copy!
In [65]: arr=n.array([1., 2., 3., 4.]) In [66]: brr = arr[:] In [67]: brr is arr Out[67]: False In [68]: id(brr) Out[68]: 47212608 In [69]: id(arr) Out[69]: 52434720 In [70]: brr[1]=20. In [71]: arr Out[71]: array([ 1., 20., 3., 4.]) In [72]: brr = arr.copy() In [73]: brr[0] = 30. In [74]: arr Out[74]: array([ 1., 20., 3., 4.]) In [75]: brr Out[75]: array([ 30., 20., 3., 4.])
Array creation
http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html
Import numpy and give it a shorter name
In [7]: import numpy as n
Create an array from an existing list
The most precise data type of the list's elements will be used (ie. if there is one float, the array will contain floats) unless you specify a type.
In [9]: list = [1, 2, 3, 4] In [10]: arr = n.array(list) In [11]: arr Out[11]: array([1, 2, 3, 4]) In [12]: arr_float = n.array(list, dtype=n.float64) In [13]: arr_float Out[13]: array([ 1., 2., 3., 4.]) In [14]: arr_float = n.array([1., 2., 3., 4.]) In [15]: arr_float Out[15]: array([ 1., 2., 3., 4.])
Create particular arrays
In [16]: arr = n.ones((3,3)) In [17]: arr Out[17]: array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]]) In [18]: arr = n.zeros((3,3)) In [19]: arr Out[19]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]) In [20]: arr = n.empty((3,3)) In [21]: arr Out[21]: array([[ 0.00000000e+000, 6.95281578e-310, 1.62438192e-316], [ 1.16068313e-316, 1.16068313e-316, 1.63041663e-322], [ 6.95281481e-310, 6.95281481e-310, 1.58101007e-322]]) In [23]: arr = n.eye(3) In [24]: arr Out[24]: array([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]]) In [26]: arr = n.eye(3,5) In [27]: arr Out[27]: array([[ 1., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0.], [ 0., 0., 1., 0., 0.]])
Create an array with evenly spaced numbers
In [28]: arr = n.arange(10) In [29]: arr Out[29]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [30]: arr = n.arange(10, 20, 2.5) In [31]: arr Out[31]: array([ 10. , 12.5, 15. , 17.5])
Create an array that looks like another one (same shape)
In [32]: brr = n.ones_like(arr) In [33]: brr Out[33]: array([ 1., 1., 1., 1.])
Create an array that equally divides an interval
In [39]: arr = n.linspace(0, 10, 26) In [40]: arr Out[40]: array([ 0. , 0.4, 0.8, 1.2, 1.6, 2. , 2.4, 2.8, 3.2, 3.6, 4. , 4.4, 4.8, 5.2, 5.6, 6. , 6.4, 6.8, 7.2, 7.6, 8. , 8.4, 8.8, 9.2, 9.6, 10. ])
Manipulate/reshape arrays
http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html
In [54]: arr = n.arange(24) In [55]: arr Out[55]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]) In [56]: arr.reshape((4,6)) # default is ordering of elements like in C Out[56]: array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
This does not change arr, to do this, we need to assign the result of reshape to arr.
Here we also change the order of the elements from C (row major) to Fortran (column major) order.
In [58]: arr = arr.reshape((4,6), order='F') In [59]: arr Out[59]: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) In [60]: arr.reshape((12,2)) Out[60]: array([[ 0, 4], [ 8, 12], [16, 20], [ 1, 5], [ 9, 13], [17, 21], [ 2, 6], [10, 14], [18, 22], [ 3, 7], [11, 15], [19, 23]])
Make a 1D array
In [61]: arr.flatten() Out[61]: array([ 0, 4, 8, 12, 16, 20, 1, 5, 9, 13, 17, 21, 2, 6, 10, 14, 18, 22, 3, 7, 11, 15, 19, 23])
Transpose
In [63]: arr Out[63]: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) In [64]: n.transpose(arr) Out[64]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]) In [65]: arr.T Out[65]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]])
Combine two arrays into a new one
In [89]: a = n.array([1,2,3]) In [90]: b = n.array([4,5,6]) In [91]: n.hstack((a,b)) Out[91]: array([1, 2, 3, 4, 5, 6]) In [92]: n.vstack((a,b)) Out[92]: array([[1, 2, 3], [4, 5, 6]]) In [93]: a = n.array([[1,2,3],[4,5,6]]) In [94]: b = n.array([[7,8,9],[10,11,12]]) In [95]: n.hstack((a,b)) Out[95]: array([[ 1, 2, 3, 7, 8, 9], [ 4, 5, 6, 10, 11, 12]]) In [96]: n.vstack((a,b)) Out[96]: array([[ 1, 2, 3], [ 4, 5, 6], [ 7, 8, 9], [10, 11, 12]])
Read netCDF file and ASCII file
Use the pyNIO class for reading binary data in many common file formats like netCDF, HDF, grib, etc (Mikhail will introduce that package).
Use the loadtxt function of the numpy module to read ASCII files.
In [48]: import numpy as np In [49]: import Nio In [50]: filename_netcdf = 'CEIP_EC_L3_DEHai_ebc.nc' In [51]: filename_ascii = 'DE-Hai.2005.synth.hourly.allvars.csv' In [96]: # read netcdf with pyNIO package In [97]: F = Nio.open_file(filename_netcdf, mode='r') n In [98]: temperature = F.variables['Ta'][:] In [99]: F.close() In [100]: # read ascii file with numpy.loadtxt In [101]: data = np.loadtxt(filename_ascii, delimiter=',', skiprows=1) # comma separated file; first line is header
Get information about arrays
In [102]: type(data) Out[102]: numpy.ndarray In [103]: type(temperature) Out[103]: numpy.ma.core.MaskedArray In [104]: data.dtype Out[104]: dtype('float64') In [105]: temperature.dtype Out[105]: dtype('float32') In [106]: data.shape Out[106]: (17520, 103) In [107]: temperature.shape Out[107]: (140256, 1, 1)
Array statistics
http://docs.scipy.org/doc/numpy/reference/routines.statistics.html
In [126]: n.mean(temperature) Out[126]: 8.3047627793191499 In [127]: n.std(temperature) Out[127]: 7.7843985224597105 In [128]: n.min(temperature) Out[128]: -16.67 In [137]: t1=temperature[:10000].flatten() In [138]: t2=temperature2[:10000].flatten() In [139]: n.corrcoef([t1,t2]) Out[139]: array([[ 1. , 0.00285879], [ 0.00285879, 1. ]])
Indexing and slicing an array
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
In [113]: print temperature[100:222] [ 0.41999999 0.50999999 0.56999999 0.64999998 0.81 0.94999999 1.09000003 1.17999995 1.35000002 1.41999996 1.41999996 1.57000005 1.66999996 1.79999995 1.86000001 1.90999997 1.95000005 1.99000001 2.0999999 2.31999993 2.52999997 2.48000002 2.3900001 2.51999998 2.68000007 2.70000005 2.70000005 2.6099999 2.45000005 2.3599999 2.32999992 2.22000003 2.06999993 1.96000004 2.00999999 2.00999999 2.05999994 2.1400001 2.1099999 2.0999999 2.06999993 2.08999991 2.1099999 2.0999999 2.0999999 2.00999999 2.04999995 2.13000011 2.0999999 2.19000006 2.26999998 2.31999993 2.33999991 2.24000001 2. 1.94000006 2.06999993 2.03999996 2.02999997 2.06999993 2.07999992 1.94000006 1.91999996 1.96000004 2.01999998 2.05999994 2.1099999 2.25999999 2.31999993 2.33999991 2.30999994 2.3599999 2.45000005 2.53999996 2.58999991 2.70000005 2.9000001 3.03999996 3.16000009 3.32999992 3.3900001 3.69000006 4.05000019 4.21999979 4.19000006 4.21999979 4.36000013 4.48999977 4.63999987 4.73000002 4.82000017 4.82000017 4.67000008 4.59000015 4.44999981 4.55000019 4.73999977 4.75 4.76000023 4.61999989 4.51000023 4.11000013 3.76999998 3.3599999 2.9000001 2.6400001 2.44000006 2.02999997 1.61000001 1.24000001 1.10000002 1.05999994 0.94999999 1.07000005 1.44000006 1.83000004 2.05999994 2.27999997 2.57999992 2.61999989 2.82999992 2.92000008] In [158]: temperature[:10] # first 10 elements Out[158]: array([-3.04999995, -0.60000002, -0.64999998, -0.57999998, -0.50999999, -0.49000001, -0.40000001, -0.36000001, -0.34999999, -0.28 ], dtype=float32) In [159]: temperature[-10:] # last 10 elements, temperature[-1] gives you the last one Out[159]: array([-0.14 , -0.23999999, -0.33000001, -0.34 , -0.43000001, -0.47 , -0.40000001, -0.47999999, -0.74000001, -0.81 ], dtype=float32) In [160]: temperature[0:10:2] # every 2nd of the first 10 elements Out[160]: array([-3.04999995, -0.64999998, -0.50999999, -0.40000001, -0.34999999], dtype=float32)
Where function: very helpful and powerful
http://docs.scipy.org/doc/numpy/reference/routines.statistics.html Set all missing values (-999.) to NaN and everything else to the value from temperature:
In [115]: print np.where(temperature == -9999.) (array([ 9960, 10010, 10293, 10969, 11302, 12649, 16227, 26818, 28628, 31754, 34538, 42551, 50044, 67411, 67420, 67421, 67422, 69171, 69172, 69173, 69174, 69175, 69176, 69177, 69178, 69179, 69180, 69181, 69182, 69183, 69184, 69185, 69395, 69396, 69397, 69398, 69406, 69914, 69915, 69916, 69917, 75428, 75429, 75430, 75431, 75432, 75433, 75434, 75435]),) In [116]: print type(np.where(temperature == -9999.)) <type 'tuple'> In [117]: print type(np.where(temperature == -9999.)[0]) <type 'numpy.ndarray'>
1 But: The much better way is to use masked arrays
In [118]: temp_nan = np.where(temperature == -9999., np.nan, temperature) In [119]: print temp_nan[9960] nan In [122]: print np.min(temp_nan) # doesn't work nan
Write out into ASCII file format
In [132]: temp_ascii = data[:,20] In [133]: np.savetxt('temp_ascii_out.csv', temp_ascii, delimiter=',')
A quick plot
Alex will introduce plotting, but since you asked: here it is.
In [129]: from matplotlib import pyplot as mpp In [130]: mpp.plot(data[:,20]) Out[130]: [<matplotlib.lines.Line2D at 0xb1c1c38c>] In [131]: mpp.show()
Elementwise functions
Many common functions are available as numpy routines.
Always use numpy routines when working on an array to profit from the performance of the underlying C implementation.
If a desired operation cannot be achieved by combining several numpy techniques (clever indexing, slicing, functions), write your own function that operates on a single element and use numpy.vectorize
In [97]: arr = n.arange(1, 10, 0.4) In [98]: arr Out[98]: array([ 1. , 1.4, 1.8, 2.2, 2.6, 3. , 3.4, 3.8, 4.2, 4.6, 5. , 5.4, 5.8, 6.2, 6.6, 7. , 7.4, 7.8, 8.2, 8.6, 9. , 9.4, 9.8]) In [99]: n.log(arr) Out[99]: array([ 0. , 0.33647224, 0.58778666, 0.78845736, 0.95551145, 1.09861229, 1.22377543, 1.33500107, 1.43508453, 1.5260563 , 1.60943791, 1.68639895, 1.75785792, 1.82454929, 1.88706965, 1.94591015, 2.00148 , 2.05412373, 2.10413415, 2.1517622 , 2.19722458, 2.24070969, 2.28238239]) In [100]: arr**2 Out[100]: array([ 1. , 1.96, 3.24, 4.84, 6.76, 9. , 11.56, 14.44, 17.64, 21.16, 25. , 29.16, 33.64, 38.44, 43.56, 49. , 54.76, 60.84, 67.24, 73.96, 81. , 88.36, 96.04]) In [101]: n.sqrt(arr) Out[101]: array([ 1. , 1.18321596, 1.34164079, 1.4832397 , 1.61245155, 1.73205081, 1.84390889, 1.94935887, 2.04939015, 2.14476106, 2.23606798, 2.32379001, 2.40831892, 2.48997992, 2.56904652, 2.64575131, 2.7202941 , 2.79284801, 2.86356421, 2.93257566, 3. , 3.06594194, 3.13049517])
Array arithmetic
Add a constant
In [102]: arr + 100.0 # add constant Out[102]: array([ 101. , 101.4, 101.8, 102.2, 102.6, 103. , 103.4, 103.8, 104.2, 104.6, 105. , 105.4, 105.8, 106.2, 106.6, 107. , 107.4, 107.8, 108.2, 108.6, 109. , 109.4, 109.8])
Add a constant row/column to all rows/columns
In [103]: arr = n.ones((4,4)) In [104]: arr Out[104]: array([[ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]]) In [105]: brr = n.arange(1, 5) In [106]: brr Out[106]: array([1, 2, 3, 4]) In [107]: arr + brr Out[107]: array([[ 2., 3., 4., 5.], [ 2., 3., 4., 5.], [ 2., 3., 4., 5.], [ 2., 3., 4., 5.]]) In [110]: brr = brr.reshape((4,1)) In [111]: brr Out[111]: array([[1], [2], [3], [4]]) In [112]: arr + brr Out[112]: array([[ 2., 2., 2., 2.], [ 3., 3., 3., 3.], [ 4., 4., 4., 4.], [ 5., 5., 5., 5.]])
Add/multiply elementwise
In [114]: arr = n.arange(25).reshape((5,5)) In [115]: arr Out[115]: array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) In [120]: brr = n.arange(10, 260, 10).reshape((5,5)) In [121]: brr Out[121]: array([[ 10, 20, 30, 40, 50], [ 60, 70, 80, 90, 100], [110, 120, 130, 140, 150], [160, 170, 180, 190, 200], [210, 220, 230, 240, 250]]) In [122]: arr + brr Out[122]: array([[ 10, 21, 32, 43, 54], [ 65, 76, 87, 98, 109], [120, 131, 142, 153, 164], [175, 186, 197, 208, 219], [230, 241, 252, 263, 274]]) In [123]: arr * brr Out[123]: array([[ 0, 20, 60, 120, 200], [ 300, 420, 560, 720, 900], [1100, 1320, 1560, 1820, 2100], [2400, 2720, 3060, 3420, 3800], [4200, 4620, 5060, 5520, 6000]])
Linear algebra
http://docs.scipy.org/doc/numpy/reference/routines.linalg.html
Dot product
In [124]: arr = n.random.rand(5,5) In [125]: symm_arr = n.do n.dot n.double In [125]: symm_arr = n.dot(arr, n.transpose(arr)) In [126]: symm_arr Out[126]: array([[ 1.58587206, 0.84716432, 1.14919924, 1.28189122, 1.19506914], [ 0.84716432, 0.65720114, 1.00722336, 1.0351238 , 0.59206034], [ 1.14919924, 1.00722336, 1.62035768, 1.59009221, 0.79137279], [ 1.28189122, 1.0351238 , 1.59009221, 2.14853304, 1.39284491], [ 1.19506914, 0.59206034, 0.79137279, 1.39284491, 1.68863531]])
Eigenvectors, eigenvalues
In [127]: eval, evec = n.linalg.eig(symm_arr) In [128]: eval Out[128]: array([ 6.08869334e+00, 9.75455082e-01, 5.38114193e-01, 9.38976138e-02, 4.43900059e-03]) In [129]: evec Out[129]: array([[-0.44727346, -0.19413401, -0.76916546, 0.36518519, 0.19308733], [-0.30802641, 0.25639165, -0.14501538, -0.06950575, -0.90195484], [-0.45918029, 0.55669694, -0.05749307, -0.58276662, 0.36921484], [-0.56134928, 0.12795817, 0.58475995, 0.56423789, 0.09058196], [-0.42320992, -0.75517583, 0.20519597, -0.45147248, -0.06833777]]) In [130]: b = n.dot(symm_arr, evec[:, 0]) In [131]: b/eval[0] Out[131]: array([-0.44727346, -0.30802641, -0.45918029, -0.56134928, -0.42320992])
Masked arrays
http://docs.scipy.org/doc/numpy/reference/routines.ma.html
Creation from existing array
Set the data of the masked array and then manipulate the mask.
In [132]: arr = n.array([[4,3,1],[3,1,4],[4,4,2]]) In [133]: marr = n.ma.array(arr) In [134]: marr Out[134]: masked_array(data = [[4 3 1] [3 1 4] [4 4 2]], mask = False, fill_value = 999999) In [135]: marr.mask = [[True, True, False], [False, False, False], [True, False, True]] In [136]: marr Out[136]: masked_array(data = [[-- -- 1] [3 1 4] [-- 4 --]], mask = [[ True True False] [False False False] [ True False True]], fill_value = 999999)
Creation with mask set everywhere
In [138]: marrall = n.ma.masked_all((4,4)) In [139]: marrall Out[139]: masked_array(data = [[-- -- -- --] [-- -- -- --] [-- -- -- --] [-- -- -- --]], mask = [[ True True True True] [ True True True True] [ True True True True] [ True True True True]], fill_value = 1e+20) In [140]: marrall[3,2] = 2 In [141]: marrall[0:2,2:] = 5 In [142]: marrall Out[142]: masked_array(data = [[-- -- 5.0 5.0] [-- -- 5.0 5.0] [-- -- -- --] [-- -- 2.0 --]], mask = [[ True True False False] [ True True False False] [ True True True True] [ True True False True]], fill_value = 1e+20)
Creation from existing array and mask set by a condition
In [143]: arr = n.random.normal(0, 1, 25).reshape((5,5)) In [144]: arr Out[144]: array([[ 0.17329858, 0.74527996, 0.26280218, 0.72841021, -0.42252451], [-0.82555426, 0.34412862, -1.62284074, 0.70672411, 0.40375032], [ 0.86688535, 0.39196647, -0.46354791, -0.75160129, 0.20977493], [-1.5837415 , -0.58776249, -0.22477697, -0.52870726, -1.47657261], [ 0.71056709, 1.48120984, -0.8241472 , 0.01575747, 0.0033042 ]]) In [145]: marr = n.ma.masked_less(arr, 0) In [146]: marr Out[146]: masked_array(data = [[0.173298577831 0.745279962312 0.262802177587 0.728410213258 --] [-- 0.34412861582 -- 0.706724108565 0.403750315142] [0.86688535478 0.391966467504 -- -- 0.209774932895] [-- -- -- -- --] [0.710567094906 1.48120984326 -- 0.0157574720741 0.00330419672626]], mask = [[False False False False True] [ True False True False False] [False False True True False] [ True True True True True] [False False True False False]], fill_value = 1e+20) In [151]: n.ma.count(marr) Out[151]: 14 In [152]: n.ma.count_masked(marr) Out[152]: 11 In [159]: arr = n.array([[1, 2, n.nan],[n.inf, 3, 4]]) In [160]: arr Out[160]: array([[ 1., 2., nan], [ inf, 3., 4.]]) In [162]: marr = n.ma.fix_invalid(arr) In [163]: marr Out[163]: masked_array(data = [[1.0 2.0 --] [-- 3.0 4.0]], mask = [[False False True] [ True False False]], fill_value = 1e+20)
Fill values
In [164]: marr.fill_value = -9999 In [165]: marr.filled() Out[165]: array([[ 1.00000000e+00, 2.00000000e+00, -9.99900000e+03], [ -9.99900000e+03, 3.00000000e+00, 4.00000000e+00]])
Compression to 1D array
In [166]: marr.compressed() Out[166]: array([ 1., 2., 3., 4.])
Example with JSBACH init file
To run this example, change into a directory with a netcdf file that contains a variable 'slm' (like a JSBACH init file) or adapt the file name below. In that directory, start ipython and copy the code into the ipython shell. To run this code as a script, add
#!/usr/bin/env python
in the beginning and save it to a file. Boolean indexing is used here.
import Nio import matplotlib.pyplot as p import numpy as n f = Nio.open_file('jsbach_T63GR15_11tiles_2005.nc') slm = f.variables['slm'][:].copy() vegratiomax = f.variables['veg_ratio_max'][:].copy() elevation = f.variables['elevation'][:].copy() f.close() ### unpack vector with land points to map land = n.where(slm>0, True, False) # land mask is True where slm>0, else False data1d = n.random.rand(6222) # a random 1D vector representing some data data2d = n.ma.masked_all((96, 192)) data2d[land] = data1d # the boolean array land can be used to index another 2D array, the result is a 1D array with the points where land==True p.imshow(data2d) p.show() p.imshow(vegratiomax) p.show() maskedvrm = n.ma.masked_equal(vegratiomax, 0) p.imshow(maskedvrm) p.show() maskedvrm = n.ma.masked_less(vegratiomax, 0.5) p.imshow(maskedvrm) p.show() maskedvrm = n.ma.masked_outside(vegratiomax, 0.2, 0.4) p.imshow(maskedvrm) p.show() print vegratiomax.mean() print maskedvrm.mean() highpoints = n.where(elevation>3000) highpointsmap = n.ma.masked_all((96, 192)) highpointsmap.mask[highpoints] = False highpointsmap = n.ma.masked_all((96, 192)) highpointsmap[highpoints] = elevation[highpoints] highpointsmapfilled = highpointsmap.filled(0) p.imshow(highpointsmap) p.show()
The plotting is kept simple. Wait for the next seminar...