natcap.invest.pygeoprocessing_0_3_3 package

Submodules

natcap.invest.pygeoprocessing_0_3_3.fileio module

class natcap.invest.pygeoprocessing_0_3_3.fileio.CSVDriver(uri, fieldnames=None)

Bases: natcap.invest.pygeoprocessing_0_3_3.fileio.TableDriverTemplate

The CSVDriver class is a subclass of TableDriverTemplate.

get_fieldnames()
get_file_object(uri=None)
read_table()
write_table(table_list, uri=None, fieldnames=None)
exception natcap.invest.pygeoprocessing_0_3_3.fileio.ColumnMissingFromTable

Bases: exceptions.KeyError

A custom exception for when a key is missing from a table.

More descriptive than just throwing a KeyError. This class inherits the KeyError exception, so any existing exception handling should still work properly.

class natcap.invest.pygeoprocessing_0_3_3.fileio.DBFDriver(uri, fieldnames=None)

Bases: natcap.invest.pygeoprocessing_0_3_3.fileio.TableDriverTemplate

The DBFDriver class is a subclass of TableDriverTemplate.

get_fieldnames()

Return a list of strings containing the fieldnames.

get_file_object(uri=None, read_only=True)

Return the library-specific file object by using the input uri. If uri is None, return use self.uri.

read_table()

Return the table object with data built from the table using the file-specific package as necessary. Should return a list of dictionaries.

write_table(table_list, uri=None, fieldnames=None)

Take the table_list input and write its contents to the appropriate URI. If uri == None, write the file to self.uri. Otherwise, write the table to uri (which may be a new file). If fieldnames == None, assume that the default fieldnames order will be used.

class natcap.invest.pygeoprocessing_0_3_3.fileio.TableDriverTemplate(uri, fieldnames=None)

Bases: object

The TableDriverTemplate classes provide a uniform, simple way to interact with specific tabular libraries. This allows us to interact with multiple filetypes in exactly the same way and in a uniform syntax. By extension, this also allows us to read and write to and from any desired table format as long as the appropriate TableDriver class has been implemented.

These driver classes exist for convenience, and though they can be accessed directly by the user, these classes provide only the most basic functionality. Other classes, such as the TableHandler class, use these drivers to provide a convenient layer of functionality to the end-user.

This class is merely a template to be subclassed for use with appropriate table filetype drivers. Instantiating this object will yield a functional object, but it won’t actually get you any relevant results.

get_fieldnames()

Return a list of strings containing the fieldnames.

get_file_object(uri=None)

Return the library-specific file object by using the input uri. If uri is None, return use self.uri.

read_table()

Return the table object with data built from the table using the file-specific package as necessary. Should return a list of dictionaries.

write_table(table_list, uri=None, fieldnames=None)

Take the table_list input and write its contents to the appropriate URI. If uri == None, write the file to self.uri. Otherwise, write the table to uri (which may be a new file). If fieldnames == None, assume that the default fieldnames order will be used.

class natcap.invest.pygeoprocessing_0_3_3.fileio.TableHandler(uri, fieldnames=None)

Bases: object

__iter__()

Allow this handler object’s table to be iterated through. Returns an iterable version of self.table.

create_column(column_name, position=None, default_value=0)

Create a new column in the internal table object with the name column_name. If position == None, it will be appended to the end of the fieldnames. Otherwise, the column will be inserted at index position. This function will also loop through the entire table object and create an entry with the default value of default_value.

Note that it’s up to the driver to actually add the field to the file on disk.

Returns nothing

find_driver(uri, fieldnames=None)

Locate the driver needed for uri. Returns a driver object as documented by self.driver_types.

get_fieldnames(case='lower')

Returns a python list of the original fieldnames, true to their original case.

case=’lower’ - a python string representing the desired status of the
fieldnames. ‘lower’ for lower case, ‘orig’ for original case.

returns a python list of strings.

get_map(key_field, value_field)

Returns a python dictionary mapping values contained in key_field to values contained in value_field. If duplicate keys are found, they are overwritten in the output dictionary.

This is implemented as a dictionary comprehension on top of self.get_table_list(), so there shouldn’t be a need to reimplement this for each subclass of AbstractTableHandler.

If the table list has not been retrieved, it is retrieved before generating the map.

key_field - a python string. value_field - a python string.

returns a python dictionary mapping key_fields to value_fields.

get_table()

Return the table list object.

get_table_dictionary(key_field, include_key=True)

Returns a python dictionary mapping a key value to all values in that particular row dictionary (including the key field). If duplicate keys are found, the are overwritten in the output dictionary.

key_field - a python string of the desired field value to be used as
the key for the returned dictionary.
include_key=True - a python boolean indicating whether the
key_field provided should be included in each row_dictionary.

returns a python dictionary of dictionaries.

get_table_row(key_field, key_value)

Return the first full row where the value of key_field is equivalent to key_value. Raises a KeyError if key_field does not exist.

key_field - a python string. key_value - a value of appropriate type for this field.

returns a python dictionary of the row, or None if the row does not exist.

set_field_mask(regexp=None, trim=0, trim_place='front')

Set a mask for the table’s self.fieldnames. Any fieldnames that match regexp will have trim number of characters stripped off the front.

regexp=None - a python string or None. If a python string, this
will be a regular expression. If None, this represents no regular expression.

trim - a python int. trim_place - a string, either ‘front’ or ‘back’. Indicates where

the trim should take place.

Returns nothing.

write_table(table=None, uri=None)

Invoke the driver to save the table to disk. If table == None, self.table will be written, otherwise, the list of dictionaries passed in to table will be written. If uri is None, the table will be written to the table’s original uri, otherwise, the table object will be written to uri.

natcap.invest.pygeoprocessing_0_3_3.fileio.get_free_space(folder='/', unit='auto')

Get the free space on the drive/folder marked by folder. Returns a float of unit unit.

folder - (optional) a string uri to a folder or drive on disk. Defaults
to ‘/’ (‘C:’ on Windows’)
unit - (optional) a string, one of [‘B’, ‘MB’, ‘GB’, ‘TB’, ‘auto’]. If
‘auto’, the unit returned will be automatically calculated based on available space. Defaults to ‘auto’.

returns a string marking the space free and the selected unit. Number is rounded to two decimal places.’

natcap.invest.pygeoprocessing_0_3_3.geoprocessing module

A collection of GDAL dataset and raster utilities.

class natcap.invest.pygeoprocessing_0_3_3.geoprocessing.AggregatedValues(total, pixel_mean, hectare_mean, n_pixels, pixel_min, pixel_max)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__getstate__()

Exclude the OrderedDict from pickling

__repr__()

Return a nicely formatted representation string

hectare_mean

Alias for field number 2

n_pixels

Alias for field number 3

pixel_max

Alias for field number 5

pixel_mean

Alias for field number 1

pixel_min

Alias for field number 4

total

Alias for field number 0

exception natcap.invest.pygeoprocessing_0_3_3.geoprocessing.DatasetUnprojected

Bases: exceptions.Exception

An exception in case a dataset is unprojected

exception natcap.invest.pygeoprocessing_0_3_3.geoprocessing.DifferentProjections

Bases: exceptions.Exception

An exception in case a set of datasets are not in the same projection

exception natcap.invest.pygeoprocessing_0_3_3.geoprocessing.SpatialExtentOverlapException

Bases: exceptions.Exception

An exeception class for cases when datasets or datasources don’t overlap
in space.
Used to raise an exception if rasters, shapefiles, or both don’t overlap
in regions that should.
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.aggregate_raster_values_uri(raster_uri, shapefile_uri, shapefile_field=None, ignore_nodata=True, all_touched=False, polygons_might_overlap=True)

Collect stats on pixel values which lie within shapefile polygons.

Parameters:
  • raster_uri (string) – a uri to a raster. In order for hectare mean values to be accurate, this raster must be projected in meter units.
  • shapefile_uri (string) – a uri to a OGR datasource that should overlap raster; raises an exception if not.
Keyword Arguments:
 
  • shapefile_field (string) – a string indicating which key in shapefile to associate the output dictionary values with whose values are associated with ints; if None dictionary returns a value over the entire shapefile region that intersects the raster.
  • ignore_nodata – if operation == ‘mean’ then it does not account for nodata pixels when determining the pixel_mean, otherwise all pixels in the AOI are used for calculation of the mean. This does not affect hectare_mean which is calculated from the geometrical area of the feature.
  • all_touched (boolean) – if true will account for any pixel whose geometry passes through the pixel, not just the center point
  • polygons_might_overlap (boolean) – if True the function calculates aggregation coverage close to optimally by rasterizing sets of polygons that don’t overlap. However, this step can be computationally expensive for cases where there are many polygons. Setting this flag to False directs the function rasterize in one step.
Returns:

result_tuple – named tuple of the form

(‘aggregate_values’, ‘total pixel_mean hectare_mean n_pixels

pixel_min pixel_max’)

Each of [sum pixel_mean hectare_mean] contains a dictionary that maps shapefile_field value to the total, pixel mean, hecatare mean, pixel max, and pixel min of the values under that feature. ‘n_pixels’ contains the total number of valid pixels used in that calculation. hectare_mean is None if raster_uri is unprojected.

Return type:

tuple

Raises:
  • AttributeError
  • TypeError
  • OSError
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.align_dataset_list(dataset_uri_list, dataset_out_uri_list, resample_method_list, out_pixel_size, mode, dataset_to_align_index, dataset_to_bound_index=None, aoi_uri=None, assert_datasets_projected=True, all_touched=False)
Create a new list of datasets that are aligned based on a list of
inputted datasets.

Take a list of dataset uris and generates a new set that is completely aligned with identical projections and pixel sizes.

Parameters:
  • dataset_uri_list (list) – a list of input dataset uris
  • dataset_out_uri_list (list) – a parallel dataset uri list whose positions correspond to entries in dataset_uri_list
  • resample_method_list (list) – a list of resampling methods for each output uri in dataset_out_uri list. Each element must be one of “nearest|bilinear|cubic|cubic_spline|lanczos”
  • out_pixel_size – the output pixel size
  • mode (string) – one of “union”, “intersection”, or “dataset” which defines how the output output extents are defined as either the union or intersection of the input datasets or to have the same bounds as an existing raster. If mode is “dataset” then dataset_to_bound_index must be defined
  • dataset_to_align_index (int) – an int that corresponds to the position in one of the dataset_uri_lists that, if positive aligns the output rasters to fix on the upper left hand corner of the output datasets. If negative, the bounding box aligns the intersection/ union without adjustment.
  • all_touched (boolean) – if True and an AOI is passed, the ALL_TOUCHED=TRUE option is passed to the RasterizeLayer function when determining the mask of the AOI.
Keyword Arguments:
 
  • dataset_to_bound_index – if mode is “dataset” then this index is used to indicate which dataset to define the output bounds of the dataset_out_uri_list
  • aoi_uri (string) – a URI to an OGR datasource to be used for the aoi. Irrespective of the mode input, the aoi will be used to intersect the final bounding box.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.assert_datasets_in_same_projection(dataset_uri_list)

Assert that provided datasets are all in the same projection.

Tests if datasets represented by their uris are projected and in the same projection and raises an exception if not.

Parameters:

dataset_uri_list (list) – (description)

Returns:

is_true – True (otherwise exception raised)

Return type:

boolean

Raises:
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.assert_file_existance(dataset_uri_list)

Assert that provided uris exist in filesystem.

Verify that the uris passed in the argument exist on the filesystem if not, raise an exeception indicating which files do not exist

Parameters:dataset_uri_list (list) – a list of relative or absolute file paths to validate
Returns:None
Raises:IOError – if any files are not found
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.calculate_disjoint_polygon_set(shapefile_uri)

Create a list of sets of polygons that don’t overlap.

Determining the minimal number of those sets is an np-complete problem so this is an approximation that builds up sets of maximal subsets.

Parameters:shapefile_uri (string) – a uri to an OGR shapefile to process
Returns:subset_list – list of sets of FIDs from shapefile_uri
Return type:list
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.calculate_intersection_rectangle(dataset_list, aoi=None)

Return bounding box of the intersection of all rasters in the list.

Parameters:dataset_list (list) – a list of GDAL datasets in the same projection and coordinate system
Keyword Arguments:
 aoi – an OGR polygon datasource which may optionally also restrict the extents of the intersection rectangle based on its own extents.
Returns:bounding_box – a 4 element list that bounds the intersection of
all the rasters in dataset_list. [left, top, right, bottom]
Return type:list
Raises:SpatialExtentOverlapException – in cases where the dataset list and aoi don’t overlap.
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.calculate_raster_stats_uri(dataset_uri)

Calculate min, max, stdev, and mean for all bands in dataset.

Parameters:dataset_uri (string) – a uri to a GDAL raster dataset that will be modified by having its band statistics set
Returns:None
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.calculate_slope(dem_dataset_uri, slope_uri, aoi_uri=None, process_pool=None)

Create slope raster from DEM raster.

Follows the algorithm described here: http://webhelp.esri.com/arcgiSDEsktop/9.3/index.cfm?TopicName=How%20Slope%20works

Parameters:
  • dem_dataset_uri (string) – a URI to a single band raster of z values.
  • slope_uri (string) – a path to the output slope uri in percent.
Keyword Arguments:
 
  • aoi_uri (string) – a uri to an AOI input
  • process_pool – a process pool for multiprocessing
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.clip_dataset_uri(source_dataset_uri, aoi_datasource_uri, out_dataset_uri, assert_projections=True, process_pool=None, all_touched=False)

Clip raster dataset to bounding box of provided vector datasource aoi.

This function will clip source_dataset to the bounding box of the polygons in aoi_datasource and mask out the values in source_dataset outside of the AOI with the nodata values in source_dataset.

Parameters:
  • source_dataset_uri (string) – uri to single band GDAL dataset to clip
  • aoi_datasource_uri (string) – uri to ogr datasource
  • out_dataset_uri (string) – path to disk for the clipped datset
Keyword Arguments:
 
  • assert_projections (boolean) – a boolean value for whether the dataset needs to be projected
  • process_pool – a process pool for multiprocessing
  • all_touched (boolean) – if true the clip uses the option ALL_TOUCHED=TRUE when calling RasterizeLayer for AOI masking.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.convolve_2d_uri(signal_path, kernel_path, output_path)

Convolve 2D kernel over 2D signal.

Convolves the raster in kernel_path over signal_path. Nodata values are treated as 0.0 during the convolution and masked to nodata for the output result where signal_path has nodata.

Parameters:
  • signal_path (string) – a filepath to a gdal dataset that’s the source input.
  • kernel_path (string) – a filepath to a gdal dataset that’s the source input.
  • output_path (string) – a filepath to the gdal dataset that’s the convolution output of signal and kernel that is the same size and projection of signal_path. Any nodata pixels that align with signal_path will be set to nodata.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.copy_datasource_uri(shape_uri, copy_uri)

Create a copy of an ogr shapefile.

Parameters:
  • shape_uri (string) – a uri path to the ogr shapefile that is to be copied
  • copy_uri (string) – a uri path for the destination of the copied shapefile
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.create_directories(directory_list)

Make directories provided in list of path strings.

This function will create any of the directories in the directory list if possible and raise exceptions if something exception other than the directory previously existing occurs.

Parameters:directory_list (list) – a list of string uri paths
Returns:None
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.create_raster_from_vector_extents(xRes, yRes, format, nodata, rasterFile, shp)

Create a blank raster based on a vector file extent.

This code is adapted from http://trac.osgeo.org/gdal/wiki/FAQRaster#HowcanIcreateablankrasterbasedonavectorfilesextentsforusewithgdal_rasterizeGDAL1.8.0

Parameters:
  • xRes – the x size of a pixel in the output dataset must be a positive value
  • yRes – the y size of a pixel in the output dataset must be a positive value
  • format – gdal GDT pixel type
  • nodata – the output nodata value
  • rasterFile (string) – URI to file location for raster
  • shp – vector shapefile to base extent of output raster on
Returns:

blank raster whose bounds fit within `shp`s bounding box and features are equivalent to the passed in data

Return type:

raster

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.create_raster_from_vector_extents_uri(shapefile_uri, pixel_size, gdal_format, nodata_out_value, output_uri)

Create a blank raster based on a vector file extent.

A wrapper for create_raster_from_vector_extents

Parameters:
  • shapefile_uri (string) – uri to an OGR datasource to use as the extents of the raster
  • pixel_size – size of output pixels in the projected units of shapefile_uri
  • gdal_format – the raster pixel format, something like gdal.GDT_Float32
  • nodata_out_value – the output nodata value
  • output_uri (string) – the URI to write the gdal dataset
Returns:

dataset – gdal dataset

Return type:

gdal.Dataset

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.create_rat(dataset, attr_dict, column_name)

Create a raster attribute table.

Parameters:
  • dataset – a GDAL raster dataset to create the RAT for (...)
  • attr_dict (dict) – a dictionary with keys that point to a primitive type {integer_id_1: value_1, ... integer_id_n: value_n}
  • column_name (string) – a string for the column name that maps the values
Returns:

dataset – a GDAL raster dataset with an updated RAT

Return type:

gdal.Dataset

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.create_rat_uri(dataset_uri, attr_dict, column_name)

Create a raster attribute table.

URI wrapper for create_rat.

Parameters:
  • dataset_uri (string) – a GDAL raster dataset to create the RAT for (...)
  • attr_dict (dict) – a dictionary with keys that point to a primitive type {integer_id_1: value_1, ... integer_id_n: value_n}
  • column_name (string) – a string for the column name that maps the values
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.dictionary_to_point_shapefile(dict_data, layer_name, output_uri)

Create a point shapefile from a dictionary.

The point shapefile created is not projected and uses latitude and
longitude for its geometry.
Parameters:
  • dict_data (dict) – a python dictionary with keys being unique id’s that point to sub-dictionarys that have key-value pairs. These inner key-value pairs will represent the field-value pair for the point features. At least two fields are required in the sub-dictionaries, All the keys in the sub dictionary should have the same name and order. All the values in the sub dictionary should have the same type ‘lati’ and ‘long’. These fields determine the geometry of the point 0 : {‘lati’:97, ‘long’:43, ‘field_a’:6.3, ‘field_b’:’Forest’,...}, 1 : {‘lati’:55, ‘long’:51, ‘field_a’:6.2, ‘field_b’:’Crop’,...}, 2 : {‘lati’:73, ‘long’:47, ‘field_a’:6.5, ‘field_b’:’Swamp’,...}
  • layer_name (string) – a python string for the name of the layer
  • output_uri (string) – a uri for the output path of the point shapefile
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.distance_transform_edt(input_mask_uri, output_distance_uri, process_pool=None)

Find the Euclidean distance transform on input_mask_uri and output the result as raster.

Parameters:
  • input_mask_uri (string) – a gdal raster to calculate distance from the non 0 value pixels
  • output_distance_uri (string) – will make a float raster w/ same dimensions and projection as input_mask_uri where all zero values of input_mask_uri are equal to the euclidean distance to the closest non-zero pixel.
Keyword Arguments:
 

process_pool – (description)

Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.extract_datasource_table_by_key(datasource_uri, key_field)

Return vector attribute table of first layer as dictionary.

Create a dictionary lookup table of the features in the attribute table of the datasource referenced by datasource_uri.

Parameters:
  • datasource_uri (string) – a uri to an OGR datasource
  • key_field – a field in datasource_uri that refers to a key value for each row such as a polygon id.
Returns:

attribute_dictionary – returns a dictionary of the

form {key_field_0: {field_0: value0, field_1: value1}...}

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_bounding_box(dataset_uri)

Get bounding box where coordinates are in projected units.

Parameters:dataset_uri (string) – a uri to a GDAL dataset
Returns:bounding_box

[upper_left_x, upper_left_y, lower_right_x, lower_right_y] in projected coordinates

Return type:list
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_cell_size_from_uri(dataset_uri)

Get the cell size of a dataset in units of meters.

Raises an exception if the raster is not square since this’ll break most of the pygeoprocessing algorithms.

Parameters:dataset_uri (string) – uri to a gdal dataset
Returns:cell size of the dataset in meters
Return type:size_meters
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_dataset_projection_wkt_uri(dataset_uri)

Get the projection of a GDAL dataset as well known text (WKT).

Parameters:dataset_uri (string) – a URI for the GDAL dataset
Returns:proj_wkt – WKT describing the GDAL dataset project
Return type:string
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_datasource_bounding_box(datasource_uri)

Get datasource bounding box where coordinates are in projected units.

Parameters:dataset_uri (string) – a uri to a GDAL dataset
Returns:bounding_box

[upper_left_x, upper_left_y, lower_right_x, lower_right_y] in projected coordinates

Return type:list
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_datatype_from_uri(dataset_uri)

Return datatype for first band in gdal dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:datatype for dataset band 1
Return type:datatype
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_geotransform_uri(dataset_uri)

Get the geotransform from a gdal dataset.

Parameters:dataset_uri (string) – a URI for the dataset
Returns:a dataset geotransform list
Return type:geotransform
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_lookup_from_csv(csv_table_uri, key_field)

Read CSV table file in as dictionary.

Creates a python dictionary to look up the rest of the fields in a csv table indexed by the given key_field

Parameters:
  • csv_table_uri (string) – a URI to a csv file containing at least the header key_field
  • key_field – (description)
Returns:

lookup_dict – returns a dictionary of the form {key_field_0:

{header_1: val_1_0, header_2: val_2_0, etc.} depending on the values of those fields

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_lookup_from_table(table_uri, key_field)

Read table file in as dictionary.

Creates a python dictionary to look up the rest of the fields in a table file indexed by the given key_field. This function is case insensitive to field header names and returns a lookup table with lowercase keys.

Parameters:
  • table_uri (string) – a URI to a dbf or csv file containing at least the header key_field
  • key_field – (description)
Returns:

lookup_dict – a dictionary of the form {key_field_0:

{header_1: val_1_0, header_2: val_2_0, etc.} where key_field_n is the lowercase version of the column name.

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_nodata_from_uri(dataset_uri)

Return nodata value from first band in gdal dataset cast as numpy datatype.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:nodata value for dataset band 1
Return type:nodata
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_raster_properties(dataset)

Get width, height, X size, and Y size of the dataset as dictionary.

This function can be expanded to return more properties if needed

Parameters:dataset (gdal.Dataset) – a GDAL raster dataset to get the properties from
Returns:dataset_dict – a dictionary with the properties stored
under relevant keys. The current list of things returned is: width (w-e pixel resolution), height (n-s pixel resolution), XSize, YSize
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_raster_properties_uri(dataset_uri)

Get width, height, X size, and Y size of the dataset as dictionary.

Wrapper function for get_raster_properties() that passes in the dataset URI instead of the datasets itself

Parameters:dataset_uri (string) – a URI to a GDAL raster dataset
Returns:value – a dictionary with the properties stored under
relevant keys. The current list of things returned is: width (w-e pixel resolution), height (n-s pixel resolution), XSize, YSize
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_rat_as_dictionary(dataset)

Get Raster Attribute Table of the first band of dataset as a dictionary.

Parameters:dataset (gdal.Dataset) – a GDAL dataset that has a RAT associated with the first band
Returns:rat_dictionary – a 2D dictionary where the first key is the
column name and second is the row number
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_rat_as_dictionary_uri(dataset_uri)

Get Raster Attribute Table of the first band of dataset as a dictionary.

Parameters:dataset (string) – a GDAL dataset that has a RAT associated with the first band
Returns:value – a 2D dictionary where the first key is the column
name and second is the row number
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_row_col_from_uri(dataset_uri)

Return number of rows and columns of given dataset uri as tuple.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:rows_cols – 2-tuple (n_row, n_col) from dataset_uri
Return type:tuple
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_spatial_ref_uri(datasource_uri)

Get the spatial reference of an OGR datasource.

Parameters:datasource_uri (string) – a URI to an ogr datasource
Returns:a spatial reference
Return type:spat_ref
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.get_statistics_from_uri(dataset_uri)

Get the min, max, mean, stdev from first band in a GDAL Dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:statistics – min, max, mean, stddev
Return type:tuple
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.iterblocks(raster_uri, band_list=None, largest_block=1048576, astype=None, offset_only=False)

Iterate across all the memory blocks in the input raster.

Result is a generator of block location information and numpy arrays.

This is especially useful when a single value needs to be derived from the pixel values in a raster, such as the sum total of all pixel values, or a sequence of unique raster values. In such cases, raster_local_op is overkill, since it writes out a raster.

As a generator, this can be combined multiple times with itertools.izip() to iterate ‘simultaneously’ over multiple rasters, though the user should be careful to do so only with prealigned rasters.

Parameters:
  • raster_uri (string) – The string filepath to the raster to iterate over.
  • band_list=None (list of ints or None) – A list of the bands for which the matrices should be returned. The band number to operate on. Defaults to None, which will return all bands. Bands may be specified in any order, and band indexes may be specified multiple times. The blocks returned on each iteration will be in the order specified in this list.
  • largest_block (int) – Attempts to iterate over raster blocks with this many elements. Useful in cases where the blocksize is relatively small, memory is available, and the function call overhead dominates the iteration. Defaults to 2**20. A value of anything less than the original blocksize of the raster will result in blocksizes equal to the original size.
  • astype (list of numpy types) – If none, output blocks are in the native type of the raster bands. Otherwise this parameter is a list of len(band_list) length that contains the desired output types that iterblock generates for each band.
  • offset_only (boolean) – defaults to False, if True iterblocks only returns offset dictionary and doesn’t read any binary data from the raster. This can be useful when iterating over writing to an output.
Returns:

If offset_only is false, on each iteration, a tuple containing a dict of block data and n 2-dimensional numpy arrays are returned, where n is the number of bands requested via band_list. The dict of block data has these attributes:

data[‘xoff’] - The X offset of the upper-left-hand corner of the

block.

data[‘yoff’] - The Y offset of the upper-left-hand corner of the

block.

data[‘win_xsize’] - The width of the block. data[‘win_ysize’] - The height of the block.

If offset_only is True, the function returns only the block data and

does not attempt to read binary data from the raster.

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.load_memory_mapped_array(dataset_uri, memory_file, array_type=None)

Get the first band of a dataset as a memory mapped array.

Parameters:
  • dataset_uri (string) – the GDAL dataset to load into a memory mapped array
  • memory_uri (string) – a path to a file OR a file-like object that will be used to hold the memory map. It is up to the caller to create and delete this file.
Keyword Arguments:
 

array_type – the type of the resulting array, if None defaults to the type of the raster band in the dataset

Returns:

memory_array – a memmap numpy array of the data

contained in the first band of dataset_uri

Return type:

memmap numpy array

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.make_constant_raster_from_base_uri(base_dataset_uri, constant_value, out_uri, nodata_value=None, dataset_type=<Mock id='59299152'>)

Create new gdal raster filled with uniform values.

A helper function that creates a new gdal raster from base, and fills it with the constant value provided.

Parameters:
  • base_dataset_uri (string) – the gdal base raster
  • constant_value – the value to set the new base raster to
  • out_uri (string) – the uri of the output raster
Keyword Arguments:
 
  • nodata_value – the value to set the constant raster’s nodata value to. If not specified, it will be set to constant_value - 1.0
  • dataset_type – the datatype to set the dataset to, default will be a float 32 value.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.new_raster(cols, rows, projection, geotransform, format, nodata, datatype, bands, outputURI)

Create a new raster with the given properties.

Parameters:
  • cols (int) – number of pixel columns
  • rows (int) – number of pixel rows
  • projection – the datum
  • geotransform – the coordinate system
  • format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
  • bands (int) – the number of bands in the raster
  • outputURI (string) – the file location for the outputed raster. If format is ‘MEM’ this can be an empty string
Returns:

a new GDAL raster with the parameters as described above

Return type:

dataset

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.new_raster_from_base(base, output_uri, gdal_format, nodata, datatype, fill_value=None, n_rows=None, n_cols=None, dataset_options=None)

Create a new, empty GDAL raster dataset with the spatial references, geotranforms of the base GDAL raster dataset.

Parameters:
  • base – a the GDAL raster dataset to base output size, and transforms on
  • output_uri (string) – a string URI to the new output raster dataset.
  • gdal_format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
Keyword Arguments:
 
  • fill_value – the value to fill in the raster on creation
  • n_rows – if set makes the resulting raster have n_rows in it if not, the number of rows of the outgoing dataset are equal to the base.
  • n_cols – similar to n_rows, but for the columns.
  • dataset_options – a list of dataset options that gets passed to the gdal creation driver, overrides defaults
Returns:

a new GDAL raster dataset.

Return type:

dataset

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.new_raster_from_base_uri(base_uri, output_uri, gdal_format, nodata, datatype, fill_value=None, n_rows=None, n_cols=None, dataset_options=None)

Create a new, empty GDAL raster dataset with the spatial references, geotranforms of the base GDAL raster dataset.

A wrapper for the function new_raster_from_base that opens up the base_uri before passing it to new_raster_from_base.

Parameters:
  • base_uri (string) – a URI to a GDAL dataset on disk.
  • output_uri (string) – a string URI to the new output raster dataset.
  • gdal_format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
Keyword Arguments:
 
  • fill_value – the value to fill in the raster on creation
  • n_rows – if set makes the resulting raster have n_rows in it if not, the number of rows of the outgoing dataset are equal to the base.
  • n_cols – similar to n_rows, but for the columns.
  • dataset_options – a list of dataset options that gets passed to the gdal creation driver, overrides defaults
Returns:

nothing

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.pixel_size_based_on_coordinate_transform(dataset, coord_trans, point)

Get width and height of cell in meters.

Calculates the pixel width and height in meters given a coordinate transform and reference point on the dataset that’s close to the transform’s projected coordinate sytem. This is only necessary if dataset is not already in a meter coordinate system, for example dataset may be in lat/long (WGS84).

Parameters:
  • dataset (gdal.Dataset) – a projected GDAL dataset in the form of lat/long decimal degrees
  • coord_trans (osr.CoordinateTransformation) – an OSR coordinate transformation from dataset coordinate system to meters
  • point (tuple) – a reference point close to the coordinate transform coordinate system. must be in the same coordinate system as dataset.
Returns:

pixel_diff – a 2-tuple containing (pixel width in meters, pixel

height in meters)

Return type:

tuple

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.pixel_size_based_on_coordinate_transform_uri(dataset_uri, *args, **kwargs)

Get width and height of cell in meters.

A wrapper for pixel_size_based_on_coordinate_transform that takes a dataset uri as an input and opens it before sending it along.

Parameters:
  • dataset_uri (string) – a URI to a gdal dataset
  • other parameters pass along (All) –
Returns:

result – (pixel_width_meters, pixel_height_meters)

Return type:

tuple

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.rasterize_layer_uri(raster_uri, shapefile_uri, burn_values=[], option_list=[])

Rasterize datasource layer.

Burn the layer from ‘shapefile_uri’ onto the raster from ‘raster_uri’. Will burn ‘burn_value’ onto the raster unless ‘field’ is not None, in which case it will burn the value from shapefiles field.

Parameters:
  • raster_uri (string) – a URI to a gdal dataset
  • shapefile_uri (string) – a URI to an ogr datasource
Keyword Arguments:
 
  • burn_values (list) – the equivalent value for burning into a polygon. If empty uses the Z values.
  • option_list (list) – a Python list of options for the operation. Example: [“ATTRIBUTE=NPV”, “ALL_TOUCHED=TRUE”]
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.reclassify_dataset_uri(dataset_uri, value_map, raster_out_uri, out_datatype, out_nodata, exception_flag='values_required', assert_dataset_projected=True)

Reclassify values in a dataset.

A function to reclassify values in dataset to any output type. By default the values except for nodata must be in value_map.

Parameters:
  • dataset_uri (string) – a uri to a gdal dataset
  • value_map (dictionary) – a dictionary of values of {source_value: dest_value, ...} where source_value’s type is a postive integer type and dest_value is of type out_datatype.
  • raster_out_uri (string) – the uri for the output raster
  • out_datatype (gdal type) – the type for the output dataset
  • out_nodata (numerical type) – the nodata value for the output raster. Must be the same type as out_datatype
Keyword Arguments:
 
  • exception_flag (string) – either ‘none’ or ‘values_required’. If ‘values_required’ raise an exception if there is a value in the raster that is not found in value_map
  • assert_dataset_projected (boolean) – if True this operation will test if the input dataset is not projected and raise an exception if so.
Returns:

nothing

Raises:

Exception – if exception_flag == ‘values_required’ and the value from ‘key_raster’ is not a key in ‘attr_dict’

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.reproject_dataset_uri(original_dataset_uri, pixel_spacing, output_wkt, resampling_method, output_uri)

Reproject and resample GDAL dataset.

A function to reproject and resample a GDAL dataset given an output pixel size and output reference. Will use the datatype and nodata value from the original dataset.

Parameters:
  • original_dataset_uri (string) – a URI to a gdal Dataset to written to disk
  • pixel_spacing – output dataset pixel size in projected linear units
  • output_wkt – output project in Well Known Text
  • resampling_method (string) – a string representing the one of the following resampling methods: “nearest|bilinear|cubic|cubic_spline|lanczos”
  • output_uri (string) – location on disk to dump the reprojected dataset
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.reproject_datasource(original_datasource, output_wkt, output_uri)

Reproject OGR DataSource object.

Changes the projection of an ogr datasource by creating a new shapefile based on the output_wkt passed in. The new shapefile then copies all the features and fields of the original_datasource as its own.

Parameters:
  • original_datasource – an ogr datasource
  • output_wkt – the desired projection as Well Known Text (by layer.GetSpatialRef().ExportToWkt())
  • output_uri (string) – the filepath to the output shapefile
Returns:

the reprojected shapefile.

Return type:

output_datasource

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.reproject_datasource_uri(original_dataset_uri, output_wkt, output_uri)

Reproject OGR DataSource file.

URI wrapper for reproject_datasource that takes in the uri for the datasource that is to be projected instead of the datasource itself. This function directly calls reproject_datasource.

Parameters:
  • original_dataset_uri (string) – a uri to an ogr datasource
  • output_wkt – the desired projection as Well Known Text (by layer.GetSpatialRef().ExportToWkt())
  • output_uri (string) – the path to where the new shapefile should be written to disk.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.resize_and_resample_dataset_uri(original_dataset_uri, bounding_box, out_pixel_size, output_uri, resample_method)

Resize and resample the given dataset.

Parameters:
  • original_dataset_uri (string) – a GDAL dataset
  • bounding_box (list) – [upper_left_x, upper_left_y, lower_right_x, lower_right_y]
  • out_pixel_size – the pixel size in projected linear units
  • output_uri (string) – the location of the new resampled GDAL dataset
  • resample_method (string) – the resampling technique, one of “nearest|bilinear|cubic|cubic_spline|lanczos”
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.temporary_filename(suffix='')

Get path to new temporary file that will be deleted on program exit.

Returns a temporary filename using mkstemp. The file is deleted on exit using the atexit register.

Keyword Arguments:
 suffix (string) – the suffix to be appended to the temporary file
Returns:a unique temporary filename
Return type:fname
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.temporary_folder()

Get path to new temporary folder that will be deleted on program exit.

Returns a temporary folder using mkdtemp. The folder is deleted on exit using the atexit register.

Returns:path – an absolute, unique and temporary folder path.
Return type:string
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.tile_dataset_uri(in_uri, out_uri, blocksize)
Resample gdal dataset into tiled raster with blocks of blocksize X
blocksize.
Parameters:
  • in_uri (string) – dataset to base data from
  • out_uri (string) – output dataset
  • blocksize (int) – defines the side of the square for the raster, this seems to have a lower limit of 16, but is untested
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.transform_bounding_box(bounding_box, base_ref_wkt, new_ref_wkt, edge_samples=11)

Transform input bounding box to output projection.

This transform accounts for the fact that the reprojected square bounding box might be warped in the new coordinate system. To account for this, the function samples points along the original bounding box edges and attempts to make the largest bounding box around any transformed point on the edge whether corners or warped edges.

Parameters:
  • bounding_box (list) – a list of 4 coordinates in base_epsg coordinate system describing the bound in the order [xmin, ymin, xmax, ymax]
  • base_ref_wkt (string) – the spatial reference of the input coordinate system in Well Known Text.
  • new_ref_wkt (string) – the EPSG code of the desired output coordinate system in Well Known Text.
  • edge_samples (int) – the number of interpolated points along each bounding box edge to sample along. A value of 2 will sample just the corners while a value of 3 will also sample the corners and the midpoint.
Returns:

A list of the form [xmin, ymin, xmax, ymax] that describes the largest fitting bounding box around the original warped bounding box in new_epsg coordinate system.

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.unique_raster_values(dataset)

Get list of unique integer values within given dataset.

Parameters:dataset – a gdal dataset of some integer type
Returns:unique_list – a list of dataset’s unique non-nodata values
Return type:list
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.unique_raster_values_count(dataset_uri, ignore_nodata=True)

Return a dict from unique int values in the dataset to their frequency.

Parameters:dataset_uri (string) – uri to a gdal dataset of some integer type
Keyword Arguments:
 ignore_nodata (boolean) – if set to false, the nodata count is also included in the result
Returns:itemfreq – values to count.
Return type:dict
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.unique_raster_values_uri(dataset_uri)

Get list of unique integer values within given dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset of some integer type
Returns:value – a list of dataset’s unique non-nodata values
Return type:list
natcap.invest.pygeoprocessing_0_3_3.geoprocessing.vectorize_datasets(dataset_uri_list, dataset_pixel_op, dataset_out_uri, datatype_out, nodata_out, pixel_size_out, bounding_box_mode, resample_method_list=None, dataset_to_align_index=None, dataset_to_bound_index=None, aoi_uri=None, assert_datasets_projected=True, process_pool=None, vectorize_op=True, datasets_are_pre_aligned=False, dataset_options=None, all_touched=False)

Apply local raster operation on stack of datasets.

This function applies a user defined function across a stack of datasets. It has functionality align the output dataset grid with one of the input datasets, output a dataset that is the union or intersection of the input dataset bounding boxes, and control over the interpolation techniques of the input datasets, if necessary. The datasets in dataset_uri_list must be in the same projection; the function will raise an exception if not.

Parameters:
  • dataset_uri_list (list) – a list of file uris that point to files that can be opened with gdal.Open.
  • (function) a function that must take in as many (dataset_pixel_op) – arguments as there are elements in dataset_uri_list. The arguments can be treated as interpolated or actual pixel values from the input datasets and the function should calculate the output value for that pixel stack. The function is a parallel paradigmn and does not know the spatial position of the pixels in question at the time of the call. If the bounding_box_mode parameter is “union” then the values of input dataset pixels that may be outside their original range will be the nodata values of those datasets. Known bug: if dataset_pixel_op does not return a value in some cases the output dataset values are undefined even if the function does not crash or raise an exception.
  • dataset_out_uri (string) – the uri of the output dataset. The projection will be the same as the datasets in dataset_uri_list.
  • datatype_out – the GDAL output type of the output dataset
  • nodata_out – the nodata value of the output dataset.
  • pixel_size_out – the pixel size of the output dataset in projected coordinates.
  • bounding_box_mode (string) – one of “union” or “intersection”, “dataset”. If union the output dataset bounding box will be the union of the input datasets. Will be the intersection otherwise. An exception is raised if the mode is “intersection” and the input datasets have an empty intersection. If dataset it will make a bounding box as large as the given dataset, if given dataset_to_bound_index must be defined.
Keyword Arguments:
 
  • resample_method_list (list) – a list of resampling methods for each output uri in dataset_out_uri list. Each element must be one of “nearest|bilinear|cubic|cubic_spline|lanczos”. If None, the default is “nearest” for all input datasets.
  • dataset_to_align_index (int) – an int that corresponds to the position in one of the dataset_uri_lists that, if positive aligns the output rasters to fix on the upper left hand corner of the output datasets. If negative, the bounding box aligns the intersection/ union without adjustment.
  • dataset_to_bound_index – if mode is “dataset” this indicates which dataset should be the output size.
  • aoi_uri (string) – a URI to an OGR datasource to be used for the aoi. Irrespective of the mode input, the aoi will be used to intersect the final bounding box.
  • assert_datasets_projected (boolean) – if True this operation will test if any datasets are not projected and raise an exception if so.
  • process_pool – a process pool for multiprocessing
  • vectorize_op (boolean) – if true the model will try to numpy.vectorize dataset_pixel_op. If dataset_pixel_op is designed to use maximize array broadcasting, set this parameter to False, else it may inefficiently invoke the function on individual elements.
  • datasets_are_pre_aligned (boolean) – If this value is set to False this operation will first align and interpolate the input datasets based on the rules provided in bounding_box_mode, resample_method_list, dataset_to_align_index, and dataset_to_bound_index, if set to True the input dataset list must be aligned, probably by raster_utils.align_dataset_list
  • dataset_options – this is an argument list that will be passed to the GTiff driver. Useful for blocksizes, compression, etc.
  • all_touched (boolean) – if true the clip uses the option ALL_TOUCHED=TRUE when calling RasterizeLayer for AOI masking.
Returns:

None

Raises:

ValueError – invalid input provided

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.vectorize_points(shapefile, datasource_field, dataset, randomize_points=False, mask_convex_hull=False, interpolation='nearest')

Interpolate values in shapefile onto given raster.

Takes a shapefile of points and a field defined in that shapefile and interpolate the values in the points onto the given raster

Parameters:
  • shapefile – ogr datasource of points
  • datasource_field – a field in shapefile
  • dataset – a gdal dataset must be in the same projection as shapefile
Keyword Arguments:
 
  • randomize_points (boolean) – (description)
  • mask_convex_hull (boolean) – (description)
  • interpolation (string) – the interpolation method to use for scipy.interpolate.griddata(). Default is ‘nearest’
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing.vectorize_points_uri(shapefile_uri, field, output_uri, interpolation='nearest')

Interpolate values in shapefile onto given raster.

A wrapper function for pygeoprocessing.vectorize_points, that allows for uri passing.

Parameters:
  • shapefile_uri (string) – a uri path to an ogr shapefile
  • field (string) – a string for the field name
  • output_uri (string) – a uri path for the output raster
  • interpolation (string) – interpolation method to use on points, default is ‘nearest’
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.geoprocessing_core module

natcap.invest.pygeoprocessing_0_3_3.geoprocessing_core.distance_transform_edt()

Calculate the Euclidean distance transform on input_mask_uri and output the result into an output raster

input_mask_uri - a gdal raster to calculate distance from the 0 value
pixels
output_distance_uri - will make a float raster w/ same dimensions and
projection as input_mask_uri where all non-zero values of input_mask_uri are equal to the euclidean distance to the closest 0 pixel.

returns nothing

natcap.invest.pygeoprocessing_0_3_3.geoprocessing_core.new_raster_from_base()

Create a new, empty GDAL raster dataset with the spatial references, geotranforms of the base GDAL raster dataset.

base - a the GDAL raster dataset to base output size, and transforms on output_uri - a string URI to the new output raster dataset. gdal_format - a string representing the GDAL file format of the

output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
nodata - a value that will be set as the nodata value for the
output raster. Should be the same type as ‘datatype’
datatype - the pixel datatype of the output raster, for example
gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4

fill_value - (optional) the value to fill in the raster on creation n_rows - (optional) if set makes the resulting raster have n_rows in it

if not, the number of rows of the outgoing dataset are equal to the base.

n_cols - (optional) similar to n_rows, but for the columns. dataset_options - (optional) a list of dataset options that gets

passed to the gdal creation driver, overrides defaults

returns a new GDAL raster dataset.

natcap.invest.pygeoprocessing_0_3_3.geoprocessing_core.new_raster_from_base_uri()

A wrapper for the function new_raster_from_base that opens up the base_uri before passing it to new_raster_from_base.

base_uri - a URI to a GDAL dataset on disk.

All other arguments to new_raster_from_base are passed in.

Returns nothing.

natcap.invest.pygeoprocessing_0_3_3.geoprocessing_core.reclassify_by_dictionary()

Convert all the non-default values in dataset to the values mapped to by rules. If there is no rule for an input value it is replaced by the default output value (which may or may not be the raster’s nodata value ... it could just be any default value).

dataset - GDAL raster dataset rules - a dictionary of the form:

{‘dataset_value1’ : ‘output_value1’, ...
‘dataset_valuen’ : ‘output_valuen’} used to map dataset input types to output

output_uri - The location to hold the output raster on disk format - either ‘MEM’ or ‘GTiff’ default_value - output raster dataset default value (may be nodata) datatype - a GDAL output type

return the mapped raster as a GDAL dataset

Module contents

__init__ module for pygeprocessing, imports all the geoprocessing functions into the pygeoprocessing namespace

natcap.invest.pygeoprocessing_0_3_3.aggregate_raster_values_uri(raster_uri, shapefile_uri, shapefile_field=None, ignore_nodata=True, all_touched=False, polygons_might_overlap=True)

Collect stats on pixel values which lie within shapefile polygons.

Parameters:
  • raster_uri (string) – a uri to a raster. In order for hectare mean values to be accurate, this raster must be projected in meter units.
  • shapefile_uri (string) – a uri to a OGR datasource that should overlap raster; raises an exception if not.
Keyword Arguments:
 
  • shapefile_field (string) – a string indicating which key in shapefile to associate the output dictionary values with whose values are associated with ints; if None dictionary returns a value over the entire shapefile region that intersects the raster.
  • ignore_nodata – if operation == ‘mean’ then it does not account for nodata pixels when determining the pixel_mean, otherwise all pixels in the AOI are used for calculation of the mean. This does not affect hectare_mean which is calculated from the geometrical area of the feature.
  • all_touched (boolean) – if true will account for any pixel whose geometry passes through the pixel, not just the center point
  • polygons_might_overlap (boolean) – if True the function calculates aggregation coverage close to optimally by rasterizing sets of polygons that don’t overlap. However, this step can be computationally expensive for cases where there are many polygons. Setting this flag to False directs the function rasterize in one step.
Returns:

result_tuple – named tuple of the form

(‘aggregate_values’, ‘total pixel_mean hectare_mean n_pixels

pixel_min pixel_max’)

Each of [sum pixel_mean hectare_mean] contains a dictionary that maps shapefile_field value to the total, pixel mean, hecatare mean, pixel max, and pixel min of the values under that feature. ‘n_pixels’ contains the total number of valid pixels used in that calculation. hectare_mean is None if raster_uri is unprojected.

Return type:

tuple

Raises:
  • AttributeError
  • TypeError
  • OSError
natcap.invest.pygeoprocessing_0_3_3.align_dataset_list(dataset_uri_list, dataset_out_uri_list, resample_method_list, out_pixel_size, mode, dataset_to_align_index, dataset_to_bound_index=None, aoi_uri=None, assert_datasets_projected=True, all_touched=False)
Create a new list of datasets that are aligned based on a list of
inputted datasets.

Take a list of dataset uris and generates a new set that is completely aligned with identical projections and pixel sizes.

Parameters:
  • dataset_uri_list (list) – a list of input dataset uris
  • dataset_out_uri_list (list) – a parallel dataset uri list whose positions correspond to entries in dataset_uri_list
  • resample_method_list (list) – a list of resampling methods for each output uri in dataset_out_uri list. Each element must be one of “nearest|bilinear|cubic|cubic_spline|lanczos”
  • out_pixel_size – the output pixel size
  • mode (string) – one of “union”, “intersection”, or “dataset” which defines how the output output extents are defined as either the union or intersection of the input datasets or to have the same bounds as an existing raster. If mode is “dataset” then dataset_to_bound_index must be defined
  • dataset_to_align_index (int) – an int that corresponds to the position in one of the dataset_uri_lists that, if positive aligns the output rasters to fix on the upper left hand corner of the output datasets. If negative, the bounding box aligns the intersection/ union without adjustment.
  • all_touched (boolean) – if True and an AOI is passed, the ALL_TOUCHED=TRUE option is passed to the RasterizeLayer function when determining the mask of the AOI.
Keyword Arguments:
 
  • dataset_to_bound_index – if mode is “dataset” then this index is used to indicate which dataset to define the output bounds of the dataset_out_uri_list
  • aoi_uri (string) – a URI to an OGR datasource to be used for the aoi. Irrespective of the mode input, the aoi will be used to intersect the final bounding box.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.assert_datasets_in_same_projection(dataset_uri_list)

Assert that provided datasets are all in the same projection.

Tests if datasets represented by their uris are projected and in the same projection and raises an exception if not.

Parameters:

dataset_uri_list (list) – (description)

Returns:

is_true – True (otherwise exception raised)

Return type:

boolean

Raises:
  • DatasetUnprojected – if one of the datasets is unprojected.
  • DifferentProjections – if at least one of the datasets is in a different projection
natcap.invest.pygeoprocessing_0_3_3.assert_file_existance(dataset_uri_list)

Assert that provided uris exist in filesystem.

Verify that the uris passed in the argument exist on the filesystem if not, raise an exeception indicating which files do not exist

Parameters:dataset_uri_list (list) – a list of relative or absolute file paths to validate
Returns:None
Raises:IOError – if any files are not found
natcap.invest.pygeoprocessing_0_3_3.calculate_disjoint_polygon_set(shapefile_uri)

Create a list of sets of polygons that don’t overlap.

Determining the minimal number of those sets is an np-complete problem so this is an approximation that builds up sets of maximal subsets.

Parameters:shapefile_uri (string) – a uri to an OGR shapefile to process
Returns:subset_list – list of sets of FIDs from shapefile_uri
Return type:list
natcap.invest.pygeoprocessing_0_3_3.calculate_intersection_rectangle(dataset_list, aoi=None)

Return bounding box of the intersection of all rasters in the list.

Parameters:dataset_list (list) – a list of GDAL datasets in the same projection and coordinate system
Keyword Arguments:
 aoi – an OGR polygon datasource which may optionally also restrict the extents of the intersection rectangle based on its own extents.
Returns:bounding_box – a 4 element list that bounds the intersection of
all the rasters in dataset_list. [left, top, right, bottom]
Return type:list
Raises:SpatialExtentOverlapException – in cases where the dataset list and aoi don’t overlap.
natcap.invest.pygeoprocessing_0_3_3.calculate_raster_stats_uri(dataset_uri)

Calculate min, max, stdev, and mean for all bands in dataset.

Parameters:dataset_uri (string) – a uri to a GDAL raster dataset that will be modified by having its band statistics set
Returns:None
natcap.invest.pygeoprocessing_0_3_3.calculate_slope(dem_dataset_uri, slope_uri, aoi_uri=None, process_pool=None)

Create slope raster from DEM raster.

Follows the algorithm described here: http://webhelp.esri.com/arcgiSDEsktop/9.3/index.cfm?TopicName=How%20Slope%20works

Parameters:
  • dem_dataset_uri (string) – a URI to a single band raster of z values.
  • slope_uri (string) – a path to the output slope uri in percent.
Keyword Arguments:
 
  • aoi_uri (string) – a uri to an AOI input
  • process_pool – a process pool for multiprocessing
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.clip_dataset_uri(source_dataset_uri, aoi_datasource_uri, out_dataset_uri, assert_projections=True, process_pool=None, all_touched=False)

Clip raster dataset to bounding box of provided vector datasource aoi.

This function will clip source_dataset to the bounding box of the polygons in aoi_datasource and mask out the values in source_dataset outside of the AOI with the nodata values in source_dataset.

Parameters:
  • source_dataset_uri (string) – uri to single band GDAL dataset to clip
  • aoi_datasource_uri (string) – uri to ogr datasource
  • out_dataset_uri (string) – path to disk for the clipped datset
Keyword Arguments:
 
  • assert_projections (boolean) – a boolean value for whether the dataset needs to be projected
  • process_pool – a process pool for multiprocessing
  • all_touched (boolean) – if true the clip uses the option ALL_TOUCHED=TRUE when calling RasterizeLayer for AOI masking.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.convolve_2d_uri(signal_path, kernel_path, output_path)

Convolve 2D kernel over 2D signal.

Convolves the raster in kernel_path over signal_path. Nodata values are treated as 0.0 during the convolution and masked to nodata for the output result where signal_path has nodata.

Parameters:
  • signal_path (string) – a filepath to a gdal dataset that’s the source input.
  • kernel_path (string) – a filepath to a gdal dataset that’s the source input.
  • output_path (string) – a filepath to the gdal dataset that’s the convolution output of signal and kernel that is the same size and projection of signal_path. Any nodata pixels that align with signal_path will be set to nodata.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.copy_datasource_uri(shape_uri, copy_uri)

Create a copy of an ogr shapefile.

Parameters:
  • shape_uri (string) – a uri path to the ogr shapefile that is to be copied
  • copy_uri (string) – a uri path for the destination of the copied shapefile
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.create_directories(directory_list)

Make directories provided in list of path strings.

This function will create any of the directories in the directory list if possible and raise exceptions if something exception other than the directory previously existing occurs.

Parameters:directory_list (list) – a list of string uri paths
Returns:None
natcap.invest.pygeoprocessing_0_3_3.create_raster_from_vector_extents(xRes, yRes, format, nodata, rasterFile, shp)

Create a blank raster based on a vector file extent.

This code is adapted from http://trac.osgeo.org/gdal/wiki/FAQRaster#HowcanIcreateablankrasterbasedonavectorfilesextentsforusewithgdal_rasterizeGDAL1.8.0

Parameters:
  • xRes – the x size of a pixel in the output dataset must be a positive value
  • yRes – the y size of a pixel in the output dataset must be a positive value
  • format – gdal GDT pixel type
  • nodata – the output nodata value
  • rasterFile (string) – URI to file location for raster
  • shp – vector shapefile to base extent of output raster on
Returns:

blank raster whose bounds fit within `shp`s bounding box and features are equivalent to the passed in data

Return type:

raster

natcap.invest.pygeoprocessing_0_3_3.create_raster_from_vector_extents_uri(shapefile_uri, pixel_size, gdal_format, nodata_out_value, output_uri)

Create a blank raster based on a vector file extent.

A wrapper for create_raster_from_vector_extents

Parameters:
  • shapefile_uri (string) – uri to an OGR datasource to use as the extents of the raster
  • pixel_size – size of output pixels in the projected units of shapefile_uri
  • gdal_format – the raster pixel format, something like gdal.GDT_Float32
  • nodata_out_value – the output nodata value
  • output_uri (string) – the URI to write the gdal dataset
Returns:

dataset – gdal dataset

Return type:

gdal.Dataset

natcap.invest.pygeoprocessing_0_3_3.create_rat(dataset, attr_dict, column_name)

Create a raster attribute table.

Parameters:
  • dataset – a GDAL raster dataset to create the RAT for (...)
  • attr_dict (dict) – a dictionary with keys that point to a primitive type {integer_id_1: value_1, ... integer_id_n: value_n}
  • column_name (string) – a string for the column name that maps the values
Returns:

dataset – a GDAL raster dataset with an updated RAT

Return type:

gdal.Dataset

natcap.invest.pygeoprocessing_0_3_3.create_rat_uri(dataset_uri, attr_dict, column_name)

Create a raster attribute table.

URI wrapper for create_rat.

Parameters:
  • dataset_uri (string) – a GDAL raster dataset to create the RAT for (...)
  • attr_dict (dict) – a dictionary with keys that point to a primitive type {integer_id_1: value_1, ... integer_id_n: value_n}
  • column_name (string) – a string for the column name that maps the values
natcap.invest.pygeoprocessing_0_3_3.dictionary_to_point_shapefile(dict_data, layer_name, output_uri)

Create a point shapefile from a dictionary.

The point shapefile created is not projected and uses latitude and
longitude for its geometry.
Parameters:
  • dict_data (dict) – a python dictionary with keys being unique id’s that point to sub-dictionarys that have key-value pairs. These inner key-value pairs will represent the field-value pair for the point features. At least two fields are required in the sub-dictionaries, All the keys in the sub dictionary should have the same name and order. All the values in the sub dictionary should have the same type ‘lati’ and ‘long’. These fields determine the geometry of the point 0 : {‘lati’:97, ‘long’:43, ‘field_a’:6.3, ‘field_b’:’Forest’,...}, 1 : {‘lati’:55, ‘long’:51, ‘field_a’:6.2, ‘field_b’:’Crop’,...}, 2 : {‘lati’:73, ‘long’:47, ‘field_a’:6.5, ‘field_b’:’Swamp’,...}
  • layer_name (string) – a python string for the name of the layer
  • output_uri (string) – a uri for the output path of the point shapefile
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.distance_transform_edt(input_mask_uri, output_distance_uri, process_pool=None)

Find the Euclidean distance transform on input_mask_uri and output the result as raster.

Parameters:
  • input_mask_uri (string) – a gdal raster to calculate distance from the non 0 value pixels
  • output_distance_uri (string) – will make a float raster w/ same dimensions and projection as input_mask_uri where all zero values of input_mask_uri are equal to the euclidean distance to the closest non-zero pixel.
Keyword Arguments:
 

process_pool – (description)

Returns:

None

natcap.invest.pygeoprocessing_0_3_3.extract_datasource_table_by_key(datasource_uri, key_field)

Return vector attribute table of first layer as dictionary.

Create a dictionary lookup table of the features in the attribute table of the datasource referenced by datasource_uri.

Parameters:
  • datasource_uri (string) – a uri to an OGR datasource
  • key_field – a field in datasource_uri that refers to a key value for each row such as a polygon id.
Returns:

attribute_dictionary – returns a dictionary of the

form {key_field_0: {field_0: value0, field_1: value1}...}

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.get_bounding_box(dataset_uri)

Get bounding box where coordinates are in projected units.

Parameters:dataset_uri (string) – a uri to a GDAL dataset
Returns:bounding_box

[upper_left_x, upper_left_y, lower_right_x, lower_right_y] in projected coordinates

Return type:list
natcap.invest.pygeoprocessing_0_3_3.get_cell_size_from_uri(dataset_uri)

Get the cell size of a dataset in units of meters.

Raises an exception if the raster is not square since this’ll break most of the pygeoprocessing algorithms.

Parameters:dataset_uri (string) – uri to a gdal dataset
Returns:cell size of the dataset in meters
Return type:size_meters
natcap.invest.pygeoprocessing_0_3_3.get_dataset_projection_wkt_uri(dataset_uri)

Get the projection of a GDAL dataset as well known text (WKT).

Parameters:dataset_uri (string) – a URI for the GDAL dataset
Returns:proj_wkt – WKT describing the GDAL dataset project
Return type:string
natcap.invest.pygeoprocessing_0_3_3.get_datasource_bounding_box(datasource_uri)

Get datasource bounding box where coordinates are in projected units.

Parameters:dataset_uri (string) – a uri to a GDAL dataset
Returns:bounding_box

[upper_left_x, upper_left_y, lower_right_x, lower_right_y] in projected coordinates

Return type:list
natcap.invest.pygeoprocessing_0_3_3.get_datatype_from_uri(dataset_uri)

Return datatype for first band in gdal dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:datatype for dataset band 1
Return type:datatype
natcap.invest.pygeoprocessing_0_3_3.get_geotransform_uri(dataset_uri)

Get the geotransform from a gdal dataset.

Parameters:dataset_uri (string) – a URI for the dataset
Returns:a dataset geotransform list
Return type:geotransform
natcap.invest.pygeoprocessing_0_3_3.get_lookup_from_csv(csv_table_uri, key_field)

Read CSV table file in as dictionary.

Creates a python dictionary to look up the rest of the fields in a csv table indexed by the given key_field

Parameters:
  • csv_table_uri (string) – a URI to a csv file containing at least the header key_field
  • key_field – (description)
Returns:

lookup_dict – returns a dictionary of the form {key_field_0:

{header_1: val_1_0, header_2: val_2_0, etc.} depending on the values of those fields

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.get_lookup_from_table(table_uri, key_field)

Read table file in as dictionary.

Creates a python dictionary to look up the rest of the fields in a table file indexed by the given key_field. This function is case insensitive to field header names and returns a lookup table with lowercase keys.

Parameters:
  • table_uri (string) – a URI to a dbf or csv file containing at least the header key_field
  • key_field – (description)
Returns:

lookup_dict – a dictionary of the form {key_field_0:

{header_1: val_1_0, header_2: val_2_0, etc.} where key_field_n is the lowercase version of the column name.

Return type:

dict

natcap.invest.pygeoprocessing_0_3_3.get_nodata_from_uri(dataset_uri)

Return nodata value from first band in gdal dataset cast as numpy datatype.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:nodata value for dataset band 1
Return type:nodata
natcap.invest.pygeoprocessing_0_3_3.get_raster_properties(dataset)

Get width, height, X size, and Y size of the dataset as dictionary.

This function can be expanded to return more properties if needed

Parameters:dataset (gdal.Dataset) – a GDAL raster dataset to get the properties from
Returns:dataset_dict – a dictionary with the properties stored
under relevant keys. The current list of things returned is: width (w-e pixel resolution), height (n-s pixel resolution), XSize, YSize
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.get_raster_properties_uri(dataset_uri)

Get width, height, X size, and Y size of the dataset as dictionary.

Wrapper function for get_raster_properties() that passes in the dataset URI instead of the datasets itself

Parameters:dataset_uri (string) – a URI to a GDAL raster dataset
Returns:value – a dictionary with the properties stored under
relevant keys. The current list of things returned is: width (w-e pixel resolution), height (n-s pixel resolution), XSize, YSize
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.get_rat_as_dictionary(dataset)

Get Raster Attribute Table of the first band of dataset as a dictionary.

Parameters:dataset (gdal.Dataset) – a GDAL dataset that has a RAT associated with the first band
Returns:rat_dictionary – a 2D dictionary where the first key is the
column name and second is the row number
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.get_rat_as_dictionary_uri(dataset_uri)

Get Raster Attribute Table of the first band of dataset as a dictionary.

Parameters:dataset (string) – a GDAL dataset that has a RAT associated with the first band
Returns:value – a 2D dictionary where the first key is the column
name and second is the row number
Return type:dictionary
natcap.invest.pygeoprocessing_0_3_3.get_row_col_from_uri(dataset_uri)

Return number of rows and columns of given dataset uri as tuple.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:rows_cols – 2-tuple (n_row, n_col) from dataset_uri
Return type:tuple
natcap.invest.pygeoprocessing_0_3_3.get_spatial_ref_uri(datasource_uri)

Get the spatial reference of an OGR datasource.

Parameters:datasource_uri (string) – a URI to an ogr datasource
Returns:a spatial reference
Return type:spat_ref
natcap.invest.pygeoprocessing_0_3_3.get_statistics_from_uri(dataset_uri)

Get the min, max, mean, stdev from first band in a GDAL Dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset
Returns:statistics – min, max, mean, stddev
Return type:tuple
natcap.invest.pygeoprocessing_0_3_3.iterblocks(raster_uri, band_list=None, largest_block=1048576, astype=None, offset_only=False)

Iterate across all the memory blocks in the input raster.

Result is a generator of block location information and numpy arrays.

This is especially useful when a single value needs to be derived from the pixel values in a raster, such as the sum total of all pixel values, or a sequence of unique raster values. In such cases, raster_local_op is overkill, since it writes out a raster.

As a generator, this can be combined multiple times with itertools.izip() to iterate ‘simultaneously’ over multiple rasters, though the user should be careful to do so only with prealigned rasters.

Parameters:
  • raster_uri (string) – The string filepath to the raster to iterate over.
  • band_list=None (list of ints or None) – A list of the bands for which the matrices should be returned. The band number to operate on. Defaults to None, which will return all bands. Bands may be specified in any order, and band indexes may be specified multiple times. The blocks returned on each iteration will be in the order specified in this list.
  • largest_block (int) – Attempts to iterate over raster blocks with this many elements. Useful in cases where the blocksize is relatively small, memory is available, and the function call overhead dominates the iteration. Defaults to 2**20. A value of anything less than the original blocksize of the raster will result in blocksizes equal to the original size.
  • astype (list of numpy types) – If none, output blocks are in the native type of the raster bands. Otherwise this parameter is a list of len(band_list) length that contains the desired output types that iterblock generates for each band.
  • offset_only (boolean) – defaults to False, if True iterblocks only returns offset dictionary and doesn’t read any binary data from the raster. This can be useful when iterating over writing to an output.
Returns:

If offset_only is false, on each iteration, a tuple containing a dict of block data and n 2-dimensional numpy arrays are returned, where n is the number of bands requested via band_list. The dict of block data has these attributes:

data[‘xoff’] - The X offset of the upper-left-hand corner of the

block.

data[‘yoff’] - The Y offset of the upper-left-hand corner of the

block.

data[‘win_xsize’] - The width of the block. data[‘win_ysize’] - The height of the block.

If offset_only is True, the function returns only the block data and

does not attempt to read binary data from the raster.

natcap.invest.pygeoprocessing_0_3_3.load_memory_mapped_array(dataset_uri, memory_file, array_type=None)

Get the first band of a dataset as a memory mapped array.

Parameters:
  • dataset_uri (string) – the GDAL dataset to load into a memory mapped array
  • memory_uri (string) – a path to a file OR a file-like object that will be used to hold the memory map. It is up to the caller to create and delete this file.
Keyword Arguments:
 

array_type – the type of the resulting array, if None defaults to the type of the raster band in the dataset

Returns:

memory_array – a memmap numpy array of the data

contained in the first band of dataset_uri

Return type:

memmap numpy array

natcap.invest.pygeoprocessing_0_3_3.make_constant_raster_from_base_uri(base_dataset_uri, constant_value, out_uri, nodata_value=None, dataset_type=<Mock id='59299152'>)

Create new gdal raster filled with uniform values.

A helper function that creates a new gdal raster from base, and fills it with the constant value provided.

Parameters:
  • base_dataset_uri (string) – the gdal base raster
  • constant_value – the value to set the new base raster to
  • out_uri (string) – the uri of the output raster
Keyword Arguments:
 
  • nodata_value – the value to set the constant raster’s nodata value to. If not specified, it will be set to constant_value - 1.0
  • dataset_type – the datatype to set the dataset to, default will be a float 32 value.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.new_raster(cols, rows, projection, geotransform, format, nodata, datatype, bands, outputURI)

Create a new raster with the given properties.

Parameters:
  • cols (int) – number of pixel columns
  • rows (int) – number of pixel rows
  • projection – the datum
  • geotransform – the coordinate system
  • format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
  • bands (int) – the number of bands in the raster
  • outputURI (string) – the file location for the outputed raster. If format is ‘MEM’ this can be an empty string
Returns:

a new GDAL raster with the parameters as described above

Return type:

dataset

natcap.invest.pygeoprocessing_0_3_3.new_raster_from_base(base, output_uri, gdal_format, nodata, datatype, fill_value=None, n_rows=None, n_cols=None, dataset_options=None)

Create a new, empty GDAL raster dataset with the spatial references, geotranforms of the base GDAL raster dataset.

Parameters:
  • base – a the GDAL raster dataset to base output size, and transforms on
  • output_uri (string) – a string URI to the new output raster dataset.
  • gdal_format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
Keyword Arguments:
 
  • fill_value – the value to fill in the raster on creation
  • n_rows – if set makes the resulting raster have n_rows in it if not, the number of rows of the outgoing dataset are equal to the base.
  • n_cols – similar to n_rows, but for the columns.
  • dataset_options – a list of dataset options that gets passed to the gdal creation driver, overrides defaults
Returns:

a new GDAL raster dataset.

Return type:

dataset

natcap.invest.pygeoprocessing_0_3_3.new_raster_from_base_uri(base_uri, output_uri, gdal_format, nodata, datatype, fill_value=None, n_rows=None, n_cols=None, dataset_options=None)

Create a new, empty GDAL raster dataset with the spatial references, geotranforms of the base GDAL raster dataset.

A wrapper for the function new_raster_from_base that opens up the base_uri before passing it to new_raster_from_base.

Parameters:
  • base_uri (string) – a URI to a GDAL dataset on disk.
  • output_uri (string) – a string URI to the new output raster dataset.
  • gdal_format (string) – a string representing the GDAL file format of the output raster. See http://gdal.org/formats_list.html for a list of available formats. This parameter expects the format code, such as ‘GTiff’ or ‘MEM’
  • nodata – a value that will be set as the nodata value for the output raster. Should be the same type as ‘datatype’
  • datatype – the pixel datatype of the output raster, for example gdal.GDT_Float32. See the following header file for supported pixel types: http://www.gdal.org/gdal_8h.html#22e22ce0a55036a96f652765793fb7a4
Keyword Arguments:
 
  • fill_value – the value to fill in the raster on creation
  • n_rows – if set makes the resulting raster have n_rows in it if not, the number of rows of the outgoing dataset are equal to the base.
  • n_cols – similar to n_rows, but for the columns.
  • dataset_options – a list of dataset options that gets passed to the gdal creation driver, overrides defaults
Returns:

nothing

natcap.invest.pygeoprocessing_0_3_3.pixel_size_based_on_coordinate_transform(dataset, coord_trans, point)

Get width and height of cell in meters.

Calculates the pixel width and height in meters given a coordinate transform and reference point on the dataset that’s close to the transform’s projected coordinate sytem. This is only necessary if dataset is not already in a meter coordinate system, for example dataset may be in lat/long (WGS84).

Parameters:
  • dataset (gdal.Dataset) – a projected GDAL dataset in the form of lat/long decimal degrees
  • coord_trans (osr.CoordinateTransformation) – an OSR coordinate transformation from dataset coordinate system to meters
  • point (tuple) – a reference point close to the coordinate transform coordinate system. must be in the same coordinate system as dataset.
Returns:

pixel_diff – a 2-tuple containing (pixel width in meters, pixel

height in meters)

Return type:

tuple

natcap.invest.pygeoprocessing_0_3_3.pixel_size_based_on_coordinate_transform_uri(dataset_uri, *args, **kwargs)

Get width and height of cell in meters.

A wrapper for pixel_size_based_on_coordinate_transform that takes a dataset uri as an input and opens it before sending it along.

Parameters:
  • dataset_uri (string) – a URI to a gdal dataset
  • other parameters pass along (All) –
Returns:

result – (pixel_width_meters, pixel_height_meters)

Return type:

tuple

natcap.invest.pygeoprocessing_0_3_3.rasterize_layer_uri(raster_uri, shapefile_uri, burn_values=[], option_list=[])

Rasterize datasource layer.

Burn the layer from ‘shapefile_uri’ onto the raster from ‘raster_uri’. Will burn ‘burn_value’ onto the raster unless ‘field’ is not None, in which case it will burn the value from shapefiles field.

Parameters:
  • raster_uri (string) – a URI to a gdal dataset
  • shapefile_uri (string) – a URI to an ogr datasource
Keyword Arguments:
 
  • burn_values (list) – the equivalent value for burning into a polygon. If empty uses the Z values.
  • option_list (list) – a Python list of options for the operation. Example: [“ATTRIBUTE=NPV”, “ALL_TOUCHED=TRUE”]
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.reclassify_dataset_uri(dataset_uri, value_map, raster_out_uri, out_datatype, out_nodata, exception_flag='values_required', assert_dataset_projected=True)

Reclassify values in a dataset.

A function to reclassify values in dataset to any output type. By default the values except for nodata must be in value_map.

Parameters:
  • dataset_uri (string) – a uri to a gdal dataset
  • value_map (dictionary) – a dictionary of values of {source_value: dest_value, ...} where source_value’s type is a postive integer type and dest_value is of type out_datatype.
  • raster_out_uri (string) – the uri for the output raster
  • out_datatype (gdal type) – the type for the output dataset
  • out_nodata (numerical type) – the nodata value for the output raster. Must be the same type as out_datatype
Keyword Arguments:
 
  • exception_flag (string) – either ‘none’ or ‘values_required’. If ‘values_required’ raise an exception if there is a value in the raster that is not found in value_map
  • assert_dataset_projected (boolean) – if True this operation will test if the input dataset is not projected and raise an exception if so.
Returns:

nothing

Raises:

Exception – if exception_flag == ‘values_required’ and the value from ‘key_raster’ is not a key in ‘attr_dict’

natcap.invest.pygeoprocessing_0_3_3.reproject_dataset_uri(original_dataset_uri, pixel_spacing, output_wkt, resampling_method, output_uri)

Reproject and resample GDAL dataset.

A function to reproject and resample a GDAL dataset given an output pixel size and output reference. Will use the datatype and nodata value from the original dataset.

Parameters:
  • original_dataset_uri (string) – a URI to a gdal Dataset to written to disk
  • pixel_spacing – output dataset pixel size in projected linear units
  • output_wkt – output project in Well Known Text
  • resampling_method (string) – a string representing the one of the following resampling methods: “nearest|bilinear|cubic|cubic_spline|lanczos”
  • output_uri (string) – location on disk to dump the reprojected dataset
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.reproject_datasource(original_datasource, output_wkt, output_uri)

Reproject OGR DataSource object.

Changes the projection of an ogr datasource by creating a new shapefile based on the output_wkt passed in. The new shapefile then copies all the features and fields of the original_datasource as its own.

Parameters:
  • original_datasource – an ogr datasource
  • output_wkt – the desired projection as Well Known Text (by layer.GetSpatialRef().ExportToWkt())
  • output_uri (string) – the filepath to the output shapefile
Returns:

the reprojected shapefile.

Return type:

output_datasource

natcap.invest.pygeoprocessing_0_3_3.reproject_datasource_uri(original_dataset_uri, output_wkt, output_uri)

Reproject OGR DataSource file.

URI wrapper for reproject_datasource that takes in the uri for the datasource that is to be projected instead of the datasource itself. This function directly calls reproject_datasource.

Parameters:
  • original_dataset_uri (string) – a uri to an ogr datasource
  • output_wkt – the desired projection as Well Known Text (by layer.GetSpatialRef().ExportToWkt())
  • output_uri (string) – the path to where the new shapefile should be written to disk.
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.resize_and_resample_dataset_uri(original_dataset_uri, bounding_box, out_pixel_size, output_uri, resample_method)

Resize and resample the given dataset.

Parameters:
  • original_dataset_uri (string) – a GDAL dataset
  • bounding_box (list) – [upper_left_x, upper_left_y, lower_right_x, lower_right_y]
  • out_pixel_size – the pixel size in projected linear units
  • output_uri (string) – the location of the new resampled GDAL dataset
  • resample_method (string) – the resampling technique, one of “nearest|bilinear|cubic|cubic_spline|lanczos”
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.temporary_filename(suffix='')

Get path to new temporary file that will be deleted on program exit.

Returns a temporary filename using mkstemp. The file is deleted on exit using the atexit register.

Keyword Arguments:
 suffix (string) – the suffix to be appended to the temporary file
Returns:a unique temporary filename
Return type:fname
natcap.invest.pygeoprocessing_0_3_3.temporary_folder()

Get path to new temporary folder that will be deleted on program exit.

Returns a temporary folder using mkdtemp. The folder is deleted on exit using the atexit register.

Returns:path – an absolute, unique and temporary folder path.
Return type:string
natcap.invest.pygeoprocessing_0_3_3.tile_dataset_uri(in_uri, out_uri, blocksize)
Resample gdal dataset into tiled raster with blocks of blocksize X
blocksize.
Parameters:
  • in_uri (string) – dataset to base data from
  • out_uri (string) – output dataset
  • blocksize (int) – defines the side of the square for the raster, this seems to have a lower limit of 16, but is untested
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.transform_bounding_box(bounding_box, base_ref_wkt, new_ref_wkt, edge_samples=11)

Transform input bounding box to output projection.

This transform accounts for the fact that the reprojected square bounding box might be warped in the new coordinate system. To account for this, the function samples points along the original bounding box edges and attempts to make the largest bounding box around any transformed point on the edge whether corners or warped edges.

Parameters:
  • bounding_box (list) – a list of 4 coordinates in base_epsg coordinate system describing the bound in the order [xmin, ymin, xmax, ymax]
  • base_ref_wkt (string) – the spatial reference of the input coordinate system in Well Known Text.
  • new_ref_wkt (string) – the EPSG code of the desired output coordinate system in Well Known Text.
  • edge_samples (int) – the number of interpolated points along each bounding box edge to sample along. A value of 2 will sample just the corners while a value of 3 will also sample the corners and the midpoint.
Returns:

A list of the form [xmin, ymin, xmax, ymax] that describes the largest fitting bounding box around the original warped bounding box in new_epsg coordinate system.

natcap.invest.pygeoprocessing_0_3_3.unique_raster_values(dataset)

Get list of unique integer values within given dataset.

Parameters:dataset – a gdal dataset of some integer type
Returns:unique_list – a list of dataset’s unique non-nodata values
Return type:list
natcap.invest.pygeoprocessing_0_3_3.unique_raster_values_count(dataset_uri, ignore_nodata=True)

Return a dict from unique int values in the dataset to their frequency.

Parameters:dataset_uri (string) – uri to a gdal dataset of some integer type
Keyword Arguments:
 ignore_nodata (boolean) – if set to false, the nodata count is also included in the result
Returns:itemfreq – values to count.
Return type:dict
natcap.invest.pygeoprocessing_0_3_3.unique_raster_values_uri(dataset_uri)

Get list of unique integer values within given dataset.

Parameters:dataset_uri (string) – a uri to a gdal dataset of some integer type
Returns:value – a list of dataset’s unique non-nodata values
Return type:list
natcap.invest.pygeoprocessing_0_3_3.vectorize_datasets(dataset_uri_list, dataset_pixel_op, dataset_out_uri, datatype_out, nodata_out, pixel_size_out, bounding_box_mode, resample_method_list=None, dataset_to_align_index=None, dataset_to_bound_index=None, aoi_uri=None, assert_datasets_projected=True, process_pool=None, vectorize_op=True, datasets_are_pre_aligned=False, dataset_options=None, all_touched=False)

Apply local raster operation on stack of datasets.

This function applies a user defined function across a stack of datasets. It has functionality align the output dataset grid with one of the input datasets, output a dataset that is the union or intersection of the input dataset bounding boxes, and control over the interpolation techniques of the input datasets, if necessary. The datasets in dataset_uri_list must be in the same projection; the function will raise an exception if not.

Parameters:
  • dataset_uri_list (list) – a list of file uris that point to files that can be opened with gdal.Open.
  • (function) a function that must take in as many (dataset_pixel_op) – arguments as there are elements in dataset_uri_list. The arguments can be treated as interpolated or actual pixel values from the input datasets and the function should calculate the output value for that pixel stack. The function is a parallel paradigmn and does not know the spatial position of the pixels in question at the time of the call. If the bounding_box_mode parameter is “union” then the values of input dataset pixels that may be outside their original range will be the nodata values of those datasets. Known bug: if dataset_pixel_op does not return a value in some cases the output dataset values are undefined even if the function does not crash or raise an exception.
  • dataset_out_uri (string) – the uri of the output dataset. The projection will be the same as the datasets in dataset_uri_list.
  • datatype_out – the GDAL output type of the output dataset
  • nodata_out – the nodata value of the output dataset.
  • pixel_size_out – the pixel size of the output dataset in projected coordinates.
  • bounding_box_mode (string) – one of “union” or “intersection”, “dataset”. If union the output dataset bounding box will be the union of the input datasets. Will be the intersection otherwise. An exception is raised if the mode is “intersection” and the input datasets have an empty intersection. If dataset it will make a bounding box as large as the given dataset, if given dataset_to_bound_index must be defined.
Keyword Arguments:
 
  • resample_method_list (list) – a list of resampling methods for each output uri in dataset_out_uri list. Each element must be one of “nearest|bilinear|cubic|cubic_spline|lanczos”. If None, the default is “nearest” for all input datasets.
  • dataset_to_align_index (int) – an int that corresponds to the position in one of the dataset_uri_lists that, if positive aligns the output rasters to fix on the upper left hand corner of the output datasets. If negative, the bounding box aligns the intersection/ union without adjustment.
  • dataset_to_bound_index – if mode is “dataset” this indicates which dataset should be the output size.
  • aoi_uri (string) – a URI to an OGR datasource to be used for the aoi. Irrespective of the mode input, the aoi will be used to intersect the final bounding box.
  • assert_datasets_projected (boolean) – if True this operation will test if any datasets are not projected and raise an exception if so.
  • process_pool – a process pool for multiprocessing
  • vectorize_op (boolean) – if true the model will try to numpy.vectorize dataset_pixel_op. If dataset_pixel_op is designed to use maximize array broadcasting, set this parameter to False, else it may inefficiently invoke the function on individual elements.
  • datasets_are_pre_aligned (boolean) – If this value is set to False this operation will first align and interpolate the input datasets based on the rules provided in bounding_box_mode, resample_method_list, dataset_to_align_index, and dataset_to_bound_index, if set to True the input dataset list must be aligned, probably by raster_utils.align_dataset_list
  • dataset_options – this is an argument list that will be passed to the GTiff driver. Useful for blocksizes, compression, etc.
  • all_touched (boolean) – if true the clip uses the option ALL_TOUCHED=TRUE when calling RasterizeLayer for AOI masking.
Returns:

None

Raises:

ValueError – invalid input provided

natcap.invest.pygeoprocessing_0_3_3.vectorize_points(shapefile, datasource_field, dataset, randomize_points=False, mask_convex_hull=False, interpolation='nearest')

Interpolate values in shapefile onto given raster.

Takes a shapefile of points and a field defined in that shapefile and interpolate the values in the points onto the given raster

Parameters:
  • shapefile – ogr datasource of points
  • datasource_field – a field in shapefile
  • dataset – a gdal dataset must be in the same projection as shapefile
Keyword Arguments:
 
  • randomize_points (boolean) – (description)
  • mask_convex_hull (boolean) – (description)
  • interpolation (string) – the interpolation method to use for scipy.interpolate.griddata(). Default is ‘nearest’
Returns:

None

natcap.invest.pygeoprocessing_0_3_3.vectorize_points_uri(shapefile_uri, field, output_uri, interpolation='nearest')

Interpolate values in shapefile onto given raster.

A wrapper function for pygeoprocessing.vectorize_points, that allows for uri passing.

Parameters:
  • shapefile_uri (string) – a uri path to an ogr shapefile
  • field (string) – a string for the field name
  • output_uri (string) – a uri path for the output raster
  • interpolation (string) – interpolation method to use on points, default is ‘nearest’
Returns:

None