InVEST +VERSION+ documentation

Crop Production



Please note that the Crop Production model is currently under active development. We do not recommend using this model in decision-making contexts until it has been further tested.


Expanding agricultural production and closing yield gaps is a key strategy for many governments and development agencies focused on poverty alleviation and achieving food security. However, conversion of natural habitats to agricultural production sites impacts other ecosystem services that are key to sustaining the economic benefits that agriculture provides to local communities. Intensive agricultural practices can add to pollution loads in water sources, often necessitating future costly water purification methods. Overuse of water also threatens the supply available for hydropower or other services. Still, crop production is essential to human well-being and livelihoods. The InVEST crop production model allows detailed examination of the costs and benefits of this vital human enterprise, allowing exploration of questions such as:

  • How would different arrangement or selection of cropping systems compare to current systems in terms of total production? Could switching crops yield higher economic returns or nutritional value?
  • What are the impacts of crop intensification on ecosystem services? If less land is used to produce equal amounts of food by increasing intensification, is the net result on ecosystem services production positive or negative?
  • How can we evaluate different strategies for meeting increasing food demand while minimizing the impact on ecosystem services?

The Model

The InVEST crop production model will produce estimates of crop yield, from existing data, percentile summaries, and modeled predictions. For existing or modeled crop yields, the model can also generate estimates of crop value.

Observed data: The crop yield model supplies observed yields, based on FAO and sub-national datasets for 175 crops, as tons/ha (Monfreda et al. 2008). If a crop type submitted by the user is not grown in that region, the model will not return a value for those pixels; crops can be moved around within a region in which they are grown, but novel cropping systems cannot be introduced in minimum mode. The model will also return existing inputs for that crop (in that region) as a percent of land irrigated (for 15 crops for which there are data), and amount of N, P, and K applied/ha (for 140 crops for which there are data). The model can provide nutrition information for all crops and economic production if additional cost information is provided for fertilizer, nutrients, labor, seed, and machinery (this information is already included in the model for 12 staple crops in 2012: barley, maize, oil palm, potato, rapeseed, rice, rye, soybean, sugar beet, sugar cane, sunflower, and wheat).

Percentile summaries: This option allows the user to explore yields under different management scenarios, picking from a range of “intensification” levels. The dataset associated with this model contains CSV tables for each crop, listing the yield for the 25th, 50th, 75th, and 95th percentiles, amongst observed yield data in each of the crop’s climate bins.

Modeled yields: For 12 staple crops for which yields have been modeled globally by Mueller et al. (2011), the model can provide estimates of both yields and inputs (fertilizer and irrigation), in the same units as above. These crops include barley, maize, oil palm, potato, rapeseed, rice, rye, soybean, sugar beet, sugar cane, sunflower, and wheat. To run this model, the user must provide rasters of nitrogen, phosphate, and potash application rate (kg/ha) and an irrigation raster (0 for pixels that are not irrigated and 1 for pixels that are) that cover all cropped areas of interest. The model returns crop yields and economic and nutritional value.

The crop value model can use the yields and/or inputs generated by the yield model, or can be run with yield maps derived from other models or data sources (e.g., SSURGO). Crop yields can be valued in terms of economic returns or in terms of nutrition. To calculate economic returns, the model requires yield maps, as well as maps of fertilizer and irrigation rates corresponding to those yields, and combines this information with crop price and cost datasets to calculate the total expected returns (yields x area x price – inputs x input costs – area x other costs). To calculate nutrition, the model only requires yield maps of all food crops produced, and the user can select from 33 macro and micronutrients to map or summarize. This model can be combined with our nutrition demand model, which multiplies population density by recommended daily allowances of the same nutrients, to determine what proportion of nutritional requirements can be met from local food production.

How it Works

Calculating Yield and Production

Method 1: Observed Regional Yields (Observed)

\(ProductionPerCell_{crop,x,y} = { ObservedLocalYieldPerHectare_{crop,x,y} * HectaresPerCell }\)

\(ProductionTotal_{crop} = \sum_{x,y}{ ProductionPerCell_{crop,x,y} }\)

Method 2: Climate-specific Distribution of Observed Yields (Percentile)

\(YieldPerHectare_{crop,percentile,x,y} = \left( ObservedClimateBinYield_{crop, precentile, climatebin} \mid ClimateBin_{x, y} \right)\)

\(ProductionPerCell_{crop,percentile,x,y} = YieldPerHectare_{crop,percentile,x,y} * HectaresPerCell\)

\(ProductionTotal_{crop,percentile} = \sum_{x,y}{ ProductionPerCell_{crop,percentile,x,y} }\)

Method 3: Yield Regression Model with Climate-specific Parameters (Modeled)

\(PercentMaxYieldNitrogen_{x,y} = \left( 1 - Bnp_{crop,climatebin} * e^{-Cn_{crop,climatebin} * NitrogenAppRate_{x,y}} \mid ClimateBin_{x, y} \right)\)

\(PercentMaxYieldPhosphorus_{x,y} = \left( 1 - Bnp_{crop,climatebin} * e^{-Cp_{crop,climatebin} * PhosphorusAppRate_{x,y}} \mid ClimateBin_{x, y} \right)\)

\(PercentMaxYieldPotassium_{x,y} = \left( 1 - Bk_{crop,climatebin} * e^{-Ck_{crop,climatebin} * PotassiumAppRate_{x,y}} \mid ClimateBin_{x, y} \right)\)

\(MaxYieldNitrogen_{x,y} = MaxYield_{crop,climatebin} * PercentMaxYieldNitrogen_{x,y}\)

\(MaxYieldPhosphorus_{x,y} = MaxYield_{crop,climatebin} * PercentMaxYieldPhosphorus_{x,y}\)

\(MaxYieldPotassium_{x,y} = MaxYield_{crop,climatebin} * PercentMaxYieldPotassium_{x,y}\)

\(YieldPerHectare_{crop,x,y} = \left\{ \begin{matrix} min\left( MaxYieldNitrogen, MaxYieldPhosphorus, MaxYieldPotassium \right) & if & irrigated \\ min\left( MaxYieldNitrogen, MaxYieldPhosphorus, MaxYieldPotassium, MaxYieldRainfed \right) & if & rainfed \end{matrix} \right\}\)

\(ProductionPerCell_{crop,x,y} = YieldPerHectare_{crop,x,y} * HectaresPerCell_{x,y}\)

\(ProductionTotal_{crop} = \sum_{x,y}{ ProductionPerCell_{crop,x,y} }\)

Calculating Nutritional Contents from Production

\(NutrientAmount_{crop, nutrient} = NutrientAmountPerTonCrop_{crop, nutrient} * ProductionTotal_{crop} * (1 - FractionRefuse)\)

\(NutrientAmountTotal_{nutrient} = \sum_{crops}{ NutrientAmount_{crop, nutrient} }\)

Calculating Economic Returns

\(KilogramInputTotalCosts_{crop, x, y} = \sum_{fertilizer} \left( { FertKgPerHectare_{fertilizer,x,y} * CostPerKg_{crop, fertilizer} * HectaresPerCell } \right)\)

\(HectareInputTotalCosts_{crop, x, y} = { \sum_{inputs}{ CostPerHectare_{input,x,y}} * HectaresPerCell }\)

\(Cost_{crop, x, y} = KilogramInputTotalCosts_{crop, x, y} + HectareInputTotalCosts_{crop, x, y}\)

\(Revenue_{crop, x, y} = Production_{crop, x, y} * Price_{crop}\)

\(Returns_{crop, x, y} = Revenue_{crop, x, y} - Cost_{crop, x, y}\)

\(ReturnsTotal_{crop} = \sum_{x, y} Returns_{crop, x, y}\)

Limitations and Simplifications

The current version of the model is a coarse global model driven mostly by climate and optionally by management. This model is therefore not able to capture the variation in productivity that occurs across heterogeneous landscapes. A rocky hill slope and a fertile river valley, if they share the same climate, would be assigned the same yield in the current model. This is a problem if the question of interest is where: where to prioritize future habitat conversion; or where farming is most productive and least destructive.

Spatial downscaling of the current coarse global model is necessary to make the crop model more useful in local land-use decisions. Our approach will be to acquire local yield data that can be compared to the regression model yields to determine where the model is overestimating yields and where it is underestimating. The resulting differences can be related to other variables such as slope, aspect, elevation, soil fertility, and soil depth, and any significant relationships can be used to refine the current model. The coarse model will still be used to arrive at the general magnitude of yield for a given climate and intensification level, and the finer-scale differences will essentially tune the coarse model up or down. To do this we need:

  • Field-level (or better) yield data across a wide representation of soils, topographies, and climates
  • Soil and topographic data at the same level of resolution as the yield data

If you have or intend to take such data and are interested in collaborating with us, please contact Becky Chaplin-Kramer at

Data Needs

Running the Model

General Parameters

  1. Workspace Folder The selected folder is used as the workspace where all intermediate and final output files will be written. If the selected folder does not exist, it will be created. If datasets already exist in the selected folder, they will be overwritten.
  2. Results Suffix (Optional) This text will be appended to the end of the output folders to help separate outputs from multiple runs. Please see the Interpreting Results section for an example folder structure for outputs.
  3. Lookup Table (CSV) The table should contain three columns: a ‘name’ column, a ‘code’ column, and an ‘is_crop’ column.
name code is_crop
other 0 false
maize 1 true
soybean 2 true
rice 3 true
... ...  
  1. Crop Management Scenario Map (Raster) A GDAL-supported raster representing a crop management scenario. Each cell value in the raster should be a valid integer code that corresponds to a lulc-class in the Lookup Table file. The NoData value should be set to a number not existing in the LULC Lookup Table.
int int
int int
  1. Global Dataset Folder A directory of raster datasets and CSV tables representing climate bins, yields and regressions developed by the Foley lab for this model. These data are not currently distributed with the InVEST installer and must be downloaded from:

Folder Structure

|-- spatial_dataset_folder
    |-- climate_bin_maps
    |   |-- [crop]_climate_bin_map (*.tif)
    |-- climate_percentile_yield
    |   |-- [crop]_percentile_yield_table.csv
    |-- climate_regression_yield
    |   |-- [crop]_regression_yield_table.csv
    |-- observed_yield
        |-- [crop]_yield_map (*.tif)

Embedded Data for Functions Based on Climate (Percentile and Regression Functions)

Crop Climate-Bin Maps (Rasters) A set of GDAL-supported rasters representing the climate-bin that a given area of land is located within for each particular crop. Each raster contains a set of values between 0 and 100. Zero-values represent areas that do not exist within a climate-bin, such as an ocean. Values 1 through 100 correspond to particular climate-bins. The climate-bin maps reside in the ‘climate_bin_maps’ folder of the provided spatial dataset.

int int
int int

Embedded Data for Observed Regional Yields

Observed Crop Yield Maps (Rasters) A set of GDAL-supported rasters representing the observed regional crop yield. Each cell value in the raster should be a non-negative float value representing the amount of crop produced in units of tons per hectare (tons/hectare). The observed yield maps reside in the ‘observed_yield’ folder of the provided spatial dataset.

float float
float float

Embedded Data for Climate-specific Distribution of Observed Yields

Percentile Yield Table (CSV) The provided CSV tables should contain information about the average crop yield occurring within each climate-bin across several income levels for each crop in units of tons per hectare (tons/ha). The table must have a ‘climate_bin’ column containing values 0 through 100. The table must have at least one additional column representing a percentile yield within the given climate-bin for a particular crop - an example set of columns could be: ‘yield_25th’, ‘yield_50th’, ‘yield_75th’, ‘yield_95th’. So, this example table would have the following columns: ‘crop’, ‘climate_bin’, ‘yield_25th’, ‘yield_50th’, ‘yield_75th’, ‘yield_95th’. Each file should be prepended with the name of the crop in lowercase, followed by an underscore to help the program parse the file. The tables reside in the ‘climate_percentile_yield’ folder of the provided spatial dataset.

climate_bin yield_25th yield_50th yield_75th yield_95th ...
1 <float> <float> <float> <float> ...
2 <float> <float> <float> <float> ...
3 <float> <float> <float> <float> ...
... ... ... ... ... ...

e.g. ‘maize_percentile_yield_table.csv’

Embedded Data for Yield Regression Model with Climate-specific Parameters

Regression Model Yield Table (CSV) The provided CSV tables should contain information useful for calculating the yield of a crop located in a particular climate-bin based on the limiting factor. The table must have the following columns: ‘climate_bin’, ‘yield_ceiling’, ‘yield_ceiling_rf’, ‘b_nut’, ‘b_K2O’, ‘c_N’, ‘c_P2O5’, and ‘c_K2O’. Each file should be prepended with the name of the crop in lowercase, followed by an underscore to help the program search for the matching file. Currently, the regression model yield function is useful to a small subset of the crops provided in the dataset. The tables reside in the ‘climate_regression_yield’ folder of the provided spatial dataset.

climate_bin yield_ceiling yield_ceiling_rf b_nut b_K2O c_N c_P2O5 c_K2O
1 <float> <float> <float> <float> <float> <float> <float>
2 <float> <float> <float> <float> <float> <float> <float>
3 <float> <float> <float> <float> <float> <float> <float>
... ... ... ... ... ... ... ...

e.g. ‘maize_regression_yield_table.csv’

Parameters for Yield Regression Model with Climate-specific Parameters

  1. Yield Function Determines how yield is estimated in the model.
  2. Percentile Column Required for Percentile Yield Function. This input is used to select the column of yield values from the tables in the climate_percentile_yield folder of the global dataset.
  3. Fertilizer Folder (Rasters) Required for Regression Yield Function. A set of GDAL-supported rasters representing the amount of Nitrogen (N), Phosphorus (P2O5), and Potash (K2O) applied to each area of land. These maps are required for the regression model yield function and are an optional input for all yield functions when calculating economic returns. Each cell value in the raster should be a non-negative float value representing the amount of fertilizer applied in units of kilograms per hectare (kgs/ha). Each file must be named by their fertilizer (nitrogen, phosphorus, potash) in lowercase, followed by the ‘.tif’ file extension. The Fertilizer Maps should have the same dimensions and projection as the provided Crop Management Scenario Map. Global fertilizer datasets are available for download from
float float
float float

Folder Structure

|-- fertilizer_maps_folder
    |-- nitrogen.tif
    |-- phosphorus.tif
    |-- potash.tif
  1. Irrigation Map (Raster) Required for Regression Yield Function. A GDAL-supported raster representing whether irrigation occurs or not. A zero value indicates that no irrigation occurs. A one value indicates that irrigation occurs. The Irrigation Map should have the same dimensions and projection as the provided Crop Management Scenario Map.
int int
int int

Parameters for Calculating Nutritional Contents from Production

  1. Nutrient Contents Table (CSV) A CSV table containing information about the nutrient contents of each crop. The values provided are assumed to be given in relation to one ton of harvest crop biomass. The ‘crop’ and ‘fraction_refuse’ columns must be provided in the table. The ‘fraction_refuse’ column is expected to contain a value between 0 and 1 representing the fraction of the harvested crop that is considered refuse and does not contain any nutritional value.
crop fraction_refuse protein lipid energy ca ph ...
maize <float> <float> <float> <float> <float> <float> ...
soybean <float> <float> <float> <float> <float> <float> ...
... ... ... ... ... ... ... ...

Parameters for Calculating Economic Returns

  1. Economics Table (CSV) A CSV table containing information related to the market price of a given crop and the costs involved with producing that crop.
crop price_per_ton cost_nitrogen_per_kg cost_phosphorus_per_kg cost_potash_per_kg cost_labor_per_ha cost_machine_per_ha cost_seed_per_ha cost_irrigation_per_ha
maize <float> <float> <float> <float> <float> <float> <float> <float>
soybean <float> <float> <float> <float> <float> <float> <float> <float>
... ... ... ... ... ... ... ... ...

Interpreting Results

Outputs Folder Structure

A unique set of outputs shall be created for each yield function that is run such that the folder structure may look as follows:

|-- outputs
    |-- yield.tif
    |-- nutritional_contents.csv
    |-- financial_analysis.csv


  1. Crop Yield Map (Raster) A set of GDAL-supported rasters spatially representing the per-cell yield. Each cell value in the raster shall be a non-negative float value representing the yield area under the given scenario in units of tons.
float float
float float
  1. Nutritional Contents Table (CSV)
crop total_yield (nutrient_a) (nutrient_b) (etc.)
maize <float> <float> <float> ...
soybean <float> <float> <float> ...
... ... ... ... ...
  1. Financial Analysis Table (CSV)
crop total_yield costs returns revenues
maize <float> <float> <float> <float>
soybean <float> <float> <float> <float>
... ... ... ... ...


Monfreda et al. 2008

Mueller et al. 2012

Appendix I

Available Crop Data within Global Dataset

Crop Observed Model Percentile Model Regression Model
Abaca Yes Yes No
Agave Yes Yes No
Alfalfa Yes Yes No
Almond Yes Yes No
Aniseetc Yes Yes No
Apple Yes Yes No
Apricot Yes Yes No
Areca Yes Yes No
Artichoke Yes Yes No
Asparagus Yes Yes No
Avacado Yes Yes No
Bambara Yes Yes No
Banana Yes Yes No
Barley Yes Yes Yes
Bean Yes Yes No
Beetfor Yes Yes No
Berrynes Yes Yes No
Blueberry Yes Yes No
Brazil Yes Yes No
Broadbean Yes Yes No
Buckwheat Yes Yes No
Cabbage Yes Yes No
Cabbagefor Yes Yes No
Canaryseed Yes Yes No
Carob Yes Yes No
Carrot Yes Yes No
Carrotfor Yes Yes No
Cashew Yes Yes No
Cashewapple Yes Yes No
Cassava Yes Yes No
Castor Yes Yes No
Cauliflower Yes Yes No
Cerealnes Yes Yes No
Cherry Yes Yes No
Chestnut Yes Yes No
Chickpea Yes Yes No
Chicory Yes Yes No
Chilleetc Yes Yes No
Cinnamon Yes Yes No
Citrusnes Yes Yes No
Clove Yes Yes No
Clover Yes Yes No
Cocoa Yes Yes No
Coconut Yes Yes No
Coffee Yes Yes No
Coir Yes No No
Cotton Yes Yes No
Cowpea Yes Yes No
Cranberry Yes Yes No
Cucumberetc Yes Yes No
Currant Yes Yes No
Date Yes Yes No
Eggplant Yes Yes No
Fibrenes Yes Yes No
Fig Yes Yes No
Flax Yes Yes No
Fonio Yes Yes No
Fornes Yes Yes No
Fruitnes Yes Yes No
Garlic Yes Yes No
Ginger Yes Yes No
Gooseberry Yes Yes No
Grape Yes Yes No
Grapefruitetc Yes Yes No
Grassnes Yes Yes No
Greenbean Yes Yes No
Greenbroadbean Yes Yes No
Greencorn Yes Yes No
Greenonion Yes Yes No
Greenpea Yes Yes No
Groundnut Yes Yes No
Gums Yes No No
Hazelnut Yes Yes No
Hemp Yes Yes No
Hempseed Yes Yes No
Hop Yes Yes No
Jute Yes Yes No
Jutelikefiber Yes Yes No
Kapokfiber Yes Yes No
Kapokseed Yes Yes No
Karite Yes Yes No
Kiwi Yes Yes No
Kolant Yes Yes No
Legumenes Yes Yes No
Lemonlime Yes Yes No
Lentil Yes Yes No
Lettuce Yes Yes No
Linseed Yes Yes No
Lupin Yes Yes No
Maize Yes Yes Yes
Maizefor Yes Yes No
Mango Yes Yes No
Mate Yes Yes No
Melonetc Yes Yes No
Melonseed Yes Yes No
Millet Yes Yes No
Mixedgrain Yes Yes No
Mixedgrass Yes Yes No
Mushroom Yes Yes No
Mustard Yes Yes No
Nutmeg Yes Yes No
Nutnes Yes Yes No
Oats Yes Yes No
Oilpalm Yes Yes Yes
Oilseedfor Yes Yes No
Oilseednes Yes Yes No
Okra Yes Yes No
Olive Yes Yes No
Onion Yes Yes No
Orange Yes Yes No
Papaya Yes Yes No
Pea Yes Yes No
Peachetc Yes Yes No
Pear Yes Yes No
Pepper Yes Yes No
Peppermint Yes Yes No
Persimmon Yes Yes No
Pigeonpea Yes Yes No
Pimento Yes Yes No
Pineapple Yes Yes No
Pistachio Yes Yes No
Plantain Yes Yes No
Plum Yes Yes No
Popcorn Yes Yes No
Poppy Yes Yes No
Potato Yes Yes Yes
Pulsenes Yes Yes No
Pumpkinetc Yes Yes No
Pyrethrum Yes Yes No
Quince Yes Yes No
Quinoa Yes Yes No
Ramie Yes Yes No
Rapeseed Yes Yes No
Raspberry Yes Yes No
Rice Yes Yes Yes
Rootnes Yes Yes No
Rubber Yes Yes No
Rye Yes Yes No
Ryefor Yes Yes No
Safflower Yes Yes No
Sesame Yes Yes No
Sisal Yes Yes No
Sorghum Yes Yes No
Sorghumfor Yes Yes No
Soybean Yes Yes Yes
Sourcherry Yes Yes No
Spicenes Yes Yes No
Spinach Yes Yes No
Stonefruitnes Yes Yes No
Strawberry Yes Yes No
Stringbean Yes Yes No
Sugarbeet Yes Yes Yes
Sugarcane Yes Yes Yes
Sugarnes Yes Yes No
Sunflower Yes Yes Yes
Swedefor Yes Yes No
Sweetpotato Yes Yes No
Tangetc Yes Yes No
Taro Yes Yes No
Tea Yes Yes No
Tobacco Yes Yes No
Tomato Yes Yes No
Triticale Yes Yes No
Tropicalnes Yes Yes No
Tung Yes Yes No
Turnipfor Yes Yes No
Vanilla Yes Yes No
Vegetablenes Yes Yes No
Vegfor Yes Yes No
Vetch Yes Yes No
Walnut Yes Yes No
Watermelon Yes Yes No
Wheat Yes Yes Yes
Yam Yes Yes No
Yautia Yes Yes No

Fertilizer Units

Band 1: Kg/ha

Band 2: Precision

  • any previous number + .25 = any one of the previous data types but scaling of application rates was maxed out at a doubling when trying to match the FAO consumption

Appendix II - Statistics

Climate Bin Fertilizer

Climate Bin Correlation Coefficient