Hawth's Analysis Tools for ArcGIS


 You are here: Home > Hawths Tools > Tools Descriptions & Help


Input: a line (polyline) layer and any number of raster layers representing continuous data (not categorical data)
Output: a statistical summary of the raster data along each line is written to the line attribute table

  • this tool calculates the length of line that falls within each of the raster cells, and creates a statistical summary based on those segments
  • calculates: length weighted mean, length weighted standard deviation (see below), the minimum and maximum values encountered, and a segment count (see below)
  • the user can control which of these statistical summary metrics are written to the attribute table using check-mark boxes
  • this is a processing intensive tool, so can take a considerable time to run for layers containing many thousands of line features
  • NoData: this tool ignores NoData cells, and creates the statistical summary on all other non-NoData values
  • -999 is used to represent NoData in the summary statistics produces (e.g. if no summary statistics can be produced for a given line)


  • both the vector and raster layers MUST be in a projected coordinate system (this tool will not produce logical results if a geographic coordinate system is used)
  • the projections of the vector and raster layers MUST match exactly: this tool does not perform on-the-fly projection changes
  • this tool is designed to work with shapefiles, other vector formats have not been tested

[Click for larger view]


Getting started, and sampling issues. A line layer and one or more raster layers must be loaded into ArcMap in order to use this tool. Ensure that you have write access to the attribute table of the line layer. All of the raster layers should represent continuous data (e.g. elevation, biomass, slope, etc); this produces a statistical summary of raster values along each line and as such cannot be used with categorical raster data. Note that the sampling unit is the polyline associated with each record in the attribute table. It is possible for one polyline to contain several component lines that can be spatially disparate (i.e. Multipart polylines in ESRI’s terminology). You will need to use the Convert Multipart to Singlepart tool in ArcToolbox on your dataset if you wish to to separate those lines into different polylines for the purpose of statistical analysis. Thus when you examine the results of this tool it is essential to understand whether some or all of your polylines actually consist of several spatially distinct component lines. This can have a profound influence on how you use or interpret the results of this tool.

Field names and naming conventions. Automatic handling of naming is essential to allow the user to process many rasters at once. However, this does mean that you should be aware of the naming convention used. There are two options: 1) if only one summary statistics is selected you have the option of using the first 10 letters of the raster name as the new field name, with no prefix or suffix appended, or 2) the first 6 letters of the raster name are used and a four letter suffix is appeneded to it that represents the summary statistic. For either of these options to work your raster names need to be unique in the first 6 or 10 letters of the name. You can temporarily change the name of the raster layers in the ArcMap table of contents (click twice slowly on the raster name). This tool uses these names for the field names, so it is very easy for you to rename the rasters in a meaningful way before you run this tool. It is worth taking the time to change long raster names to shorter names that are meaningful and unique in the first 6 characters before you start the tool.

Segment Count. A polyline consists of one or more straight-line geometry segments, the lines that connect consecutive nodes or vertices. When the polyline is intersected with the raster, each of these geometry segments is further broken into smaller segments, each of which corresponds to the piece of the geometry segment that passes though a single raster cell. The segment count recorded in this tools is different from the number of segments recorded in the geometry of the polyline. E.g. if the polyline has a start node, 3 vertices, and an end node, then the segment count from the geometry perspective is 4. The segment count reported here refers specifically to the number segments that result when a geometry segment is split into peices based on the boundaries of raster cells.

Length weighted mean (LWM). The LWM is calculated by multiplying the length of each segment (see above) by the raster cell value of that segment, summing this value across all segements, and finally dividing that sum by the total length of the polyline:

where l is the length of a segment, v is the value of the raster cell for that segment, and L is the total line length.

Length weighted standard deviation (LWSD). This value is calculated by subtracting the LWM (see above) for each segments from the raster cell value of that segment, squaring this value, multiplying by the length of the segment divided by the total line length, summing this across all segments, dividing by n-1 where n is the segment count, and finally taking the square root of this value:

where n is the number of segments, xi is the raster cell value of segment i, the other x term is the length weighted mean, li is the length of segment i, and L is the total line length.
While this statistic does provide some indication of the variability of the raster data that underlies the line, it should be interpreted with caution. It is likely that longer lines are likely to have larger LWSD values simply because they cross more raster cells. Often, the length of polylines is arbitrary and results from decisions made by digitizers or software regarding where to break polylines. Arbitrary line lengths make this statistic particularly difficult to interpret. Applications that involve non-arbitrary lines like transect lines or possibly animal movement paths might find this statistic more useful as long as you interpret it carefully.
For any serious evaluation of variability along a line/transect I think some of the rigorous spatial statistical techniques like the family of local quadrad variance statistics would be the most appropriate approach. I recommend the following excellent text for further information:
Dale, M.R.T. 1999 Spatial Pattern Analysis in Plant Ecology. Cambridge University Press.

Home | Articles | Services || Hawth's Tools:  Overview | Description | Download | FAQ