GME

Geospatial Modelling Environment

pointdistances (Distances Among Points)

Calculates distances between points, either among points within a point data source, or among points between two point data sources

Description

This tool calculates distances between points, either among points within a point data source, or among points between two point data sources. It also optionally identifies the N nearest neighbours, and has three output formats to choose from. This tool has been designed to produce flexible distance matrix outputs that would be used in subsequent analyses.

The simplest implementation of the tool is to specify an input point dataset, identify the unique ID field for that dataset, and specify the output file. The tool will calculate distances between each of the points within this dataset and write the result using the default format.

There are three formats to choose from. The first (1D) write a table that has three columns: the unique ID of each of the two points, and the distance between them. This format is the default because it can accommodate very large numbers of points, and it does not report distances twice (i.e. if the tool calculates the distance between points A and B, it will not also report the distance between points B and A). The second option (2D) writes an NxN table (the full distance matrix), where N is the total number of points in the dataset. This option is only recommended with small numbers of points because the output is unlikely to be useful for large numbers of points (you would not be able to open it in Excel or import it into Access for instance). The 2D format reports all values twice (e.g. A and B, and B and A, and all the cells on the diagonal are 0 (the distance between a point and itself). The final format (SUMMARY) generates a simple statistical summary (mean, minimum, maximum, standard deviation, and optionally median) of the distances between a given point and all the other points. The output table includes these summary statistic fields as columns, and N rows.

Nearest neighbours can be identified at the same time using the 'nearest' and 'nout' parameters. The nearest neighbour output is written to a different table, and includes the ID's of the ordered list of the n nearest neighbours, and the distance to each one.

The 'multiplier' option provides you with a way of changing the units of the distance value. By default the distance is calculated in coordinate system units, e.g. meters for UTM. If you wanted these distances in kilometers then you would set the multiplier to be 0.001.

To change the delimiting character in the output table (if you are in a country that prefers not to use the comma as the delimiting character), please refer to the 'delimiter' command, which must be issued before running this tool. However, you only have to issue that command once per session as it is a global setting.

The median is not calculated by default because it can be a computationally expensive calculation for large numbers of points. The median will be calculated if you specify 'format=SUMMARY' and 'median=TRUE' options.

This tool will attempt to ignore points with null geometries. These records can arise in a number of ways, but are particularly common in the output of address geocoding algorithms (the records that the algorithm failed to resolve). They are also common in GPS telemetry databases. It is advisable to either repair or remove these records prior to running GME commands. You will be warned if this command detects null geometries.

This tool should not be used with data in a geographic coordinate system. The distance calculations here assume the coordinates are Cartesian (i.e. a projected coordinate system), not spherical.

Note that this tool will not overwrite existing files, so you should ensure the output file does not already exist.

Syntax

pointdistances(in, fld, out, [in2], [fld2], [format], [nearest], [nout], [multiplier], [median], [where]);

inthe input point dataset (calculates distances between points within this dataset - but see 'in2')
fldthe unique ID field in the input dataset
outthe output delimited text file to create
[in2]the second point layer (calculates distances between points between datasets)
[fld2]the unique ID field in the second input dataset (MUST be specified if 'in2' is specified)
[format]the structure of the output table: 1D (default, one record per row), 2D (an NxN matrix), SUMMARY (summary statistics for each point only) (options: 1D, 2D, SUMMARY)
[nearest]the number of nearest neighbours to identify for each input point (default=0); note that 'nout' must also be specified with this option
[nout]the output delimited text file to create for the nearest neighbour dataset
[multiplier]multiplies the distance by this value before writing to output (default=1)
[median](TRUE/FALSE) determines whether the median is also calculated if the 'summary' format is specified (this can add considerably to processing time for large datasets, default=FALSE)
[where]the filter/selection statement that will be applied to the first ('in', not 'in2') point feature class to identify a subset of points to process

Example

pointdistances(in="C:\data\locs.shp", fld="RECID", out="C:\data\distances.csv", format="SUMMARY", multiplier=0.001);

pointdistances(in="C:\data\predators.shp", fld="ANID", out="C:\data\distances.csv", in2="C:\data\prey.shp", fld2="PREYID");

pointdistances(in="C:\data\firestations.shp", fld="STID", out="C:\data\distances.csv", in2="C:\data\houses.shp", fld2="PROPID", nearest=10, nout="C:\data\nearestheigh.csv", median=TRUE);


Messages

Please consider making a purchase to support the continued development of these tools  Read more...

Tips on how to use this interface efficiently  Read more...

Links

Open Source GIS

Copyright © 2001-2014 Hawthorne L. Beyer, Ph.D., Spatial Ecology LLC    Connect on LinkedIn