Summary statistics are calculated by the Aggregate Points, Summarize Within, Summarize Nearby, Join Features, and Dissolve Boundaries tools.
Equations
Mean and standard deviation are calculated using weighted mean and weighted standard deviation for line and polygon features. None of the statistics for point features are weighted. The weight is the length or area of the feature that falls within the boundary.
The following table shows the equations used to calculate standard deviation, weighted mean, and weighted standard deviation:
Statistic | Equation | Variables | Features |
---|---|---|---|
Standard Deviation | where:
| Points | |
Weighted Mean | where:
| Lines and polygons | |
Weighted Standard Deviation | where:
| Lines and polygons |
Note:
Null values are excluded from all statistical calculations. For example, the mean of 10, 5, and a null value is:
(10+5)/2=7.5
Points
Point layers are summarized using only the point features within the boundary areas.
A real-life scenario in which points could be summarized is in determining the total number of students in each school district. Each point represents a school. The Type field gives the type of school (primary school, middle school, or secondary school) and a population field gives the number of students enrolled at each school.
The figure below shows a hypothetical point and boundary layer, and the table summarizes the attributes for the point layer.
ObjectID | District | Type | Population |
---|---|---|---|
1 | A | Primary school | 280 |
2 | A | Primary school | 408 |
3 | A | Primary school | 356 |
4 | A | Middle school | 361 |
5 | A | Middle school | 450 |
6 | A | Secondary school | 713 |
7 | B | Primary school | 370 |
8 | B | Primary school | 422 |
9 | B | Primary school | 495 |
10 | B | Middle school | 607 |
11 | B | Middle school | 574 |
12 | B | Secondary school | 932 |
The calculations and results for District A are given in the table below. From the results, you can see that District A has 2,568 students. When running a tool, the results would also be given for District B.
Statistic | Result District A |
---|---|
Sum |
|
Minimum | Minimum of:
|
Maximum | Maximum of:
|
Mean |
|
Standard Deviation |
|
Lines
Line layers are summarized using only the proportions of the line features that are within the boundary areas.
Tip:
When summarizing lines, use fields with counts or amounts so proportional calculations make logical sense in your analysis. For example, use population rather than population density.
A real-life scenario in which you can use this analysis is determining the total volume of water in rivers within a specified boundary. Each line represents a river that is partially located inside the boundary.
The figure below shows a hypothetical line and boundary layer, and the table summarizes the attributes for the line layer.
River | Length (miles) | Volume (gallons) |
---|---|---|
Yellow | 3 | 6,000 |
Blue | 8 | 10,000 |
The calculations for volume are given in the table below. From the results, you can see that the total volume is 9,000 gallons.
Note:
The calculations use the proportions of the lines within the boundary area. For example, the yellow line has a total volume of 6,000 gallons with two of its three total miles within the boundary. Therefore, the calculations are preformed using 4,000 gallons as the volume for the yellow line:
6000*(2/3)=4000
Statistic | Result |
---|---|
Sum |
|
Minimum | Minimum of:
|
Maximum | Maximum of:
|
Mean |
|
Standard Deviation |
|
Polygons
Polygon layers are summarized using only the proportions of the polygon features that are within the boundary areas.
Tip:
When summarizing polygons, use fields with counts or amounts so proportional calculations make logical sense in your analysis. For example, use population rather than population density.
A real-life scenario in which you can use this analysis is determining the population in a city neighborhood. The blue outline represents the boundary of the neighborhood and the smaller polygons represent census blocks.
The figure below shows a hypothetical polygon and boundary layer, and the table summarizes the attributes for the polygon layer.
Census block | Area (miles²) | Population |
---|---|---|
Yellow | 6 | 3,200 |
Green | 6 | 4,700 |
Pink | 2.5 | 1,000 |
Blue | 8 | 4,500 |
Orange | 4 | 3,600 |
The calculations for population are given in the table below. From the results, you can see that there are 10,841 people in the neighborhood and an average (mean) of approximately 2,666 people per census block.
Note:
The calculations use the proportions of the polygons within the boundary area. For example, the yellow polygon has a total population of 3,200 with four of its six total square miles within the boundary. Therefore, the calculations are preformed using 2,133 as the population for the yellow polygon:
3200*(4/6)=2133
Statistic | Result |
---|---|
Sum |
|
Minimum | Minimum of:
|
Maximum | Maximum of:
|
Mean |
|
Standard Deviation |
|
Related topics
Use the following topics to learn more about summary statistics within a specific tool: