Tuesday, February 20, 2018

Module 6: Data Classification


This week the assignment was to utilize ArcGIS to create two maps with 4 data frames each.  4 data classification with the same data, population over 65 in Miami Dade County, FL. The first map utilized % of population over 65.  The second map utilized a count of those over 65 normalized by square mile.  One of the learning outcomes is to gain awareness of the different distribution of the same data by the different classification methods and different aggregate of the data (% total verses count normalized by square mile).  The 4 data classification methods utilized in this lab are:  Natural Breaks, Standard Deviation, Quantile, and Equal Interval.

Equal Interval: Takes the maximum value and subtracts the minimum value to get the range of the data.  The range is then divided into equal range classes.  The number of classes to be assigned by the map maker.  This option leaves no gaps in the data range and is fairly easy to understand.  However it can force same or similar values being divided into different groups and/or dissimilar values groups together.

Quantile: Takes all the data, ordered numerically, and divides it into classes with equal observations. Each group has the same number of observations.  This option is again fairly easy to understand.  However, this option can leaves gaps in the data range and can force same or similar values into separate categories and/or group dissimilar values into the same category. 

Natural Break: Takes data and runs mathematical algorithms to place similar values together and maximize differences between classes.  This option is much more complicated to explain how mathematical equations determine the class breaks. There option does not allow same values to be put in different classes and should not class values drastically different together.  This classification method is popular among cartographers, and is the default classification method used by ArcMap. 

Standard Deviation:  Takes data and a bell curve approach to classification with equal sections. The majority of data will be in the middle class around the average value, other classes will have fewer and fewer data points as they get farther away from the mean.  This option requires basic understanding of statistics to understand bell curve data and the percent of deviation from the mean to understand the class differentiation.  This method would not allow for gaps in the data range.  This method would not force same or similar values into different classes or dissimilar values into the same class.

Symbolized map for intuitive data acquisition by using graduated color was utilized to symbolize the data.  Lighter color for lower numbers ranging to darker color for higher numbers.   Implemented cartographic design principles by positioning the page title in the largest font at the top of the page and individual data frame titles in smaller font within their frames.  Data information, author and date are all positioned in smallest font at the bottom of the page. 

In my opinion the presentation method best suited to present the distribution is the data that is % over 65 presented in natural breaks classification.  The count per square mile data seemed to wash out the data.  The values were lower resulting in lower values for the map.  The percent over 65 takes into account the areas that are populated and the section of that population that meets the criteria of over 65.  The classification method of Natural Breaks allows for categories to be formulated based on the data.  Forcing the values into equal ranges for equal interval distorts the data on the high end of the values.  Quantile forces a equal count of values into categories and then the rages are set from the data.  This doesn't allow for an natural separation in the data. Standard Deviation assumes a bell curve to the data and this data set is more weighted in the lower range and one outlier on the high end that skews this classification.
 


No comments: