Views:

Outlying values in a city which has multiple stations is filtered out when the hourly median average is calculated: extreme values will be discounted. The data analyst applied some “tagging” to the whole dataset which helps us to identify unusual data patterns or potentially outlying values. Then the odd values and the stations they belong to are visually checked and compared against multiple related measures, such as the PM2.5 of nearby areas where available, special weather conditions, PM2.5 value of the same period last year etc.

Any values determined to be erroneous through this data analysis process were discounted.