Discussions
Questions about Advan Neighborhood Patterns
Hello,
Our team has a few questions about the Advan Neighborhood Patterns dataset:
First, we were curious about this line in the [Neighborhood Patterns documentation](https://docs.deweydata.io/docs/advan-research-neighborhood-patterns) (under Key Concepts -> Visit Attribution):
‘In addition, if there is only one device in the home and daytime areas it is not reported at all; if there are between 2 and 4 devices, they are reported as 4; and, starting January 2023 in the US, only the 65th percentile of areas are included.’
My interpretation of the above sentence is that (for example) that among the values reported in any given DEVICE_HOME_AREAS field, the raw values are filtered such that only values above the (original) 65th percentile are included? This would have the effect of dropping out the low-count CBGs from each DEVICE_HOME_AREAS field. Is this correct?
Second, we have a few questions related to the recently posted CBG Normalization notice:
First, does this only apply to the Monthly/Weekly pattern datasets, or does it also apply to the Neighborhood Patterns data as well? The notice lists which columns are affected, but the ‘DEVICE_HOME_AREAS’ (per-CBG home device counts) column we use from the Neighborhood Patterns is not listed (but I assume this probably still applies to neighborhood patterns as well).
Assuming that the above notice does apply to ‘DEVICE_HOME_AREAS’ from Neighborhood Patterns, it sounds like we are discouraged from using the actual device counts in ‘DEVICE_HOME_AREAS’ and should instead treat them as a relative/fractional visits (i.e divide each raw device count in DEVICE_HOME_AREAS by sum of the device counts in DEVICE_HOME_AREAS) and then multiply by the overall device count RAW_DEVICE_COUNTS to get per-CBG estimates. Does that match your understanding of how we should be using DEVICE_HOME_AREAS going forward?
Thanks!