FAQs - Advan Research

How is the data sourced?

The data is sourced from a panel of mobile devices and aggregated.

How can I be confident the geographic coverage will meet my research needs?

Reach out to [email protected] to request a one-month sample of the data. This will allow you to review all POI covered in the dataset.

What is the source of the POI?

SafeGraph provides the source of POI and geometry used to recognize pings.

How are visits attributed and defined?

Advan does not filter for any dwell times. If a device was captured within the boundaries of the geofence (for any given amount a time) it is registered as a visit. This different from legacy SafeGraph products which used a 4 min dwell time minimum to capture visits.

A visit can simply be defined as us capturing a device within the polygon boundaries of a POI (no minimum dwell time). Exit and reentry within the same day will not be counted as multiple visits. We only count 1 visit per day per device. Using weekly patterns as an example, if a device visits a POI at least once every single day in a given week, this would reflect as 7 in raw_visit_counts and 1 in raw_visitor_count.

Consider a person/device that visits Grand Teton National Park during the week of 01/01/2022 - 01/06/2022:

  • If a device visits the park on the 1st, leave the park that evening, and return again on the 2nd, the raw_visits_counts == 2 and raw_visitor_counts == 1.
  • If a device visited on the 1st, stayed in the park (polygon) overnight, and left on the 2nd, would this count as two visits
  • If a device visited on the morning of the 1st, left the park at noon, and returned again at 3pm that day, would this count as one visit.

Why is there a missing day in November?

Daylight savings in November causes there to be a missing day in the data. This has been corrected in Weekly Patterns for years after 2018.

What is the coverage of National Parks?

Advan stopped generating data for National Parks after Dec 2022. They are now filtered because of the size of the geofences.

Why does the raw_visitors value appear to be less than visitor_home_aggregation? Shouldn’t it be the same or less than the raw count?

The raw_visitors value may be smaller than visitor_home_aggregation because they come from different panels.

  • raw_visitors uses a consistent panel optimized for accurate year-over-year visitation metrics.
  • visitor_home_aggregation uses a larger, more volatile panel designed for broader trade area insights, including data from devices allowed for trade area analysis but not for "home" or "work" calculations.

These panels overlap but aren't identical, so the numbers won’t match. Additionally, data with fewer than four visitors per CBG is excluded.

What are the differences between Weekly and Monthly Patterns

The best way to think about the two datasets is that the underlying visits and algorithms are the same, but they are aggregated at different timescales and delivered at difference frequencies.

Below are some differences between Weekly Patterns and Patterns:

  1. Each delivery of Weekly Patterns covers one week starting Monday and ending end of day on Sunday. The data will be available three days later on Wednesday of each week, providing more frequent actionable data. (Note: A very early, now deprecated, version of Weekly Patterns (v1) went from Sunday to end of day Saturday and was delivered on Tuesdays)
  2. In Weekly Patterns, we include a visits_by_each_hour column to enable you to get a
    more detailed view of the week.
  3. Weekly Patterns does not include popularity_by_hour (covered by visits_by_each_hour)
    and popularity_by_day (covered by visits_by_each_hour).
  4. We update our Places file monthly and start using the new file for our visits generation on the first of each calendar month. This means that if we introduce a new place on the 1st of a month and the Weekly Patterns file straddles two months, you will only see visits for the new place in those days of the week that are in the new month. This edge case only affects a tiny fraction of places each month.

What are the years used for the Census Block Group (CBG) data

Advan uses 2018 CBGs (which are based on the 2010-2019 boundaries).

What version of NAICS code does Advan use?

SafeGraph uses the 2017 NAICS code version and since Advan uses SafeGraph to source their POIs, they do as well.

What is the meaning of a null/blank value for distance_from_home?

Advan discovered a bug where starting in Nov 2023, distance_from_home is null for all POIs for all periods. It is currently being investigated

Does Advan do any normalization with their data?

Yes, Advan has normalized attributes like normalized\_visits\_by\_state\_scalingwhich are already scaled for population and panel.

What happened to closed_on_date in the Advan data? Does it still exist?

If a POI still has foot traffic or a new business moves into it's location, then the POI would likelynot have a closed on data (still operating or a new placekey, respectively). To identify closed_on_data one can join to the SafeGraph Places dataset (since Advan uses that as it's GIS provider).

For SafeGraph places, opened\_on and closed\_on dates are determined from metadata at the source level. If a new POI from an existing source repeatedly appears in their build pipeline, it is flagged as opened\_on during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears in our build pipeline, it is flagged as closed_on during the month in which it first disappears. opened_on dates are only inferred for POIs with a safegraph\_brand\_id whereas closed\_on dates are attempted for both branded and non-branded POIs. POIs with an opened_on or closed_on date have been determined to be accurate within a ~60 day margin of error.

Advan Monthly Patterns foot traffic has NA values for raw_visit_counts for a few months for particular placekeys. In such cases, do we replace NA with 0 visit counts for that placekey or does it mean that the information is not collected for that placekey in that month?

Advan foot traffic datasets are pre-appended to the SafeGraph Places data regardless of whether or not they’re monitoring visits for those POI, so there are going to be a number of POI in that dataset with no visits data. This could be for various reasons (ex. not a POI-type they’re calculating visits for, not enough visit data, “point” POI with no geometry, etc.)

Where is afternoon_tea_device_home_areas attribute in Neighborhood patterns?

This attribute is actually hidden in Dewey. The structure of the attributes (and the amount of data that was included in each attribute) for Neighborhood Patterns makes the overall “width” of the dataset for certain rows too large and surpasses the maximum limit of our platform.

We chose this as the one column to hide since it was the least used attribute and it brought the overall amount of data per row below the maximum.

Why does the number of POIs surge from June 2023 - July 2023?

The number of POIs in the SafeGraph Places dataset (which Advan uses for their POIs) changes month over month and is constantly being added to. During this month, SafeGraph aded some 3 million new POIs which caused the surge.

Can counts, like home location, be summed across CBGs within a county?

Home/daytime assignment is unique per device per month so numbers can be added to get county level panels.

Why is the First week of 2019 missing from the Weekly Patterns dataset?

The start date for each week in the dataset is Monday and the first “full” week of data starts on the 7th. For consistency, they decided to have the first week for that year start on Monday.