Discussions
Missing Data Questions and Unusual Data Patterns
3 days ago by Camilla Schneier
Hi! I've observed some odd patterns in the Veraset data and I would like some help with the following questions.
I constructed a measure of time at home per device from January 2019- May 2025. From 2019-2022 I have used Veraset Visits only, and after 2023 I have started additionally adding the Home Visits because the data is split up.
- The documentation says that "Starting in 2022, home and work visits were split into separate datasets due to privacy requirements." - I believe this actually starts in 2023. I can only download Veraset home visits starting in 2023.
- I find some unusual patterns in the data starting in 2022, and I have been so far focused on Cook County, IL:
- The number of devices first declines throughout 2019, 2020, and 2021, then spikes at a few dates in early 2022 and late 2021, and finally remains low starting in 2022
- The total time at home observed across all devices follows a logical pattern between 2019 January and 2022 January but then dips a lot in 2022. This seems to be driven by devices which are only observed for a short time period at home.
- There are outliers in the time-spent-at home data , for example on Saturday, December 4 2021 the average time at home before 4am drops, and this drop is driven by outliers that spend less than 30 minutes total time at home
- A few days have a large spike in the number of devices observed (I believe April 22 or April 23 2022 are such examples).
Can you please help with this?