Discussions

Ask a Question
Back to All

Possible measurement error in number of transactions in 2022 for Starbucks?

Is there any documented measurement error in Starbucks's transaction count in 2022? I noticed that 99.5% of all Starbucks stores in the data set had less than 3000 monthly transactions. However, some stores, like the one attached in Exhibit, had 10 times the number of transactions in only one month in 2022, which is surprising. I have a list of several other such Starbucks stores.

Admin

Hi Partha,

Thanks for the question. I am unable to see the attachment but it is possible outliers are in the data. I've attached a notebook detailing how to address biases and fluctuations in total spend. It may be beneficial to normalize using panel totals for a region given the raw spend will likely fluctuate with the panel. You can access these on Dewey here.

Does this help answer your question?

Thanks!

Thanks for the information about the bias in the data set overall. However, I was more interested in something related to a particular brand: Starbucks. I'm attaching a picture depicting this which I found for several Starbucks stores I checked:


About the possibility of outliers being present, I want to confirm whether it means that these outliers are a result of faulty measurement and can just be ignored or if you believe that the change in the number of sources could have caused this.

To follow up on that thought, could you please tell me how the spending data is collected monthly? I read Safegraph's FAQ, which says it is not from the mobile phone users panel. Information about how transactions are assigned to a particular POI would be very useful in interpreting the findings from this data.

Admin

On the second part of your questions, you can read more about how a transaction is attributed to a store on the SafeGraph docshttps://docs.safegraph.com/docs/spend. The panel used to create this datasets is credit card transactions, not mobile devices, so they can only attribute the transaction to a store if there is enough metadata in the transaction information to locate a specific store where the transaction occurred.

Because it is a panel of devices, the dataset is not meant to cover every transaction for a particular store. The outlier example has been flagged to the SafeGraph team, but for now, it is likely best to ignore the data point if there is a 10x spike in a given month.

Admin

Heard back from the partner: "This was an error from our supplier where they over-attributed transactions to this particular store"
They recommend dropping those outliers.

Marked as answered by Evan Barry

ο»Ώ