Spend Patterns
Overview
SafeGraph Spend Patterns data aggregates anonymized credit and debit transactions at specific points of interest over the course of a month. Attributes include aggregated transaction volume and amounts, as well as transaction intermediary (Apple Pay, Doordash, etc.), and anonymized customer details. Spend data is available for both online and offline transactions.
| Data Information | Value |
|---|---|
| Refresh Cadence | Monthly |
| Historical Coverage | 2019-Present |
| Geographic Coverage | United States |
Key Concepts
Panel
Details on the panel used to aggregate Spend Patterns anonymized consumer transaction data can be accessed as a related file on Dewey. Spend Patterns - Entire US Panel Summary ↗️
Brand Info
The brand_info dataset can be accessed as a related file on Dewey. Brand Info (Places, Patterns, Geometry, Spend) ↗️
A SafeGraph brand is defined as a store which has multiple locations all under the same logo or store banner.
Date Granularity
- The underlying transaction data being aggregated are only resolvable at the daily level. Therefore, columns such as
date_range_startanddate_range_endthat are provided down to the hour level are done so to facilitate consistent joining to other SafeGraph datasets, and not reflective of the actual granularity of the transaction timing. - Furthermore, whenever possible, the transaction dates used in
spend_by_day,spend_per_transaction_by_day,spend_by_day_of_weekreflect the date of the actual transaction. However, for some transactions, the date reported is instead the date processed by the financial institution, which is typically the next business day.- This means that Saturday and Sunday spend will appear lower in the data and Monday will be appear higher (i.e., Sat/Sun spend being attributed to Mon), but this only affects these three columns.
- Debit (a.k.a. bank) card transactions are also more likely than credit card transactions to have this bias, so weekend numbers are more likely to reflect credit card transactions.
- Note that we have provided a column called
day_countswhich is simply a count of how many of each day occurred in the given month (e.g., there were 4 Tuesdays in the month). You can use this column to determine whether an increase in spend in a given month is due to a real phenomena or due to the fact that there were more Mondays in the given month.
Online vs In-Person Transactions
- Prior to being aggregated to POIs, individual transactions are classified by origin as online or in-person based on a proprietary model leveraging information about the transaction, the merchant, the customer, and other factors.
- This allows SafeGraph to understand what proportion of transactions attributed to a POI (and their corresponding spend) were made physically versus online.
- Certain POIs lend themselves more to online versus in-person transactions. For example, self-storage POIs are more likely to have online transactions where payment is not made at the physical location. On the other end of the spectrum, gas station POIs are more likely to have in-person transactions where payment is made at the physical location.
- Note that online transactions that cannot be tied to an individual physical location will not be included in columns such as
raw_total_spend,raw_num_transactions, etc. For example, purchases made online and shipped directly to a residence may not reference a specific store because they might be filled from a warehouse or distribution center. Whereas a "buy online, pick up in store" presents a connection to a physical store.- There is one exception to this general rule: transactions that cannot be tied to a physical location (whether online or offline) are included in cross-shopping columns (e.g.,
related_cross_shopping_physical_brands_pct,related_cross_shopping_online_merchants_pct, etc).
- There is one exception to this general rule: transactions that cannot be tied to a physical location (whether online or offline) are included in cross-shopping columns (e.g.,
Transaction Intermediaries
- Transaction intermediaries can be apps that facilitate the transaction between the POI and the customer (e.g., DoorDash for restaurant POIs.
- They can also be payment processors through which the transaction takes place (e.g., Apple Pay)
- Transactions can also have multiple intermediaries. Paying for a DoorDash order through Apple Pay would mean there would be a 1 in Apple Pay and also in DoorDash.
- There is also some nuance with specific values which show up in this column:
- No Intermediary does not mean that the transaction was via cash or anything like that. It means either no intermediary metadata was available and/or it was a direct bank or credit card charge.
- Similarly, Visa as an intermediary does not mean they used a Visa card. Visa has a shared checkout option similar to Paypal, that's what the "Visa" intermediary means in that context.
- Similarly with Square: mostly this means the store has a Square POS system, but there are Square intermediaries that aren't necessarily POS, e.g., Square Online, so "Square" would cover that payment processing method as well.
Privacy
To preserve privacy, SafeGraph applies differential privacy techniques to the following columns: bucketed_customer_income and customer_home_city.
SafeGraph has added Laplacian noise to the values in these columns. After adding noise, only attributes (e.g., a city) with at least two customers are included in the data. For these columns, SafeGraph does not report data unless at least 2 visitors are observed from that group.
SafeGraph takes the added precaution of ensuring no city can appear in customer_home_city if less than 4 panelists have that home city assigned. This is to prevent de-identifying panelists who come from rare or unique cities.
Updated about 8 hours ago