Discussions
Similarweb website traffic -- reduced coverage?
Hi I am a new Dewey user and I am looking to use Similarweb's website traffic data through Dewey. However, after downloading the dataset, the coverage seems to be smaller than anticipated. I have two questions on this: (1) though the time coverage on the main page is Oct 17, 2022—Jul 01, 2025, I see in the support doc that historical data goes back to 2019. is there a way to retrieve data prior to Oct 2022? (2) more importantly, I see about 8,000 unique websites in the downloaded data. I had thought that Similarweb's coverage of websites was much broader. Is there a specific way to download the data (or API call) that gets the broader set of websites?
Veraset data until present but ends in May 2025?
Hello, I'm trying to use the bulk API to download veraset data. The documentation says it goes until present, however when querying the metadata I get:
The Latitude and Longitude "NULL" value in Assessor History
Hi, why are the values of the Latitude and Longitude "NULL" in the Assessor History dataset?
"exclude" filter doesn't seem to include rows with missing values
In order to get my download sizes under 5% (when downloading data from the Attom Tax Assessor dataset), I was previously using the filter for "MSANAME". However, when I use the exclude filter, it seems to ignore the MSA's I've listed (desired), but also ignore rows that are missing an MSANAME value (not desired, because I'm trying to get a complete universe of data). This either seems like a bug, or like a quirk that is not discussed anywhere, and could lead to incomplete datasets being downloaded. It seems intuitive to me that N/A values should be included when performing an 'exclude X value' filter, or to allow the inclusion/exclusion of N/A values as an additional filter. Is this something that can be addressed?
missing variables and inconsistent column names in Attom Tax Assessor downloads
I've been downloading 5% increments of the Attom Tax Assessor data over the span of several days, but noticed that the column names have inconsistencies between the batches I download. For instance, the Attom ID column is labeled "_ATTOM_ID_" for my Utah sample, but labeled "X_ATTOM_ID_" for my Nevada sample, and labeled "ATTOMID" for my California sample. I also noticed that the property latitude and longitude seemed to be missing from a sample I downloaded covering AZ, CO, ID, and WY (even though I've been selecting all columns to be downloaded).
bucketed_dwell_times for weekly pattern plus files
I noticed in the documentation (https://docs.deweydata.io/docs/weekly-patterns-plus#additional-column-information) for the weekly-pattern-plus files that the bucketed dwell time has 61–120 and 121–240 as separate categories, but the file only has one category (61-240): “{\”<5\”:0, \”5-20\”:0, \”21-60\”:0,\”61-240\”:0,\”>240\”:1018}”. Am I missing something? Can someone help me figure this out?