Discussions

Ask a Question
Back to all

Using DuckDB to download Advan Weekly Patterns

Hello, I was following the instructions of using DuckDB to download foot traffic/ weekly patterns from the Avdan Research. I was able to get the file list and define the date partition range. However, I kept getting the following error message when generating a data preview and querying relevant data for downloading:


con.execute("INSTALL httpfs;")
con.execute("LOAD httpfs;")
con.execute("SET enable_http_metadata_cache=true;")
con.execute("SET enable_object_cache=true;")
con.execute("PRAGMA threads = 8;")
con.execute("SET http_timeout = 120000;")

preview_df = con.execute("""
SELECT *,
FROM read_parquet($urls, hive_partitioning=1)
where nacis_code = '722511'
LIMIT 5
""", {"urls": urls}).df()
preview_df.head()

__

InvalidInputException Traceback (most recent call last)

Cell In[35], line 10
7 con.execute("SET http_timeout = 120000;")
9 # Preview
---> 10 preview_df = con.execute("""
11 SELECT *,
12 FROM read_parquet($urls, hive_partitioning=1)
13 where nacis_code = '7225'
14 LIMIT 5
15 """, {"urls": urls}).df()
16 preview_df.head()

InvalidInputException: Invalid Input Error: No magic bytes found at end of file 'https://api.deweydata.io/api/v1/downloads/REMOVED PART OF THE LINK'

LINE 3: FROM read_parquet($urls, hive_partitioning=1)

Could you please provide some insights on which part I coded wrong? Thank you very much.