Company Insights
Overview
Complete, worldwide company database of over 30 million businesses. Includes basic company information such as name, location, category, and industry, along with a number of aggregated trends such as employee count and average tenure.
| Data Information | Value |
|---|---|
| Refresh Cadence | Monthly |
| Historical Coverage | 2010 - Present |
| Geographic Coverage | Global |
Key Concepts
Reading in PySpark
People Data Labs Company data has a slightly tricky encoding with multiple lines and non-escaped quotes inside quotes that really trip PySpark up (Pandas handles it fine).
You can use this way of reading the data:
df = spark.read.option("header", "true")\
.option("multiLine", "true")
.option("escape", """)
.option("quote", '"')
.csv(path-do-company-data)Updated 5 days ago