Company Insights

Overview

Complete, worldwide company database of over 30 million businesses. Includes basic company information such as name, location, category, and industry, along with a number of aggregated trends such as employee count and average tenure.

Data InformationValue
Refresh CadenceMonthly
Historical Coverage2010 - Present
Geographic CoverageGlobal

Key Concepts

Reading in PySpark

People Data Labs Company data has a slightly tricky encoding with multiple lines and non-escaped quotes inside quotes that really trip PySpark up (Pandas handles it fine).

You can use this way of reading the data:

df = spark.read.option("header", "true")\
.option("multiLine", "true")
.option("escape", """)
.option("quote", '"')
.csv(path-do-company-data)