Company Insights
Overview
Complete, worldwide company database of over 30 million businesses. Includes basic company information such as name, location, category, and industry, along with a number of aggregated trends such as employee count and average tenure.
| Data Information | Value |
|---|---|
| Refresh Cadence | Monthly |
| Historical Coverage | 2010 - Present |
| Geographic Coverage | Global |
Schema
Under People and Resume Data PDL.COMPANY.COMPANY is the primary table. You can join additional tables using the COMPANY_ID identifier. The schema below is for the PDL.COMPANY.COMPANY table. For more information on how to access datasets via API, review our docs page.
| Name | Description |
|---|---|
| LATEST_FUNDING_STAGE | The stage of the company’s most recent funding event |
| MIC_EXCHANGE | The MIC code for the company's ticker exchange |
| ULTIMATE_PARENT | The ID of the ultimate owning company |
| DATASET_VERSION | DATASET_VERSION |
| SIZE | Number of employees at the company (range) |
| LINKEDIN_ID | Main LinkedIn profile ID for the company |
| FOUNDED | The founding year of the company |
| TICKER | The company ticker for public companies |
| INDUSTRY | The self-reported industry of the company |
| COUNT_LINKEDIN_FOLLOWER | Count of LinkedIn followers |
| LINKEDIN_SLUG | LinkedIn slug for the company |
| AVERAGE_EMPLOYEE_TENURE | Average years of employee tenure |
| COUNT_FUNDING_ROUNDS | Number of funding rounds announced |
| GICS_SECTOR | GICS sector classification for public companies |
| COMPANY_ID | Unique identifier for the company |
| ULTIMATE_PARENT_TICKER | Ultimate parent's stock symbol (if public) |
| LINKEDIN_URL | Main LinkedIn profile URL for the company |
| NAME | Company's main common name |
| DISPLAY_NAME | Company's displayed name |
| TOTAL_FUNDING_RAISED | Total amount of funding raised in USD |
| EMPLOYEE_COUNT | Current number of employees |
| LAST_FUNDING_DATE | Date of the most recent funding event |
| TYPE | Company type |
| FACEBOOK_URL | Main Facebook profile URL for the company |
| WEBSITE | Primary company website |
| IMMEDIATE_PARENT | Direct owner of the company |
| ULTIMATE_PARENT_MIC_EXCHANGE | MIC exchange of the ultimate parent (if public) |
| TWITTER_URL | Main Twitter profile URL for the company |
| SUMMARY | Company description |
| INFERRED_REVENUE | Estimated annual revenue in USD |
| HEADLINE | Company's headline summary |
Additional fields, such as the new one's below, are accessible by tables of the same name:
| Name | Description |
|---|---|
top_next_employers | The top ten companies employees moved to, and how many employees moved there, across all time periods |
top_previous_employers | The top ten previous companies employees worked for previously, and how many current employees were previously employed by them, across all time periods |
top_next_employers_12_month | The top ten next employers, counting only employee changes within the last 12 months |
top_previous_employers_12_month | The top ten previous employers, counting only employee changes within the last 12 months |
employee_count_by_sub_role | The number of current employees broken down by Job Title Sub Role. |
employee_growth_rate_12_month_by_sub_role | The twelve month rate of change by Job Title Sub Role |
employee_count_by_class | The number of current employees broken down by Job Title Class |
employee_growth_rate_12_month_by_class | The twelve month rate of change by Job Title Class |
Key Concepts
Reading in PySpark
People Data Labs Company data has a slightly tricky encoding with multiple lines and non-escaped quotes inside quotes that really trip PySpark up (Pandas handles it fine).
You can use this way of reading the data:
df = spark.read.option("header", "true")\
.option("multiLine", "true")
.option("escape", """)
.option("quote", '"')
.csv(path-do-company-data)Updated 2 months ago