People And Resume
Overview
This collection comprises 18 interconnected datasets that together form the People Data Labs People and Resume data product. These datasets contain over 50 years of people-based data including job history, education history, associated addresses, social media handles, contact information, etc.
The data are structured around a core Person table that serves as the root record in the schema, containing a unique persistent identifier, current job attributes, and source metadata (num_sources, num_records). This Person table is the parent entity referenced by all other tables in this collection via the person_id foreign key.
Data Description
The People and Resume collection consists of the following interconnected tables:
Person
The Person dataset serves as the top-level record aggregating a unique persistent identifier, current job attributes, and source metadata. This table contains current job information and serves as the parent table for all other datasets via person_id.
Profile
Contains social profiles associated with each person. Each profile entry includes network (the canonical platform identifier, e.g. linkedin, twitter, facebook), username (account handle on that platform), and url (profile web address). The person_id field links to the Person table.
LinkedIn URLs
Stores all LinkedIn profile URLs ever observed for a person, including the current linkedin_url plus historical aliases that have resolved to the same person identity. The person_id field links to the Person table.
Email
Contains email classifications for emails associated with an individual. Each entry includes address (fully parsed email) and type (classification from the Canonical Email Types list, such as personal, professional, or current_professional).The current work email is also surfaced separately as work_email. The person_id field links to the Person table.
Education
Contains educational experiences for an individual. Each entry includes school (institution name, location, and domain), degrees, majors, minors, and start/end dates when available. The person_id field links to the Person table, and education_id serves as the primary key referenced by the Degree, Major, and Minor tables.
Degree
A normalized table containing standardized academic degree classifications (e.g. bachelors, masters, doctorates, associates) drawn from the Canonical Education Degrees list. The education_id field links to the Education table.
Major
Contains all majors the person earned at a school, drawn from the Canonical Education Majors list. The education_id field links to the Education table.
Minor
Contains all minors the person earned at a school, drawn from the Canonical Education Majors list. The education_id field links to the Education table.
Certification
Contains certifications obtained by an individual. Each entry contains the certification name, the issuing organization, and start/end dates when available. The person_id field links to the Person table.
Experience
Contains an individual's work experience. Each entry includes title (job position with name, role, sub_role, levels, and class), company (employer data that mirrors the Company Schema), location_names, start_date, end_date, and a flag indicating whether the role is the person's primary current role. The person_id field links to the Person table, and id serves as the primary key referenced by Experience Location Name and Experience Title Level tables.
Experience Location Name
Contains experience locations associated with a particular experience entry. Locations are denormalized name strings (city, region, country) and may include multiple values when the role spanned more than one location. The experience_id field links to the Experience table.
Experience Title Level
Contains derived seniority level(s) of the person's job title for a given experience entry (for example: owner, partner, cxo, vp, director, manager, senior, training, unpaid, entry). The experience_id field links to the Experience table.
Job Title Level
Contains the derived level(s) of the person's current job title. Uses the same enumeration as experience.title.levels. The person_id field links to the Person table.
Skill
Contains the person's self-reported skills, normalized to lowercase strings. The person_id field links to the Person table.
Interest
Contains the person's self-reported interests, normalized to lowercase strings. The person_id field links to the Person table.
Location Name
Contains the location (city, state, and country) of the person's current address as a denormalized name string. location_names captures the full set of location name strings observed across sources. The person_id field links to the Person table.
Region
Contains the administrative region of the person's current address — for example, a U.S. state or a non-U.S. first-level subdivision. The person_id field links to the Person table.
Country
Contains the country of the person's current address. The person_id field links to the Person table.
Coverage
The datasets contain over 50 years of people-based data. Geographic coverage includes location information at the country, region (administrative subdivision), and city level for both current addresses and work experience locations.
Methodology
Educational degrees are standardized using the Canonical Education Degrees list (e.g. bachelors, masters, doctorates, associates). Academic majors and minors are drawn from the Canonical Education Majors list. Email addresses are classified according to the Canonical Email Types list (such as personal, professional, or current_professional).
Job title seniority levels are derived and classified into categories including: owner, partner, cxo, vp, director, manager, senior, training, unpaid, and entry. These derived levels apply both to historical experience entries and current job titles.
Social profiles include canonical platform identifiers (e.g. linkedin, twitter, facebook) with associated usernames and URLs. LinkedIn URL data includes both current profile URLs and historical aliases that have resolved to the same person identity.
Skills and interests are normalized to lowercase strings. Location names are stored as denormalized strings combining city, region, and country information.
The Person table aggregates source metadata including num_sources and num_records, which track the number of sources and records contributing to each person profile.
Additional Notes
The data structure uses a relational model with the Person table as the root entity. Foreign key relationships connect child tables to parent tables via person_id (linking to Person), education_id (linking to Education), and experience_id (linking to Experience).
Work experience includes company information that mirrors the Company Schema. Experience entries include a flag indicating whether a role represents the person's primary current role.
The datasets are sourced from People Data Labs documentation including the Person Schema, Company Schema, and Company Data Field Bundles.
Updated 11 days ago