FAQs - WageScape

Where does WageScape data come from?

WageScape utilizes data from publicly available sources across the web, including aggregated job boards. WageScape adds 24.5 million publicly available job postings with employer reported salary ranges every month. By scanning the internet and aggregating information from publicly available job listings, they compile a vast dataset that represents the entire job market.

How is WageScape data validated?

WageScape utilizes several methodologies to validate the data and remove jobs that are outliers, inaccurate postings, duplicates and jobs with estimated pay ranges. WageScape data comes straight from the source, so it is as accurate as the data advertised by employers and you can review each data point exactly as it was originally posted.

Does WageScape have every job posting?

WageScape is made up of millions of individual data points and adds over 24.5 millions jobs each month. That’s more job data salary surveys have, however the data in WageScape still remains a sample of the market, and not every job posting is going to be captured by, for a variety of reasons ranging from quality checks to data collection processes they employ.

What data (i.e., predictors for salary) are used in WageScape.AI model?

The WageScape AI model is trained on multiple parameters, but those with the highest level of influence are role, location, company, and industry. The models are trained using prior 6 months data when sufficient data exists, and the last 12 months if required due to lower sample counts.

How does WageScape track subsidiaries after M&A events? Does it retroactively classify subsidiaries under the parent company, or maintain separate records?

Where subsidiaries are known for a company, WageScape displays a company as a parent in the company_parent. They do not retroactively assign parents to companies for data previously collected before the transaction.

What kind of labor market data does WageScape provide?

WageScape offers real-time hiring data collected from online job listings, covering over 600 million records globally. Their system captures:

  • Who is hiring
  • What roles are open
  • What compensation is offered
  • What skills and qualifications are required

This data is updated regularly and normalized across job titles, companies, and industries to ensure usability across sectors and research applications.

How complete is WageScape’s U.S. job market coverage?

WageScape captures data on approximately 90% of all new jobs created in the U.S. The coverage is:

  • Based on company websites and job boards
  • Benchmarked against JOLTS (Job Openings and Labor Turnover Survey)
  • Representative across all industries and job levels — except some C-suite and niche expert roles, which are underrepresented

How far back does the historical data go?

  • U.S. data: Back to 2016
  • International data: Coverage begins in 2021, varying by country

Historical access enables analysis of long-run labor market trends, including pre/post-COVID shifts and the evolution of remote work.

How does WageScape infer salary when not explicitly listed?

WageScape uses proprietary modeling to estimate salary for jobs without explicit pay in the listing. They apply methods like:

  • Inferring pay from related job listings
  • Using search behavior simulations (e.g., what salary a job seeker would expect)
  • Modeling salary bands based on normalized job attributes

This results in much higher pay coverage than typical job scraping approaches.

What compensation details are available in the dataset?

WageScape includes:

  • Base salary estimates (quantified)
  • Prevalence of bonuses, equity, commissions (qualitative flag only)

They do not quantify non-base compensation due to limited visibility in source listings, but they indicate when such benefits are mentioned.

Can researchers see salary ranges, not just point estimates?

Yes. Their Precision Pay tables:

  • Preserve the original low–high ranges (e.g., "$17.50–$22/hr")
  • Annotate multiple salary sources: job posting, listing platform estimates, WageScape’s own inferences, and AI-derived models

This supports granular analysis of pay bands, transparency requirements, and equity comparisons.

Are job postings geo-tagged?

Yes. Each job is assigned a specific location:

  • Typically down to the ZIP code
  • With available latitude and longitude
  • Indicates job location, not just the company HQ

This allows spatial analysis of hiring trends and geographic variation in wages and skills demand.

How is the data structured on Dewey?

The dataset is available in a relational format with six linked tables:

  • Job metadata (posting date, company, location, etc.)
  • Time series tracking (open/closed status)
  • Job titles
  • Attributes/tags (~11,000 skills and features)
  • Compensation information
  • Normalized job roles (~6,000 standardized titles)

All tables are linked by a common job_id.

Can you distinguish full-time vs part-time or remote jobs?

Yes. These are tracked in the job attribute (tags) table, which includes:

  • Work setting (on-site, hybrid, remote)
  • Employment type (full-time, part-time, contractor)
  • Academic and experience requirements

Most jobs in the dataset are full-time, but part-time flags are clearly identified where available.

Who is using this data in academia, and how?

Researchers at institutions like Stanford, MIT, and global universities use WageScape for:

  • Curriculum design (e.g., evolving data science skills)
  • AI and automation impact studies
  • Labor market forecasting post-COVID
  • DEI and gender gap analysis
  • Capstone/practicum coursework (with real-time data)

Use cases span labor economics, education, business, and social science disciplines.

Is the data representative of the labor market?

Yes. The dataset is:

  • Broadly representative across sectors and geographies
  • Slightly underrepresents top executive roles
  • Captures job demand trends at both macro and micro levels
  • Validated against external benchmarks like JOLTS

This makes it a reliable proxy for research on wages, employment, and hiring dynamics.