Discussions
Duplicates in WageScape Title Data
I am working with WageScape's Title data, and I am noticing that there are millions of observations with identical Job Posting IDs, which the data dictionary suggests should be the unique job posting identifier. At a glance, it looks like there are 423M observations in WageScape's Job Postings With Salary database, 407M in Role Mapping, and 454 in Time Logs, yet there are 818M observations in Titles. What is the best way to handle these duplicates, and why are there so many? Are these duplicates the result of some job postings having multiple titles?
Global Data
It's a bit unclear to me which data sets exclusively encompass the United States and which are global in their coverage. For example, Job Postings with Salary
from WageScape claims to comprise "individual roles around the world", but then when I click on the documentation the metadata claims `Geographical Coverage: United States
.
“Salary” and “Salary_modeled” in WageScape dataset
I am exploring the WageScape data and wondering what “Salary” and “Salary_modeled” exactly represent. I see the following descriptions for the two variables on Dewey: