FAQs - RentHub

Why is rental data difficult to collect?

Unlike for-sale real estate, rental transactions are private and unrecorded at the municipal level. Rental data is fragmented across owner sites, listing platforms, and classifieds. RentHub aggregates all this data into a standardized, research-ready format.

What’s included in RentHub’s dataset?

RentHub’s dataset includes:

  • Over 1 million listings weekly
  • Historical coverage from 2014 to present
  • Daily tracking of ~500,000 apartment complexes
  • Geographic coverage across the entire U.S.
  • Full dataset refreshed bi-weekly

Each listing includes rent, square footage, bedroom/bath count, amenities, geolocation, marketing description, and time stamps (posted/scraped date).

How does RentHub handle data cleanliness?

RentHub does not simply offer raw listings. It structures the data, deduplicates it, and adds identifiers:

  • Unique IDs for properties and units
  • Indicators for listing duration (posted-to-delisted lifecycle)
  • Fields to track amenity premiums (e.g., units with granite countertops)

The platform is optimized to support clean joins and longitudinal analysis.

How has RentHub data been used in research?

Dr. Le Jiang Le presented research on how university presence affects nearby rental markets in Southern California. Her key findings using RentHub data include:

  • Proximity to a university raises asking rents by $200+, plus $120/km closer
  • Units near universities are more likely to be studios/1-bedrooms, furnished, dense, and amenity-rich
  • These “university rental markets” are statistically distinct from surrounding areas

The research relied on RentHub’s street-level geolocation and high temporal granularity, validating results against ACS survey data.

Can RentHub data be joined with other datasets?

Yes. RentHub is integrating Placekey, a universal spatial identifier that makes it easier to join with parcel, commercial, or demographic datasets. While parcel-level joins aren't yet supported natively, they’re on the roadmap.

What legal and ethical standards guide RentHub’s scraping?

RentHub only collects public, factual listing data—never images or content behind login walls. It follows current web-scraping legal norms, ensuring compliance and ethical standards.

When merging the Rental Data with the Listing and Property Mapping by ID, some listings aren't matched, so then a listing doesn't have a UNIT_ID or PROPERTY_ID. What are possible reasons for this?

RentHub's methodology for aggregating data and assigning unit_ids and property_ids has evolved and improved over time. As a result, recent listing observations (approximately from 2019 onward) include these unit_ids and property_id fields. Unfortunately, some earlier listing observations in the dataset may not contain these fields.

If I want to know how long a unit has been on the market, is it recommended to look at the max and min of DATE_POSTED and take that difference?

RentHub agrees with this as a suggested approach for deriving how long a unit has been in the market. The SCRAPED_TIMESTAMP, however, refers to the date that RentHub was able to acquire the listing observation. This does not correlate to the date that the listing was posted to the channel.

For some units, the DATE_POSTED variable changes often. Does this mean the unit was reposted or something changed from the previous posting?

The DATE_POSTED variable simply indicates the date a listing appeared on an advertising channel. If the same unit is posted with a different DATE_POSTED field, it likely means the listing was reposted. Similar to the above, RentHub can take a look to provide context with an example