Rent Map Data Sources
I just finished updating my rent map to handle the recent Padmapper UI refresh, and someone asked how Padmapper not including Craigslist listings affected the map. This confused me: I had thought Padmapper got its data by buying it from 3Taps who scraped Google's cache who crawled Craigslist. But it turns out that Padmapper and 3Taps settled the lawsuit, and Padmapper has only gotten its listings from other sources since then.
One issue, though, is it could be that the cheapest apartments are listed only on Craigslist [1] and not on the other services that Padmapper pulls from. To get a rough check this, I took ten random listings from the Boston Craigslist page, and tried to figure out which Padmapper listing it went with.
2br at 300 2nd Ave Needham for $3697. In Padmapper. My map doesn't go this far out.
2br on Crawford St in Watertown for $1700. Not in Padmapper. My map predicts $2065.
1br near Broadway at Oxford St in Arlington for $1495. Not in Padmapper. My map predicts $1775.
2br on Harvey St in Cambridge for $2900. In Padmapper. My map predicts $2800.
Studio at 1110 Comm Ave in Allston for $1575. In Padmapper. My map predicts $1590.
3br on Tremont St in Cambridge for $3000. Not in Padmapper. My map predicts $2995.
3br on Bromfield Rd in Somerville for $3000. In Padmapper. My map predicts $3155.
3br in Cambridge for $6000. Not in Padmapper. This listing is kind of nuts, since it gives no pictures or address information beyond just "Cambridge". Even for Harvard Sq my map only predicts $3650.
2br on Somerville Ave in Somerville for $1700: Not in Padmapper. My map predicts $2605.
2br on 400 Foxborough Blvd in Foxboro for $2215. In Padmapper. My map doesn't go this far out.
Predictions summary:
Listing Estimate Error In Padmapper $1700 $2605 +53% no $1700 $2065 +21% no $1495 $1775 +19% no $3000 $3155 +5% yes $3000 $2995 -0% no $1575 $1590 -1% yes $2900 $2800 -3% yes $6000 $3650 -64% no (dubious)
While this is a small sample, it looks like the predictions are pretty good for the ones in Padmapper (which is what you would expect) and consistently too high (0%, 19%, 21%, 53%, avg=23%) for the ones not in Padmapper.
Fixing this is pretty tricky. I could do a larger sample to try to get a better sense of what the error is, and then adust my map down by the combination of how much lower the non-Padmapper apartments are and what fraction aren't in padmapper. In this case, ignoring the dubious listing, 4 of 9 weren't on padmapper, with an average error of 23%, that would mean adjusting all my estimates down by 10%. On the other hand, as people's listing behavior changes this could get obsolete pretty quickly, and it's a pain to calculate the first time let alone on an ongoing basis. Ideas?
[1] Or, worse for my map, listed only with signs in windows or something else not available online.
Comment via: google plus, facebook