NSW Using Regression to Estimate Property Prices

Discussion in 'Property Analysis' started by Yamas, 24th Jan, 2020.

Join Australia's most dynamic and respected property investment community
Tags:
  1. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    I have my eye on the Surry Hills/Redfern market and was interested if anyone had considered or used regression to precisely estimate values of properties. I looked at the vast majority of apartment sales in Redfern and Surry Hills over the past year and came up with the following outputs.

    Constant $177,981.61
    Area (per sqm) $5,850.17
    Bedroom $57,197.74
    Bathroom $44,591.74
    Car Space $77,753.11

    Essentially, the value of an apartment is
    177981+5850*(Floor Area in sqm)+57197(#beds)+44591(#baths)+77753(#car spaces)

    For those with a statistics background, my F value and P values are rock solid (4.84E-24 and generally <0.05 respectively).

    Using this I was able to identify that statistically, the most undervalued property sold in the last year was 43/146-152 Pitt St. It sold for $1,030,000, when the model predicts it had a value of $1,277,033 (assuming 130sqm of floorspace, 3 beds, 2 baths and 1 car space) (for a discount of $247,033).
    Statistically, the most overvalued property sold in the last year was 209/1A Great Buckingham St. It sold for $1,860,000 (assuming 160sqm of floor space, 2 beds, 2 baths, 1 car space) when the model suggests a value of $1,395,340 (for a premium of $464,659). For reference, Great Buckingham is a premium street and people pay extra just to live on it.

    Anyone got any stories about applying this? Any words of caution?

    I am going to recreate the model for houses too.
     

    Attached Files:

    New Town and Lindsay_W like this.
  2. Beano

    Beano Well-Known Member

    Joined:
    7th Apr, 2016
    Posts:
    2,420
    Location:
    Brisbane
    I did the same type of calculation on purchase too
    1: work of how much land is attached to the unit
    2: the aspect of the unit
    3: the body corporate cost
    4: the rates
    5: the age of the unit
    6:deferred maintenance of the unit and block.
     
  3. Lindsay_W

    Lindsay_W Well-Known Member

    Joined:
    1st Jul, 2015
    Posts:
    2,465
    Location:
    QLD
    This is quite interesting BUT can I ask you what does it matter if you calculate a properties value to be worth say $1 Million using the above calculations but the valuation comes in at $800K ? Do you consider it a bargain even if you can't convince a valuer to value it up to $1M therefore can't access the perceived equity etc?
    Or the opposite, such as the example you used where someone paid $464,659 premium (based on your figures) doesn't the market set the value in this case? As in it's worth what people are willing to pay for it?
     
    NSWelshman likes this.
  4. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    Scenario 1: The benefit is that regression analysis is reactive and proactive. A 45sqm studio has come on the market on Riley St with a guide of 650k. Using this model, I know that the fair value of the apartment is around 585k, so the guide is well over.
    Scenario 2: Let’s say that a slightly uncommon combo comes on the market. Let’s say a 60sqm 2/2/0 (typically, a 2 bedder would have a parking spot in the area). Using the model, you can determine what the fair value is based on the observations.
    Scenario 3: let’s say you are negotiating on a property and the agent says “parking spots in this area add 100k to the value of a property”. You can say, “actually, I’ve crunched the numbers and parking spots are worth a touch under 80k”.
     
  5. Trainee

    Trainee Well-Known Member

    Joined:
    24th May, 2017
    Posts:
    6,043
    Location:
    Australia
    Seems that even if the model can estimate fair value, it doesnt help with what people are going to pay.

    Also how do you allow for renovated, not renovated, age?

    How does it work in rising or falling markets? In a rising market you will underestimate values, and vice versa in a falling market.

    How well would the model have worked in 2016 (rising market), and 2018 (falling market)?

    Most importantly, what has the model done for you? Have you managed to buy an undervalued property and does the bank valuation agree with your model?
     
  6. Scott No Mates

    Scott No Mates Well-Known Member

    Joined:
    18th Jun, 2015
    Posts:
    20,384
    Location:
    Sydney or NSW or Australia
    Or sometime in 2017/2019 where the market turned?

    What's it do in times of thin trading eg. regional areas?

    Conversely, what if it's a dead cat bounce? Governments intervene with new fhb incentives, introduce/remove tax concession, economy tanks/booms etc?
     
    NSWelshman likes this.
  7. The Y-man

    The Y-man Moderator Staff Member

    Joined:
    18th Jun, 2015
    Posts:
    8,955
    Location:
    Melbourne
    Given the posts above, I'd say you'd have to go to MCDA/MCDM ~ and possibly fuzzy at that.

    The Y-man
     
  8. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    Assuming my interpretation of the meaning of the outputs is correct, the model can explain around 77% of the value of a property using just the area of floor space, number of bedrooms, bathrooms and car spaces. The remaining 23% is not explained by these four inputs and may be attributed to any number of other inputs (some of which you mentioned). But if anyone tells you they have an absolutely perfect model, they are fibbing you.

    The model only used property sales which occured over the previous year, so it is quite contemporaneous. Essentially, the output should be averaged over the course of the year. If you know today that a particular suburb is rising or falling, you can subjectively account for that.
    I'm not pretending this is the holy grail of property price forecasting, it is supposed to be a tool to supplement your property purchase journey.

    Finally, to the most important point, I only developed it today (it was a thought bubble). But hoping to record some sales over the coming weeks to see how well the predicted sale prices match the final sale prices.
     
    Archaon likes this.
  9. Scott No Mates

    Scott No Mates Well-Known Member

    Joined:
    18th Jun, 2015
    Posts:
    20,384
    Location:
    Sydney or NSW or Australia
    That'd be where experience, market knowledge, purpose, motivation etc come into play.
     
    Lindsay_W likes this.
  10. Lindsay_W

    Lindsay_W Well-Known Member

    Joined:
    1st Jul, 2015
    Posts:
    2,465
    Location:
    QLD
    I guess my question wasn't quite clear sorry.
    For my own understanding, what is the purpose of using this IF it doesn't match what a valuer values the property at or what someone is willing to pay for it? What benefit does it give you? Do people actually sell their properties at Fair Value rather than Market value?
     
  11. significance

    significance Well-Known Member

    Joined:
    6th Sep, 2019
    Posts:
    164
    Location:
    Queensland
    Have you run the usual stats checks on the model? e.g. Have you trained it with a random sample of the data and then evaluated the stats for data outside the training set? Have you looked for patterns in the residuals? Have you checked for homscedasity in the covariate data?

    I think that the biggest problem with this approach is that your predictor variables are not independent from each other (e.g. the number of bathrooms is likely highly correlated with the number of bedrooms and the number of bedrooms is highly correlated with overall area), so linear regression modelling is likely give misleading results -- you might want to consider using an approach that doesn't assume independence (e.g. random forest modelling) or else run the regression on the principal components rather than the raw covariates.
     
  12. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    In theory at least, the value determined by this model and a valuation offered by a valuer should be approximately the same. The methodologies utilised might be different, but in broad terms, the outcome should be the same. The model can benefit buyers and sellers. The model benefits/educates a seller when there is a perception in the market that the house should be undervalued for a particular reason (and whether that reason is justified). Similarly, it benefits/educates a buyer (investors in particular) in knowing what people really value in property in a particular region.

    Using another example I developed, I had a look at all the house sales (i.e. land component). I found that in a pocket of the eastern suburbs of Sydney, the area of the land is not a good indicator of the value of a property. Normally, we would associate larger land with higher values, but I can statistically show this isn’t the case.

    2.PNG

    Land size only accounts for $671/sqm, but statistically, it could be $0. I (you) might think this is ridiculous. But have a look at the sales over the past year. For (say) 3 bedroom, 2 bathroom houses, there is very little correlation between the land component and the sale price. One of the smallest pieces of land sold for the largest amount (94sqm at $2.7 million), and substantially larger bits of land sold for less (144sqm at $1.73 million). Essentially, from a statistics standpoint, people pay a flat rate for land (regardless of size), and then a considerable premium for each bedroom, bathroom and carspace.

    1.PNG

    You can use this leverage in negotiating (from both sides).

    If you were a prospective buyer of a house on a large block of land, and were negotiating with the agent and they were saying ‘there is a huge land component that you are getting, bring up your price a little’, you can say ‘statistically, sales show that the size of the land plays an insignificant part in the price of the property… intrinsically, people pay for bedrooms and bathrooms (and to a lesser extent car parking spaces)’.

    To the contrary, if you were a prospective seller of a house on a small block of land and had received an offer with the agent saying ‘they put in an offer which is low, but they are saying that the land size is a postage stamp and it is justified’, you can say ‘statistically, there is a small land component, but its land close to the CBD and whether it’s 100sqm or 200sqm it doesn’t really matter ’ [probably because it's exceedingly difficult to redevelop a sub 200sqm block into (say) dual occupancy etc].
     
    Thorazine and Lindsay_W like this.
  13. significance

    significance Well-Known Member

    Joined:
    6th Sep, 2019
    Posts:
    164
    Location:
    Queensland
    Except that your model doesn't allow for correlated covariates. Number of bedrooms and land size are highly correlated. If people are paying for more land, that usually means they'll be getting more bedrooms but your model is not capable of distinguishing between the two effects. It also looks as though you are giving us fit statistics from the training data set and haven't actually tested the model with an independent data set.
     
  14. Trainee

    Trainee Well-Known Member

    Joined:
    24th May, 2017
    Posts:
    6,043
    Location:
    Australia
    How does the model beat just going to opens, researching past sales and understanding the micromarket?
     
    thatbum and Lindsay_W like this.
  15. significance

    significance Well-Known Member

    Joined:
    6th Sep, 2019
    Posts:
    164
    Location:
    Queensland
    A good model will always beat gut feeling.My husband has been in a rugby tipping comp with his friends for the past few years. He's a keen rugby follower and pays attention to what is happening from week to week. I don't follow rugby at all. Last year, I got him to give me the results of the previous year's games, which teams were home vs away and what he'd predicted for each game (result and margin). I put it together in a random forest model and used it to analyse his tipping and then to make predictions for the next season. Results:
    1) I was able to show him that he was under-estimating the home advantage, which he used to improve his own rank in that year's tipping competition;
    2) Despite no interest in rugby, my tips were at the top of the ranking table for his group of 12 rugby-watching friends for most of the season (though it dropped off a little towards the end).

    A good model will help you to hone your gut feeling and will also help you to take emotional reactions out of the equation.
     
    Thorazine likes this.
  16. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    Broadly, there are two separate types of properties, and I will discuss each separately:
    * Apartments - I agree, generally, floor space might be expected to correlate with number of bedrooms. You cannot have a 5 bedroom 40sqm apartment, and it would be highly implausible to have a 1 bedroom 200sqm. Generally, a studio might be <40sqm, a 1 bedder around 50sqm, 2 bedder around 80sqm etc. From the sample that I have, the correlation between area and number of bedrooms is 0.73. This results in a VIF of 3.7 (1/1-0.73). VIF thresholds are subjective, but from what I have read, a VIF of <5 generally indicates enough of an absence of colinearity that they are sufficiently not correlated. I have a theory for how I could resolve this issue if I needed to, using a simple but intuitive equation, I instantly reduced the r squared from 0.73 to 0.58.
    * Houses - This is less of an issue because a house may not cover the full land plot and is dependent on the number of levels. I have tested two different markets. In a regional market, the r squared of the bedrooms and land area is 0.21. In a metropolitan area, the r squared was 0.44. My understanding is that this does not point to an issue of colinearity.
     
  17. significance

    significance Well-Known Member

    Joined:
    6th Sep, 2019
    Posts:
    164
    Location:
    Queensland
    A VIF of 3.7 means that your standard error is about 51% higher than it would be if they variables were uncorrelated. Work out your confidence intervals and increase the margins by 51% (and then by whatever you need to to account for the correlation between number of bathrooms and number of bedrooms) if you are going to use this approach.

    In my field, an r squared of 0.44 is considered very high, but this isn't my field. I'd still treat this model with caution and would prefer an approach more suited to use with correlated variables.
     
  18. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney

    Thanks for your feedback. I appreciate it.
     
  19. Gestalt

    Gestalt Well-Known Member

    Joined:
    20th May, 2018
    Posts:
    81
    Location:
    Brisbane
    Congratulations on producing this model. It looks like you’ve put a fair bit of effort into it.

    Just because a model isn’t perfect doesn’t mean it isn’t useful. Determining the right price for a property will almost always involve the application of judgment and other soft skills (less so for units vs houses).

    I’d never formulate a purchase price solely based on a model. But this seems as good a starting point as any.

    If nothing else, it provides a helpful sanity check. If the asking price or range is way out of whack with the model’s price, it will at least impose the intellectual discipline of forcing you to consider what factors might explain the variance.
     
    significance likes this.
  20. Yamas

    Yamas Active Member

    Joined:
    14th Oct, 2018
    Posts:
    30
    Location:
    Sydney
    Thanks :)

    Broadly, I think everyone knows that a model is never going to be perfect (it may not even be good). Indeed, with some of these models, the inputs only account for around 60% of the variance in prices. As you note, a good sanity check.