Markov-Switching Regressions Part 1 – Manual R Code for the basic quasi-maximum likelihood (univariate case)

I am sorry that I haven’t posted any new blog entries for the last couple months. I have been busy with my work and my side project which I do not want to discuss openly. Anyway,  I would like to talk about Markov-switching regression today.

When we run time series regressions,  it is usually the case where our estimated coefficients  may not be stable and be subject to regime shifts. (This can be checked by CUSUM squared tests and other break point tests).  To model this kind of regime shift , several types of econometric models were developed by Econometricians. One of them is the Markov-switching type regressions, which has gained recognition ever since the influential work by Hamilton (1989) . Nowadays, we have many statistical packages to run markov switching univariate or multivariate regressions. However,  I never found anyone trying to post the manual code for the markov-switching regressions. So, let us write the code together! 🙂

The code I am going to present is the Quasi-likelihood Estimation method for a dynamic 2-state markov-switchhing AR(1) model:

Let Y_t be the observed variables. Assume our model has the following form:

 Y_t = \alpha_s+ \beta_s Y_{t-1}+\epsilon_t

where s= 0,1 signals 2 different states. \alpha_s and \beta_s are subject to regime shift depending on the state changes

First, Assuming asymptotic normality, and let \Theta = (\alpha_s, \beta_s), we can write the density of Y_t conditional on Y_{t-1} and s_t=i  as:

f(Y_t|s_t, Y_{t-1}; \Theta)= \frac{1}{\sqrt{2\pi\sigma_\epsilon^2}}exp{\frac{-\epsilon_{st}^2}{2\sigma_\epsilon}}

Second, we have: f(Y_t|Y_{t-1}; \Theta)=P(s_t = 0|Y_{t-1};\Theta) f(Y_t|s_t=0,Y_{t-1};\Theta)+P(s_t=1|Y_{t-1};\Theta)f(Y_t|s_t=1,Y_{t-1};\Theta)

Third, we have: P(s_t=i|Y_t; \Theta)=\frac{P(s_t=i|Y_{t-1};\Theta)f(Y_t|s_t=i; \Theta)}{f(Y_t|Y_{t-1}; \Theta)}

The fourth equation we have is: P(s_{t+1}=i| Y_t; \Theta) = p_{0i}P(s_t=0|Y_t; \Theta)+p_{1i}P(s_t=i|Y_t;\Theta)

where p_{0i} and p_{1i} are the transition probabilities from previous state to the next state. For further details of the Markov-switching model, please refer to most papers by Hamilton (1989,1994, 2005) etc.. Here, I only briefly outlined the key equations to set up the likelihood equation

Now if we are given initial values of P(s_t = 0|Y_{t-1};\Theta), we iterate the 4 equations to obtain the sum of the likelihood. Then the quasi-likelihood equation is:

L(\Theta)= \frac{1}{T}\sum_{t=1}^{T}{lnf(y_t|Y_{t-1};\Theta)}

To maximize the equation above, we can use the optimx package in R or the nlm function in R

Here I post my R code for setting up the quasi-likelihood function for the example above:

likeli_fun <-function(param){

y <- data[,1] #Here, use your own dataset, dependent variable, put it in column 1
alpha<-param[1] #Intercept in regime 0
alpha1 <- param[2] #intercept in regime 1
var <-param[3]^2 # variance of the error
p00 <- param[4] #The transition probability
p11 <-param[5]
beta <-param[6] #AR(1) slope in regime 0
beta1 <-param[7] #AR(1) slope in regime 1

#Set initial value for prob1 and prob0:
A <- matrix(c(1-p00,-1+p11,1,-1+p00,1-p11,1),nrow=3,ncol=2) #This is the A matrix mentioned in Hamilton 1994

inv_A <-solve(t(A)%*%A)
sol_A <- inv_A%*%t(A)  #Why we do this? Please check out Hamilton (1994)

prob0<-sol_A[1,3] #Recommended initial values by Hamilton (1994)

#Start iterating to get the likelihood
T <- length(y)
likv <-0
j_iter <-2
while (j_iter <= N) {

eps0 <- y[j_iter]-alpha-beta*y[j_iter-1]
eps1 <- y[j_iter]-alpha1-beta1*y[j_iter-1]
f0 <- (1/sqrt(2*pi*var))*exp(-0.5*eps0*eps0/var) #equation 1 for regime0
f1 <- (1/sqrt(2*pi*var))*exp(-0.5*eps1*eps1/var) #equation 1 for regime1

con_fz <- prob0*f0+prob1*f1  #Equation 2

fil_prob0 <- (prob0*f0)/con_fz   #Equation 3
fil_prob1 <- (prob1*f1)/con_fz

prob0 <- p00*fil_prob0+(1-p00)*fil_prob1 #Equation 4
prob1 <- (1-p11)*fil_prob0+p11*fil_prob1
lik <- log(con_fz)   # Take the log
likv <- likv-lik     # add to the last likelihood
j_iter <- j_iter+1   #Start another round of iteration

Next step, all you need to do in R is to fire up nlm or optimx in R to maximize the likelihood to compute the estimates of the parameters.

In my next post, let us start using the packages in R and command in Eviews to do some interesting analysis using Markov-switching regressions.

Does bubble exist in Toronto’s Housing Market?

Is it a bubble? How long will it last?

Everyone keeps talking about 33% year-over-year house price growth in March in Toronto….. Toronto’s housing market is definitely on fire! However, is there a bubble? Well, let us do some econometricks! 🙂


Most of the data I will use for today’s study come from the Statistics Canada’s CANSIM database. Since city-level data are not available. I use Ontario’s monthly economic data ranging from 1997M1 to 2016M12 as a proxies for Toronto’s Economic variables. (Note that this may create some bias in our estimates but what else can we do? Canada has a huge lack of data! Even though Conference Board Canada has time series economic data for Toronto, I actually have some doubt in the accuracy of their data. Plus,  as a poor Econometrickster, I  don’t have money to buy the subscription data from Conference Board Canada)

For the house price data of Toronto, I use the MLS HPI from the CREA website:

Now what I will do is to replicate the method used in this paper written by 3 Chinese economists from:

Let’s run some regressions

The first regression we run is the following, here’s the eviews output below:

ln(P_t) = \beta_0 +\beta_1ln(Income_t)+\beta_2ln(Rate_t)+\beta_3ln(P_{t-1})+\epsilon_t

Where Income_t denotes real disposable income er capita, which is not available. We gotta use monthly wage data from Statistics Canada, deflated by the CPI index. Rate_t denotes the real interest rate. I use the bank rate (minus the inflation rate) to get the real interest rate… P_t is the house price level, which I use the house price data from CREA. Below is the output from EViews:


Well….Results are not that great. All the variables are not significant except the AR(1) lag… This can be partially explained by the fact that all the economic variables I use are proxy variables, which introduces bias in the estimates. However, let us move on. The coefficient we are mostly interested in is the autoregressive coefficient, which is 0.986 in our case. Now we need to estimate the real growth momentum of house price h_t:

h_t = (P_t/P_{t-1})^{0.986}-1

pic 2

Above shows the estimated pure real price growth

Then we run the following simple AR model of order 2 with constant intercept:

h_t = \alpha_0+\alpha_1h_{t-1}+\alpha_2h_{t-2}+v_t

The coefficient we are mostly interested in is \alpha_1. If \alpha_1> 0.4, it is said to be a huge warning sign of speculative bubble

Now, instead of using the whole sample size,  I am gonna use the rolling regression methods to compute the monthly growth speculative bubble index  \alpha_1 for each month in Eviews. I set the rolling window sample size to be 32 observations per roll and here are the estimates from the last roll estimation:


Well, 0.491417 is way above 0.4, isn’t it?….That means Toronto’s housing market is probably already in a bubble state!! HOWEVER, this estimate I got at home is different from the one I got at work. (Data are more accurate but since the data I use at work are not public, I am not allowed to utilize them outside of work…..) The estimate I got by using more accurate data at my workplace is around 0.38 for the latest quarter (yes, I use quarterly data at my work place), which is also very close to the warning threshold.  Anyways, the conclusion is that the housing market in the GTA area is either on the verge or already in a bubble state!

Below shows the estimated bubble index I estimated using the rolling regressions :


(Note that estimates could be subject to upward bias because the proxies for fundamental economic variables fail to capture Toronto’s fundamentals. Also the size of the rolling window also plays a role in the estimates)

On one hand, we keep talking about demand side problem in Toronto but supply side is also a big problem for Toronto. Why people love single detached homes so much in this country? Canadians need to realize that land is limited resources and building more high-rise condo buildings is one solution to satisfy the demand side.

Speculative demand is a driving force for sure but the inelastic supply response is also another cause and the incompetent Ontario and Toronto governments should be responsible for this mess.

History suggests that Toronto’s housing market is very persistent. Actually my cointegration analysis at work also suggests the persistence of GTA’s housing market.

That is to say, whether it is in booms or busts, the state will last for a very long period of time. Therefore, if we base our evidence from the history, we can argue that the bubble will be persistent and a sudden burst of the bubble will not be likely in 1 or 2 years of time because there is still high degrees of growth momentum.

However, when the bubble does burst, it will be very nasty for sure.

Cointegration Tests that allow for structural breaks?

Structural breaks can sometimes be a huge hassle to deal with when you are trying to investigate the long run relationship between the variables by running standard cointegration tests such as Engle-Granger or the standard Johansen test.

Now, when the structural breaks are present, one solution is the residual-based cointegration test proposed by Gregory and Hansen (1996).   The relevant R programs and example can be found on website of Bruce E. Hansen by clicking here. Note that p is number of variables and r is number of cointegrating rank being tested

The test proposed by Johansen et al. (2000) appears to be another solution. The relevant R program for computing the critical values can be found at Dave Giles’ website here. (In the program, remember to set the correct breakpoint proportion and the value of q!!)

To do the Johansen et al. (2000) test, it can be decomposed into the following steps:

Step1: Identify the structural breaks.

Step2:  incorporate the date dummy (D2), trend*dummy interaction term(D2*@trend), as well as the shift indicator dummy(I2)  as exogenous variables into the original VAR. In Dave Giles example,  the variables he adds are  D2(-2), I2(-1), trend*D2(-2), and I2.

step 3: construct the usual Johansen trace statistics

(How to calculate the trace statistics? See this paper here )

The asymptotic critical values depend on the proportion of the way through the sample that the break occurs (λ = 0.44 in our case); and on (pr), where p is the number of variables under test p = 2, here), and r is the cointegrating rank being tested. So, for us,  r  = 0, 1.  Unlike Gregory-Hansen (1996) test, the Johansen et al. (2000) test can be modified to allow for two structural breaks.


I downloaded the EViews workfile,  which is available here


Now Time for some sleep before tomorrow’s work….Zzzzzzz