## Tomorrow’s Watchlist (July 16th)

Long List:

1. \$GOT.v (Goliath Resources Limited)

Ascending triangle plus bullish cross. Lots of chatters around it too.

2. GTT.V (GT Gold Corp)

bullish cross….bullish crosses everywhere….

Shot List: (plays for short-selling)

1. TGL.TO

Bulkowski’s bump and run reversal pattern…. Way above bollinger band and RSI at the peak level.

2. LI.V

Perhaps there is a gap to be filled?

I am excited about Monday…Let us make some money…

## The Beginning of My Trading Journey

This will be the first post of my trading journey.

First of all, let me tell you something about my trading experience:

I started trading in early 2017 when the legalization of Marijuana in Canada was underway and  blockchain technology and cryptocurrencies became popular buzzwords. Making money had never been easy in my life like that in 2017. Everyone was euphoric and I, a newbie in stock trading, also thought the stock market will be bullish for a while. At some point in 2017, I made tons of money and I thought, if the market continued to be like that, I would become a millionaire soon.  (I even forgot about updating this site, of course)

Of course, shit happened.

After mid January in 2018, the whole stock market, especially the venture and CSE market here, went into bearish mode…. A huge amount of my unrealized gains vanished. Bitcoins plummeting, bearish Cannabis sector outlook , as well as the trade war fears..all are yelling one thing: “Market is turning bearish!”

And now, here I am, typing this blog post and still living an ordinary life. After experiencing such a dramatic cycle in my life, I  made up my mind to learn more and become a great trader…!

I hope all the readers here can witness each step I made on this journey from now on. I will share my thoughts, my technical analysis (TA) and some econometric trading models in my posts.  Let the journey begins!

P.S. I find that econometric models and simple machine learning models cannot really provide good prediction of stock movement.  They work reasonably well when the trend is obvious. But when there is a change in the overall trend or in a side-way market, econometric and machine learning models will not work well.

## Plotting Choropleth Map using STATA

I feel very sorry for not updating this site for a long time. I have too much work to do lately and in the past year, I’ve spent some time developing my new hobby: trading in the capital market. I will talk more trading and comparing different trading strategies in the future 🙂

Today I will show how to plot maps using Stata. I am definitely not a GIS guru but for people who specialize in Urban or real estate economics, plotting maps by using stata can become very handy, especially for empirical work. All you need is to install 2 ado files in Stata: spmap and shp2dta.

Typing “ssc install spmap” and “ssc install shp2dta” in STATA.

## Data

For illustration purpose,  I will use the City of Vancouver property tax report data from the City of Vancovuer Open Data Catalogue. The download link is here . We are going to make a map of average land value by FSA in Vancouver. FSA means Forward sortation areas, which is actually just the first 3 digit of the postal code.

https://data.vancouver.ca/datacatalogue/propertyTax.htm

We use the Year 2017 data. (download the csv file). The data set includes different variables and it is a dataset with hundreds of thousands of observations. We focus on the single variable:  current_land_value

The next data file we need to download is the shapefile, which is downloadable on the Statistics Canada website:  http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/bound-limit-2011-eng.cfm

we download the one labeled as “Forward sortation areas”

## Plot Our Map

Put them into the same folder . In STATA, we declare the directory of that folder by using the cd command.

The next  step is to use the following command:

shp2dta using gfsa000b11a_e, database(map) coordinates(coord)

The code above converts the shapfile into the Stata data files. The coordinate file is named “coord” while the database file is named “map”

Next step: import the csv file into STATA and create the FSA identifier variable by typing the following:

gen CFSAUID = substr(property_postal_code,1,3)

We need to name it as CFSAUID instead of FSA because in our map data, the variable name is CFSAUID. In order to marge the 2 dataset based on FSA, we need them to be named as the same.

Next, we calculate the average land value for each FSA and collapse it into FSA level by typing:

collapse current_land_value, by(CFSAUID)

Next, we merge the dataset by typing:

merge 1:1 CFSAUID using map.dta

Next, we use the spmap command:

spmap current_land_value using coord if CFSAUID !=”V6T”, id(_ID) fcolor(Blues) title(“Average land price by FSA in Vancouver in 2017”)

we exclude the V6T area for the graphical purpose (or you can include it but you will see what I meant if you include V6T area…)

Below is the map:

The map looks pretty nice, right? As you can see…the eastern part of Vancouver City and the west side of Vancouver City look drastically different in terms of land value!

Next post, I will talk more about trading and possibly, part 2 of markov-switching regression. Stay tuned!

## Markov-Switching Regressions Part 1 – Manual R Code for the basic quasi-maximum likelihood (univariate case)

I am sorry that I haven’t posted any new blog entries for the last couple months. I have been busy with my work and my side project which I do not want to discuss openly. Anyway,  I would like to talk about Markov-switching regression today.

When we run time series regressions,  it is usually the case where our estimated coefficients  may not be stable and be subject to regime shifts. (This can be checked by CUSUM squared tests and other break point tests).  To model this kind of regime shift , several types of econometric models were developed by Econometricians. One of them is the Markov-switching type regressions, which has gained recognition ever since the influential work by Hamilton (1989) . Nowadays, we have many statistical packages to run markov switching univariate or multivariate regressions. However,  I never found anyone trying to post the manual code for the markov-switching regressions. So, let us write the code together! 🙂

The code I am going to present is the Quasi-likelihood Estimation method for a dynamic 2-state markov-switchhing AR(1) model:

Let $Y_t$ be the observed variables. Assume our model has the following form:

$Y_t = \alpha_s+ \beta_s Y_{t-1}+\epsilon_t$

where $s= 0,1$ signals 2 different states. $\alpha_s$ and $\beta_s$ are subject to regime shift depending on the state changes

First, Assuming asymptotic normality, and let $\Theta = (\alpha_s, \beta_s)$, we can write the density of $Y_t$ conditional on $Y_{t-1}$ and $s_t=i$  as:

$f(Y_t|s_t, Y_{t-1}; \Theta)= \frac{1}{\sqrt{2\pi\sigma_\epsilon^2}}exp{\frac{-\epsilon_{st}^2}{2\sigma_\epsilon}}$

Second, we have: $f(Y_t|Y_{t-1}; \Theta)=P(s_t = 0|Y_{t-1};\Theta) f(Y_t|s_t=0,Y_{t-1};\Theta)+P(s_t=1|Y_{t-1};\Theta)f(Y_t|s_t=1,Y_{t-1};\Theta)$

Third, we have: $P(s_t=i|Y_t; \Theta)=\frac{P(s_t=i|Y_{t-1};\Theta)f(Y_t|s_t=i; \Theta)}{f(Y_t|Y_{t-1}; \Theta)}$

The fourth equation we have is: $P(s_{t+1}=i| Y_t; \Theta) = p_{0i}P(s_t=0|Y_t; \Theta)+p_{1i}P(s_t=i|Y_t;\Theta)$

where $p_{0i}$ and $p_{1i}$ are the transition probabilities from previous state to the next state. For further details of the Markov-switching model, please refer to most papers by Hamilton (1989,1994, 2005) etc.. Here, I only briefly outlined the key equations to set up the likelihood equation

Now if we are given initial values of $P(s_t = 0|Y_{t-1};\Theta)$, we iterate the 4 equations to obtain the sum of the likelihood. Then the quasi-likelihood equation is:

$L(\Theta)= \frac{1}{T}\sum_{t=1}^{T}{lnf(y_t|Y_{t-1};\Theta)}$

To maximize the equation above, we can use the optimx package in R or the nlm function in R

Here I post my R code for setting up the quasi-likelihood function for the example above:


likeli_fun <-function(param){

y <- data[,1] #Here, use your own dataset, dependent variable, put it in column 1
alpha<-param[1] #Intercept in regime 0
alpha1 <- param[2] #intercept in regime 1
var <-param[3]^2 # variance of the error
p00 <- param[4] #The transition probability
p11 <-param[5]
beta <-param[6] #AR(1) slope in regime 0
beta1 <-param[7] #AR(1) slope in regime 1

#Set initial value for prob1 and prob0:
A <- matrix(c(1-p00,-1+p11,1,-1+p00,1-p11,1),nrow=3,ncol=2) #This is the A matrix mentioned in Hamilton 1994

inv_A <-solve(t(A)%*%A)
sol_A <- inv_A%*%t(A)  #Why we do this? Please check out Hamilton (1994)

prob0<-sol_A[1,3] #Recommended initial values by Hamilton (1994)
prob1<-sol_A[2,3]

#Start iterating to get the likelihood
T <- length(y)
likv <-0
j_iter <-2
while (j_iter <= N) {

eps0 <- y[j_iter]-alpha-beta*y[j_iter-1]
eps1 <- y[j_iter]-alpha1-beta1*y[j_iter-1]
f0 <- (1/sqrt(2*pi*var))*exp(-0.5*eps0*eps0/var) #equation 1 for regime0
f1 <- (1/sqrt(2*pi*var))*exp(-0.5*eps1*eps1/var) #equation 1 for regime1

con_fz <- prob0*f0+prob1*f1  #Equation 2

fil_prob0 <- (prob0*f0)/con_fz   #Equation 3
fil_prob1 <- (prob1*f1)/con_fz

prob0 <- p00*fil_prob0+(1-p00)*fil_prob1 #Equation 4
prob1 <- (1-p11)*fil_prob0+p11*fil_prob1
lik <- log(con_fz)   # Take the log
likv <- likv-lik     # add to the last likelihood
j_iter <- j_iter+1   #Start another round of iteration
}
return(likv/T)
}



Next step, all you need to do in R is to fire up nlm or optimx in R to maximize the likelihood to compute the estimates of the parameters.

In my next post, let us start using the packages in R and command in Eviews to do some interesting analysis using Markov-switching regressions.

## Does bubble exist in Toronto’s Housing Market?

Everyone keeps talking about 33% year-over-year house price growth in March in Toronto….. Toronto’s housing market is definitely on fire! However, is there a bubble? Well, let us do some econometricks! 🙂

Data

Most of the data I will use for today’s study come from the Statistics Canada’s CANSIM database. Since city-level data are not available. I use Ontario’s monthly economic data ranging from 1997M1 to 2016M12 as a proxies for Toronto’s Economic variables. (Note that this may create some bias in our estimates but what else can we do? Canada has a huge lack of data! Even though Conference Board Canada has time series economic data for Toronto, I actually have some doubt in the accuracy of their data. Plus,  as a poor Econometrickster, I  don’t have money to buy the subscription data from Conference Board Canada)

For the house price data of Toronto, I use the MLS HPI from the CREA website: http://www.crea.ca/housing-market-stats/mls-home-price-index/hpi-tool/

Now what I will do is to replicate the method used in this paper written by 3 Chinese economists from: http://file.scirp.org/pdf/JSSM20090100006_39362604.pdf

Let’s run some regressions

The first regression we run is the following, here’s the eviews output below:

$ln(P_t) = \beta_0 +\beta_1ln(Income_t)+\beta_2ln(Rate_t)+\beta_3ln(P_{t-1})+\epsilon_t$

Where $Income_t$ denotes real disposable income er capita, which is not available. We gotta use monthly wage data from Statistics Canada, deflated by the CPI index. $Rate_t$ denotes the real interest rate. I use the bank rate (minus the inflation rate) to get the real interest rate… $P_t$ is the house price level, which I use the house price data from CREA. Below is the output from EViews:

Well….Results are not that great. All the variables are not significant except the AR(1) lag… This can be partially explained by the fact that all the economic variables I use are proxy variables, which introduces bias in the estimates. However, let us move on. The coefficient we are mostly interested in is the autoregressive coefficient, which is 0.986 in our case. Now we need to estimate the real growth momentum of house price $h_t$:

$h_t = (P_t/P_{t-1})^{0.986}-1$

Above shows the estimated pure real price growth

Then we run the following simple AR model of order 2 with constant intercept:

$h_t = \alpha_0+\alpha_1h_{t-1}+\alpha_2h_{t-2}+v_t$

The coefficient we are mostly interested in is $\alpha_1$. If $\alpha_1> 0.4$, it is said to be a huge warning sign of speculative bubble

Now, instead of using the whole sample size,  I am gonna use the rolling regression methods to compute the monthly growth speculative bubble index  $\alpha_1$ for each month in Eviews. I set the rolling window sample size to be 32 observations per roll.

Below shows the estimated bubble index I estimated using the rolling regressions :

(Note that estimates could be subject to upward bias because the proxies for fundamental economic variables fail to capture Toronto’s fundamentals. Also the size of the rolling window also plays a role in the estimates)

When the bubble does burst, it will be very nasty for sure.

## Cointegration Tests that allow for structural breaks?

Structural breaks can sometimes be a huge hassle to deal with when you are trying to investigate the long run relationship between the variables by running standard cointegration tests such as Engle-Granger or the standard Johansen test.

Now, when the structural breaks are present, one solution is the residual-based cointegration test proposed by Gregory and Hansen (1996).   The relevant R programs and example can be found on website of Bruce E. Hansen by clicking here. Note that p is number of variables and r is number of cointegrating rank being tested

The test proposed by Johansen et al. (2000) appears to be another solution. The relevant R program for computing the critical values can be found at Dave Giles’ website here. (In the program, remember to set the correct breakpoint proportion and the value of q!!)

To do the Johansen et al. (2000) test, it can be decomposed into the following steps:

Step1: Identify the structural breaks.

Step2:  incorporate the date dummy (D2), trend*dummy interaction term(D2*@trend), as well as the shift indicator dummy(I2)  as exogenous variables into the original VAR. In Dave Giles example,  the variables he adds are  D2(-2), I2(-1), trend*D2(-2), and I2.

step 3: construct the usual Johansen trace statistics

(How to calculate the trace statistics? See this paper here )

The asymptotic critical values depend on the proportion of the way through the sample that the break occurs (λ = 0.44 in our case); and on (pr), where p is the number of variables under test p = 2, here), and r is the cointegrating rank being tested. So, for us,  r  = 0, 1.  Unlike Gregory-Hansen (1996) test, the Johansen et al. (2000) test can be modified to allow for two structural breaks.

I downloaded the EViews workfile,  which is available here

Now Time for some sleep before tomorrow’s work….Zzzzzzz