Beating the Naive Model in the Stock Market

Calvin Ku
5 min readMay 22, 2017

Recently while studying for the Self-Driving Car Nanodegree from Udacity, I came across something really amazing, called the Kalman filter. It is used widely in self-driving cars to deal with the localization problem.

The idea of Kalman filter is simple. We assume our current location is some Gaussian distribution with mean µ and standard deviation σ. And each time we do a measurement or we move, we update this distribution. For example, if we looked up and saw a pyramid, the chance of us being in Egypt would increase. And if we moved toward the pyramid, the chance of us getting closer to it would also increase.

Essentially, Kalman filter is just Bayes rule and total probability. The good thing about Kalman filter is that it lets us deal with uncertainty with ease. When we looked up and saw a pyramid, there’s chance that this pyramid was just a picture, and we were just too drunk to notice it. Also when we moved toward the pyramid, we could also be too drunk and believed we were moving towards it while we were actually going backwards. When we update the probability distribution of our location, we need to take these “chances” into account. And this is what Kalman filter is good for.

So what’s any of this to do with the stock market?

The thing with stock market, is that people long a stock because they believe the stock is undervalued and short one if they believe it to be overvalued. In other words, people believe there’s something called intrinsic value, and however it is disturbed, the market would eventually settle around that value.

Isn’t this strikingly similar to the localization problem we just talked about earlier?

If we look at the intrinsic value as the ground truth of where we are (our location), all the price data we see as measurements, and the momentum as our motion prediction, then we can use Kalman filter to update our belief of the intrinsic value.

It’s all talk until the code runs. Let’s see how this idea holds against the naive model.

To test the idea, I use the data of JP Morgan stock (JPM) from 1983–12–30 to 2016–09–26. And for the implementation of the naive model, I use the last observed Adjusted Close as the next Adjusted Close. The result looks like this:

Naive forecast for JPM Adjusted Close

The r² is really high at 99.878% which came as no surprise. This part of the chart is selected so that the difference between the data and the predictions is visible.

So how does the Kalman model perform against this?

We can clearly see from the chart that it fits the data slightly better than the naive model. And is our gut feeling right about this?

Yes! The r² has improved quite a bit to 99.914%. Considering how close to 1 we already are this isn’t so bad.

Here’s my implementation of the Kalman filter in Python.

class Kalman(object):
def __init__(self, init_price, noise=1):
self.dt = 1 # time scale
self.noise = noise

self.x = np.array([init_price, 0])
# State vector: [price, price_rate] (2x1)
self.P = np.array([[1, 0], [0, 1]])
# Uncertainty covariance matrix (2x2)

self.F = np.array([[1, self.dt], [0, 1]])
# Prediction matrix (2x2)

self.Q = np.array([[noise, 0], [0, noise]])
# Unpredictable external factor noise covariance matrix (2x2)

self.H = np.array([1, 0])
# Measurement mapping function (1x2)

self.R_h = None # Sensor noise covariance (scalar)
self.R_l = None # Sensor noise covariance (scalar)
self.R_c = None # Sensor noise covariance (scalar)
self.R_o = None # Sensor noise covariance (scalar)

self.S = None # Fusion (scalar)

self.y = None # error (scalar)
self.K = None # Kalman gain (2x1)

def predict(self):
self.x = np.matmul(self.F, self.x)
# Predict today's adj close
self.P = np.matmul(np.matmul(self.F, self.P), self.F.T) + self.Q

def update(self, measurement, sensor_type):
self.y = measurement - np.matmul(self.H, self.x)
# Calculate loss

if sensor_type == 'high':
self.S = np.matmul(np.matmul(self.H, self.P), self.H.T) + self.R_h
elif sensor_type == 'low':
self.S = np.matmul(np.matmul(self.H, self.P), self.H.T) + self.R_l
elif sensor_type == 'close':
self.S = np.matmul(np.matmul(self.H, self.P), self.H.T) + self.R_c
else:
self.S = np.matmul(np.matmul(self.H, self.P), self.H.T) + self.R_o

self.K = np.matmul(self.P, self.H.T) * (1/self.S)
# Calculate Kalman gain (3x1)

# Update x and P
self.x = self.x + self.K * self.y
self.P = np.matmul(np.eye(2, 2) - np.matmul(self.K, self.H), self.P)

I use four “measurements” — High, Low, Adjusted Close of given day and Open of the next day to predict for the Adjusted Close of the next day (and by the way, I call it Adjusted Close because that’s what it’s called by Yahoo Finance, but all of the values have been adjusted by myself, so don’t worry). I also use the 6 months standard deviations of the four variables for their measurement noises. And for the motion noise, I just use one dollar. We can always come back and tune these parameters later.

So now we know it works for JP Morgan. The question is, does it work for most of the stocks?

To answer this question, I randomly pick 200 stocks from S&P 500 and do the same comparison.

Here’s the result.

Naive model vs Kalman model on S&P 500

We can see from the mean that the improvement is about as much as in the case of JPM, around 0.1%. And it is also more consistent in making good predictions with standard deviation 0.002575 as opposed to 0.003214.

And this makes sense. The Kalman model we’re using here can be seen as a drift model modified with some related variables (high, low, open) and with uncertainty taken into account. Still it’s great to see that it’s able to beat the naive model with ease.

The next question is: can we make money using this? I don’t think so, haha. But we can test it out easily. And that’ll be the topic of my next article. Stay tuned!

--

--