Recently I’ve been doing some research on Elasticsearch to see if it suits my needs at work. One thing that I care about most is its Python support. If you do data science then there are three libraries available on Python for Elasticsearch you might wanna look into: elasticsearch-py, elasticsearch-dsl-py, and eland.

elasticsearch-py provides you with low level APIs with which you can do most what you need to do with Elasticsearch. elasticsearch-dsl-py is a higher level client library which is more pythonic and sits on top of elasticsearch-py. To see the difference, here’s some code straight from their github:

elasticsearch-py vs elasticsearch-dsl-py

Why use DataLoader?

Because you don’t want to implement your own mini batch code each time. And since you’re gonna write up some wrapper for it anyway, the guys at FAIR thought they’d just do it for you to save you the trouble. Also it’s standardized so anyone can figure out how you prepare your data easily when they see your code. And I think this wrapper they’ve come up with is pretty good.

How it works

Basically the DataLoader works with the Dataset object. So to use the DataLoader you need to get your data into this Dataset wrapper. To do this you only need…

Recently while studying for the Self-Driving Car Nanodegree from Udacity, I came across something really amazing, called the Kalman filter. It is used widely in self-driving cars to deal with the localization problem.

The idea of Kalman filter is simple. We assume our current location is some Gaussian distribution with mean µ and standard deviation σ. And each time we do a measurement or we move, we update this distribution. For example, if we looked up and saw a pyramid, the chance of us being in Egypt would increase. …

Calvin Ku

Sapere aude.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store