Sunday, July 31, 2011

LinkedIn Intern Hackday 2011 and Node.js

Yesterday I had the best birthday ever being part of the LinkedIn Intern Hackday.

First a little background of this this event:

Sponsored by LinkedIn, put together by Adam Nash and Jim Brikman.

Any intern in the Silicon Valley is invited. Just form a small group, come to LinkedIn Mountain View headquarters, and pull an all-nighter hacking up some awesomeness. From all the projects, we would pick top three in the likes of American Idol.

Interestingly, Node.js was a popular technology stack being used.


The momentum behind this technology is getting stronger and stronger. I see this trend continuing from the following aspects:

Pervasiveness of JavaScript:

JavaScript has become the de-facto programming language on the client-side. It fulfilled the void of writing complex programs in the browser where Java Applets failed.

As JavaScript matures, I see it become more and more pervasive on the server-side. Having a consistent language stack between client and server is desirable. With support for CommonJS by companies like Google, it does seem possible JavaScript being a strong contender in the server-side landscape.

Cloud Computing:

Cloud computing is here, where the cost of running a service on the cloud is measured by usage. Therefore squeezing every ounce to power from your machine instance is highly desirable. The philosophy of Node.js for asynchronous event handling makes a lot of sense in this environment. Because every bit of CPU starvation is costly, now measurable by $.


The tech trend is often set by the younger generation. The projects in this intern hackday collectively is a good sample of what the next generation of wiz's and geeks are going to work on. Node.js set a tone.


I spent sometime learning about Node.js and find although it is still very young, but its potential impact is going to be significant. I see someday it become a serious competition for Ruby/Rails and/or Python/Django.

I see server-side Java being pushed more towards the backend and eventually finding room in custom backends like NOSQL/Search systems.

I am picking up a JavaScript book and it is going to be exciting!

Friday, July 8, 2011

Python, Tornado, Kafka, Oh My!

Something about Kafka:

Recently LinkedIn's Kafka has been accepted to Apache Incubator.

Kafka is a high-throughput distributed publish-scribe messaging system written in Scala.

Kafka scales very well with increased dataset as well subscribers. For detailed performance results, check this out.

Capturing internet events:

We were looking to build a data application that captures mobile activities.

Requirements are:
  • High volume
  • Data sent over the internet
Kafka being the obvious choice for streaming message to our backend systems, but we of course don't want to expose our Kafka endpoint on the web.

So, we need to build a http proxy to front our Kafka cluster.

Python and Tornado:

Being a recent Python convert (by learning Django from Lei), I wanted to build this proxy in Python.

Django is a little heavy for this use-case, all I needed is a http server.

Luckily Ikai facebooked me his talk on Tornado - a light-weight http server in Python.

Given Kafka already has a Python client, voila, we have a http proxy listening for events pumping to Kafka.

Here is the code:

import tornado.ioloop
import tornado.web

from kafka import KafkaProducer

class KafkaHandler(tornado.web.RequestHandler):
topic = "app-update"
producer = KafkaProducer('localhost',9092)
def post(self):
d = self.request.body
self.producer.send([d], self.topic)
print d

application = tornado.web.Application([
(r"/app-update", KafkaHandler),

if __name__ == "__main__":