Something about Kafka:
Recently LinkedIn's Kafka has been accepted to Apache Incubator.
Kafka is a high-throughput distributed publish-scribe messaging system written in Scala.
Kafka scales very well with increased dataset as well subscribers. For detailed performance results, check this out.
Capturing internet events:
We were looking to build a data application that captures mobile activities.
- High volume
- Data sent over the internet
Kafka being the obvious choice for streaming message to our backend systems, but we of course don't want to expose our Kafka endpoint on the web.
So, we need to build a http proxy to front our Kafka cluster.
Python and Tornado:
Django is a little heavy for this use-case, all I needed is a http server.
Given Kafka already has a Python client, voila, we have a http proxy listening for events pumping to Kafka.
Here is the code:
from kafka import KafkaProducer
topic = "app-update"
producer = KafkaProducer('localhost',9092)
d = self.request.body
application = tornado.web.Application([
if __name__ == "__main__":