Co-founder @ SMERGERS and

Often, we come across problems which need better and faster solutions. It may seem like we have tried out everything and we do not have anything else for the given scenario. But some solutions seem to be just amazing, I have come across some of them where these performance bottlenecks have been solved using very simple design.

1. HTTP reverse proxying

Apache is the most common web server used by many websites. Apache is very good for serving dynamic content, but it has some limitations. As load increases on the server, performance of apache starts going down, whereas some lightweight servers like nginx perform really well for static content.

An instance of apache thread is tied to a client and it is busy as long as the data transfer between the client and server is going on. This can be problematic if many clients are accessing the site from a slow connection. Slower the connection, more is the time needed to serve the data and free up the thread.

Client <-----> Server(Apache)

On the other hand, nginx uses a single thread to manage all the connections. Like apache, slower connections will not tie up the server. How do we take advantage of this? By placing an instance of nginx between the client and server, we can get some amazing performance boost and this technique is called as reverse proxying.

Client <-----> Nginx on port 80(Reverse proxy) <-----> Server(Apache on some other port)

Whenever we make a request to the server, Nginx listens to our request on port 80 and forwards it to apache which is listening on another port, which processes the request and returns the http response. Since both Nginx and Apache are running on the same machine, the data transfer is really fast between apache and nginx. Apache processes are released in no time. We are taking the best of the two servers, apache for serving the dynamic content and nginx for static content. More information about reverse proxying can be found Mark Maunders' blog.

2. Compression in MS SQL Database

When I attended one of the boot camps as a part of Microsoft Student Partner programme, Vinod Kumar spoke about MS SQL server, where he mentioned that data is compressed before saving it on disk and it has resulted in better performance. One may be surprised how doing something extra can speed up operations? It so happens that disk IO is far more time consuming that processor operations. So, transferring of compressed data is reducing a lot of time.

Transfer time of Compressed data + Decompression < Transfer time of Uncompressed data

This is analogous to us sending a zipped file to some of our friends via email. Internet speed is a bottle neck and we rely on the processing speed of our computers to compress and decompress the files which result in faster sending and receiving of data.

3. AWS Data transfers

This one was least expected but it truly happens. I attended the Hadoop summit last year and attended a talk by Simone Brunozzi. where he was talking about how one can perform map reduce on Amazon map reduce platform. During the QnA session, one of the delegates asked the following question:

Since map reduce deals with humongous amount of data, to even take advantage of the platform, we have to upload all the data to Amazon servers, which is very troublesome since it may take several weeks to transfer all the data.

And this was the answer

You can courier us your hard drives, we will plug them in our high speed data centers and you will have access to your data on our map reduce platform within 2 to 3 days.

Many were surprised, and some asked. What?! do you really do that?? and the answer was yes. Simone went on to quote an example where tanenbaum was asked to devise a backup solution for storing huge volumes of bank transaction details. Considering the connectivity available at that time, tanenbaum said:

"Never underestimate the power of bullock carts carrying hard drives on a highway!"

So, before we devise complex solutions, it is important to see if there is something simple and more elegant

blog comments powered by Disqus