About data caching

Published Sunday, September 9, 2012 9:04 PM


In my Code Project article “Demystifying concurrent lazy load pattern” I have explained what common mistakes in lazy load implementations and how to implement fast concurrent lazy load cache. In this post I will discuss why caching itself is important and what are some implementation strategies for caching.


Why caching?


Simply put... performance! Usual business application relies on lots of data, and that data usually resides in a database. Because data is out of the application thread, accessing it can be much slower. If external data is on a hard drive (and it usually is), accessing it can be thousands of times slower than accessing data in application memory. The great thing about cached data is that not only it will be few thousand times faster, but it will also scale much better under heavy load than your poor database.

Nowadays, cached data is becoming more and more important with the emergence of high load web sites with many concurrent users. In addition, we see the rise of distributed key-value databases acting as a shared cache.

In our application, we have increased the speed of a critical business feature, Order lines import, for about 50 times after data caching. Speed went from 1 row per second to 50 rows per second.

Sounds great, and best of all it is easy and it just takes a little caring about your data. Unfortunately, I have seen many problematic implementations of caching that will either occasionally break in production or have very low performance and those implementations are the main reason for writing this article.






by vukoje