LFU Cache

0

LFU Cache Algorithm

LFU is a cache eviction algorithm called least frequently used cache. The algorithm presented here is used in Apache Kahadb module, class LFUCache. It is based on the paper http://dhruvbird.com/lfu.pdf, with some notable differences. The algorithm has a run-time complexity of O(1) for all of its operations, which include insertion, access and deletion(eviction).

Data structure

It requires three data structures. One is a hash table which is used to cache the key/values so that given a key we can retrieve the cache entry at O(1). Second one is a double linked list for each frequency of access. The max frequency is capped at the cache size to avoid creating more and more frequency list entries. If we have a cache of max size 4 then we will end up with 4 different frequencies. Each frequency will have a double linked list to keep track of the cache entries belonging to that particular frequency.
The third data structure would be to somehow link these frequencies lists. It can be either an array or another linked list so that on accessing a cache entry it can be easily promoted to the next frequency list in time O(1). In our article it is based on array as traversing would be faster than linked list.

Put values to LFU cache

Insertion is done in O(1) time. The new entry is added to hash table as well as to the first frequency’s linked list. This is done in O(1). If the cache reaches its maximum size then we need to evict few entries based on eviction factor. The number of entries deleted would  be = max size * eviction factor. If eviction factor is 1 and the cache reaches its limit, only then the insert would take O(n) time. The entries would be removed from cache in order of the frequency that they appear, from the lowest frequency and to the next frequency list and so on.

The below test creates a LFUCache and adds key/values. Internally, a cache entry is created for each key/value pair to hold the key, value and the frequency. Since the cache entry is new to cache, its frequency will be 0. The cache entry is added to the hash map as well as to the linked list assigned to frequency 0.

    public void testLfuPut() {
        LFUCache lfu = new LFUCache(4, 0.9f);
        lfu.put("a", 1);
        lfu.put("b", 2);
        lfu.put("c", 3);
        lfu.put("d", 4);
    }

Put new entries into LFU Cache

Get key ‘c’

Accessing c, removes it from first frequency list and adds it to the next one.
Access key in LFU Cache

Get key ‘d’

Accessing key d, removes it from first frequency list and adds it to the next one. It is now linked to c.
LFU Cache

Get key ‘c’

Accessing key c,  promotes it to the next frequency list at index 2.

 

LFU Cache

Few more retrievals

Access keys in order c, c, d, d, b, b, a, a
LFU Cache

 

Access d and then c. Since c is already there in the highest frequency list, it will simply be removed from the list and added again, so c will move closer to the header node.

 

Promote most recently used

Eviction

Suppose the eviction factor is 0.8f then on reaching maximum cache size, a new entry will push 4 * 0.8 = 3 entries out of the cache before adding itself. In the above example, it will delete a, b and d entries.

 

LFU Cache Eviction

 

Share.

Leave A Reply