W2D5: Hashmaps and LRU Caches
Hot on the heels of learning about big O notation, Friday’s task was to first take in about 3 hours of lecture via online videos, and then get to work designing a LRU Cache from scratch. The goal was to design a data structure which can store, lookup, and delete elements all in constant time regardless of the size of the data set, and can store recently used items in RAM to make rendering more efficient. We did this by first making our own versions of a linked list, an ordered list of items in memory that can be easily modified without changing the reference path to each of the objects in the list, unlike other data structures such as an array (in an array, objects can be referenced by their position in the array, but if I take something out or add something, everything’s position changes, whereas a linked list will maintain its relationships regardless of where I insert an element). We then designed a hash map (ruby’s built in Hash class, often notated with curly braces { }), which can be designed through a clever combination of an array of variable size (size increases as more things are stored in it, so that lookup times stay constant) in which each element is a linked list of stored objects. Objects are stored by assigning them a unique but consistent hash number, from which their correct storage bucket can be determined through use of the modulo operator, which returns the remainder of a division operation (if i have 16 memory “buckets” and I divide the unique number for this object and get a remainder of 5, I can store it, and look for it later, in the fifth bucket, which will make accessing times constant as we will only have to look through the elements stored in that bucket, and since we increase our number of buckets proportional to the number of objects we store, our average list length will be ~1 item). The LRU cache is then a combination of this hash map and another linked list, where the linked list represents the items stored in memory, and each time we access an item we move it to the front of the linked list. Once we have too many items to keep in memory, this list will tell us which items we have used least recently so that we can remove them from our RAM. The hash is necessary to keep our lookup times down, as we can have the return value for various lookup keys be the actual objects stored in the LRU cache linked list (see above for my brief explanation of why looking things up in a hash is faster).
This post is more technical than others I have written, but this day had a lot of content that I wanted to articulate as clearly as I could. After we wrapped up pair programming we broke into quick circle groups to reflect upon our progress and meet with a few members of the previous cohort for a quick Q&A. Tomorrow we get to work learning SQL!











