Every semester, without fail, I get dozens of questions about tree rotations. It gets confusing to keep track of the direction of the rotation and the steps to reorganize the tree afterwards. It doesn't help that we don't spend too much time covering it in class either. This post seeks to make understanding tree rotations, their uses, and their algorithms just a little bit easier!
This post assumes familiarity with Binary Search Trees [1] [2] [3] and AVL constraints [4] [5]. If you're unfamiliar with either, then I recommend brushing up on those concepts before jumping into tree rotations. Before we start, it's important to note that tree rotations happen sparingly, and you'd only perform a rotation if a subtree is unbalanced. Otherwise, if your tree is balanced, there's no reason to do a tree rotation, and doing so might actually cause more problems. With that out of the way, let's get started!
Let's start with a Binary Search Tree. Since tree rotations are a way to balance our search tree, we'll make sure that our binary search tree is unbalanced. Since there are different ways to balance search trees [6], our goal will be for our tree to adhere to AVL balance constraints. Let's look at the example below:
Figure 1: A balanced AVL BST.
Let's say we remove Node F. Now our tree looks like this:
Figure 2: Our tree from Figure 1, with Node F removed.
Since this is an AVL tree, the balance factor of Node F's ancestors has changed, which causes Node C to have a balance factor of +2. Before removing Node F, Node C had a balance factor of +1. We'll have to rebalance our tree in order for it to adhere to AVL constraints. This is where tree rotations come in!
Our Special Rotation Algorithm:
We'll need to rotate Node F's parent, Node C, the node with the +2 balance factor, over to the left.
Figure 3: Our tree from Figure 2, but now we're preparing for a left rotation around Node C.
A leftward rotation will allow for Node C's right child, Node G, to become the parent of that entire subtree, while Node C becomes the new left child of Node G [7]. Node H becomes the right child of Node C. This preserves the ordering of the BST, since we want to make sure we maintain those constraints as well (a value smaller than the parent goes to the left, while a value bigger than the parent goes to the right).
Figure 4: Our new AVL tree after rotating around Node C.
Note that Node I doesn't change its position ever, it just stays put relative to the rotation. The Node G's former left node, Node H does change; it becomes the right child of Node G's "new" left node. If Node H were to have subtrees, those subtrees would not need to be changed.
Note that our new balance factor for Node G is -1 and Node C is +1. These are now within an acceptable range for our AVL tree constraints.
Final Algorithm:
Now that we know the steps we need to take, let's write down our algorithm in a cleaner way.
Since we just performed a left rotation, we can now infer the steps for a right rotation. Let's start with a slightly different AVL Tree (which, with the correct values, could also be a heap! [8]).
Figure 5: A brand new AVL tree, oOoOoOo...
Let's pretend we removed Node E. This would change our balance factor for Node B to -2 [9]. In order to fix this, we would need to do a rightward rotation around Node B. After doing so, Node B's balance factor will be 0, with Node D's balance factor becoming +1.
That's all there is to it! I recommend practicing drawing your own tree with different balance factors, then removing or adding a node, and trying to use a rotation to balance your tree. It's good practice that will help you visualize the process. Until next time!
Sources & Side Notes:
[1] Stanford's BST Lecture Slides.
[2] CMU's BST Notes.
[3] UB's BST Lecture Slides.
[4] Stanford's AVL Lecture Slides.
[5] AVL Tree Visualizer.
[6] AVL vs. Red Black Trees.
[7] UB's Tree Rotation Lecture Slides.
[8] Algosaurus - Heaps, Please note that you shouldn't do tree rotations on heaps, they'll break the structure of a heap. Please.
[9] Removing Node E doesn't affect Node A's balance factor because the height of its children never changes.
This video explains the famous Binary Search Tree(BST) Data Structures. BST(s) are generally used for efficient storing and retrieving of data. You can see D...
Introduction to Binary Search Trees: The Data Structure behind Fast Lookups
Learning Data Structures in JavaScript from Scratch - $10 Course
Learning Data Structures in JavaScript from Scratch – $10 Course
Data structures allow you to improve the efficiency, performance, speed, and scalability of your code/programs/applications. The Udemy course “Learning Data Structures in JavaScript from Scratch” teaches data structures (linked lists, binary search trees, hash tables) from the ground up. You will learn what data structures are, why they are important, and how to code them out in JavaScript. You…
Learning Data Structures in JavaScript from Scratch - $10 Course
Learning Data Structures in JavaScript from Scratch – $10 Course
Data structures allow you to improve the efficiency, performance, speed, and scalability of your code/programs/applications. The Udemy course “Learning Data Structures in JavaScript from Scratch” teaches data structures (linked lists, binary search trees, hash tables) from the ground up. You will learn what data structures are, why they are important, and how to code them out in JavaScript. You…
Learning Data Structures in JavaScript from Scratch - $10 Course
Learning Data Structures in JavaScript from Scratch – $10 Course
Data structures allow you to improve the efficiency, performance, speed, and scalability of your code/programs/applications. The Udemy course “Learning Data Structures in JavaScript from Scratch” teaches data structures (linked lists, binary search trees, hash tables) from the ground up. You will learn what data structures are, why they are important, and how to code them out in JavaScript. You…
Remember how we left off last Slog talking about how to do bst_insert and bst_insert in a recursive way? We’re finally here.
I’d been looking at the definition of recursion and BST before this class, trying to puzzle it out-- To no avail. But now looking at it, it was because I kept trying to use an iterative definition to apply it-- when what we needed was a whole new approach.
For the deletion method, Iteration does it by taking the node to be deleted then pointing it’s parent function to it’s right tree-- easy, done. Recursion does it by applying bst_delete to every single node. It’s repetitive and memory consuming, but can be done in short lines of code.
Both have their good and bad points-- but memory wise, it feels like iteration would be so much more efficient. Remember the old saying that “No recursive function is more efficient than its iterative equivalent” ? That certainly comes into play here.
We did a brief intro on Binary search trees last time—and here we are now, going deeper and implementing more of our code. ( This seems to be a common theme in our classes. )
The methods we’re exploring today is insert and delete; how do you insert a value in the tree and keep it balanced? Insert and Delete may have been easier for other trees, but for binary, we have to keep to the definition—values lower than the value in the parent node are in the left child, and higher than the value in the parent node are in the right.
Insert was easier than delete. You simply trace the tree, going left if the value is lower, and right if the value is higher, and append the value if the child is empty.
Delete, on the other hand… We had to consider two cases, one where the current node has no child, and two, where the current node does have a child if the current node has no child, we would simply connect the right subtree to the parent node. ( I say simply, but in practice, it’s really not. ) In the second case, we had to go to the right most node of the left child and put that in the parent node, connect the left child of that node to where the right most node was before.
We wrote it iteratively this week—the teacher asked us this: how could we write is recursively? I was thinking about this, but I can’t see a way how. Iteration uses a method where you “point” to the node, then delete it—how would you write that recursively at all? Maybe I’ll make a post if I figure it out before the next lecture.
To get back to my binary search tree project, I’m going to show you some code. Not the ugliest I have ever written, but not the prettiest by any means. I am proud of it though. Through this smallish (it does NOT fit Sandi Metz’s criteria of a 5 line max per) method, I was able to understand recursion better. I told you I would get back to recursion and I am doing it here!
This is an excerpt from a Node Class. Basically, each node knows its own value (stored as ‘value’) and up to two other nodes; one with a lower value (the ‘left_child’), and one with a higher value (the ‘right_child’).
I was struggling with the idea of having calls to a method from another instance of Node within the parent Node and being able to pass the information back that a search_value was either found (true) or not (stay false). This is complicated because each Node has no knowledge of its parent nodes, so it can’t send them information. I started out with something like this:
In addition to having nested if statements (a sure sign I am struggling to understand what I am doing.. “if I do this and this and this, maybe? it will work?”), this flat out doesn’t work. When the child Nodes set value_found = true, it does nothing to the parent Node’s value_found (leaving it at the default ‘false’). Unacceptable.
Ah yes! turning the value_found from a local variable into a class variable makes it available to every instance of the Node class. Now each Node can modify that and every other Node can see that change! Hooray, problem solved... except it isn’t. Once one node sets @@value_found to true, there was no way to set it back. I could add more nested ifs and elses and basically make my code totally unreadable and a beast to refactor... or I could admit that using a class variable isn’t making the best use of the recursive process and search for a better way to get the desired result. The solution I came up with helped clean my code (a bit) while it passed the info around I wanted:
def can_find?(search_value)
value_found ||= false
if value == search_value
value_found = true
elsif right_child && (search_value > value)
value_found = true if right_child.can_find?(search_value)
elsif left_child && (search_value < value)
value_found = true if left_child.can_find?(search_value)
This way the parent node sets value_found to true if its child returns it true. The Nodes don’t have to know something all the Nodes know (through the class variable) and, what’s cool to me, is that now I can see some easy ways to start refactoring my code here!