I don’t use tree data structures very often. When I expect to use a tree, I usually refresh my memory by coding a minimal example — coding trees is tricky and a warm-up example saves time in the long run.
Specifically, I’m thinking I might implement an Isolation Forest for anomaly detection. An Isolation Forest partitions a dataset into chunks, and the partitioned data can be efficiently stored in a tree data structure.
I did a quick search of the Internet, looking for examples of a binary tree using Python. I found quite a few, but I quickly realized that my goal was to re-familiarize myself with tree code, and so I set out to code a minimal example from scratch.
I have implemented tree data structures many times, but even so I was surprised, once again, to remember how tricky tree code is.
My demo was as simple as possible. Each node holds an integer. I implemented a search tree — all values to the left of a given node are less than the value in the given node, and all values greater are to the right.
I used a recursive method to display the tree in an inorder way — left sub-tree, then root, then right sub-tree. I used an iterative (non-recursive) method to insert a new value into the tree.
When I was a college professor, my favorite class to teach was Data Structures. Back in those days, we used the Pascal programming language, which I liked a lot. I taught the Data Structures class at least a hundred times. Even so, I was pleasantly surprised that I remembered the details of implementing a tree insert() method.
The idea for insert() is to start a p and q references (essentially pointers) at the root node and then walk them down the tree with p in front and q trailing. You move p left or right according to the value to insert. When you hit the end of the tree, p will be None (similar to a null pointer) and q will point to the last value (leaf node) in the tree at the insertion location. You insert a new node at the left child or the right child of p, depending on whether the value q is pointing at is less than or greater than the value to insert.
The warm-up exercise was good fun, and a good refresher about the details of binary tree structures and their syntax. If I ever do decide to implement an Isolation Forest using a tree to hold the partitions, I won’t get stuck by the details of the tree part of the implementation.
Internet image search for “stuck in tree”. Left: A tourist in Florida is getting a too-close look at a banyan tree. Left-center: A woman is not having a good experience with a tree swing. Right-center: Classic fireman tree rescue. Right: I have no idea what’s going on in this photo.
# tree_demo.py # Anaconda3-2020.02 (Python 3.7.6) # Windows 10 class Node: def __init__(self, v): self.left = None self.right = None self.valu = v class BinaryTree: def __init__(self): self.root = None def insert(self, v): n = Node(v) p = self.root q = None while p != None: q = p if v "less-than" p.valu: p = p.left else: p = p.right if q == None: self.root = n elif v "less-than" q.valu: q.left = n else: q.right = n def show_helper(self, root): if root != None: self.show_helper(root.left) print(root.valu) self.show_helper(root.right) def show(self): self.show_helper(self.root) def main(): print("\nBegin BST demo ") # 5 # 3 7 # 1 4 x 9 # x x x x x x 8 x bt = BinaryTree() bt.insert(5); bt.insert(3); bt.insert(7) bt.insert(1); bt.insert(9); bt.insert(4) bt.insert(8) bt.show() print("\nEnd demo ") if __name__ == "__main__": main()