2-3-4 Trees
2-3-4 Trees: Introduction
- Motivation: Keep a search tree balanced
- Basic ideas:
- Keep 1, 2, or 3 values per node
- Values in nodes and subtrees are ordered (as in BST)
- Split nodes as needed, pushing values up
- All leaves are the same depth
- 2-3 trees invented by John Hopcroft (1970)
2-3-4 Trees: Definition
- Each node in a 234 tree is either a 2-node, 3-node, or 4-node
- 2-nodes have 2 children
- 3-nodes have 3 children
- 4-nodes have 4 children
- The number of children can change
- The number of children depends on the number of values in the node:
- 2-nodes have 1 value
- 3-nodes have 2 values
- 4-nodes have 3 values
- The values and children maintain an ordering (like a binary search tree)
- AKA (2-4) trees
2-3-4 Trees: Example
- Examples (each key is a single letter):
- Node with 2 values and no children: ER
- Node with 3 values and no children: EIR
- Node with 2 values and three children: ER/AAC;HIN;S
- Node with 3 values and four children: EIR/AAC;GH;N;S
2-3-4 Trees: Finding Nodes
- 2-node: left ≤ key; key ≤ right
- 3-node: left ≤ lkey; lkey ≤ middle ≤ rkey; rkey ≤ right
- 4-node:
left ≤ lkey;
lkey ≤ lmiddle ≤ mkey;
mkey ≤ rmiddle ≤ rkey;
rkey ≤ right
- How do we search for a value?
- Notice: duplicates can appear in either subtree
2-3-4 Trees: Adding Values
- Traverse tree from top to bottom
- Split any 3-nodes by moving up middle value
- Middle value will always have a place to go. Why?
- If new value is same as node value, go right
- Only add values in bottom nodes
2-3-4 Trees: Example Adding Values
- Let's add A S E A R C H I N G E X A M P L E to a tree
- Empty ; add A
- A ; add S
- AS ; add E
- AES ; add A
- E / AA, S ; add R, C, H
- E / AAC, HRS ; add I, N
- ER / AAC, HIN, S ; add G
- EIR / AAC, GH, N, S ; add E
- I / E, R / AAC, EGH, N, S ; add X, A, M, P, L, E
- I / AEG, NR / A, AC, EE, H; LM, P, SX
Why the Tree Remains Balanced
- Height only grows when split root
- Thus height of all nodes is increased at same time
- Thus, tree remains balanced!
2-3-4 Tree Implementation
- Having a variable number of values and pointers per node complicates
implementation
- Red-Black Trees are a binary tree implementation of Red Black Trees
- Every node has a color: red or black
- A red node is part of its parent in the equivalent 2-3-4 tree
- RB trees allows using a simpler data, but the
implementation is more complicated
2-3-4 Trees and Other Balanced Trees
- We looked at a top-down tree; bottom up trees also exist
- Can also create 2-3 trees
- 2-3-4 Trees are a kind of B-Tree
- B-Trees are used in storing databases on disk