- 1 Introduction
- 2 Multiway Search Trees
- 3 Searching
- 4 Secondary Structure
- 5 General Performance Analysis
- 6 Definition of a (2,4)-tree
- 7 Insertion
- 8 Deletion
- 9 Interactive Java Applet

We begin by examining general multiway search trees which we then specialize to obtain certain desired properties.

[Top] [Bottom] [Home]

Given this definition, a binary search tree is a multiway search tree.

*T*is ordered, meaning that the all the elements in subtrees to the left of an item are less than the item itself, and all the elements in subtrees to the right of an item are greater.- Each internal node of
*T*has at least 2 children. - Each
*d*-node (node with*d*children)*v*of*T*, with children*v*_{1},...,*v*stores_{d}*d*-1 items (*k*_{1},*x*_{1}),...,(*k*_{d-1},*x*_{d-1}). Where the*k*'s are keys and_{i}*x*is the element associated with key number_{i}*i*. - External nodes are empty

- For a
*d*-node*v*, compare the sought key with the keys*k*_{1},...,*k*_{d-1}stored at*v*. - If
*k*is found then the search is a SUCCESS. - Otherwise return to step 1 using the child
*v*such that_{i}*k*_{i-1}<=*k*._{i} - If at any point we arrive at an empty (external) node, then the search FAILS.

[Top] [Bottom] [Home]

Why is this?

Note that we have more than one item stored at each node of our tree, how are we going to store

Side note: Bootstrapping

A clever solution to this problem is to use a multiway search tree as the secondary structure! This seems to be a circular argument doesn't it? How can we use a multiway search tree to build a multiway search tree? Circularity is avoided by using a technique known as bootstrapping. We are not going to go over this technique here because (2,4)-trees deal with our problem in a much simpler way, but I welcome the interested reader to read up on bootstrapping techniques.

[Top] [Bottom] [Home]

Hence the search time

- O(
*d*_{max}) (if our secondary structure is list-based) - O(log
*d*_{max}) (if our secondary structure is tree-based)

- O(
*h**d*_{max}),**or** - O(
*h*log*d*_{max})

In the following sections, we will define (2,4)-trees and see how they fare based on these constraints!

[Top] [Bottom] [Home]

- SIZE: every node can have
*no more*than 4 children. - DEPTH: all external nodes have the same depth.

- if follows from the the SIZE property
that the number of items at each node is less than or equal to 4.
Hence
*d*_{max}is constant and our search time is already down to O(*h*)! - luckily, the DEPTH property ensures
that the tree is balanced, but also that the height is restricted
to THETA(log
*n*) (where*n*is the number of nodes in the tree). Click here if you want to see the proof

So we've got to see if we can insert and delete items from this tree efficiently without disturbing these properties. This will be the topic of the next two sections.

[Top] [Bottom] [Home]

- The SIZE and DEPTH depth properties of (2,4)-trees can be maintained upon insertion of a new item.
- The maintenance cost is bounded above by the height of the tree

- perform a SEARCH for
*k*in*T*.

**if**it succeeds then we don't need to INSERT the item and we're done,

**otherwise**(if it fails) then the search terminates at an external node*z*. - let
*v*be the parent node of*z*, insert*k*into the appropriate place in*v*and add a new child*w*to*v*on the left of*z*(see figure 6.2(a)).

Consider what would happen if

We fix this little glitch by SPLITTING

- Split
*v*by replacing it with*v'*(a 3-node that stores keys*k*_{1}and*k*_{2}) and*v"*(a 2-node that stores*k*_{4}). - Store
*k*_{3}in what*was*the parent of*v*(if*v*was the root we create a new node and store it in there). Call that node*u*. - make
*v'*and*v"*the children of*u*. (if*v*was the*i*^{th}child of*u*, then*v'*and*v"*become the*i*^{th}and*(i+1)*^{st}children of*u*).

First observe that this procedure has perfectly taken care of the overflow situation at node

Notice also that this procedure will eventually terminate at a node that doesn't overflow or at the root (where we create a brand new node that obviously doesn't overflow).

Finally, you may have spotted the fact that the we change the depth of the tree when we create a new root. However this doesn't violate our DEPTH property since every node in the tree's depth increases by 1 (hence the tree stays balanced).

- We began with a search procedure which take O(log
*n*) time. - Next we insert the key into
*v*in O(1) time. - Finally, we had a maximum of O(
*h*) split operations to maintain the SIZE property. As we showed in section 6.6,*h*is THETA(log*n*). Hence we have at worst case O(log*n*) splits. - Each split affects a constant number of items in a constant number
of nodes of
*T*and hence takes O(1) time.

[Top] [Bottom] [Home]

- perform a SEARCH for
*k*in*T*,

**if it fails then we don't need to DELETE the item so we exit,**

**otherwise**(if it succeds) then we find*k*in a node*v*with only external children (by our assumption). - now we simply remove
*k*from*v*and delete the external node child to the left of*k*(see figure 6.4(a)).

This is because (as shown in figure 6.4(b))

Let

*v*has a sibling*w*that is a 3-node or a 4-node.

In this case we perform a TRANSFER operation as follows (figure 6.5):- Move a child of
*w*to*v*. - Move a key from
*w*to*u*. - Move a key from
*u*to*v*.

- It adds a child to
*v*and removes one from*w*(thereby making*v*a 2-node and*w*a 2 or 3-node, resolving the underflow at*v*). - But now we must transfer the corresponding keys without
destroying the ordered nature of the tree. We do this by pushing
the extra key from
*w*through the appropriate key in the parent*u*, and finally into*v*. - The net effect is that the number of keys in
*w*has been reduced by 1, and the number of keys in*v*has been increased by 1 as desired!

**Figure 6.5:**deletion of key 4 resulting in an underflow at*v*. (a) transfer operation, (b) the resulting tree after the transfer.

- Move a child of
*v*has no such siblings (i.e., they are all 2-nodes).

This case requires a FUSION of two nodes as follows (figure 6.6):- Merge
*v*with a 2-node sibling*w*creating a new node*v'*. - Move a key from
*u*(*v*'s parent) to*v'*.

*v'*has 3 children, but stores only one key, and*u*has lost a child (since two of it's children merged into one), hence we move a key from*u*to*v'*to preserve the tree.

- Merge

Observe also that this process also terminates at the root (an underflow at the root simply causes its deletion), and is hence bounded above by O(log

Almost done! However, remember that everything above only applies if the key we're trying to remove is stored at a node with only external nodes for children (if you don't remember making this assumption click here)! What do we do if this is not the case? Well, there just so happens to be a very easy way to swap any key in a (2,4)-tree with one that is stored in a node with the property we desire without destroying the ordering in the tree. This is done as follows:

Swap the internal key

A natural question to ask would be "how do we find this element?" This is accomplished by performing the following steps:

- Let
*v*be the internal node in which the element we wish to delete (*k*_{i}) is stored. - Let
*w*be the right-most internal node in the subtree rooted at the*i*^{th}child of*v*. - Swap
*k*_{i}with the last item of*w*.

- We began with a search procedure which takes O(log
*n*) time. - This may be followed with a swap which also takes O(log
*n*). - Finally, we had a maximum total of O(log
*n*) TRANSFER or FUSION operations (each of which takes constant time).

[Top] [Bottom] [Home]

- Netscape 4 or above is recommended (otherwise the animation stinks)
- Input a number between 1 and 99 in the text box at the top and click on either the "Insert", "Delete", or "Search" button to perform the corresponding function with that number.
- The speed of the animation can be adjusted in using the menu at the top right (selecting "Instant" will disregard any animation and simply display the tree after the function has been completed).
- An overflow is indicated in red.
- An underflow is indicated in light blue.
- When elements are moved from one node to another, they are colored differently (in green or red) so that you can track their movements.
- When entire subtrees are moved, they are highlighted in light blue.
- Although the applet allows for as many as a hundred elements to be input, the screen gets pretty crowded beyond 50, and nodes will start to overlap. The reason for allowing this many numbers is to allow flexibility in your choice of where you would like an element to be placed.
- Here is a link to a similar applet using AVL trees instead of (2,4)-trees. Unfortunately, there is no documentation about AVL trees at this site, just the applet. However, this applet has many more features than mine, and is better animated.

[Top] [Bottom] [Home]

Sam Bakhtiar SANJABI Last modified: Tue Apr 25 08:30:10 EDT 2000