1. Balanced Search Trees
- (a) I'm not going to write out every step, and I apologize
for my unevocative ASCII art, but you should end up with
a tree like this:
-
___10 22____
/ | \
1 2 5 12 16 18 30 45 50
- (b) One ordering which produces a different tree is to
insert the numbers in increasing order,
(1, 2, 5, ..., 50), producing the following tree:
-
_____5 16 30_____
/ | \ \
1 2 10 12 18 22 45 50
- (c) By the depth property of a (2,4)-tree, all
external nodes have the same height h. The number of external
nodes in T is minimized when every node has 2 children (the
minimum number of children per node by the size
property), and maximized when every node has 4 children
(the maximum number of children per node by the size
property).
The minimum and maximum numbers of external nodes in a (2,4)-tree
are hence 2h and 4h respectively.
Because T is a tree, we know that it has exactly n+1 external
nodes. This yields the following inequalities:
The first inequality gives us
, and the second yields
, hence
-
Therefore, h is
.
2. Element Uniqueness
Designing the algorithm is easy. Find smax,
the largest number in S. This takes a single run through
the elements: O(n). We can now pick
smax + 1, a number larger than any
element of S, which therefore must not itself be in S.
The proof is also pretty easy -- if you understand what
you're supposed to do. Proof by contradiction. Suppose there
exists an algorithm which runs in less than
time and produces a number guaranteed not to be in S.
This algorithm must perform less than
comparisons in general, since if it did that many comparisons
it would be
. But each comparison only involves two elements of S.
So in performing less than
comparisons, the algorithm must be examining less than n
of the elements. Thus, there must be at least one element,
si, which it never compares against
at all. But clearly this supposed algorithm can't guarantee
that the number it produces is not equal to si,
if it never compares anything with si.
Thus no correct algorithm can run faster than
in the worst case.
3. Non-collinearity of Points
This was a tough one.
The idea is to first calculate the maximum (finite) slope,
mmax, of any line through two points
in S. Once we have mmax, we
can construct a point X such that, for any point p
in S, the line
has slope > mmax. This guarantees
that X is not collinear with any pair of points in
S.
Calculating mmax.
This is really the only tricky part. Naively, we would consider
every pair of points in S, and find the pair which
defined the line with the greatest slope. But this of course
would take
time: too slow. To speed this up, we use a clever observation:
if we sort the points by x-coordinate, mmax
must be defined by a pair of adjacent points. Why? Suppose
mmax is defined by pi
and pj, and some other point pi+1has
an x-coordinate somewhere between xi
and xj. Now, suppose pi+1
lies on the line
; then pi and pi+1
define mmax, as required. Conversely,
suppose pi+1 doesn't lie on
. In this case it's not hard to see that either
or
must have greater slope than
-- contradicting the assumption that pi
and pj defined mmax.
It follows that mmax must be defined
by a pair of points with adjacent x-coordinates. So,
we can sort the points by x-coordinates (
), and do a linear search to find the adjacent pair of points
which define mmax, in total time
.
Constructing X. This is easier. Do
linear searches through the points in S to find xmin,
xmax, ymin,
and ymax. This gives us the bounding
box of the n points. Call the upper left corner point
of this bounding box
c = (xmin, ymax).
Draw a line L through c with slope mmax.
Now, let u be the point on L with x-coordinate
xmax + 1. Finally, let X be
the point (xu, yu
+ 1).
Proving X is non-collinear. For any
point p in S, consider the slope mp,X
of the line
. Since c is at the upper left of all the points,
and X lies above and to the right of all the points,
. Also, since X lies directly above u,
mc,X > mc,u.
But we constructed u so that
mc,u = mmax.
Therefore,
mp,X > mmax,
as desired. And, since no line through a pair of points in
S has slope > mmax, it
follows that no pair of points in S is collinear with
X.
4. Maximal Points of a Set
This took some creativity, but really wasn't hard once you understood
what a ``maximal'' point was. Our algorithm must find all points
p such that no other point q in Sdominates
p (ie, satisfies xq > xp
and yq > yp).
(Note that q only dominates p if both
its x- and y-coordinates are greater than p's,
not if only one coordinate is greater! In other words, p
is maximal iff the quadrant above it and to its right is empty.
Many students were confused about this.) The complexity constraint
is again
, so presorting is an option.
The idea is to sort the points by x-coordinate and
go through them from right to left, keeping track as we go
of ymax, the greatest y-coordinate
seen so far. When considering a point p, there are
two cases: either
, or
ymax > yp.
If
, no point to the right of p also lies above
it. Thus no point dominates p, and therefore p
is a maximal point. On the other hand, if
ymax > yp,
there exists some point already seen (ie, to the right of
p) with greater y-coordinate. So this point
dominates p, and p is not a maximal point. (There
is a small subtlety here in the case where multiple points
have the same x-coordinate, but it's not hard to deal
with by presorting the points appropriately.)
In short, by presorting and then traversing the points once
from right to left, we can conclude whether each point p
is maximal as we traverse it. The running time for the whole
algorithm is
for the presorting + O(n) for the linear traversal
=
. (Of course, by symmetry of x and y, you can
accomplish the same thing by sorting by y-coordinate
instead and traversing keeping track of xmax.)
Sylvain Bouix
2000-10-24