In the algorithm SELECT, the input elements are divided into groups of 5. Will the algorithm work in linear time if they are divided into groups of 7? Argue that SELECT does not run in linear time if groups of 3 are used.
Suppose the input elements are divided into
Suppose
Suppose the input elements are divided into
Suppose
Therefore SELECT does not run in linear time if groups of 3 are used.
Analyze SELECT to show that if
$n \ge 140$ , then at least$\lceil n/4 \rceil$ elements are greater than the median-of-medians$x$ and at least$\lceil n/4 \rceil$ elements are less than$x$ .
Show how quicksort can be made to run in
$O(n \lg n)$ time in the worst case, assuming that all elements are distinct.
Use median as pivot, since we can find median in
Suppose that an algorithm uses only comparisons to find the $i$th smallest element in a set of
$n$ elements. Show that it can also find the$i - 1$ smaller elements and$n-i$ larger elements without performing additional comparisons.
Suppose that you have a "black-box" worst-case linear-time median subroutine. Give a simple, linear-time algorithm that solves the selection problem for an arbitrary order statistic.
def black_box_median(a, p, r):
return sorted(a)[(p + r) // 2]
def partition(a, p, r, x):
i = p - 1
for k in range(p, r - 1):
if a[k] == x:
a[k], a[r - 1] = a[r - 1], a[k]
break
for k in range(p, r - 1):
if a[k] < x:
i += 1
a[i], a[k] = a[k], a[i]
i += 1
a[i], a[r - 1] = a[r - 1], a[i]
return i
def select(a, p, r, i):
if p + 1 == r:
return a[p]
x = black_box_median(a, p, r)
q = partition(a, p, r, x)
k = q - p + 1
if i == k:
return a[q]
if i < k:
return select(a, p, q, i)
return select(a, q + 1, r, i - k)
The $k$th quantiles of an
$n$ -element set are the$k - 1$ order statistics that divide the sorted set into$k$ equal-sized sets (to within 1). Give an$O(n \lg k)$ -time algorithm to list the $k$th quantiles of a set.
Pre-calculate the positions of the quantiles in
def partition(a, p, r):
x = a[r - 1]
i = p - 1
for k in range(p, r - 1):
if a[k] < x:
i += 1
a[i], a[k] = a[k], a[i]
i += 1
a[i], a[r - 1] = a[r - 1], a[i]
return i
def randomized_partition(a, p, r):
x = random.randint(p, r - 1)
a[x], a[r - 1] = a[r - 1], a[x]
return partition(a, p, r)
def randomized_select(a, p, r, i):
while True:
if p + 1 == r:
return p, a[p]
q = randomized_partition(a, p, r)
k = q - p + 1
if i == k:
return q, a[q]
if i < k:
r = q
else:
p = q + 1
i -= k
def k_quantiles_sub(a, p, r, pos, f, e, quantiles):
if f + 1 > e:
return
mid = (f + e) // 2
q, val = randomized_select(a, p, r, pos[mid])
quantiles[mid] = val
k_quantiles_sub(a, p, q, pos, f, mid, quantiles)
k = q - p + 1
for i in xrange(mid + 1, e):
pos[i] -= k
k_quantiles_sub(a, q + 1, r, pos, mid + 1, e, quantiles)
def k_quantiles(a, k):
num = len(a) / k
mod = len(a) % k
pos = [num for _ in xrange(k)]
for i in xrange(mod):
pos[i] += 1
for i in xrange(1, k):
pos[i] += pos[i - 1]
quantiles = [0 for _ in xrange(k)]
k_quantiles_sub(a, 0, len(a), pos, 0, len(pos), quantiles)
return quantiles
Describe an
$O(n)$ -time algorithm that, given a set$S$ of$n$ distinct numbers and a positive integer$k \le n$ , determines the$k$ numbers in$S$ that are closest to the median of$S$ .
Find the median in
def black_box_kth(a, k):
return sorted(a)[k-1]
def black_box_median(a):
return sorted(a)[(len(a) - 1) // 2]
def k_closest(a, k):
median = black_box_median(a)
b = [abs(a[i] - median) for i in xrange(len(a))]
kth = black_box_kth(b, k)
closest = []
for i in xrange(len(a)):
if abs(a[i] - median) < kth:
closest.append(a[i])
for i in xrange(len(a)):
if abs(a[i] - median) == kth:
closest.append(a[i])
if len(closest) >= k:
break
return closest
Let
$X[1\dots n]$ and$Y[1\dots n]$ be two arrays, each containing$n$ numbers already in sorted order. Give an$O(\lg n)$ -time algorithm to find the median of all$2n$ elements in arrays$X$ and$Y$ .
We can find the median in
def median_of_two(a, b):
if len(a) == 1:
return min(a[0], b[0])
mid = (len(a) - 1) // 2
k = mid + 1
if a[mid] <= b[mid]:
return median_of_two(a[-k:], b[:k])
return median_of_two(a[:k], b[-k:])
Professor Olay is consulting for an oil company, which is planning a large pipeline running east to west through an oil field of
$n$ wells. The company wants to connect a spur pipeline from each well directly to the main pipeline along a shortest route (either north or south), as shown in Figure 9.2. Given the$x$ - and$y$ -coordinates of the wells, how should the professor pick the optimal location of the main pipeline, which would be the one that minimizes the total length of the spurs? Show how to determine the optimal location in linear time.
Find the median of