1. What are blocks and why are they used on secondary memory? Why can blocks not be very small or very large? Give an argument to suggest roughly how many bytes a block should hold. (ans)

  2. Given the rotation speed of a disk, , in revolutions per minute, and the recording density, , in bytes per track, give formulas for a) , the transfer time per byte, in microseconds, and b) , the average latency, in milliseconds. (ans)

  3. What are the relationships between N, the number of records in a file, n, the number of blocks in the file, b, the number of records per block, and , the load factor? a) in terms of the others. b) n in terms of the others.

  4. The following sequences of integers are input to the replacement-selection phase of a sort for secondary storage. There is room in RAM for only two integers. What is the average run length in each case? a) 9, 1, 8, 2, 3, 7, 6, 4, 5 b) 8, 7, 6, 5, 4, 3, 2, 1

  5. a) The merge phase of a sort for secondary storage has 1,000,000 initial runs as input, and does 10-way merges. How many passes of the data will be needed to complete the sort? b) If RAM capacity during replacement selection is 10,000 bytes, how many bytes long is the sorted file, assuming the average number of initial runs were generated?

  6. How should a k-way merge be terminated?

  7. Searches can be successful or unsuccessful. Sequential files can be ordered or unordered. For each possible combination, write down the expected number of accesses that will be made to a file of n blocks under uniform usage for a transaction which requests two records.

  8. Homogenous B-trees of order 3 are built from the following sequences of data. What is the load factor of the resulting trees? a) 1, 3, 4, 6, 8, 2, 5, 7 b) 1, 2, 3, 6, 7, 4, 5

  9. Using the formulas for homogenous B-trees of order 3, determine a) the smallest possible height for 8 records and b) the biggest possible height for 7 records.

  10. What are two ways to make B-trees more efficient? Briefly discuss the form taken by the improvement in each case.

  11. Derive expressions without in them for a) and b) (Tedious algebra is not necessary as long as you say what should be done.)

  12. What are two important differences between B-trees and kd-trees?

  13. The points (3,7) and (4,0) are nearest-neighbours in Z-order, but 50 apart in 2-dimensional space. Here is another 1-dimensional ordering of 2-dimensional space, in which several pairs of nearest neighbours are at least 50 apart. a) What are all these pairs? b) Use this to compare briefly the two orderings as ways of representing symmetrical data. Is the second ordering symmetrical?

    The new ordering is given by where r = x + y, , and x and y are the possible values of the two fields, { 0, 1, 2, 3, 4, 5, 6, 7}

    Hint. Draw a picture and neglect q.

  14. A variant of Z-order is based on a kd-trie in which the discriminant cycles through x, then x again, then y (), where x and y are the two fields. Write down the order of the following sixteen points, (0,0), (0,1), (0,2), (0,3), (1,0), (1,1), (1,2), (1,3), (2,0), (2,1), (2,2), (2,3), (3,0), (3,1), (3,2), (3,3) under this ordering, or draw the picture.

  15. A file containing every possible record on fields 0, 1, 2, 3, 4, 5, 6, 7 is queried on the range (3,2)..(5,4). a) If only one direct access, followed by nothing but sequential accesses, is made, how many records will be retrieved? b) If only the 16 records are to be retrieved, how many direct accesses must be made. (As opposed to sequential accesses.) Suppose the blocksize is 2.

  16. The following integer values are hashed using division-remainder to five blocks of secondary storage, of capacity b = 2. What is the probe factor a) optimistically? b) with linear probing?

    280, 531, 427, 329, 191, 350, 346, 189, 313

  17. The integer values from the last question are hashed to seven blocks, still using division-remainder and with b = 2. What is the load factor?

  18. Nine records are inserted into a linearly hashed file, initially empty. If the split criterion is split unless this makes , how many pages will have been allocated for the nine records? (Assume that is calculated using all the records that have been inserted.)

  19. A file of five blocks has been created using division-remainder hashing with R=100 bytes, b=25, and , on a device with access-transfer ratio . a) How would you sort the following search keys to use them in a high-activity search? b) Is this the best way to search this file?

    280, 531, 427, 329, 191, 350, 346, 189, 313

  20. A 40-block file is to be accessed by tidy function on a field, f, which has values from 1 to 1,000,000. A quarter of the records are uniformly distributed between 1 and 500,000; a quarter are uniformly distributed between 500,000 and 707,106; a third quarter are uniform between 707,107 and 866,025; and the final quarter are uniform between 866,026 and 1,000,000.

    a) Describe the tidy function, using mathematical expressions, words, and drawing. (This may help you: 2/2 = 0.7071067, 3/2 = 0.8660254)

    b) What is for this tidy function?

  21. Design static multipaging, in two dimensions and using blocksize b=1, for the following sixteen tuples.

    (±0.20,±0.98),(±0.56,±0.83), (±0.83,±0.56),(±0.98,±0.20)

    (That is, (±0.20,±0.98) means the four tuples, (0.20, 0.98), (0.20, -0.98), (-0.20, 0.98), (-0.20, -0.98), and so on.)

    a) What are n, the number of blocks, , the number of values for the ith attribute, and , the number of segments for the ith attribute?

    b) Draw the tuples and the optimal segment boundaries.

    c) What is pi?

    d) Can this multipaging be improved? How?