Summary: I/O-algorithms Spring 2009Project 3 -- Theoretical Homework
The goal of this project is to use the theoretical techniques we have discussed in the first part of
the course to design new algorithms and data structures. The project should be done in the same
groups as project 1 and 2. A report with the solutions is due on Tuesday May 26, 2009. Remember
to argue for correctness and complexity of each of your solutions. The evaluation of the project
will be part of the final grade.
1. Design an I/O-efficient algorithm for removing duplicate from a multiset of N elements (you
can not assume the range of the elements is known); The output from the algorithms should
be the K distinct elements among the N input elements in sorted order, and the algorithm
should run in O max N
B - K
B logM/B Ni, N
B I/Os, where Ni is the number
of copies of the i'th elements in the input set.
(Hint: Use merge-sort and remove duplicates as soon as they are found. Analyze the algorithm