600.466 -Final Project Due: Thursday, May 11, 2000 at 5 PM (there can be no extensions)

Due: Thursday, May 11, 2000 at 5 PM (there can be no extensions)
The nal project in this class provides an opportunity for you to apply the informa-
tion retrieval and classi cation techniques you have learned in this class to an interesting
application of your choice.
Students have wide exibility in selecting a project description, and many of you have
already chosen projects in consultation with the instructor.
For those of you still looking for project options, below are several possible project
descriptions that will give you an idea of the focus, scope and complexity expected of nal
projects. However, you are very strongly encouraged to do a project on a topic other than
Option 1 and you must obtain speci c approval in advance to do options 1 or 2.
Option 1: FriendFinder - a web Robot/Agent
Johns Hopkins can be a lonely place. But there are potentially hundreds of students here
and at other area universities that share many of your same interests. Build a web robot
to nd them by locating and analyzing all personal home pages at JHU.
The intuition behind this assignment is that if you take a collection of personal home
pages of yourself, your friends and others who share your interests, convert them to a
vector representation (as in HW2 and 3), and compute a centroid or pro le v F , and then
take a collection of home pages of people who do not seem to share your interests and
compute their vector centroid v NF , new home pages can be classi ed simplying by nding


Source: Amir, Yair - Department of Computer Science, Johns Hopkins University


Collections: Computer Technologies and Information Sciences