Summary: A Kosher Source of Ham
Lyryx Learning, Inc.
210 - 1422 Kensington Road NW
Calgary, Alberta, Canada T2N 3P9
Department of Computer Science
University of Calgary
2500 University Drive NW
Calgary, Alberta, Canada T2N 1N4
(Appeared in MIT Spam Conference, 2009)
Testing content-based anti-spam systems requires a plentiful source of both
spam and ham. We examine the viability of Usenet postings as a ham source.
While Usenet postings have been used before for this purpose, we refine the idea
and show empirically that it is the text of Usenet replies that provides the best cut
Measuring the efficacy of spam filters and comparing different spam filters is an im-
portant task. Filters cannot be methodically improved without measurement; informed