To obtain the content summary of a database, a metasearcher could rely on the database to supply the summary (., by following a protocol like STARTS [12], or possibly using Semantic Web [1] tags in the future). Unfortunately many web-accessible text databases are completely autonomous and do not report any detailed metadata about their contents to facilitate metasearch- ing. To handle such databases, a metasearcher could rely on manually generated descriptions of the database con- tents. Such an approach would not scale to the thousands of text databases available on the web [2], and would likely not produce the good-quality, fine-grained content summaries required by database selection algorithms. In this paper, we present.