Measuring semantic similarity between words using page counts and snippets