The problem breaks down into two parts: defining the clusters (., a list of members for each family) and building multiple alignments of the members. Previous approaches to construct comprehensive fam- ily databases have either concentrated on aligning short conserved regions, 6–8 often starting from the manually constructed clusters in Prosite, 9 or full domain alignments using either clusters that were derived manually from PIR2 or automatically. 10 An issue here is whether to aim for conserved regions only or whole domain alignments. By using short conserved motifs either in the form of a pattern or an alignment can indicate when a protein contains a known domain. Motif matches are often useful to indicate functional sites. However,.