Ravindra Pushker, Alex Mira and Francisco Rodríguez-Valera Address: Evolutionary Genomics Group, Universidad Miguel Hernández, Campus de San Juan, Apartado 18, 03550 San Juan de Alicante, Alicante, Spain. Correspondence: Alex Mira. E-mail: reviews Published: 18 March 2004 Genome Biology 2004, 5:R27 The electronic version of this article is the complete one and can be found online at Received: 12 December 2003 Revised: 23 January 2004 Accepted: 6 February 2004 © 2004 Pushker et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved. | Open Access Research Comparative genomics of gene-family size in closely related bacteria Ravindra Pushker Alex Mira and Francisco Rodriguez-Valera Address Evolutionary Genomics Group Universidad Miguel Hernández Campus de San Juan Apartado 18 03550 San Juan de Alicante Alicante Spain. Correspondence Alex Mira. E-mail Published 18 March 2004 Genome Biology 2004 5 R27 The electronic version of this article is the complete one and can be found online at http 2004 5M R27 Received 12 December 2003 Revised 23 January 2004 Accepted 6 February 2004 2004 Pushker et al. licensee BioMed Central Ltd. This is an Open Access article verbatim copying and redistribution of this article are permitted in all media for any purpose provided this notice is preserved along with the article s original URL. Abstract Background The wealth of genomic data in bacteria is helping microbiologists understand the factors involved in gene innovation. Among these the expansion and reduction of gene families appears to have a fundamental role in this but the factors influencing gene family size are unclear. Results The relative content of paralogous genes in bacterial genomes increases with genome size largely due to the expansion of gene family size in large genomes. Bacteria undergoing genome reduction display a parallel process of redundancy elimination by which gene families are reduced to one or a few members. Gene family size is also influenced by sequence divergence and physiological function. Large gene families show wider sequence divergence suggesting they are probably older and certain functions such as metabolite transport mechanisms are overrepresented in large families. The size of a given gene family is remarkably similar in strains of the same species and in closely related species suggesting that homologous gene families are vertically transmitted and depend little on horizontal gene transfer HGT . Conclusions The remarkable preservation of copy