Tuyển tập các báo cáo nghiên cứu về sinh học được đăng trên tạp chí y học Molecular Biology cung cấp cho các bạn kiến thức về ngành sinh học đề tài: Evaluating deterministic motif significance measures in protein databases. | Algorithms for Molecular Biology BioMed Central Research Open Access Evaluating deterministic motif significance measures in protein databases Pedro Gabriel Ferreira and Paulo J Azevedo Address Department of Informatics University of Minho Campus de Gualtar 4710-057 Braga Portugal Email Pedro Gabriel Ferreira - pedrogabriel@ Paulo J Azevedo - pja@ Corresponding author Published 24 December 2007 Received 15 May 2007 Algorithms for Molecular Biology 2007 2 16 doi 1748-7188-2-16 Accepted 24 December 2007 This article is available from http content 2 1 16 2007 Ferreira and Azevedo licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License http licenses by which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. Abstract Background Assessing the outcome of motif mining algorithms is an essential task as the number of reported motifs can be very large. Significance measures play a central role in automatically ranking those motifs and therefore alleviating the analysis work. Spotting the most interesting and relevant motifs is then dependent on the choice of the right measures. The combined use of several measures may provide more robust results. However caution has to be taken in order to avoid spurious evaluations. Results From the set of conducted experiments it was verified that several of the selected significance measures show a very similar behavior in a wide range of situations therefore providing redundant information. Some measures have proved to be more appropriate to rank highly conserved motifs while others are more appropriate for weakly conserved ones. Support appears as a very important feature to be considered for correct motif ranking. We observed that not all the measures are suitable for situations with poorly balanced class information like for .