References
- Fitch, W.M. (1970). Distinguishing homologous from analogous proteins. Syst Zool 19, 99-113. https://doi.org/10.2307/2412448
- Wu, C.H., Huang, H, Yeh, L.S., and Barker, W.C. (2003). Protein family classification and functional annotation. Comput Biol Chem 27, 37-47. https://doi.org/10.1016/S1476-9271(02)00098-1
- Goodstadt, L., and Ponting, C.P. (2006). Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput Biol 2, e133. https://doi.org/10.1371/journal.pcbi.0020133
- Clamp, M., Fry, B., Kamal, M., Xie, X., Cuff, J., Lin, M.F., Kellis, M., Lindblad- Toh, K., and Lander, E.S. (2007). Distinguishing protein-coding and noncoding genes in the human genome. Proc Natl Acad Sci U S A 104, 19428-19433. https://doi.org/10.1073/pnas.0709013104
- Redfern, O., Grant, A., Maibaum, M., and Orengo, C. (2005). Survey of current protein family databases and their application in comparative, structural and functional genomics. J Chromatogr B Analyt Technol Biomed Life Sci 815, 97-107. https://doi.org/10.1016/j.jchromb.2004.11.010
- Smith, T.F., and Waterman, M.S. (1981). Identification of common molecular subsequences. J Mol Biol 147, 195-197. https://doi.org/10.1016/0022-2836(81)90087-5
- Lipman, D.J., and Pearson, W.R. (1985). Rapid and sensitive protein similarity searches. Science 227, 1435-1441. https://doi.org/10.1126/science.2983426
- Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J Mol Biol 215, 403-410. https://doi.org/10.1016/S0022-2836(05)80360-2
- Itoh, M., Goto, S., Akutsu, T., and Kanehisa, M. (2005). Fast and accurate database homology search using upper bounds of local alignment scores. Bioinformatics 21, 912-921. https://doi.org/10.1093/bioinformatics/bti076
- Moreno-Hagelsieb, G., and Latimer, K. (2008). Choosing BLAST options for better detection of orthologs as reciprocal best hits. Bioinformatics 24, 319-324. https://doi.org/10.1093/bioinformatics/btm585
- Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673-4680. https://doi.org/10.1093/nar/22.22.4673
- Jones, C.D., Custer, A.W., and Begun, D.J. (2005). Origin and evolution of a chimeric fusion gene in Drosophila subobscura, D. madeirensis and D. guanche. Genetics 170, 207-219. https://doi.org/10.1534/genetics.104.037283
- Sayah, D.M., Sokolskaja, E., Berthoux, L., and Luban, J. (2004). Cyclophilin A retrotransposition into TRIM5 explains owl monkey resistance to HIV-1. Nature 430, 569-573. https://doi.org/10.1038/nature02777
- Long, M., Betran, E., Thornton, K., and Wang, W. (2003). The origin of new genes: glimpses from the young and old. Nat Rev Genet 4, 865-875.
- Fumasoni, I., Meani, N., Rambaldi, D., Scafetta, G., Alcalay, M., and Ciccarelli, F.D. (2007). Family expansion and gene rearrangements contributed to the functional specialization of PRDM genes in vertebrates. BMC Evol Biol 7, 187. https://doi.org/10.1186/1471-2148-7-187
- Ben-Shlomo, I., Yu Hsu, S., Rauch, R., Kowalski, H.W., and Hsueh, A.J. (2003). Signaling receptome: a genomic and evolutionary perspective of plasma membrane receptors involved in signal transduction. Sci STKE 2003: RE9.
- Alexeyenko, A., Tamas, I., Liu, G., and Sonnhammer, E.L. (2006). Automatic clustering of orthologs and inparalogs shared by multiple proteomes. Bioinformatics 22, e9-15. https://doi.org/10.1093/bioinformatics/btl213
- Tian, W., and Skolnick, J. (2003). How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol 333, 863- 882. https://doi.org/10.1016/j.jmb.2003.08.057
- Hegyi, H., and Gerstein, M. (1999). The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J Mol Biol 288, 147-164. https://doi.org/10.1006/jmbi.1999.2661
- Rost, B. (1999). Twilight zone of protein sequence alignments. Protein Eng 12, 85-94. https://doi.org/10.1093/protein/12.2.85
- Hochreiter, S., Heusel, M., and Obermayer, K. (2007). Fast model-based protein homology detection without alignment. Bioinformatics 23, 1728-1736. https://doi.org/10.1093/bioinformatics/btm247
- Ben-Hur, A., and Brutlag, D. (2003). Remote homology detection: a motif based approach. Bioinformatics 19 Suppl 1, i26-33. https://doi.org/10.1093/bioinformatics/btg1002
- Tong, A.H., Drees, B., Nardelli, G., Bader, G.D., Brannetti, B., Castagnoli, L., Evangelista, M., Ferracuti, S., Nelson, B., Paoluzi, S., et al. (2002). A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321- 324. https://doi.org/10.1126/science.1064987
- Kunik, V., Meroz, Y., Solan, Z., Sandbank, B., Weingart, U., Ruppin, E., and Horn, D. (2007). Functional representation of enzymes by specific peptides. PLoS Comput Biol 3, e167. https://doi.org/10.1371/journal.pcbi.0030167
- Li, W., and Godzik, A. (2006). Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658-1659. https://doi.org/10.1093/bioinformatics/btl158
- Remm, M., and Sonnhammer, E. (2000). Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res 10, 1679-1689. https://doi.org/10.1101/gr.GR-1491R
- Kamachi, Y., Cheah, K.S., and Kondoh, H. (1999). Mechanism of regulatory target selection by the SOX high-mobility-group domain proteins as revealed by comparison of SOX1/2/3 and SOX9. Mol Cell Biol 19, 107-120.
- Hurst, L.D., Pal, C., and Lercher, M.J. (2004). The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 5, 299-310.
- Ogul, H., and Mumcuoglu, E.U. (2007). A discriminative method for remote homology detection based on n-peptide compositions with reduced amino acid alphabets. Biosystems 87, 75-81. https://doi.org/10.1016/j.biosystems.2006.03.006
- Janin, J. (1979). Surface and inside volumes in globular proteins. Nature 277, 491-492. https://doi.org/10.1038/277491a0
- Wolfenden, R., Andersson, L., Cullis, P.M., and Southgate, C.C. (1981). Affinities of amino acid side chains for solvent water. Biochemistry 20, 849-855. https://doi.org/10.1021/bi00507a030
- Kyte, J., and Doolittle, R.F. (1982). A simple method for displaying the hydropathic character of a protein. J Mol Biol 157, 105-132. https://doi.org/10.1016/0022-2836(82)90515-0
- Rose, G.D., Geselowitz, A.R., Lesser, G.J., Lee, R.H., and Zehfus, M.H.(1985). Hydrophobicity of amino acid residues in globular proteins. Science 229, 834-838. https://doi.org/10.1126/science.4023714
- Massey, K.A., Blakeslee, C.H., and Pitkow, H.S. (1998). A review of physiological and metabolic effects of essential amino acids. Amino Acids 14, 271-300. https://doi.org/10.1007/BF01318848
- Karplus, P.A. (1997). Hydrophobicity regained. Protein Sci 6, 1302-1307. https://doi.org/10.1002/pro.5560060618
- Windholz, M. (1984). The Merck Index Online. Science 226, 1250.
- Seo, J., Gordish-Dressman, H., and Hoffman, E.P. (2006). An interactive power analysis tool for microarray hypothesis testing and generation. Bioinformatics 22, 808-814. https://doi.org/10.1093/bioinformatics/btk052
- Edwards, A.W. (1969). Statistical methods in scientific inference. Nature 222, 1233-1237. https://doi.org/10.1038/2221233a0
- Whittaker, E.T., and Robinson, G.(1967). The calculus of observations; an introduction to numerical analysis, 4th edition., (New York: Dover Publications).
- Remm, M., Storm, C.E., and Sonnhammer, E.L. (2001). Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol 314, 1041-1052. https://doi.org/10.1006/jmbi.2000.5197
- Kanehisa, M. (2002). The KEGG database. Novartis Found Symp 247, 91-101; discussion 101-103, 119-128, 244-152.
- Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. (2004). The KEGG resource for deciphering the genome. Nucleic Acids Res 32, D277-280. https://doi.org/10.1093/nar/gkh063
- Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247, 536-540.
- Andreeva, A., Howorth, D., Chandonia, J.M., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. (2008). Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36, D419- 425.