Enterovirus

Project is already public.

The Enterovirus belongs to the Picornaviridae family, and is one of the five featured viruses in the VIPR database (www.viprbrc.org). In order to try to detect positively selected amino acid sites (those sites visible to the immune system, for instance) at the 11 mature proteins, for each gene, all available nucleotide sequences were downloaded from the VIPR database, namely 1337, 1337, 1337, 1337, 2587, 2587, 2584, 2587, 1102, 2588 and 8 for genes VP4, VP2, VP3, VP1, 2A, 2B, 2C, 3A, VPg, 3Cpro and RdRp, respectively. Two protocols for sequence filtering, namely the removal of identical nucleotide sequences and the removal of identical amino acid sequences, were seperately implemented. In the first protocol (N prefix), with the removal of identical nucleotide sequences, as well as those with ambiguous nucleotides and those presenting in-frame stop codons, we ended up with 686, 941, 912, 1004, 1895, 1708, 2064, 1639, 581, 1913 and 8 sequences for genes VP4, VP2, VP3, VP1, 2A, 2B, 2C, 3A, VPg, 3Cpro and RdRp, respectively. In the second protocol (A prefix), with the removal of identical amino acid sequences, as well as the untranslated sequences with ambiguous nucleotides and those presenting in-frame stop codons, we ended up with 194, 471, 489, 654, 1028, 705, 1074, 730, 165, 929 and 8 sequences for genes VP4, VP2, VP3, VP1, 2A, 2B, 2C, 3A, VPg, 3Cpro and RdRp, respectively. The phi test for recombination, as implemented in SplitsTree was used to try to find evidence for recombination in these datasets, that could affect the results by creating false positively selected amino acid sites, when using codeML. In both protocols, the phi test did not find statistically significant evidence for recombination (P>0.05) for genes VP2, VP3, VP1, 2A, 2B, 2C, 3A, 3Cpro and RdRp, and did find evidence for recombination (P<0.05) for gene VP4. Also in both cases, the Vpg gene sequence had too few informative characters and the Phi Test could not be used. Even so, the evidence for recombination (P<0.05) aproach was chosen for this gene. When more than 100 sequences are available for a given gene, and no evidence for recombination was found, five datasets with 50 randomly selected sequences were analyzed. In this case, sequences were aligned using Muscle, phylogenies inferred using MrBayes, and positively selected amino acid sites inferred using codeML as implemented in ADOPS. When more than 100 sequences are available for a given gene, and evidence for recombination or a few informative character case were found, five datasets with 50 randomly selected sequences were analyzed using OmegaMap, after aligning the sequences with Muscle, and a phylogeny inferred with MrBayes, as implemented in ADOPS. In this case, the details of the OmegaMap run are shown in the Notes tab of the corresponding ADOPS project, but positively selected amino acid sites can still be viewed in the PSS tab. As usual, the details of every project can be checked by opening the other tabs.