Human Immunodeficiency Virus 2 (HIV 2)

Project is already public.

The Human Immunodeficiency Virus 2 belongs to the Retroviridae family, and is one of the two featured viruses in the HIV database (https://www.hiv.lanl.gov). In order to try to detect positively selected amino acid sites (those sites visible to the immune system, for instance) at the 9 mature proteins, for each gene, all available nucleotide sequences were downloaded from the HIV database, namely 128, 111, 142, 110, 109, 109, 168, 112 and 112 for genes ENV, GAG, NEF, POL, REV, TAT, VIF, VPR and VPX, respectively. These sequences were only available as aligned files for download, and were initially processed to produce FASTA format files for further analysis. After, two protocols for sequence filtering, namely the removal of identical nucleotide sequences and the removal of identical amino acid sequences, were seperately implemented. In the first protocol (N prefix), with the removal of identical nucleotide sequences, as well as those with ambiguous nucleotides and those presenting in-frame stop codons, we ended up with 105, 76, 86, 70, 77, 70, 91, 62 and 84 sequences for genes ENV, GAG, NEF, POL, REV, TAT, VIF, VPR and VPX, respectively. In the second protocol (A prefix), with the removal of identical amino acid sequences, as well as the untranslated sequences with ambiguous nucleotides and those presenting in-frame stop codons, we ended up with 105, 74, 85, 67, 73, 64, 85, 58 and 72 sequences for genes ENV, GAG, NEF, POL, REV, TAT, VIF, VPR and VPX, respectively. The phi test for recombination, as implemented in SplitsTree was used to try to find evidence for recombination in these datasets, that could affect the results by creating false positively selected amino acid sites, when using codeML. In both protocols, the phi test did not find statistically significant evidence for recombination (P>0.05) for genes ENV, GAG, NEF, REV, TAT, VIF, VPR and VPX, and did find evidence for recombination (P<0.05) for gene POL. When no evidence for recombination was found, the dataset sequences were aligned using Muscle, phylogenies inferred using MrBayes, and positively selected amino acid sites inferred using codeML as implemented in ADOPS. When evidence for recombination was found, the dataset was analyzed using OmegaMap, after aligning the sequences with Muscle, and a phylogeny inferred with MrBayes, as implemented in ADOPS. In this case, the details of the OmegaMap run are shown in the Notes tab of the corresponding ADOPS project, but positively selected amino acid sites can still be viewed in the PSS tab. As usual, the details of every project can be checked by opening the other tabs.