Evolutionarily conserved RNA secondary structures in coding and non-coding sequences at the 3′ end of the hepatitis G virus/GB-virus C genome
Hepatitis G virus (HGV)/GB virus C (GBV-C) causes persistent, non-pathogenic infection in a large proportion of the human population. Epidemiological and genetic evidence indicates a long-term association between HGV/GBV-C and related viruses and a range of primate species, and the cospeciation of these viruses with their hosts during primate evolution. Using a combination of covariance scanning and analysis of variability at synonymous sites, we previously demonstrated that the coding regions of HGV/GBV-C may contain extensive secondary structure of undefined function (Simmonds & Smith, Journal of Virology 73, 5787-5794, 1999). In this study we have carried out a detailed comparison of the structure of the 3′untranslated region (3′UTR) of HGV/GBV-C with that of the upstream NS5B coding sequence. By investigation of free energies on folding, secondary structure predictive algorithms and analysis of covariance between HGV/GBV-C genotypes 1-4 and the more distantly related HGV/GBV-C chimpanzee variant, we obtained evidence for extensive RNA secondary structure formation in both regions. In particular, the NS5B region contained long stem-loop structures of up to 38 internally paired nucleotides which were evolutionarily conserved between human and chimpanzee HGV/GBV-C variants. The prediction of similar structures in the same region of hepatitis C virus may allow the functions of these structures to be determined with a more tractable experimental model.