Sequence variability in the 5' non-coding region of hepatitis C virus: identification of a new virus type and restrictions on sequence diversity.
Simmonds P., McOmish F., Yap PL., Chan SW., Lin CK., Dusheiko G., Saeed AA., Holmes EC.
We have analysed the pattern of nucleotide sequence variability in the 5' non-coding region (5' NCR) of geographically dispersed variants of hepatitis C virus (HCV). Phylogenetic analysis of sequences in this region indicated the existence of a new virus type, provisionally termed type 4, the identity of which was confirmed by further analysis of the more variable part of the HCV core protein coding region. The geographical distribution of HCV type 4 was distinct from that of other HCV types, it being particularly widespread in Africa and absent or rare in Europe and the Far East. Much of the variability in the 5' NCR appears to be constrained by a requirement for specific secondary structures in the viral RNA. In one of the most variable regions of the 5' NCR (positions -169 to -114), most of the nucleotide changes that are characteristic of different HCV types were covariant, with complementary substitutions at other positions. According to the proposed secondary structure of the 5' NCR, such changes preserved base pairing within a stem-loop structure, whereas the nucleotide insertions found in a proportion of 5' NCR sequences, including those of type 4, localized exclusively to the non-base-paired terminal loop. The specific nucleotide substitutions in the 5' NCR that differentiate each of the four HCV types can be detected by restriction enzyme cleavage, providing a rapid and reliable method for virus typing.