OUP user menu

Comparative phylogenies of yellow fever isolates from Peru and Brazil

Juliet E. Bryant, Alan D.T. Barrett
DOI: http://dx.doi.org/10.1016/S0928-8244(03)00238-4 103-118 First published online: 1 November 2003


We recently reported phylogenetic evidence to support the presence of enzootic transmission foci of yellow fever virus (YFV) in Peru [Bryant et al., Emerg. Infect. Dis. (2003)]. Because the prevailing paradigm of YFV transmission in Brazil is that of ‘wandering epizootics’ rather than discrete enzootic foci, we have now compared the molecular phylogenies of YFV isolates from Peru and Brazil, and re-examined the question of virus mobility by mapping the spatio-temporal distribution of genetic variants from these areas. Sequences were obtained for two genomic regions from 50 strains of YFV collected between 1954 and 2000 comprising 223 codons of the structural proteins (premembrane and envelope genes, ‘prM/E’), and a distal region spanning the carboxy terminus of NS5 and part of the 3′ non-coding region (‘EMF’). Peruvian and Brazilian isolates formed two monophyletic clades with no evidence to support recombination between lineages. Variation within both coding and non-coding regions revealed similar substitution rates and overall levels of diversity within each clade. The branching structure of the prM/E and EMF trees of Brazilian sequences showed strong agreement of intra-lineage relationships; in contrast, the EMF sequences of Peruvian isolates failed to fully support the subclade structure of the prM/E phylogeny. These phylogenies suggest that transmission cycles of YFV in Peru and Brazil may sometimes be locally maintained within specific locales, but have also on occasion become very widely dispersed.

  • Yellow fever virus
  • Flavivirus
  • Peru
  • Brazil
  • Phylogeny
  • Virus evolution
  • Virus evelope protein

1 Introduction

Yellow fever (YF) is an important arboviral disease that has re-emerged in South America and Africa in the last two decades [2,3]. Despite a highly effective vaccine, there has been an upsurge of YF activity in Peru and Brazil in the last decade. Over 2000 cases have been reported in South America since 1998, and reports of confirmed cases are believed to vastly under-report the true incidence of disease [4]. Urban YF (i.e. transmission of yellow fever virus (YFV) by the peridomestic mosquito Aedes aegypti) has not been reported in Brazil since 1942 [5]. However, many densely populated coastal cities in South America are infested by A. aegypti. Surveillance and monitoring of YF endemic/epidemic viral activity is thus a critically important public health objective.

YFV is the prototype member of the genus Flavivirus, family Flaviviridae. It has a positive-sense, single-stranded RNA genome of approximately 10 kb encoding a single long open reading frame (ORF) that is cleaved into three structural (C, prM, E) and seven non-structural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, NS5) proteins. In addition, the genome contains 3′ and 5′ non-coding regions that are essential to virus replication, and are suggested to contain important determinants of viral infectivity for various mosquito species [6]. Early studies of YFV strains from different geographic locations established the existence of distinctive variants among New and Old World isolates based on immunochemical properties of the E protein [7,8]. More recently, nucleotide sequencing studies of structural gene regions [9] and NS4A and 3′NCR regions [6] delineated seven genotypes of YFV worldwide: five genotypes in Africa, and two in South America. The Brazilian and Peruvian YFVs represent the two major South American YFV genotypes I and II, respectively.

The principal vector of YFV in South America is Haemagogus janthinomys; however, other species of this genus and also of the genus Sabethes play a role in the maintenance cycle of YFV. Monkeys are believed to be the main hosts and the source of amplification of the virus [10,11]. The most widely accepted paradigm of YFV ecoepidemiology is that of ‘epizootic waves’ in which the virus reservoir is ‘constantly moving’ from place to place rather than being maintained over time in the same location. The periodicity of YF epidemics and fluctuations of transmission intensity of the virus have been attributed to the level of immunity in human and simian populations [5,10,1215], as well as climatic factors affecting vector populations [16].

Our laboratory has recently studied the geographic and temporal distribution of YF variants in Peru, and shown that rather than circulating (‘wandering’) as one intermixing population, different subpopulations of YFV appear to persist within discrete foci within the foothills of the Andes high forest. The current study was undertaken to corroborate evidence for geographic subpopulations of the Peruvian YFVs by examining a distal region of the genome. In addition, we wished to compare patterns of genetic diversity among YFV isolates from Brazil and Peru. We examined two regions of the YFV genome, a fragment comprising 233 codons of the structural proteins (premembrane and envelope, prM/E), and a distal region spanning the carboxy terminus of NS5 and part of the 3′ non-coding region (EMF). Our aim was to describe the spatio-temporal distribution of variants, and examine evidence that YFV is maintained within circumscribed foci of endemicity. In addition to shedding light on YFV genetics and evolution, these results may be relevant to targeting vaccination strategies and vector control efforts.

2 Materials and methods

2.1 Virus isolates used in this study

Fifty virus isolates from Peru and Brazil were obtained from the World Arbovirus Reference Collection, at the University of Texas Medical Branch (UTMB), Galveston, TX (Table 1). Fig. 1 depicts the geographic places of origin for the viruses used in this study. The Peruvian viruses were originally isolated within the laboratories of the Instituto Nacional de Salud (INS) and NMRCD in Lima, at intervals from 1977 to 1999. The Peruvian isolates represent seven of the 14 hydrographic river basins identified as YF endemic zones in Peru. With the exception of Peru81b (isolate 1914b) which was collected from a sentinel mouse, all the Peruvian strains were obtained from human clinical samples. A detailed study of the prM/E gene sequences of these isolates was recently reported [1].

View this table:
Table 1

List of 50 South American yellow fever strains used in this study

Strain IDSequence IDPassage historySourceDepartmentCommunity
BeH 111BRAZIL54C6/36, SM10HumanParaOriboca
BeAR 189BRAZIL55CSM1 C6/36#1Sabethes sp.ParaPirelli Marituba
BeAN 23536BRAZIL60SM1 C6/36#1Monkey-macacoParaBelem Brasilia Km94
BeAR 46299BRAZIL62AC6/36#1Haemagogus sp.ParaBelem Brasilia Km94
BeAR 44824BRAZIL62BSM1 C6/36#1Haemagogus sp.ParaBelem Brasilia km87
BeAN 142028BRAZIL68AC6/36#1Monkey-macacoParaAbaetetuba
BeAR 142658BRAZIL68CSM2, c6/36#1Haemagogus sp.ParaBarcarena
BeAN 142027BRAZIL68DOriginal, C6/36#1Saguinus midasParaAbaetetuba
BeH 203410BRAZIL71Original, C6/36#1HumanParaPeixe Boi
BeAR 233164BRAZIL73AMosq 4Haemagogus sp.GoiasPirenopolis
BeAR 232869BRAZIL73BMosq 1, SM2Haemagogus sp.GoiasFaz. Cangalha Formosa
BeAR 233436BRAZIL73COriginal, C6/36#1Haemagogus sp.GoiasBela Vista
BeH 350698BRAZIL78ASM2, Mosq 1HumanParaTome Acu
BeH 379501BRAZIL80CSM2HumanMaranhaoImperatriz
BeH 425381BRAZIL84AOriginal, C6/36#1HumanAmapaTribo Oyampi
BeAR 511437BRAZIL91AOriginal, C6/36#1Haemagogus sp.ParaBacarena
BeH 511843BRAZIL91BSM1, C6/36#2HumanRoraimaTribo Yanomamy
BeAR 512943BRAZIL92AC6/36#1Hg. janthinomysMato GrossoSidrolandia
BeAR 513008BRAZIL92BSM1 C6/36#1Sabethes sp.Mato GrossoSidrolandia
BeH 512722BRAZIL92CSM2, C6/36#1HumanMato GrossoCampo Grande
BeAR 513292BRAZIL92ESM1 C6/36#1Sabethes sp.Mato GrossoJaraguari
BeAR 527785BRAZIL94ASM1 C6/36#1Sabethes sp.Minas GeraisArinos
BeAR 527198BRAZIL94BSM1 C6/36#1Haemagogus sp.Minas GeraisArinos
BeAR 544276BRAZIL96ASM1 C6/36#1Haemagogus sp.RondoniaCabixi
BeAR 628124BRAZIL2000ASM1 C6/36#1Hg. janthinomysTocantinsParana
1362/77PERU77AC6/36#2HumanAyacuchoSan Francisco
1368PERU77BSM1, Vero1, C6/36#2HumanAyacuchoTribolina
1371PERU77CSM1, Vero1, C6/36#2HumanAyacuchoChontacocha
287/78PERU78SM1, Mosq 2HumanAyacuchoSan Francisco
R 35740PERU79SM1, Mosq 2HumanAyacuchoAlto Montaro
1914bPERU81BMILLC 1, LLCMK2, Vero 1, C6/36#1Sentinel mouseCuscoCusco
ARVO544PERU95ASM1, Vero1, C6/36#2HumanSan MartinTocache Huaquisha
HEB4224PERU95BSM1, C6/36#1HumanSan MartinTocache N.Progresso
HEB4236 (153)PERU95CC6/36#1HumanPascoOxapampa Villa Rica
149PERU95DSM1, C6/36#1HumanPascoNo data
Cepa#2PERU95ESM1, C6/36#1HumanPunoNo data
Cepa#1PERU95FC6/36#2HumanPunoNo data
OBS 2240PERU95GC6/36#2HumanHuanucoHermil
OBS 2250PERU95HSM1, C6/36#1HumanHuanucoHermil
HEB 4240PERU95IC6/36#1, SM1HumanJuninChachamayo
HEB 4245PERU95JSM1, C6/36#1HumanJuninChachamayo
HEB 4246PERU95KSM1, C6/36#1HumanJuninChachamayo
OBS 2243PERU95LSM1, C6/36#1HumanHuanucoNo data
ARV 0548PERU95MSM1, C6/36#1HumanSan MartinTocache
OBS 6530PERU98ASM1, C6/36#1HumanCuscoEcharate
OBS 6745PERU98CC6/36#2HumanCuscoSanta Rita-Rio Nanay
OBS 7904PERU99Vero1, c6/36 3HumanSan MartinTarapoto
Figure 1

Map of Peru and Brazil indicating geographic origins of YFV isolates. Note: enlargements not drawn to scale.

The 25 Brazilian isolates of this study were isolated at the Instituto Evandro Chagas in Belem, Brazil from 1954 to 2000. Seven of the isolates were obtained from human clinical cases; three were from monkeys, and 15 were mosquito isolates. The method of isolation and subsequent passage history for the virus seed stocks are provided in Table 1; the majority of isolates were prepared through one or two passages in suckling mouse brain followed by a single passage in C6/36 cells.

2.2 Reverse-transcription polymerase chain reaction (RT-PCR) and sequencing

Following transfer of the isolates from the World Arbovirus Reference Center, viruses were grown for a single additional passage in Vero cells to obtain sufficient quantities for RNA extraction. Methods for viral growth, genomic RNA extraction, and amplification of viral sequences by RT-PCR have been previously described [17]. The first set of studies involved amplification of a 670-bp fragment comprising the 3′ 108 nucleotides of the membrane (M) protein gene, and the 5′ 337 nucleotides of the envelope (E) protein-coding gene. The amplicons were generated using the genomic-sense primer (5′-CTGTCCCAATCTCAGTCC) and genomic-complementary primer (5′-AATGCTTCCTTTCCCAAAT). The second set of studies involved amplification of a 607-bp fragment comprising the 3′ 297 nucleotides of NS5 at the end of the genomic ORF, and the first 309-bp of the 3′ non-coding region. The primers used to amplify this region were genomic-sense degenerate primer ‘EMF’ (5′-TGGATGACSACKGARGAYAT) and genomic-complementary primer ‘VD8’ (5′-GGGTCTCCTCTAACCTCTAG). PCR products were screened by electrophoresis, recovered from gels using the Qiagen gel extraction kit, and sent for sequencing at the UTMB Protein Chemistry core facility. Sequences were obtained from both strands of each RT-PCR product for verification.

2.3 Phylogenetic and statistical analyses

Initial sequence editing and alignments were performed using Vector NTI (Informax®), and manually edited using the GCG Wisconsin Package Version 10.3, (Accelrys, San Diego, CA, USA) and DAMBE package (http://web.hku.hk/~xxia/software/software.htm). The PAUP* program [18] was used to infer maximum likelihood (ML) trees and estimate evolutionary rate parameters for each data set. The model of nucleotide substitution used was the general time-reversible (GTR) model with a different substitution rate for each codon position. For the purpose of rate comparisons, the among-site rate hetereogeneity for Peruvian and Brazilian prM/E and EMF sequences was also estimated using the discretized gamma distribution; however, this parameter was not included in the ML tree construction. Support for individual clades was determined by non-parametric bootstrapping [19]. The PAML package Version 3.13 [20] was used to estimate rates of synonymous and non-synonymous substitution. ML search methods implemented in the PAML package use a model of codon evolution that accounts for the transition/transversion rate with codon usage bias modeled by the nucleotide frequencies at the three codon positions. The Peruvian and Brazilian data sets were evaluated separately under the assumption of a single ratio of non-synonymous to synonymous substitution rates for all lineages.

3 Results

3.1 Sequence variation between Peruvian and Brazilian YFVs

Fig. 2 shows ML phylogenies for YFV isolates based on the prM/E and EMF sequence alignments. Both prM/E and EMF gene trees revealed a consistent pattern of divergence of the Peruvian and Brazilian clades, and the monophyly of these lineages was strongly supported by bootstrap analysis. Table 2 shows the average genetic distances among and between the two clades based on nucleotide and amino acid pairwise comparisons. Fig. 3 shows the amino acid alignment of prM/E sequences, and Fig. 4 shows the nucleotide alignment of the EMF sequences. The genetic diversity within Peru and Brazil was remarkably similar based on both the prM/E and EMF sequences. The prM/E and EMF sequences contained 189 (28.2%) and 135 (22.2%) variable nucleotides, respectively. There were a total of 44 variable amino acid sites within prM/E (19.7% of 223 codons), as compared to 11 variable sites in the NS5 fragment (12.9% of 85 codons). Divergence between the Peruvian and Brazilian prM/E sequences (average 9.6%) was slightly greater than divergence of the EMF sequences (average 8.6%). Pairwise amino acid differences between Peruvian and Brazilian prM/E sequences was 2.8% (range of 0.9–6.6%), which was slightly lower than the corresponding NS5 divergence (average 3.6%, range of 2.3–6.9).

Figure 2

ML trees of YFV isolates from Peru and Brazil based on (panel A) 670 nt of prM/E region; (panel B) 576 nt of EMF region. Trees are rooted with the Asibi reference strain (Ghana27). Horizontal branch lengths represent genetic divergence (numbers of nucleotide substitutions). Numbers above the branch lengths denote bootstrap support (500 replicates). Hu, human isolate; Mk, monkey isolate; Hg, Haemagogus sp. isolate; Hj, Hg. janthinomys isolate; Sa, Sabethes sp. isolate.

View this table:
Table 2

Average genetic distances within and between Peruvian and Brazilian YFV based on prM/E (670 nt) and EMF (576 nt) regions

NucleotideAmino acidNucleotideAmino acid
% (st. dev.)range% (st. dev.)range% (st. dev.)range% (st. dev.)range
Within Brazil4.2 (1.7)0.1–7.61.6 (1.2)0–6.63.9 (1.9)0–7.42.0 (1.2)0–4.6
Within Peru4.0(1.5)0.1–7.31.7 (1.0)0–5.64.1 (0.3)0–10.10.75 (.75)0–3.4
Between Brazil and Peru9.6 (0.7)7.6–11.42.8 (0.9)0.9–6.68.6 (1.3)7.3–10.93.6 (1.1)2.3–6.9
List of amino acid substitutions within the last 85 codons of NS5
13832MTTexcept brazil62b, brazil91a, which have M
20839VIIexcept brazil73d, brazil94a and 94b
49868LLLexcept Brazil92a and b, which have F
59878QKKexcept for Peru 77bc, 95abfgm, and 98c, which have N
64883TTTexcept Brazil78a, which has A
77896AAAexcept Peru81A, which has V
  • Based on NS5 coding region, 255 nt.

Figure 3

Amino acid sequence alignment for the prM/E region of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top.

Figure 4

Alignment of partial NS5 and 3′NCR nucleotide sequences of Peruvian and Brazilian YFVs. Dots indicate identity with the Asibi reference sequence shown at top. Dashes indicate gaps in the alignment. RYF, imperfect repeat elements.

Within the prM/E fragment there was one amino acid site that distinguished all the Peruvian from the Brazilian sequences (E67) (Fig. 3). Based on homology to the West African Asibi reference strain, asparagine is most likely the ancestral residue at E67; all the Brazilians with the exception of Brazil91b retained the ancestral residue, whereas the Peruvian sequences revealed an N→H substitution at this site. Within the last 85 codons of NS5, there were two amino acid sites separating the Peruvian and Brazilian clades: T→V at NS5-822 and K→R at NS5-881. The Peruvians retained the ancestral residue at NS5-881 (i.e. identity with Asibi at this site), whereas the Brazilians shared the ancestral residue at NS5–882. Neither of these conservative substitutions would be predicted to alter polymerase function, as they do not occur within conserved polymerase motifs [21].

3.2 Genetic diversity within Peruvian YFVs

We have previously reported that the Peruvian YFV geneology based on prM/E sequences revealed six different subclades that corresponded very closely with the following geographic regions: Puno, Pasco, Junin, Cusco, Ayacucho, and San Martin/Huanuco. Numerous substitutions within the prM/E region delineated these subclades leading to high bootstrap values, and three of the clades shared signature amino acid substitutions (i.e. coding changes in nucleotide sequences shared by all members of the group). In this report we present the EMF sequences for 24 of the 25 Peruvian isolates. We were unable to amplify the EMF sequences of Peru95L (OBS2243) due to poor growth characteristics of this strain in cell culture. A total of 60 nucleotide positions were variable within the Peruvian EMF sequences; 43 of these were informative sites. The informative sites were almost equally divided between the NS5 coding (24 sites) and the 3′NCR portions (19 sites) of the sequence. There were only three variable amino acid sites among the Peruvian NS5 sequences, and only one of these sites was informative. Three of the geographic subclades (Pasco, Junin, and San Martin/Huanuco) identified by the prM/E tree showed corresponding relationships on the EMF tree with significant bootstrap support. These clades were delineated by silent substitutions; however, in contrast to the prM/E data set, there are no signature amino acid sites within the short NS5 fragment to suggest similar subclades. EMF sequences of the strains from Cusco and Ayacucho were not monophyletic, and the two isolates from Puno and one isolate from Huanuco shifted positions on the EMF tree. Interestingly, there is a common amino acid substitution at NS5 878 (K→N) shared by seven of the isolates that were not previously believed to be closely related (four from San Martin, two from Ayacucho, one from Cusco). Whether the discordance of the prM/E and EMF genealogies is indicative of intra-lineage recombination events is unclear and requires further examination.

3.3 Genetic diversity within Brazilian YFVs

There were a total of 121 variable nucleotide positions among the Brazilian prM/E sequences, as compared to 131 variable positions within the corresponding EMF sequences. Sixty-eight of the nucleotide positions were parsimony-informative in the case of prM/E, whereas 54 were informative in the EMF region (27 falling within NS5). A total of 19 variable amino acid positions in the prM/E (8.5% of the 223 codons) were scattered throughout the prM, M and E proteins, as compared to eight positions in the partial NS5 fragment (9.4% of 85 codons). Although the overall genetic variability among the Brazil sequences exceeded that of Peru, it is important to note that the Brazilian isolates represented a much larger geographic area as well as a longer time frame (1954–2000). Amino acid pairwise divergence among the Brazil prM/E sequences ranged from 0 to 6.6% (mean of 1.8%) as compared to 0–4.6% (mean of 1.9%) within NS5.

Mosquito and vertebrate-derived sequences from Brazilian YFVs appeared to be distributed randomly in both the prM/E and EMF phylogenetic trees (Fig. 2). Phylogenetic trees of Brazilian prM/E and EMF sequences revealed a cluster of isolates from Para state (dating from 1954 to 1968) that differed significantly from all other Brazilian YFV strains. These strains originated from communities close to the Atlantic coast in the region surrounding Belem. The Para cluster shared two signature amino acids within prM/E (at M44 and E83) and is identical to the ancestral West African Asibi sequence at these sites [9]. Although there are no shared amino acid substitutions within NS5 that delineate the Para cluster, bootstrap support for the clade was equally high in the EMF and prM/E trees. Note that not all the isolates from Para were monophyletic; isolates collected from the same geographic region during later periods (in 1971, 1978, and 1991) fell within the lineage containing all other Brazilian strains. The long branch separating the Para cluster from the other Brazilian YFVs reflects the numerous substitutions in prM/E and EMF that separate this subclade from the other isolates.

With the exception of the Para subclade, there were no additional nodes on the Brazilian portion of the prM/E tree showing strong support. The branching pattern among the Brazilian EMF sequences, however, showed close correspondence to relationships indicated by the prM/E tree, and also revealed significant bootstrap values. Thus, two additional subclades within Brazil may be defined: (1) a group comprised of a human isolate collected in 1991 from Roraima, together with five mosquito isolates from Mato Grosso do Sul and Rondonia (1992–1996), and (2) a group comprised of isolates collected from 1978 to 1994 in the northern and central states of Maranhao, Minas Gerais, and Para. The phylogenetic position of the remaining isolates from Goias (three from 1973), Amapa (1984), and Tocantins (2000) was not easily resolved on either the prM/E or EMF trees. Note that the isolate from Tocantins collected in 2000 was the most contemporaneous isolate included in this study, and the first from this region to be sequenced. Interestingly, it was characterized by a very long branch on the EMF tree, as a result of an unusually large number of substitutions.

3.4 Estimation of evolutionary parameters from prM/E and EMF sequences

Table 3 presents summary statistics and evolutionary parameter estimates for the YFV isolates based on prM/E and EMF sequences. With the exception of the transition/transversion ratio (κ), and the among-site variability (Γ distribution), parameter estimates for both Brazil and Peru were remarkably similar. Transition/transversion ratios differed between Peru and Brazil for the prM/E sequences, but not for the EMF sequences. This is consistent with the observation that transition/transversion ratios closely reflect substitution rates at third codon positions and are an indication of selectional constraints in coding regions. The shape parameter of the Γ distribution, α, provides a measure of the among-site rate heterogeneity; with the exception of the Peruvian EMF sequences, estimates of α were typical of highly conserved sequences, in which only a few sites exhibit variability. The Peruvian EMF sequences were exceptional insofar as the high α value suggested equal distribution of mutations among sites.

View this table:
Table 3

Summary statistics and ML parameter estimates for YFV isolates from Peru and Brazil

A: Comparisons based on the full-length prM/E and EMF fragments
Data set(−ln L)κ%GCVariable sitesNo. site patternsΓ, α
prM-E (670 nt)
EMF (576 nt)
B: Comparisons between prM/E and the coding region of the EMF fragment (NS5)
Data setnRelative site ratesωS
1st cp2nd cp3rd cp
prM-E (223 codons)
NS5 (85 codons)
  • κ, transition/transversion rate ratio; PIS, parsimony informative sites; UIS, parsimony uninformative sites; Γ, α shape parameter of the gamma distribution. n, number of sequences; relative substitution rates for each codon position; ω(dn/ds), non-synonymous/synonymous rate ratio, averaged over sites; S, tree length, number of nucleotide substitutions along the tree per codon; [κ for Peru, based on NS5=38]; [κ for Brazil, based on NS5=26].

It is worth noting that all of the prM/E and EMF sequences exhibited a predominance of C–U substitutions. Similar observations of sequences rich in C–U transitions have also been observed for the 3′NCR of the related flavivirus, West Nile virus [22,23]. Interestingly, the nucleotide base composition of the sequenced mitochondrial segments of many insect species has a high A+T content (as high as 75–78% for A. aegypti), and the most frequent transversions in these species are of the A–T type [24]. Differences in the observed transversion frequencies between Aedes and Culex mosquitoes, for instance, were used to infer relative genetic distances between the mosquito genera. Among the YFV sequences in this study, base frequencies did not appear to be A–U rich, and there was no noticeable trend among transversion frequencies.

4 Discussion

Our previous analysis of YF in Peru provided both epidemiological and phylogenetic evidence to suggest that YFVs in the high forests of the Andes circulate within discrete enzootic foci [1]. Our ability to discern separate subclades of YFV within Peru appeared to indicate low levels of population intermixing between viruses from adjacent river basins. Given the extremely complex topography of the Peruvian Andes, as well as the numerous centers of species endemism that have been observed for taxa of flora and fauna in the region [25], we hypothesized that genetic isolation by distance could explain the observed molecular diversity of the YFVs, and that biogeographic barriers had helped to shape the evolution of the virus. The current study was undertaken to confirm evidence for geographic subtyping of the Peruvian strains by examining a distal region of the genome (e.g. the EMF fragment). In addition, we wished to address whether YFV circulation in Brazil exhibited a similar pattern of population substructure. We reasoned that the molecular signature of virus populations circulating in enzootic foci would differ markedly from that of viruses transmitted via ‘wandering epizootics’, and thus might be discernible through a comparative study of virus phylogenies.

Our data revealed that Brazilian and Peruvian YFVs are significantly divergent virus lineages that can be differentiated on the basis of molecular markers in both the structural proteins and the 3′ non-coding regions. Analysis of EMF sequences from the Peruvian isolates failed to fully support the geographic subclades previously delineated by the prM/E tree. In particular, relationships among strains from Ayacucho, Cusco, and Puno were not confirmed by the EMF tree and further studies are necessary to fully understand the significance of these results.

Comparison of the Peruvian and Brazilian sequences revealed interesting differences in the spatial distribution of variants within the two regions. The Brazilian prM/E sequences revealed a branching pattern that suggested the possibility of widely dispersed epizootics; isolates collected over very large distances within Brazil appeared in some instances within the same subclade (e.g. Brazil91B and Brazil96A; Brazil78A and Brazil94A). It was also the case, however, that some closely related variants appeared to have persisted for as long as 20 years within the same locale (e.g. Brazil71 and Brazil91A). In contrast to the Peruvian gene trees, the Brazilian prM/E and EMF trees showed very close agreement with no discordance in the placement of individual isolates. Although bootstrap support for subclades varied between the trees, the genetic relationships among the strains were upheld by analysis of both genomic regions.

It is apparent that using the existing set of YFV sequences from Peru and Brazil, it is difficult to establish a clear and consistent explanation for the observed molecular diversity. The broad distribution of genetic variants across widely different ecological zones in Brazil, and the lack of clear temporal clustering of strains, suggests a very complex pattern of virus transmission that could be considered consistent with the hypothesis of wandering epizootics.

Discrepancies regarding geographic clustering, and evidence for enzootic foci in Peru may reflect the absence of sufficient phylogenetic signal in the EMF sequences, or ascertainment bias in the choice of genomic region for sequencing. Alternatively, these discrepancies could also result from differences in the relative population sizes (transmission intensities) of different subclades of virus. It is important to note that none of the 50 South American YFV isolates studied to date provided evidence to suggest recombination between the Peruvian and Brazilian lineages. However, rare recombination events or even a very small amount of migration between adjacent watersheds could easily obscure the phylogenetic signal of population subdivision. Sequencing complete genomes of representative strains may be required to resolve phylogenetic relationships among these strains.

In summary, we have described the considerable genetic variability among circulating YFVs in Peru and Brazil, and found that some virus variants appear able to persist within circumscribed foci whereas other variants appear to have dispersed over thousands of kilometers. Given the potential threat to public health from re-urbanization of the disease, it will be crucial to improve understanding of the processes controlling YFV evolution. Biological and phenotypic characterization of the YFV genetic variants would also represent an important step forward towards elucidating the ecological implications of the underlying genetic variation.


We thank Drs. Robert Tesh and Pedro F.C. Vasconcelos who provided virus isolates from collections maintained at the World Arbovirus Reference Center at the University of Texas Medical Branch, in Galveston, Texas, and the Instituto Evandro Chagas, in Belèm, Brazil, respectively. This work was supported by NIH grant AI 10986, by a Zelda Zinn Scholarship award to JEB, and the CDC training grant T01/CCT622892–01.


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
View Abstract