OUP user menu

Phase variable restriction–modification systems in Moraxella catarrhalis

Kate L. Seib, Ian R.A. Peak, Michael P. Jennings
DOI: http://dx.doi.org/10.1111/j.1574-695X.2002.tb00548.x 159-165 First published online: 1 January 2002


A repetitive DNA motif was used as a marker to identify novel genes in the mucosal pathogen Moraxella catarrhalis. There is a high prevalence of such repetitive motifs in virulence genes that display phase variable expression. Two repeat containing loci were identified using a digoxigenin-labelled 5′-(CAAC)6-3′ oligonucleotide probe. The repeats are located in the methylase components of two distinct type III restriction–modification (R–M) systems. We suggest that the phase variable nature of these R–M systems indicates that they have an important role in the biology of M. catarrhalis.

  • Moraxella catarrhalis
  • Type III restriction–modification system
  • Phase variation

1 Introduction

Moraxella catarrhalis is a causative agent of otitis media in children and can act as an opportunistic pathogen in adults with predisposing lung disease. M. catarrhalis is also carried asymptomatically in the respiratory tract of a subset of the human population [1]. The molecular mechanisms of M. catarrhalis that enable this carriage, and those which promote the transition between commensalism and virulence, are not yet fully understood.

Phase variation, the high frequency reversible on/off switching of phenotypic expression, is a feature of many virulence determinants [2]. Phase variation may be mediated by simple repetitive DNA motifs (e.g., mono-, di-, tetra-, penta-nucleotide repeats) that exhibit high mutation rates by loss or gain of repeat units during replication [3,4]. Oligonucleotide repeats are associated with numerous phase variable virulence determinants of pathogenic bacteria, such as adhesins (e.g., Opa proteins of Neisseria spp. [5,6]), iron acquisition systems (e.g., haemoglobin binding proteins of Haemophilus influenzae[7,8]) and lipopolysaccharide (LPS) biosynthesis genes (e.g., in Neisseria spp. and Haemophilus spp. [911]). Indeed the presence of such oligonucleotide repeats has been used to identify novel genes encoding virulence determinants [7,12].

The tetranucleotide repeat motif 5′-(CAAC)n-3′ is associated with phase variable LPS biosynthesis genes in H. influenzae[7]. The same motif is present in M. catarrhalis, but the identity of the associated genes was not known [13]. On the basis that many virulence genes contain repetitive elements, we used the 5′-(CAAC)n-3′ motif as a marker for the identification of genes predicted to be involved in the virulence of M. catarrhalis. Here we present the identification and analysis of the genes containing these 5′-(CAAC)n-3′ repeats in M. catarrhalis.

2 Materials and methods

2.1 Bacterial strains and culture conditions

M. catarrhalis strains used in this study were ATCC strains 8913, 23246, 25238 and 25839, and clinical isolates IB28, IC23 and IDT5 (isolated from patients with middle ear infections from the Royal Brisbane Hospital, Qld, Australia). M. catarrhalis strains were grown overnight on brain heart infusion (Oxoid, Basingstoke, UK) 1% agar plates supplemented with 10% Levinthal's base [14] at 37°C in 5% CO2. Escherichia coli strain DH5a[15] was grown overnight in Luria–Bertani (LB) broth or on LB plates containing 1.5% bacteriological agar (Difco, Detroit, MI, USA) at 37°C. Ampicillin was used at a final concentration of 100 µg ml−1.

2.2 Southern blot analysis and hybridisation

Miniprep of bacterial genomic DNA was performed as described by Ausubel et al. [16]. Restriction endonuclease (Apo I, Cla I, Hin cII or Mfe I) digested genomic DNA was separated on 0.7% agarose gels and transferred to GeneScreen Plus® Hybridisation Transfer membrane (NEN™ Life Science Products, Boston, MA, USA) by capillary action essentially as described in Sambrook et al. [15]. Hybridisation with the digoxigenin (DIG)-labelled 5′-(CAAC)6-3′ oligonucleotide probe (Genset) was carried out for 16 h at 50°C. Washes and detection were carried out (using the DIG DNA Labelling and Detection Kit, Boehringer Mannheim, Indianapolis, IN, USA) as recommended by the manufacturer. All restriction endonucleases and ligases were obtained from New England Biolabs (NEB), Beverly, MA, USA.

2.3 Recombinant DNA library construction and screening

A plasmid library was constructed from Cla I/Mfe I digested M. catarrhalis 23246 genomic DNA cloned into Acc I/Eco RI digested pUC19 (NEB). The library was screened via colony hybridisation using the DIG-labelled 5′-(CAAC)6-3′ oligonucleotide probe. Colonies were transferred to autoclaved GeneScreen Plus® Hybridisation Transfer membrane (NEN™ Life Science) by placing the membrane disc onto the surface of the agar plate for 1 min. The membranes (colony side up) were placed on Whatman 3MM filter paper soaked with: denaturation solution (15 min), neutralisation solution (15 min), then 2×saline sodium citrate (SSC) (10 min). Cell debris was removed by gentle agitation in 2×SSC. Hybridisation, washes and detection were carried out as described above.

2.4 Recombinant DNA techniques and nucleotide sequence analysis

Most recombinant DNA techniques were as described in Ausubel et al. [16]. Plasmid DNA was isolated from clones using the Qiagen Plasmid Midi Kit (Qiagen, Chatsworth, CA, USA) and sequenced on both strands in full (Prism Dye Terminator Sequencing Kit with AmpliTaq DNA polymerase FS (Perkin Elmer, Norwalk, CT, USA) in conjunction with a model 377 automated sequencer (Applied Biosystems, Foster City, CA, USA)) using vector specific primers and by primer walking. Further sequence was obtained via inverse polymerase chain reaction (PCR) and using a degenerate primer designed from a conserved region in homologous genes. Oligonucleotide primers were synthesized by GeneWorks Pty Ltd. PCR was essentially carried out as described by Saiki et al. [17]. Nucleotide sequence analysis was carried out using MacVector (Oxford Molecular Ltd) and BLASTX [18]. For comparisons in Table 1 the entire amino acid sequence of each methylase protein was compared, and the first 475 (McaRI) and 430 (McaRII) amino acids of the restriction enzymes were compared, using the ClustalW program (Oxford Molecular Ltd).

View this table:
Table 1

Comparison of (A) type III methylase (mod) genes and (B) type III restriction endonuclease (res) genes of several bacterial pathogens, shown as % similarity (identity)

McaRI37 (26)35 (22)30 (19)41 (27)52 (36)30 (19)
McaRII27 (17)43 (28)51 (36)31 (19)29 (17)
Ph35 (21)31 (17)29 (16)36 (21)
NmI40 (25)32 (17)29 (16)
NmII26 (15)55 (48)
Hi29 (15)
McaRI46 (28)33 (17)33 (17)37 (20)68 (49)26 (12)
McaRII24 (12)36 (19)37 (20)21 (10)28 (11)
Ph44 (25)32 (14)32 (14)35 (20)
NmI37 (24)24 (13)22 (11)
NmII30 (13)92 (90)
Hi30 (13)
  • McaRI, M. catarrhalis (AY049056); McaRII, M. catarrhalis (AY049057); Ph, Pasteurella haemolytica (AF060119) [48]; NmI, Neisseria meningitidis Z2491 (NMA1467, CAB84700) [49]; NmII, N. meningitidis Z2491 (CAB84817, CAB84818) [49]; Hi, H. influenzae Rd (AAC22721, AAC22720) [50]; Hp, Helicobacter pylori 26695 (AAD07659, AAD07657) [51].

The 5′-(CAAC)n-3′ associated loci of the seven M. catarrhalis strains were amplified from single colonies by PCR using the primers MCA3 (5′-AGCGGTGAAAGCAGTGCGTG-3′) and MCA4 (5′-CCTGTGGCAATTTCATAGTC-3′). Products of approximately 684 bp were sequenced to determine the number of repeat units present in each strain. Primers used for inverse PCR reactions include: MCA2 (5′-TAAGAGCCCATGTGATCGGC-3′), MCA7 (5′-TAATGGCGTTTATTGGCTAAC-3′) and MCA10 (5′-TCCTGTACAAGCATTTGAGC-3′) for Mca RI res, and MCB5 (5′-AGGTAGTGGAACAACAGCTC-3′) and MCB7 (5′-TGCCAACACTTGCTGTCAGG-3′) for Mca RII res. The degenerate primer RE1 (5′-TTGGGRTTRTCCCAGCCTTC-3′) for PCR was designed from a conserved region of type III restriction enzyme genes from Pasteurella haemolytica, Neisseria meningitidis and Helicobacter pylori.

3 Results

We have previously reported the presence of two loci containing the 5′-(CAAC)n-3′ tetranucleotide repeat in M. catarrhalis ATCC strain 8913 by Southern blot analysis using 32P-labelled oligonucleotide probes [13]. These probes were designed from tracts of repetitive DNA identified in the H. influenzae genome [7]. To investigate the presence of 5′-(CAAC)n-3′ repeats in other M. catarrhalis strains, a collection of strains was surveyed by Southern blot analysis using a DIG-labelled 5′-(CAAC)6-3′ oligonucleotide probe (Fig. 1). This survey revealed the presence of repeats in all seven of the M. catarrhalis strains tested, with six of the seven strains containing at least two 5′-(CAAC)n-3′ repeat associated loci.

Figure 1

Distribution of 5′-(CAAC)n-3′ repeats in various M. catarrhalis strains. Southern blot of M. catarrhalis chromosomal DNA restriction digests hybridised with the DIG-labelled oligonucleotide 5′-(CAAC)6-3′ probe. Each M. catarrhalis ATCC strain is digested with Hin cII, Cla I, Mfe I, and Apo I respectively. Lanes 1–4, strain 8913; 5–8, strain 23246; 9–12, strain 25238; 13–16, strain 25839. Molecular mass standards are indicated in kb.

To further investigate the repeat associated loci, a plasmid library derived from genomic DNA of M. catarrhalis strain 23246 was screened using the 5′-(CAAC)6-3 probe (see Section 2). Two hybridising clones were isolated from this library, pMcrepI and pMcrepII (see Fig. 2). The cloned regions were sequenced and homology searches were performed. Both clones contain 5′-(CAAC)n-3′ repeats within the 5′-end of open reading frames (ORFs) of two distinct genes encoding methylases of type III R–M systems (see below). Inverse PCR and PCR with degenerate primers were used to isolate additional flanking sequences required to determine the sequence of the methylase (mod) genes and associated downstream restriction endonuclease (res) genes. These ORFs were designated Mca RI mod and res, and Mca RII mod and res. The sequenced regions are displayed in Fig. 2 and are deposited in GenBank under the accession numbers AY049056 and AY049057.

Figure 2

Schematic representation of (A) the Mca RI and (B) the Mca RII type III R–M systems identified in M. catarrhalis. The lines labelled Mc 23246 represent the double stranded sequence obtained from the M. catarrhalis ATCC strain 23246 (GenBank entries AY049056, AY049057). The arrows above the line indicate the location, orientation and predicted function of the ORFs identified in the sequence. The location of the 5′-(CAAC)n-3′ repeats are shown above the methylase ORF. The pMcrepI and pMcrepII lines represent the clones obtained from the Cla I/Mfe I genomic library screened with the 5′-(CAAC)6-3′ oligonucleotide probe. The vector backbone for these plasmids (pUC19) is represented by a box. The sequence obtained by PCR and inverse PCR is indicated and the primers used are represented by arrowheads.

The Mca RI mod ORF was sequenced in full and is 1911 nucleotides (nucleotide position (ntp) 1755–3665 of GenBank entry AY049056). The predicted protein (M.Mca RI) (637 amino acids (aa), calculated molecular mass of 72.7 kDa and estimated pI of 5.56) displays a high degree of similarity to the P. haemolytica type III methylase (Table 1). The P. haemolytica methylase also contains a repetitive tract (5′-(CACAG)24-3′) in its 5′-end. Immediately downstream of Mca RI mod is a second ORF, Mca RI res, that was sequenced in part (1427 nucleotides starting at ntp 3681 of GenBank entry AY049056). The predicted 475 aa translation displays similarity to the P. haemolytica type III restriction endonuclease (879 aa) (Table 1). The G+C contents of the ORFs are 33% for Mca RI mod and res, and 50% for the predicted aminotransferase located upstream of the R–M system. These figures are significantly different for adjacent genes and vary from the genome averages cited for M. catarrhalis (40–43%) [19].

Nucleotide sequence analysis of Mca RI mod from various sources (the initial clone, various subclones, and a PCR amplified section of chromosomal DNA) revealed the presence of 20–22 copies of the 5′-(CAAC)-3′ repeat unit. The predicted start codon, the 5′-(CAAC)n-3′ repeats and the methylase gene are in-frame when there are 20 repeat units; however, addition or deletion of one or two repeat units is predicted to result in premature truncation of the putative protein.

The Mca RI mod repeat region from the seven M. catarrhalis strains surveyed by Southern blot were amplified by PCR and sequenced (Table 2). The repeats are associated with a methylase in all strains and the number of repeats ranged from 19 to 44. Expression of the predicted type III methylase is possible in only four of the seven strains due to a premature stop codon in two reading frames after the repeats. Alteration in the number of repeat units will affect expression in all strains.

View this table:
Table 2

Nucleotide sequence and predicted translation of the Mca RI mod 5′-(CAAC)n-3′ repetitive tract from seven M. catarrhalis strains


The second 5′-(CAAC)n-3′ repetitive tract in M. catarrhalis 23246 is also located within the 5′-end of an ORF homologous to methylases of type III R–M systems (Table 1). The Mca RII mod ORF was sequenced in full and is 1632 nucleotides (ntp 371–2002 of GenBank entry AY049057) and the predicted translation is similar to a Mycoplasma pulmonis type III restriction endonuclease over 430 aa. The repetitive DNA tract of Mca RII mod contains 18 copies of the 5′-(CAAC)-3′ unit. Addition or deletion of one or two units is predicted to result in premature truncation of the putative protein. In addition, two homopolymeric tracts containing eight deoxyadenylate residues are present, with a frameshift mutation seen after the second polyA tract. The G+C content of Mca RII mod and res is 35%.

The putative proteins described above contain regions conserved in the family of type III R–M systems (see Fig. 3). Type III methylases contain a motif, I-Y-I-D-P-P-Y, involved in the transfer of a methyl group to the N-6 adenine (the amino group at the C-6 position of adenines) in the DNA recognition sequence [20]. The predicted amino acid sequences of M.Mca RI and M.Mca RII both contain this sequence. A slight variation of the F-x-G-x-G motif, involved in binding the methyl donor S-adenosylmethionine (AdoMet) [20,21], is also found in M.Mca RI (F-A-G-S-A) and M.Mca RII (H-A-G-S-G). Mca RI and Mca RII contain the first of seven motifs, T-G-T-G-K-T, conserved in type I and type III restriction endonucleases [22]. This motif is characteristic of the helicase superfamily II and is proposed to be an ATP binding motif responsible for helicase activity [22] and/or DNA translocation [20].

Figure 3

Schematic representation of repetitive DNA within potentially phase variable type III R–M systems. The methylase (mod) genes, restriction endonuclease (res) genes and the repeat regions are indicated. McaRI, M. catarrhalis (AY049056); McaRII, M. catarrhalis (AY049057); Ph, P. haemolytica (AF060119) [48]; NmI, N. meningitidis Z2491 (NMA1467, CAB84700) [49]; NmII, N. meningitidis Z2491 (CAB84817, CAB84818) [49]; Hi, H. influenzae Rd (AAC22721, AAC22720) [50]; Hp, H. pylori 26695 (AAD07659, AAD07657) [51]. *These genes are also found in N. meningitidis MC58 (NMB1261, AAF41638; NMB1375) [52] and Neisseria gonorrhoeae (University of Oklahoma's Advanced Centre for Genome Technology, OK, USA). Conserved regions within type III R–M systems are also shown. The DPPY motif is involved in catalysis of the methylation reaction [20], the FxGxG motif is involved in binding of the methyl donor (AdoMet) [20,21], and the TGxGKT motif is involved in ATP binding [20,53].

The sequence similarity (Table 1) and the presence of the tetranucleotide repeats within the predicted coding regions suggest that these loci represent two novel, phase variable type III R–M systems. This is supported by the fact that the reading frame of the methylase gene is influenced by the number of repeat units present, and that the number of repeats contained in Mca RI mod varies in number both within and between strains.

4 Discussion

We have identified and cloned two novel, potentially phase variable type III R–M systems in M. catarrhalis. This report adds to the growing list of repeat-containing, potentially phase variable methylases within type III R–M systems (Fig. 3). In addition, P. haemolytica contains a type I R–M system with 5′-(CACAG)n-3′ repeats [23,24], and M. pulmonis has a phase variable type I system mediated by site-specific inversion [25,26].

R–M systems are ubiquitous in bacteria and are traditionally described as simple immune systems that protect the host against infection by foreign DNA [27], including phage DNA (although R–M systems may protect incompletely, or have an ‘ephemeral’ role in protection) [2830]. The distinct methylation pattern distinguishes ‘self’ from ‘non-self’ DNA, and incoming DNA is subjected to endonucleolytic cleavage. In this context, the advantage of phase variable R–M systems is not obvious. The prevalence of phase variable methylases and R–M systems in bacteria suggests that important, hitherto unrecognised functions are being fulfilled. There are possible implications for several cellular processes such as inter- and intra-species transformation and genetic regulation. A role in pathogenicity has also been suggested [29,31].

R–M systems are classified into three groups (type I, II and III) based on differences in the subunit structures of their enzymes, cofactor requirements, recognition sites and enzymatic mechanisms (see [27,32] for reviews of R–M systems). The components of type III R–M systems catalyse two distinct reactions: (1) the modification enzyme/methylase (Mod) is required for sequence recognition in both modification and restriction reactions, and catalyses the post-replicative addition of a methyl group to an adenine residue in a specific DNA sequence; (2) the cognate restriction endonuclease (Res) recognises the same sequence and catalyses double stranded cleavage of unmethylated foreign DNA in the presence of the mod gene product [33,34]. Type I R–M systems similarly require a complex of the hsd RMS gene products (the restriction, modification and specificity subunits) for endonuclease activity. In contrast, the restriction and modification proteins of type II R–M systems act independently of each other.

The requirement for protection against autodegradation and the consequent linkage of genes of R–M systems has led to type II R–M systems being considered examples of ‘selfish genes’ [35]. Several authors have suggested that phase variation of methylases may lead to autolytic self-DNA degradation by the cognate restriction enzyme and that such systems may be suicidal [12,29]. More specifically, it has been suggested “for bacterial species possessing natural transformation systems the induction of phase variable restriction activity may be part of an autolysis process that releases DNA into the environment for uptake by other cells” [29]. Saunders et al. [12] suggest that a population gains a selective advantage through “‘bacterial suicide’ by a proportion of that population”. These observations lead to the conclusion that such a system comprises “a remarkably ‘unselfish gene’” [12] and that “the phase variable nature of the system strongly argues against the selfish behaviour hypothesis” [29]. When the known biology of restriction systems is reviewed (see above), it is obvious that type I and type III restriction subunits are not active in isolation. If a type I or type III methylase gene is not expressed (switched off by phase variation), the resulting phenotype would be a non-functional R–M system. If during replication a type III R–M system is switched on again, then cellular conditions are such that methylation is favoured over restriction [36]. Thus for type I and type III R–M systems, phase variation of a methylase subunit will not lead inevitably to suicidal self-restriction: the repeat-containing R–M systems identified to date are of the type I and type III families.

Many of the organisms in which phase variable R–M systems have been identified are naturally transformable [37]. In the case of N. meningitidis, N. gonorrhoeae and H. influenzae, DNA uptake is predominantly confined to double stranded DNA containing genus-specific uptake sequences, and is the primary route of genetic exchange. N. meningitidis and N. gonorrhoeae utilise intergenomic exchange as a mechanism to generate antigenic variation of pilin (type 4 fimbriae) [38]. In addition, the mosaic structure of many genes in these bacteria indicates a history of recombination and implies integration of short DNA fragments. Methylation states have been shown to affect transformation efficiency [39,40] and the on/off switching of methylation would ensure that at any point a proportion of the population would have a different methylation pattern to the majority. The consequent uptake of this differentially methylated DNA would result in restriction and provide substrate for one or a few small insertions, without the obvious potential consequences of large-scale integration of ‘non-self’ DNA. Similarly, DNA from heterologous sources will be restricted. Conversely, if incoming DNA contains the same methylation pattern, this DNA will not be restricted and larger fragments may be incorporated, potentially contributing to larger scale genomic recombination, and possibly rearrangement. Such large-scale rearrangement has been noted for N. gonorrhoeae[41]. By analogy to the (not naturally competent) species E. coli, R–M system differences (and presumably differential methylation) have been implicated in size of DNA replacements in P1 transduction experiments ([42] and references therein). Hence, R–M systems may operate as a mechanism for producing many ‘recombinogenic ends’ [43].

Another possible role for phase variable R–M systems is in providing an additional layer of gene regulation. Regulation of virulence determinant expression by methylation has been well documented. Dam methylation affects Pap pili [44] and Ag43 [28] expression in E. coli, virulence gene expression in Salmonella typhimurium[45] and the rate of phase variation of capsular polysaccharide expression in N. meningitidis[46] (this is not seen in all phase variable systems in N. meningitidis and is not the result of mismatch repair deficiencies [47]). It has also been suggested that phase variation of R–M activity is associated with antigenic variation in M. pulmonis[25,26].


Work in M.P.J.'s laboratory is supported by the NHMRC. K.L.S. is supported by an Australian Postgraduate Award.


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
  38. [38].
  39. [39].
  40. [40].
  41. [41].
  42. [42].
  43. [43].
  44. [44].
  45. [45].
  46. [46].
  47. [47].
  48. [48].
  49. [49].
  50. [50].
  51. [51].
  52. [52].
  53. [53].
View Abstract