Genomic sequencing of orientia

Orientia tsutsugamushi is difficult to culture and sequence, and the first complete genomes were not available until 2010. These two genomes of the Boryong and Ikeda strains revealed an extraordinary genome with extensive repeat amplification. More recently, further complete genomes have been obtained by using a combination of long and short-read next-generation sequencing. Long-read sequencing, using the PacBio and Oxford Nanopore Technologies platforms, allows for assembly of a complete sequence despite the extreme repeat content of O. tsutsugamushi. The short-read sequencing is used to correct errors introduced due to the higher error rate of these long-read technologies.

Comparative genomic analysis of the ten complete genomes available confirms widespread amplification of repeat elements, with wide variation in the copy number of different highly amplified genes, and shows extensive chromosomal rearrangements and loss of synteny between strains, with few of the conserved core genes being retained in the same order between strains.

Short-read sequencing has also been successfully used to investigate the genomics of O. tsutsugamushi. While it is difficult to assemble complete sequences using short reads alone, due to the heavy repeat content, the reads can be compared to a reference genome and used to call single nucleotide polymorphisms (SNPs) and short insertions and deletions, allowing for phylogenetic analysis of strains.

When whole genomes are not available, the relationships between strains of O. tsutsugamushi can be determined by MLST (multilocus sequence typing), which looks at the sequence of seven housekeeping genes and systematically assigns a number to each new allele of a gene, and to each combination of alleles. Or the highly variable sequences of the 56 kDa and 47 kDa genes can be compared, as there is enough diversity in this gene to be able to see differences between strains.”