Sarah AuburnUlrike BöhmeSascha SteinbissHidayat TrimarsantoJessica HostetlerMandy SandersQi GaoFrancois NostenChris I. NewboldMatthew BerrimanRic N. PriceThomas D. OttoMenzies School of Health ResearchWellcome Trust Sanger InstituteEijkman Institute for Molecular BiologyNational Institute of Allergy and Infectious DiseasesJiangsu Institute of Parasitic DiseasesMahidol UniversityNuffield Department of Clinical MedicineWeatherall Institute of Molecular Medicine2018-12-112019-03-142018-12-112019-03-142016-01-01Wellcome Open Research. Vol.1, (2016)2398502X2-s2.0-85014557853https://repository.li.mahidol.ac.th/handle/20.500.14594/43272© 2016 Auburn S et al. Plasmodium vivax is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous ex vivo culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of P. vivax have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 Plasmodium interspersed repeat (pir) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality.Mahidol UniversityBiochemistry, Genetics and Molecular BiologyA new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genesArticleSCOPUS10.12688/wellcomeopenres.9876.1