EMBL-EBI User Survey 2024

Do data resources managed by EMBL-EBI and our collaborators make a difference to your work?

Please take 10 minutes to fill in our annual user survey, and help us make the case for why sustaining open data resources is critical for life sciences research.

Survey link: https://www.surveymonkey.com/r/HJKYKTT?channel=[webpage]

WormBase ParaSite HomeVersion: WBPS19 (WS291)-  Archive: WBPS18

Heligmosomoides polygyrus (Heligmosomoides bakeri)

BioProject PRJEB15396 | Data Source University of Edinburgh | Taxonomy ID 6339

About Heligmosomoides polygyrus (Heligmosomoides bakeri)

The nematode Heligmosomoides polygyrus (formerly known as Nematospiroides dubius) is a common parasite found in the duodenum and small intestine of woodmice and other rodents. The laboratory strain that has been sequenced was originally isolated from Peromyscus in California (Behnke and Harris, 2010); natural parasites from European wood mice (Apodemus sp.) are somewhat distinct, giving rise to some debate about the correct name and taxonomy of this species (Cable et al, 2006; Behnke and Harris, 2010; Maizels et al, 2010), with suggestions that the laboratory strain is named H. polygyrus bakeri or H. bakeri, and the European strain H. polygyrus polygyrus. The laboratory strain has been maintained as described by Camberis et al (2003) and is often used to model human helminth infection as it can establish chronic infection in different strains of mice.

There is 1 alternative genome project for Heligmosomoides polygyrus available in WormBase ParaSite: PRJEB1203

Genome Assembly & Annotation


The genome of H. polygyrus was produced by the Blaxter Laboratory at the University of Edinburgh. According to Chow et al. (2019), the short-read Illumina data were assembled and gapfilled using Platanus, and scaffolded using transcriptome evidence with SCUBAT2. Long-read PacBio data were used to further scaffold and gapfill the assembly with PBJelly.


The gene predictions were produced by the Blaxter Laboratory at the University of Edinburgh. According to Chow et al. (2019), the BRAKER pipeline was used to predict protein-coding genes using the RNA-Seq reads as evidence. The BRAKER general feature format (gff) file and the transcriptome assembly within MAKER2 were combined to predict untranslated regions of transcripts and remove low quality gene predictions.

Key Publications

Assembly Statistics

AssemblynHp_v2.0, GCA_900096555.1
Database VersionWBPS19
Genome Size696,954,138
Data SourceUniversity of Edinburgh
Annotation Version2016-09-WormBase

Gene counts

Coding genes23,471
Gene transcripts25,215

Learn more about this widget in our help section

This widget has been derived from the assembly-stats code developed by the Lepbase project at the University of Edinburgh