Globodera pallida

BioProject PRJNA702104 | Data Source University of St Andrews | Taxonomy ID 36090

About Globodera pallida

The nematode Globodera pallida, or golden nematode or potato cyst nematode (PCN), is one of the most economically important nematode problems of the U.K. potato industry. PCN imposes an annual cost in excess of 50 million on U.K. potato growers and threaten the future of the crop for many growers. Effective control of G. pallida is an essential requirement to maintain the competitiveness of U.K. potato production.

There are 2 alternative genome projects for Globodera pallida available in WormBase ParaSite: PRJEB123 PRJNA764088

Genome Assembly & Annotation


Unpublished data

This is an unpublished genome project (publication pending). Before working with these data, please read our Policy for using unpublished data.

A genome assembly pipeline was used to generate a high-quality assembly from long sequencing reads. Canu was used for error correction, followed by assembly with wtdbg2. The assembly was assessed for contamination using BlobTools. A BLASTn approach was followed to further decontaminate the assembly. Contaminant contigs were removed. After finishing (FinisherSC) and purging (purge haplotigs), the assembly was subjected to long read scaffolding using SSPACE-long read v1.1 and gap filling using GapFinisher. The assembly was then polished using PacBio reads with Arrow and using both short and long reads with Pilon. The assembly was also phased using Freebayes, Whatshap, and Bcftools. All scripts used to conduct the analysis can be found here. The assembly was directly submitted to WormBase ParaSite by Dr Peter Thorpe. Awaiting publication.


BRAKER2 was used for gene model prediction, assisted by short-read RNAseq data from Cotton et al., (2014). The resulting BRAKER predicted GFF file and the GeneMark-ET GFF files were passed to Funannotate. DIAMOND_BLASTP search against Swiss-prot was also given as evidence. The genome-guided RNAseq assembly was also passed to Funannotate, which runs PASA. The gene models were further refined in the “update” stage of Funannotate using a combination of PASA and Stringtie (Pertea et al., 2015). All scripts used to conduct the analysis can be found here. The gene models was directly submitted to WormBase ParaSite by Dr Peter Thorpe. Awaiting publication.

Assembly Statistics

AssemblySA_Gpal_Newton, GCA_023343765.1
Database VersionWBPS19
Genome Size119,582,986
Data SourceUniversity of St Andrews
Annotation Version2023-10-WormBase

Gene counts

Coding genes7,762
Non coding genes108
Small non coding genes108
Gene transcripts8,217

