EMBL-EBI User Survey 2024

Do data resources managed by EMBL-EBI and our collaborators make a difference to your work?

Please take 10 minutes to fill in our annual user survey, and help us make the case for why sustaining open data resources is critical for life sciences research.

Survey link: https://www.surveymonkey.com/r/HJKYKTT?channel=[webpage]

WormBase ParaSite HomeVersion: WBPS19 (WS291)-  Archive: WBPS18

Schmidtea mediterranea

BioProject PRJNA885486 | Data Source Max Planck Institute of Molecular Cell Biology and Genetics | Taxonomy ID 79327

About Schmidtea mediterranea

The freshwater planarian Schmidtea mediterranea is commonly used as a model for regeneration and development of tissues, due to its ability to regenerate large portions of missing body parts.

There is 1 alternative genome project for Schmidtea mediterranea available in WormBase ParaSite: PRJNA12585

Genome Assembly & Annotation


This genome was produced by the Department Rink at the Max Planck Institute for Multidisciplinary Sciences and PlanMine.

Unpublished data

This is an unpublished genome project (publication pending). Before working with these data, please read our Policy for using unpublished data.

Circular consensus sequences from ~30x coverage PacBio reads were called using pbccs v(6.0.0) and reads with quality > 0.99 (Q20) were taken forward as "HiFi" reads. Additionally 1,000 Million Hi-C reads were generated from extracted nuclei of whole animals using the Arima-HiC+ Kit. PacBio HiFi and Hi-C reads were used to assemble phased contigs with hifiasm v0.7. Next, Hi-C reads whose mapping quality no less than 10 (-q 10) were further utilized to scaffold the contigs from each haplotype by SALSA v2 following the hic-pipeline, which includes filtering procedures such as removal of experimental artifacts from Hi-C alignments, fixation of Hi-C pair mates, and removal of PCR duplicates, etc. Four chromosome-level scaffolds could be observed in both haplotypes after scaffolding. However, Hi-C heatmap revealed evidence of misplacement of contigs in terms of positions and orientations. These errors were then manually curated based on the interaction frequency indicated by the intensity of Hi-C signals.

This genome page corresponds to the Haplotype 2 of this phased assembly. For Haplotype 1 click here.


The gene predictions were made by the Department Rink at the Max Planck Institute for Multidisciplinary Sciences.

This annotation is unpublished (publication pending). To work with this data you must follow our Unpublished data Usage Policy.

The transcript annotation was generated by a hybrid genome-guided approach relying on dedicated long-read Oxford Nanopore cDNA/dRNA sequencing runs and Illumina short-read and poly-adenylation data obtained by publically available datasets. After read quality trimming, deduplication, filtering, and mapping (using HISAT and minimap2 for short and long reads, respectively), a draft transcriptome was generated using Stringtie2, then it was further refined using FLAIR and a collection of custom scripts to filter high confidence isoforms.

Key Publications



Assembly Statistics

Database VersionWBPS19
Genome Size819,865,861
Data SourceMax Planck Institute of Molecular Cell Biology and Genetics
Annotation Version2022-10-WormBase

Gene counts

Coding genes21,310
Gene transcripts41,036

Learn more about this widget in our help section

This widget has been derived from the assembly-stats code developed by the Lepbase project at the University of Edinburgh