Ceutorhynchus assimilis


Species: Ceutorhynchus assimilis (Paycull) – Cabbage seed pod weevil

Order: Coleoptera

Family: Curculionidae

Subfamily: Ceutorhynchinae

Genus: Ceutorhynchus

Cabbage seed weevils are widespread throughout the UK although they are generally considered to be less abundant than in previous years. Adult feeding on the young flowers and pods has little impact on yield, so treatment is not necessary during migration into crops. Larvae feeding in the pods can damage up to a quarter of the developing seeds and where a high percentage of pods have been affected this can equate to an overall yield loss of 5-10 %, however additional yield losses may result from brassica pod midge which can exploit feeding damage and egg laying scars to deposit their eggs.

The adults weevil usually lays a single egg in a seed pod and it has been estimated that an adult cabbage seed weevil can lay as many as 50 eggs in a season. After hatching the larvae feed on the developing seeds. The damage done by the developing larvae is relatively minor but the secondary damage caused by pod midges which lay their eggs via the puncture holes is more economically damaging.

Source: https://cropscience.bayer.co.uk/threats/pest-and-slugs/cabbage-seed-weevil/

Sample collection

Collected from Rothamsted Farm and Cross Farm in Harpenden during Oilseed Rape harvest in July 2020. Cross Farm GPS: 51.803468, -0.331117.

Next Generation Sequencing

i) Illumina Hi-C sequencing 150 bp paired end data:

290,805,515 reads and 64x coverage.

ii) Illumina RNA-seq sequencing 150 bp paired end data, totalling 193,174,246 reads.

iii) PacBio HiFi data, of mean read length 11,109, total reads 1,215,508, read length N50 12,165, and total bases 13,503,462,642. DNA was extracted using MagAttract at the University of Delaware (600ng gDNA).


Non-sexed single individual DNA used for PacBio HiFi and multi-individual for Hi-C Illumina sequencing (University of Delaware USA, Arima Genomics USA respectively). Hifiasm was used to assemble the PacBio HiFi, with Juicer then 3d-dna using Hi-C data for chromosome level assembly. Haplotigs were removed (purge_haplotigs). Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Manual curation was done to bring the genome together and check for miss-assemblies.

PGI RNA-seq was assembled into a transcriptome (BUSCO: C:98.0%[S:95.0%,D:3.0%],F:0.7%,M:1.3%) and used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.

A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq and a Maker gene annotation.

Two Endosymbionts were assembled which included Rickettsia (4,063,491 bp) and Wolbachia (1,506,389 bp).

Final Results

A complete annotated 4 chromosome assembly deposited at NCBI under accession PRJEB47903 (incl. raw data).

BUSCO (Insecta odb10): C:98.1%,F:0.7%,M:1.2%

14,643 gene models - BUSCO C:95.8%[S:91.7%,D:4.1%],F:1.3%,M:2.9%

Scaffold No. (Incl Mt): 45

N50: 55,466,945

N bases (bp): 289,927

Repeat: 66.43%

Total size (bp) (chr no.): 674,950,870 (16)

Curated: 94x P450, 73x ABC transporter, 40x UGT, and the majority of 122/130 defined IRAC gene models.

Two Endosymbionts were assembled which included Rickettsia (4,063,491 bp) and Wolbachia (1,506,389 bp).

Other files

These are files that were not submitted to NCBI but might be useful.

Genomic PFAM annotation track

Non-coding RNA annotation track

Repeat annotation track

Repeat library