Species: Spodoptera exigua (Beet armyworm)

Order: Lepidoptera

Family: Noctuidae

Genus: Spodoptera

Spodoptera exigua (beet armyworm moth) is one of the best-known agricultural pest insects. It is native to Asia but has been introduced worldwide and is now found where their hosts are found. In Britain, where it is an introduced species and not known to breed. The beet armyworm does not tolerate cold. It can overwinter in warm areas, but in colder areas it dies off during the winter and the region is reinvaded by the adult moth as the weather warms and crop plants sprout.

The larvae feed on the foliage and fruits of plants and can completely defoliate small ones. Smaller larvae devour the parenchyma of leaves, so all that remains is the thin epidermis and veins. Larger larvae tend to burrow holes through thick areas of plants. They will burrow straight into a head of lettuce rather than neatly removing tissue from one particular leaf, rendering the produce unmarketable. Larvae attack buds and new growth on plants, preventing flowers from opening, new leaves from sprouting, and vegetables from developing. Host range includes: Asparagus, beans and peas, sugar and table beets, celery, cole crops, lettuce, potato, tomato, cotton, cereals, oilseeds, tobacco, cannabis, many flowers, and a multitude of weed species.

Source: Wikipedia https://en.wikipedia.org/wiki/Beet_armyworm

Sample collection

Insecticide susceptible individual from a culture at Bayer established since 2007, originally collected from an unknown host plant (Andermatt, Switzerland).

Next Generation Sequencing

i) Illumina 10X genomics 150 bp paired end data:

431,917,580 reads and 64,787,637,000 bp.

ii) PacBio CLR data, of mean read length 22,626, total reads 938,328, read length N50 35,348, and total bases 21,230,928,495. DNA was extracted using DNAzol (2000ng) at Rothamsted.


Single individual DNA used for PacBio CLR and 10X genomics sequencing (Georgia genomics). Falcon was used to assemble the PacBio CLR, with 10X genomics used for error correction (pilon). Haplotigs were removed (redundans). Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Manual curation was done to bring the genome together and check for miss-assemblies.

Public RNA-seq data: PRJNA171128 (egg, nymph, pupa, adult tissues), and PRJNA188757 (eggs, 1(st) to 5(th) instar larvae, pupae, male and female adults) were assembled into a transcriptome (BUSCO: C:97.0%[S:59.4%,D:37.6%],F:1.3%,M:1.7%) and used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.

A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq.

Final Results

A complete annotated 31 chromosome assembly deposited at NCBI under accession PRJEB36598 (incl. raw data).

BUSCO (Insecta odb10): C:98.9%,F:0.0%,M:1.1%

13,002 gene models - BUSCO C:95.6%[S:90.5%,D:5.1%],F:0.7%,M:3.7%

Scaffold No. (incl Mt): 37

N50: 15,192,570

N bases (bp): 499

Repeat: 29.56%

Total size (bp) (chr no.): 450,855,128 (31)

Curated: 100x P450, 51x ABC transporter, 31x UGT, and the majority of 117/130 defined IRAC gene models.

The public version (WH-S strain) of the genome released during the project using Hi-C compares well, with good collinearity.

Other files

These are files that were not submitted to NCBI but might be useful.

Genomic PFAM annotation track

Non-coding RNA annotation track

Repeat library

Repeat annotation track