Species: Nezara viridula (green stink bug)

Order: Hemiptera

Family: Pentatomidae

Genus: Nezara

N. viridula is native to Africa, but frequently imported to the UK in food produce, it is widespread in southern Europe and has been recorded annually from sites in southern England since 2003 on various foodplants including tomato, beans, golden-rod, Lavatera, Viburnum and hollyhock. Many records are from allotments where bugs are associated with cultivated runner beans. Adults often overwinter indoors. N. viridula may attack all parts of a plant, including the stems and leaf veins, but the bugs feed mostly on fruiting structures and growing shoots.

In general, their piercing and sucking mouthparts puncture the plant tissues and form minute, hard, brownish or blackish spots. Feeding retards the growth of immature fruits, which the bugs prefer to over-ripe fruit, and distorts them, causing, for example, catfacing of peaches or premature drop. Flower drop in ornamental or cut flowers is sometimes a problem.

The feeding punctures also provide access for fungal and bacterial infections, some of which are toxic to vertebrates, for example, those that invade nut or maize kernels. Some of the pathogens seem to be responsible for the fruit drop that follows feeding, for example, citrus. Even if the damage is not outwardly severe, the taste of the product may be badly affected such as hazelnuts.

Sample collection

Insecticide susceptible. Originally collected in France 2008 or earlier.

Next Generation Sequencing

i) Illumina 10X genomics 150 bp paired end data:

317,918,028 reads, 47,687,704,200 bp.

ii) PacBio CLR data, of mean read length 6,645, total reads 64,063,006, read length N50 7,580, and total bases 424,109,366,461. DNA was extracted at Rothamsted Research using a MagAttract kit (4500ng gDNA).


Single individual DNA used for PacBio CLR (University of Georgia) and 10X genomics sequencing (Georgia genomics). Falcon was used to assemble the PacBio CLR, with Juicer then 3d-dna using Hi-C data for chromosome level assembly. Haplotigs were removed (purge_haplotigs). Manual curation was done to bring the genome together and check for miss-assemblies. Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Error correction was done with Illumina Hi-C data using freebayes.

Public RNA-seq transcriptome assembly BUSCO: C:97.7%[S:70.6%,D:27.1%],F:0.9%,M:1.4% using PRJNA472074 (adult), PRJNA494427 (abdominal sternites of adult male), PRJNA512480 (a dult antennae and mouthparts), PRJNA533834 (salivary glands), PRJNA557118 (4 midgut compartments and carcass) and used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.

A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, and ABC gene models were found and curated using mapped RNA-seq.

Two endosymbionts were assembled and submitted, a Pantoea sp. 1,386,240bp, and Yokenella regensburgei as three contigs 4,360,201bp.

Final Results

A complete annotated 7 chromosome assembly deposited at NCBI under accession PRJEB47893 (incl. raw data).

BUSCO (Insecta odb 10): C:97.0%,F:1.0%,M:2.0%

15,972 gene models - BUSCO C:91.2%[S:75.8%,D:15.4%],F:1.5%,M:7.3%

Scaffold No. (incl Mt): 85

N50: 181,513,190

N bases (bp): 264,947

Repeat: 49.30%

Total size (bp) (chr no.): 1,185,128,795 (7)

Curated: 127x P450, 87x ABC transporter, 49x UGT, and 0/130 defined IRAC gene models.

Two endosymbionts were assembled, a Pantoea sp. (1,386,240 bp), and Yokenella regensburgei (4,360,201 bp).

Other files

These are files that were not submitted to NCBI but might be useful.

Repeat annotation track

Repeat library

Non-coding RNA annotation track