Sphaerophoria rueppellii


Species: Sphaerophoria rueppellii European Hover fly

Order: Diptera

Suborder: Cyclorrhapha

Family: Syrphidae

Genus: Sphaerophoria

Pest control relies mainly on the use of synthetic pesticides, many of which are non-selective and are therefore toxic to both their target pest species and to beneficial predators and parasitoids. Native to Europe and Mediterranean counties, Sphaerophoria rueppellii are effective in the biological control of crop pests. Syrphid adults typically feed on nectar and pollen, however, the larvae of roughly one-third of syrphid species feed on crop pests such as aphids, thrips and coleopteran and lepidopteran larvae. Predatory Syrphidae are able to feed on up to ~500 aphids during their larval stage, which is a higher daily feeding rate than other aphid predators. For example, S. rueppellii were able to reduce aphid (Myzus persicae) populations by 84% in a field experiment. Specialised adaptations present within adult female Syrphidae allow them to detect aphid pheromones and increase their efficacy as biological control agents. Adult females often lay their eggs in close proximity to aphid colonies to ensure a plentiful food supply for emerging larvae. Syrphid adults also avoid laying their eggs close to parasitised aphids which reduces intraguild predation between parasitoids and hoverflies and thus allows for them to be safely combined in IPM strategies. Such strategies can result in more effective pest control compared to using only one beneficial predator species, especially when attempting to control multiple species of pest. Overall, it is unsurprising that Syrphidae are considered to be amongst the most important aphid predators and a key tool for biological control. Alongside pest control, adult hoverflies play a key role in pollination and are considered the second most important pollinator after the Apidae bee families.

Source: Bailey, E., Field, L., Rawlings, C. et al. A near-chromosome level genome assembly of the European hoverfly, Sphaerophoria rueppellii (Diptera: Syrphidae), provides comparative insights into insecticide resistance-related gene family evolution. BMC Genomics 23, 198 (2022). https://doi.org/10.1186/s12864-022-08436-5.

Sample collection

S. rueppellii larvae were obtained from ‘biopestgroup.com’. CO2 was used for anaesthesia to allow the insects to be sorted from the substrate. The larvae were then flash frozen with liquid N2 and stored at -80°C. The whole process was completed within 48 hours of arrival.

Next Generation Sequencing

i) Illumina genomic sequencing 150 bp paired end data:

417,662,063 reads with a total length of 125.3Gb.

ii) PacBio CLR data, total reads 6,748,327 with a total length of 83.2 Gbp (277x) and a polymerase read length N50 of 63,285bp.

iii) Illumina RNA sequencing 150 bp paired end data: 123,298,454 reads.

iv) Illumina genomic Hi-C sequencing 150 bp paired end data: 21.6Gb


Several assemblers were trialled to generate the assembly (including Canu, DBG2OLC and wtdbg2), however, many struggled to produce a good quality assembly, perhaps due to the high repeat content and heterozygosity of the genome. Flye and Platanus-Allee produced the best quality assemblies. Flye had the best assembly statistics in terms of scaffold N50 (100,207bp with 18 scaffolds >1 million bp) and BUSCO completeness score (99.2%). However, duplication was very high (48.3%) for this assembly, even after subsetting the longest reads to get 150x coverage (duplication was 63.8% prior to subsetting). The total number of scaffolds was 50,164. Platanus-Allee had a lower scaffold N50 (42,845bp with 0 scaffolds >1 million bp) and a slightly lower BUSCO completeness score (97.6%), but duplication was much lower (3.6%). The total number of scaffolds was 67,142.

In order to retain the high contiguity of the Flye assembly, whilst attempting to reduce its high duplication percentage, the Flye and Platanus-Allee assemblies were merged using QuickMerge. Some manual curation was also performed to bring back falsely removed contigs. This resulted in an assembly with a slightly lower completeness score of 96.5%, however, the duplication was reduced to 15.5% whilst preserving most of the long-length scaffolds produced using Flye. The assembly had a scaffold N50 of 67,653bp and a total of 59,284 scaffolds, 16 of which were >1 million bp.

A subsequent round of Purge Haplotigs brought the duplication score down to 4.6% whilst still maintaining a completeness of 95.6%. Scaffold N50 increased to 126,450bp and the total number of scaffolds was reduced to 15,009.

This draft assembly was next used for scaffolding with Hi-C data using the 3D-DNA de novo genome assembly pipeline. This increased the scaffold N50 to 87,361,475 bp, with 5 scaffolds > 10 million bp. The total number of scaffolds was reduced to 11,549, with 6 chromosomal-level scaffolds, numbered by sequence length. There is currently no karyotypic information for S. rueppellii to confirm the correct number of chromosomes, however, this value corresponds to a cytogenetic analysis of Eristalis tenax which had 6 chromosomes. The BUSCO completeness score was reduced to 94.6%, however, a round of Pilon error polishing brought this back up to 96.4% (subsequent rounds of Pilon worsened the BUSCO score). A final run with Purge Haplotigs reduced duplication from 4% to 3%.


Number of scaffolds: 8,476

BUSCO: C:96.0%[S:93.0%,D:3.0%], F:1.2%,M:2.8%

Total size of scaffolds: 537,631,316 bp

Longest scaffold: 125,413,692 bp

N50 scaffold length: 87,097,991 bp

Number of N’s: 56,988,920

Genes: 14,249

Gene BUSCO: 87.3%


PhD project output focusing upon predators of agricultural pest species.