Single adult beetle (with parastoid wasp) from field sample collected at Rothamsted farm on 28 July 2017. GPS 51.8093°, -0.3548°.
Next Generation Sequencing
i) P. chrysocephala Illumina Hi-C sequencing 150 bp paired end:
901,426,372 reads and 114x coverage.
M. brassicae Illumina Hi-C sequencing 150 bp paired end:
367,204,982 reads and 399x coverage.
ii) unknown sex Illumina 10X sequencing 150 bp paired end, totalling 668,730,898 reads and 85x coverage. Individual male 150 bp paired end, totalling 264,428,764 reads and 33x coverage. Individual female 150 bp paired end, totalling 230,696,660 reads and 29x coverage.
iii) PacBio HiFi data, of mean read length 14,428, total reads 2,074,955, read length N50 15,228, and total bases 29,937,625,920. DNA was extracted using Circulomics kit and quantified using FEMTO Pulse (3000ng gDNA).
Non-sexed single CSFB individual (with parastoid wasp, Microctonus brassicae) used for PacBio HiFi (University of Delaware, USA) and multi-individual Hi-C Illumina sequencing (Arima Genomics USA) for both species. Hifiasm was used to assemble the PacBio HiFi, with Juicer then 3d-dna using Hi-C data for both specie chromosome level assembly. Haplotigs were removed (purge_haplotigs). Manual curation was done to bring the genomes together and check for miss-assemblies. Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Error correction was done with Illumina 10X library data for P. chrysocephala and Hi-C for M. brassicae using freebayes.
PGI RNA-seq from adult CSFB was assembled into a transcriptome (BUSCO: C:96.7%[S:95.2%,D:1.5%],F:0.7%,M:2.6%) and used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.
A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq and a Maker gene annotation.
A P. chrysocephala Wolbachia endosymbiont (1,438,854 bp) was assembled.