Adult beetles collected from oilseed rape on the Rothamsted farm in April 2019 (GPS 51.8093°, -0.3548°).
Next Generation Sequencing
i) Illumina Hi-C sequencing 150 bp paired end data:
709,616,444 reads and 182x coverage.
ii) PacBio HiFi data, of mean read length 15,048, total reads 1,046,296, read length N50 16,775, and total bases 15,745,006,744. DNA was extracted using the MagAttract kit (3000ng gDNA).
Non-sexed single-individual used for PacBio HiFi (University of Delaware, USA) and multi-individual for Hi-C Illumina sequencing (Arima Genomics USA). Hifiasm was used to assemble the PacBio HiFi, with Juicer and 3d-dna and Hi-C data used for chromosome level assembly. Haplotigs were removed (purge_haplotigs). Manual curation was done to bring the genome together and check for miss-assemblies. Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Error correction was done with Hi-C data using freebayes and using only homozygous mutations.
RNA-seq transcriptome (BUSCO: C:98.8%[S:47.6%,D:51.2%],F:0.7%,M:0.5%), PRJNA223353 (adult), and PRJEB1765 (larval stages as well as untreated and immunized beetles (injected with different bacteria and yeast)), which were used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using infernal v1.1.4.
A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, and ABC gene models were found and curated using mapped RNA-seq and a Maker gene annotation.