Single individual, of unknown sex. A mixed sample was taken with Phyllotreta cruciferae from Canada in September 2020.
Next Generation Sequencing
i) Illumina Hi-C sequencing 150 bp paired end data:
852,080,958 reads and 478x coverage.
ii) PacBio HiFi data, of mean read length 10,255, total reads 241,153, read length N50 11,043, and total bases 2,473,243,844. Summarized in figures below. DNA was extracted using MagAttract kit at the University of Delaware (150ng gDNA).
Non-sexed single-individual was used for PacBio HiFi (University of Delaware, USA) and multi-individual for Hi-C Illumina sequencing (Arima Genomics USA). Hifiasm was used to assemble the PacBio HiFi, with Juicer and 3d-dna and Hi-C data used for chromosome level assembly. Haplotigs were removed (purge_haplotigs). Manual curation was done to bring the genome together and check for miss-assemblies. Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Error correction was done with Hi-C data using Freebayes.
RNA-seq data was taken from PRJNA299458 (male and female antenna and terminal abdomen), PRJNA679396 (No meta data) and assembled into a transcriptome (BUSCO: C:98.2%[S:96.2%,D:2.0%],F:0.1%,M:1.7%). This was used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.
A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq and a Maker gene annotation.
An endosymbiont Wolbachia (1,546,488 bp) was assembled.