Insecticide susceptible, taken in February 1963 Egypt from the Syngenta Kaha Field station.
Next Generation Sequencing
i) Illumina 10X sequencing 150 bp paired end data, totalling 309,759,182 reads and 106x coverage.
ii) PacBio CLR data, of mean read length 13,517, total reads 8,754,751, read length N50 16,730, and total bases 112,623,278,172. DNA was extracted using DNAzol at Rothamsted Research (4000ng gDNA) and BluePippin purified before library preparation.
iii) cDNA Nanopore RNA sequencing using flowcell FLO-PRO002, cDNA synthesis kit DCS109 (Direct-cDNA kit) plus EXP-NBD104 (Barcoding kit), adapter sequences Barcodes NB07 – NB12, Nanopore sequencing kit DCS109 (Direct-cDNA kit) plus EXP-NBD104 (Barcoding kit), Guppy 5.0.17, config file used by guppy dna_r9.4.1_450bps_hac_prom.cfg, other options trim_barcodes="on".
Non-sexed single-individual adult used for PacBio CLR (University of Maryland, USA) and 10X genomics Illumina. Falcon was used to assemble the PacBio CLR. Haplotigs were removed (Redundans). Manual curation was done to bring the genome together and check for miss-assemblies. Error correction was done with Illumina 10X library data using freebayes.
A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq and a Maker gene annotation. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using infernal v1.1.4.
RNA-seq data PRJEB17082 (Starvation effect of Fourth instar larvae: Antennae and maxillary palps), PRJNA312160 (carcass, antennae, brain, proboscis), PRJNA336490 (gut, fat body and Malpighian tubules over time and with insecticide) assembled with a BUSCO: C:99.7%[S:51.6%,D:48.1%],F:0.1%,M:0.2% and used in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. RNA life stages taken for cDNA nanopore RNA sequencing from egg, L1 larvae, L3 larvae, L5 larvae, pupae, adult. Direct RNA and cDNA nanopore data was used to create an additional transcriptome from six life stages with a BUSCO of C:91.3%[S:51.1%,D:40.2%],F:4.0%,M:4.7% but was not used to update the annotation.