De novo Genome Assembly
Please, contact us to discuss how we can be of assistance in achieving your project goals or to receive a quote for your project.
HPCBio has extensive experience performing de novo genome assembly on a wide range of organisms (from bacteria to fungi to plant to animal) using a variety of software tools, methods, and sequencing technologies to reconstruct a complete genome sequence without using a reference genome. We also have experience in assembly improvement (e.g., scaffolding, gap filling) and performing haplotype phasing of assemblies if additional data (Hi-C sequencing libraries, trio data) is available.
Sequencing technologies
- PacBio Revio: Long reads with high accuracy. This is the technology of choice for assembly of eukaryotic and bacterial genomes.
- Illumina Novaseq X: used for scaffolding existing assemblies by sequencing Hi-C and/or TellSeq libraries
- Oxford Nanopore: reads 5kb to 30kb, longer if desired. Used for assembly of microbial and fungal genomes and for scaffolding of eukaryotic genomes, especially problematic regions
Library types
- Whole genome shotgun: from PCR-free to ultralow libraries
- Hi-C libraries: chromosomal conformational capture libraries for identification of long-range chromosome-level contacts, essential for chromosome-length scaffolding and for phasing
- Linked-reads: TELL-Seq libraries, which are Illumina-based libraries that can be used for genome assembly by barcoding long molecules of DNA, preparing shotgun libraries, then reassembling the long molecules using those barcodes. They can enable estimation of genome size and complexity with Genomescope2, scaffolding of PacBio assemblies, and potential usefulness for phasing.
Assembly Software
- hifiasm – PacBio, ONT, and HiC data
- Verkko – PacBio, ONT, and HiC data
- Canu and HiCanu – noisier long read data
- Juicer and Juicebox tools (HiC)
- Pretext and PretextView (HiC)
- HapHiC (polyploid phasing)
- TELL-Seq utilities
- Quality control tools: GenomeScope2, BUSCO, Blobtoolkit, Smudgeplots