Whole Genome Data

This section is awaiting the arrival of genome sequence data from Cofactor Genomics

Transcriptome sequencing allows biologists to get a quick, but incomplete, inventory of the protein-coding genes, because it is limited to genes that happen to be expressed under a restricted set of condition(s). A "whole genome" sequence, on the other hand, is a complete inventory of the protein-coding genes, as well as the non-coding set of promoters, tRNAs, small RNAs, genomic repeats, etc. The challenge with whole genome sequences lay in 1) the assembly -- correctly piecing together the genome from millions/billions of small reads and 2) the annotation -- correctly predicting the location and structure of the genes, for example.

For Aiptasia the first questions to address might focus on practical questions about how well the sequencing effort has covered the genome

the depth at which the genome has been sequenced (the more reads that we have for each small region of the genome, the easier it is to reconstruct the entire genome sequence)
the extent of small DNA repeats that might complicate the assembly
the number of genes that might be represented by the genome sequence data

CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes

Data
Dataset of core genes

« Previous Page Next Page »

Whole Genome Data

This section is awaiting the arrival of genome sequence data from Cofactor Genomics

About

Reuse

Page Text

Images

Files