Bioinformatic Resources
Bioinformatic tools are computer programs that analyze one or more sequences. There are a dizzying array of bioinformatic tools that can analyze sequences to find protein domains (Pfam), or that can search through databases of millions of sequences to find ones that are similar (Blast) or that can find potential protein-coding regions (ORF-Finder). Many are freely available over the web. It can be overwhelming to find and use bioinformatic tools because you need to know 1) what type of analysis you wnat performed 2) what type of tool to use 3) where to find the tool.
For this class, we have made a collection of common databases and tools that should be useful for your research. But you should also feel free to use Google to try and find other databases and tools. If you want information on a particular species, you can search for databases that contain DNA sequences for that species (try Aiptasia EST database). If you want to identify possible functional domains contained within a set of sequences, you can search for annotation tools.
Bioinformatics Training
To get some training on how to use some of these tools, try working through this set of exercises (Microsoft Word 94kB Apr4 13).
Sequence Databases (with some annotation tools):
Transcriptome datasets
Aiptasia Sequences
SymBioSys is a database of transcriptome sequences for corals and Symbiodinium hosted by the Medina Lab.
Coral Transcriptome Datasets are available through the Matz Lab website.
Whole Genome Data
: JGI's Nematostella Genome Browser is a database of the genome sequence and annotations for Nematostella vectensis, a non-symbiotic sea anemone (Putnam et al., Science 2007).
Acropora digitifera Genome Browser is a web interface with access to whole-genome sequences for the first coral to have its genome published (Shinzato et al., Nature2011)
: JGI's Monosiga brevicollis Genome Browseris a database of the genome sequence and annotations for a choanoflagellate, which are one of the closest unicellular relatives of animals and provide important insights into the evolution of animals.
NCBI
: NCBI houses sets of databases of sequences for everything under the sun, including a site for designing primers
Annotation tools:
: Pfam is a database of evolutionarily conserved protein families, and annotations about the functions of those families.
: KAAS (KEGG Automatic Annotation Server) provides functional annotation of genes by BLAST comparisons against the manually curated KEGG GENES database and mapping them onto known biochemical/metabolic pathways.
Alignment,Phylogeny, and Evolutionary Analysis Tools:
BLAST : Blast (basic local alignment search tools)searches through large databases of sequences to identify any that are similar to a query sequence. It combines an alignment function with a search function. Almost any sequence database that you visit (including AiptasiaBase) will offer a BLAST as a way to search for sequences in that database.Puzzled?
Not sure how to get started doing blast searches? Click here for help
Muscle Muscle will align multiple DNA or protein sequences. It is available over many different sites, in this case we are providing a link to EMBL's server.
: MEGA is a program for constructing alignments and constructing phylogenetic trees. Important:this software does NOT run from the web. You must download it onto a PC and run it locally. The SciVis computers have this software installed on them.