Bioinformatics: An Interactive Introduction to NCBI

Created by Seth Bordenstein, Marine Biological Laboratory

"Understanding nature's mute but elegant language of living cells is the quest of modern molecular biology. From an alphabet of only four letters representing the chemical subunits of DNA, emerges a syntax of life processes whose most complex expression is man....The challenge is in finding new approaches to deal with the volume and complexity of data, and in providing researchers with better access to analysis and computing tools in order to advance understanding of our genetic legacy and its role in health and disease."

-From the National Center for Biotechnology Information (NCBI)

  • Module 1: To show the ways in which the NCBI online database classifies and organizes information on DNA sequences, evolutionary relationships, and scientific publications.
  • Module 2: To identify an unknown nucleotide sequence from an insect endosymbiont by using the NCBI search tool BLAST.

Teaching Time

45 minutes


This exercise represents two interrelated modules designed to introduce the student to modern biological techniques in the area of Bioinformatics. Bioinformatics is the application of computer technology to the management of biological information. The need for Bioinformatics has arisen from the recent explosion of publicly available genomic information, such as that resulting from the Human Genome Project. To address this, the National Center for Biotechnology Information (NCBI) was established in 1988 as a national resource for molecular biology information. The NCBI creates public-access databases, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease. The NCBI is a virtual goldmine both in terms of available resources, and treasures yet to be discovered. We will investigate the GenBank DNA sequence database, which is responsible for organizing millions of nucleotide sequence records.


Download (Microsoft Word 65kB Jul26 07) this exercise as a Word document.

There are a number of online, educational resources devoted to learning bioinformatics. For details that summarize what we will cover in this exercise and more, see:

Significance & Supplies Needed

By completing this project, you will be exposed to the tools and databases currently used by researchers in molecular and evolutionary biology, and you will gain a better understanding of gene analysis, taxonomy, and evolution. While no computer programming skills are necessary to complete the modules in this work, prior exposure to personal computers and the Internet will be assumed. The main program that you will need is an Internet browser, such as Netscape Navigator or Internet Explorer.

Begin the Exercise

Module 1: Sequence Taxonomy: This module will introduce you to the number and diversity of nucleotide sequences in the NCBI database.

Module 2: Sequence Searching and BLAST: This module will teach you how to retrieve genetic sequence data from the NCBI database that identifies a particular Wolbachia sequence.