Bioinformatics: An Interactive Introduction to NCBI

Created by Seth Bordenstein, Marine Biological Laboratory


Module 2: Sequence Searching and BLAST

Introduction Module 1 Module 2 Online Resources for Educators

Objective


Nasonia vitripennis female using her ovipositor to lay her eggs inside the host fly pupa

The goal of this module is to retrieve genetic sequence data from the NCBI database that identifies the 'Wolbachia Sequence' you generated. The Basic Local Alignment Search Tool (BLAST) is an essential tool for comparing a DNA or protein sequence to other sequences in various organisms. Two of the most common uses are to a) determine the identity of a particular sequence and b) identify closely related organisms that also contain this particular DNA sequence.


A Slide Show Introduction


Begin by linking to a BLAST for beginners slide show that is simple and easy to follow http://digitalworldbiology.com/tutorial/blast-for-beginners. Let the slide show guide your learning by clicking on the bright green arrow to proceed through the pages.


Using BLAST to Identify a Fake Sequence and Your Wolbachia Sequence


Begin by linking to the NCBI homepage. Select BLAST. With your new knowledge of Sequence Searching and BLAST, let's begin with a sequence you make up and then your Wolbachia sequence.

  • Select nucleotide BLAST under the basic BLAST category.
  • Input your own nucleotides (A,T,G,C) that fill one complete line into the Search Box. This is referred to as the query sequence.
  • VERY IMPORTANT - Click on the circlefor 'Others (nr etc.) under Choose Search Set
  • Select BLAST! at end of page. A new window appears.
  • Wait for the results page to automatically launch. The wait time depends on the type of search you are doing and how many other researchers are using the NCBI website at the same time you are!

  1. Did your fake sequence produce a significant hit? (probably not since a significant hit is below E-10 usually)If yes, how many?
  2. How many sequences did it search in the database?
  3. How many nucleotide letters did it search in the database?
    • Select Home at the top of the BLAST page
    • Select nucleotide BLAST under the Basic BLAST category
    • Enter your Wolbachia sequence below into the Search box. (At this point in the lab, if you generated your own Wolbachia sequences, you can BLAST your own sequence. Here everyone will BLAST the same sequence provided to you below).
    • Your Wolbachia Sequence: GTTGCAGCAATGGTAGACTCAACGGTAGCAATAACTGCAGGACCTAGAGGAAAAACAGTAGGGATT AATAAGCCCTATGGAGCACCAGAAATTACAAAAGATGGTTATAAGGTGATGAAGGGTATCAAGCCT GAAAAACCATTAAACGCTGCGATAGCAAGCATCTTTGCACAGAGTTGTTCTCAATGTAACGATAAA GTTGGTGATGGTACAACAACGTGCTCAATACTAACTAGCAACATGATAATGGAAGCTTCAAAATCA ATTGCTGCTGGAAACGATCGTGTTGGTATTAAAAACGGAATACAGAAGGCAAAAGATGTAATATTA AAGGAAATTGCGTCAATGTCTCGTACAATTTCTCTAGAGAAAATAGACGAAGTGGCACAAGTTGCA ATAATCTCTGCAAATGGTGATAAGGATATAGGTAACAGTATCGCTGATTCCGTGAAAAAAGTTGGA AAAGAGGGTGTAATAACTGTTGAAGAGAGTAAAGGTTCAAAAGAGTTAGAAGTTGAGCTGACTACT GGCATGCAATTTGATCGCGGTTATCTCTCTCCGTATTTTATTACAAATAATGAAAAAATGATCGTG GAGCTTGATAATCCTTATCTATTAATTACAGAGAAAAAATTAAATATTATTCAACCTTTACTTCCT ATTCTTGAAGCTATTGTTAAATCTGGTAAACCTTTGGTTATTATTGCAGAGGATATCGAAGGTGAA GCATTAAGCACTTTAGTTATCAATAAATTGCGTGGTGGTTTAAAAGTTGCTGCAGTAAAAGCTCCA GGTTTTGGTGACAGAAGAAAGGAGATGCTCGAAGACATAGCAACTTTAACTGGTGCTAAGTACGTC ATAAAAGATGAACTT
    • Select BLAST! A new window appears
    • Select Format! and you will have to wait for the results page to appear.
  4. How long (query length) is the Wolbachia sequence that you used to search the database?
  5. What is the E-value and bit score of the best hit (in this case, the first matching sequence)?
  6. What is the most likely identity of this sequence? (click on the blue link to the left of the top hit) What is the title of the scientific publication that reported this sequence (click on the PUBMED 16267140 link)
    • Go back twice when you're done.
    • Select Home at the top of the BLAST page.
    • Select nucleotide BLAST under the Basic BLAST category.
    • Now enter only the first 135 base pairs of your Wolbachia sequence below into the Search box.

    • Your Wolbachia Sequence GTTGCAGCAATGGTAGACTCAACGGTAGCAATAACTGCAGGACCTAGAGGAAAAACAGTAGGGATT AATAAGCCCTATGGAGCACCAGAAATTACAAAAGATGGTTATAAGGTGATGAAGGGTATCAAGCCTGAA
  7. What do you observe about the E-values? What is the E-value and score of the best hit (the first matching sequence)?
  8. Is the identity of the best hit different from when you used the complete nucleotide sequence? Is it the same gene as identified before?
  9. From the two BLAST searches, what can you deduce about how the length of a query sequence affects your confidence in the sequence search?
    • Close all web windows. This exercise is now complete. You successfully mastered one of the state-of-the-art tools used by most molecular and evolutionary biology researchers today. There is a lot of information on the NCBI website. Feel free to explore the website and you can find more tutorials at: http://www.ncbi.nlm.nih.gov/guide/training-tutorials/

More Information About Bioinformatics in the Classroom




Resources for Educators: This collection includes online resources and activities designed to incorporate bioinformatics into the K-12 and undergraduate classrooms.