Molecular evolution of gene families

Cara M. Constance, Hiram College
Author Profile


In this laboratory exercise, students will use computational methods to work with a gene family in order to learn how to identify conserved regions and observe sequence changes that evolve in DNA, mRNA and amino acid sequences in a gene family. Students will investigate how many members of this gene family are found in diverse species by searching available databases such as ECR browser, Homologene and BLAST, and will create a phylogenetic tree. This is a guided exercise that accompanies a wet lab project where students amplify by RT-PCR specific gene family members to investigate in which tissues different family members are expressed. Students will appreciate the functional significance of sequence conservation in diverse species, and advantages of gene duplication events.

Learning Goals



  • gene families arising from duplication
  • orthologous vs. parologous sequences
  • how the genomes of some organisms have undergone more extensive gene duplication events
  • relative conservation of sequence at the genomic DNA, mRNA, and amino acids levels, and how this conservation relates directly to the function of the protein product and functional domains of the protein


  • proficiency in utilizing bioinformatics tools to perform homology searches (BLAST) from genomic databases, EST databases and GenBank
  • ability to compare sequence information using sequence alignments and tree-building programs

Wet lab

  • Laboratory techniques to distinguish expression of one gene family member: RNA extraction, primer design, RT-PCR, and DNA and RNA gel electrophoresis
  • Data analysis and experimental preparation and design
  • Critical reading of primary scientific literature, and relating what they are learning in lab to the literature
  • Oral communication of scientific results ("lab meeting" presentation)
  • Scientific writing skills (journal style lab report)

Context for Use

This activity is part of an Advanced Molecular Biology course (5-12 students) that is taught in an intensive 3 week session (meeting 4 days per week, 6 hours a day) and focuses on molecular genetics. The students in this course have completed upper-level biology coursework (Molecular and Cellular Biology and Genetics), and have already learned bioinformatics in the context of analyzing prokaryotic genomes in these prerequisite courses. They are familiar with Genbank and use of BLAST to perform homology searches at the start of the Advanced Molecular Biology course.

In my course, this lab was a part of a series that incorporated both bioinformatics and molecular biology techniques. I use the casein kinase I (CKI) gene family and the period 1, and 2 genes (Period proteins are known substrates of two of the casein kinases). Our model organism for the lab is adult Xenopus tropicalis. See description below for a more detailed account.

Other appropriate courses:

The computational exercise alone would also be appropriate for an Evolution course, or any course where genomics tools would be introduced (i.e. an intermediate level Molecular and Cellular Biology or Genetics course, Developmental Biology or Neurobiology). This exercise is used to illustrate concepts such as gene families, gene and transcript structure, functional conservation of sequence and distinguish orthologous and paralogous genes. This computational exercise could be completed in two lab sessions (1-2 hours per session would be sufficient time).

If the wet lab activity is included, the gene family could be chosen to reflect the course content. For example, the Hox gene cluster could be chosen in a Developmental Biology course; the Kv or Task potassium channel family could be used as an example in a Neurobiology course.

This lab is adaptable to any organism
where different tissues can be readily dissected. This would entail an approved IACUC protocol for any vertebrate species.

Description and Teaching Materials

In Advanced Molecular Biology (Biol 415), the three-week intensive schedule allows integration of lecture, wet-lab, and computational labs each class time (see syllabus Advanced Mol Biol course syllabus (Microsoft Word 43kB Jun20 09) ). The students consider these questions: 1) Are all CKI family members expressed in the same tissues of an adult animal? 2) How many members of the casein kinase I gene family are present in diverse species? 3) Are Period proteins, substrates of CKI family members, conserved across evolution to the same extent?

Students are assigned a CKI gene family member, and obtain the X. tropicalis cDNA sequences using GenBank. They perform a sequence alignment using ClustalW (using MegAlign, DNASTAR, INC.) and use this alignment to identify sequence divergence between the gene family members. This information is used to design primers (using Primer Select, DNASTAR, INC.). This can be completed in one lab period.

Reading assignments: The articles include a review of the CKI multi-gene family, and several primary literature articles that discuss the function of CKIs in the circadian clock mechanism. These articles provide the necessary background about the genes we focus on, as well as introducing current questions addressing the evolutionary conservation and functional redundancy of CKI family members. The articles also introduce students to critical analysis of primary scientific literature, and to techniques that they employ in the wet-lab. The background for the computational lab was conveyed through a lecture derived from the text Genomes 3 (see attached lecture and lecture notes in Guided Discovery- wet-lab below) and additional relevant information is given in the lab handout. The students are charged with doing independent primary literature searches in order to learn about their assigned CKI gene family member. This information will be used in their "lab meeting" oral report and in their lab paper.

Guided Discovery-wet-lab: In order to initiate the wet-lab project, each student retrieves sequence for their assigned gene, and design primers as described above (Introduction). Each student then extracts RNA from an X. tropicalis adult liver tissue using two different methods (Trizol and Qiagen kit); this takes one lab period. In the next lab period, they test for the presence of RNA and the integrity of their sample by performing RNA gel electrophoresis (MOPs/ formaldehyde gel) on liver tissue extracted using Trizol and the kit. The students interpret their data, and decide which method they will use to extract RNA from additional tissues (heart, skeletal muscle, brain, eye, lung, spleen, testes/ ovaries; additional lab period for extraction). Students then perform reverse transcription (with a minus RT control) using a mixture of RNA from all tissues, and conduct PCR to determine whether the primers they designed work by running their PCR reactions on a TAE gel (two lab periods). The students complete the lab series by performing RT-PCR on individual tissues to ask the question "which tissues express each CKI family member?" (two lab periods). Based upon their data, we discuss whether functional redundancy is a possibility, and how that could be determined experimentally by knocking out one of gene family members. In total, the wet lab, if conducted in a similar progression, requires 7 lab periods in addition to the first computational lab period to design the PCR primers.

Student learning and reflection: The students are given a lot of independence in conducting this lab series. They research and write their own procedures to use the Trizol method for RNA extraction, make their own solutions for the gels, make their own decisions on how to proceed and design their own experiments. Prior to each lab, I give a short lecture on how the method works, and why each component is present in the reaction. After each lab, we talk as a group about their individual results and their interpretation of those results. They are expected to use primary literature and resources such as to reconcile their results with what might be expected based upon already published information.Lab report description (Microsoft Word 28kB Jun20 09)

Guided discovery – computer lab: This lab is initiated by a lecture on molecular phylogenetics, tree reconstruction using neighbor-joining and maximum parsimony, and bootstrap analysis to assess the accuracy of the tree. Using the accession numbers procured by the students in the initial lab, the students progress through the computational lab exercise, answering embedded questions about gene duplication, orthologs and paralogs (using ClustalW), conservation of sequence at the genomic DNA level (using ECR browser), mRNA level and amino acid level (using ClustalW). The students then procure sequences of CKI orthologs from 10 diverse organisms (using Homologene and BLAST) and construct a phylogenetic tree with amino acid sequences (using Mega 4.0). Their final task is to independently find orthologs to Period 1 and 2 (substrates of CKI) and create a phylogenetic tree with these amino acids sequences. Their goal is to compare the sequence conservation across evolution of a multi-gene family with diverse functions in the cell, to a gene family that has a more limited cellular role. This lab alone takes 1-2 lab periods; the second period would largely be devoted to allowing students to complete the homework to reinforce the use of the tools learned. The embedded task and independent project comprised their final quiz in the course. Bioinformatics lab (Microsoft Word 113kB Jun27 09) Molecular phylogenetics lecture notes (Microsoft Word 44kB Jun20 09)

Teaching Notes and Tips

Student audience

In this course it is absolutely essential that the students are motivated and come to class prepared. It is important that each student does the assigned reading and homework assignments, since the course is predominately driven by student-led discussion. Thus, doing both the wet lab and computational lab would require a small class size and an elective course where students who sign up are interested in the topic.

Course format

The format of my course is such that we are meeting each day for almost 6 hours, allowing for the students to focus solely on one course and to progress at a rate that allows them to continuously build their knowledge. In a course that meets several times a week for lecture, and once a week for lab, it would be important to reinforce in discussion what had been completed previously and relate it to what would be done next that day in lab. It would also be important for the lab and lecture to be integrated, if the entire lab series is to be done. The computational lab could stand alone to illustrate principles of molecular phylogenetics; in this case it would not be as important to closely integrate all aspects of the course.

Success of wet lab experiments

The molecular biology experiments worked well, because the techniques used were very familiar to me. I would suggest using an organism and a gene family that the instructor has interest in. If the Trizol method is used to extract RNA, and a formaldehyde gel used for the RNA gel, proper waste must be set up in the lab. All methods used are readily found on the web or included in manufacturer's instructions with the reagents ordered. Half of the students had PCR products. This was ideal, as we were able to discuss troubleshooting PCRand to use the literature to find that we would not have expected products for some of the genes in adult tissues. All of the students in the class did all of the techniques, allowing for hands-on experience for all and the likelihood for success of a high proportion of the experiments.

Success of bioinformatics lab

The computer exercise makes use of DNASTAR, INC. software; it is important to plan ahead in order to get the appropriate signatures for the license, and to allow for installation on the computers the students will use in the course. It is possible to design primers and use ClustalW using free software available in the web; I like to use this package because I am familiar with it, and like some of the features available. It cannot be used to generate .fas files needed for MEGA, so one disadvantage is that the sequences used in the first part of the computational lab exercise are not appropriate for the tree construction. It is important for the instructor to proceed through the software before the class, since the DNASTAR software is updated frequently, as are web pages that have primer design and sequence alignment software.

I conducted the computational lab last in my course. This was beneficial in that the students had spent a lot of time learning about the CKI gene family, and thus could relate the bioinformatic analysis to existing knowledge. One problem I encountered was that some of the students had trouble completing the final task, and did not seek my help since it was their final assignment in the course. The result was that a small fraction of the class did not demonstrate to me that they could use this software independently. I would encourage devoting a lab period to the final task, so that the instructor is available for questions as the students navigate through the software independently.


Student understanding of the concepts and techniques is assessed by:

Bioinformatics lab

Wet lab

Post-lab survey

References and Resources


Molecular Evolutionary Genetics Analysis (MEGA) (available for Windows, DOS, MAC, Linux)
- Can download software for creation and analysis of phylogenetic trees

DNASTAR INC. Educational Software (available for Windows or Mac)
1228 South Park St., Madison, WI 53715, USA
Tel 608-258-7420 Fax 608-258-7439
- This software is available free of charge for use for the duration of the course strictly for teaching purposes. In order to obtain the software package, which includes primer design and alignment programs, it is necessary to contact DNASTAR prior to the start of the course, sign a license agreement, and install the software on the computers to be used in the course.
- This site provided Xenopus genomic sequence, and UniGene expression data

Brown TA. Genomes 3. Garland Science.

Assigned reading (in chronological order)

Knippschild U, et al. 2005. The casein kinase I family: participation in multiple cellular processes in eukaryotes. Cellular Signaling 17: 675-689.
- This review is an overview of the different casein kinase I family members, their structure, substrates and what is currently known about their functions.

Ko CH, and Takahashi, JS. 2006. Molecular components of the mammalian circadian clock. Human Molecular Genetics 15: R27`-277.
- This article was used to introduce the role of CKIδ and ε in the circadian clock mechanism.

Toh KL, et al. 2001. An hPer2 Phosphorylation Site Mutation in Familial Advanced Sleep Phase Syndrome. Science 291: 1040- 1043.
- This article introduces the functional consequence of a sequence change in a substrate of CKI, and how this leads to a sleep disorder.

Fan JY, et al. 2009. Drosophila and Vertebrate Casein Kinase Iδ Exhibits Evolutionary Conservation of Circadian Function. Genetics 181: 139-152.
- This article is an example of functional conservation of a casein kinase I family member in diverse species.

See more Inquiry-based Integrated Instructional Units »