Writing a Wikipedia Genetic Disease Article

This page authored by Jeff Bell, California State University, Chico
Author Profile

This material was originally developed through Merlot
as part of its collaboration with the SERC Pedagogic Service.

Summary

For this assignment students use online genetic databases such as Genbank and OMIM to research a particular genetic disorder and then write an article about the disease for the online Wikipedia Encylopedia project. In addition to requiring students to synthesize much of what they have learned in a typical genetics course, this assignment also has a service component as students will use their genetics knowledge to produce an informative article that will add to the useful body of free information available in the Wikipedia project, which may aid individuals suffering or potentially suffering from a genetic disease, or friends or relatives of someone with a genetic disorder.


Learning Goals

This assignment requires students to synthesize knowledge from several areas of genetics, physiology and development, as they must describe a genetic disorder, the normal function of the gene, the inheritance of mutant alleles, the type of mutation, the location and structure of the gene, and the prognosis and possible treatments for the genetic disease. Students must also consider their audience and attempt to write an article appropriate for a general audience.

Context for Use

This assignment is designed to be the capstone for a course or unit on genetics. The complete assignment as described is designed as the final assignment for a biology major's course in genetics. The assignment will take several hours of student time and should probably include the submission of a draft that is commented on before the final assignment is due. Understanding of a wide range of genetics is needed, along with some practice using the different genetic databases, which should be built up with other simpler assignments before assigning this one.

The service component can be required or be left voluntary - as it is important that the quality of the Wikipedia be preserved I keep submission voluntary and only encourage students with a good article to actually submit them to the Wikipedia.

This assignment can be tailored for introductory biology or non-majors courses by reducing the information required in the report - allele frequency, amount of information about gene function and structure, etc., can be eliminated or reduced to make the assignment easier. Giving the students a smaller list of genes to choose from that focuses on genes with lots of readily available information can also make the assignment easier.

Description and Teaching Materials

A Genetic Disease Article for Wikipedia

There is a tremendous amount of important genetic information available online, in fact, some information is really available only online. However, the Genbank and OMIM databases are very technical and are too difficult for most people to use. A potential solution would be entries written for a general audience in the open source Wikipedia encyclopedia project. I would like you to to put your genetics knowledge to some use by researching a human genetic disease using Genbank, OMIM and other specialized databases, and then writing a report for the Wikipedia that summarizes the key information about the gene.

Your report should include the genes function (what does the gene do?), whether the mutant alleles (usually a disease causing allele) are dominant or recessive, the consequences of inheriting the mutant allele (what are the symptoms, how is the disease treated, what is the prognosis for an afflicted individual, etc.), the frequency of the mutant allele in the population, the genes location on the chromosomes, the genes size and structure (DNA size, mRNA size and protein size), the number of exons and introns, the mutations responsible for the most common alleles, at both the DNA level and the protein level (point mutations, insertions, missense, frameshift, etc.).

An example of a Wikipedia entry that has some of this is the entry for Ataxia telangiectasia. There is a http://en.wikipedia.org/wiki/List_of_genetic_disorders that they would like to have reports on, pick one of the ones that's in red, which means there is no report, or you can do one of the ones that's in blue, if the current entry is weak, which is the case for many of the entries. The report should include the url of the OMIM, Genbank, other database records, and, or, any other sources that you used. If your final report is acceptable, hopefully some of you will then help out the Wikipedia project by adding your contribution to the encyclopedia (please inform your instructor if you do add yours).

Start your search at the Online Mendelian Inheritance in Man (OMIM) site. Type your search term, found at the Wikipedia list of genetic diseases, into the search field located near the top of the screen, in the gray bar after "Search OMIM for". In the box "MIM Number Prefix:" select the checkbox next to the asterisk (*), this will give you only entries with a known DNA sequence, and click on "go" next to your search term(s). You can type in either a known genetic disease, from the Wikipedia list of all of the genetic diseases, or just the name of a protein or phenotype you are interested in. You should get a long list of database entries that include your search term. Click on the number on the left that is next to the result that looks most promising - something like http://www.ncbi.nlm.nih.gov:80/entrez/dispomim.cgi?id=602421 You should be taken to a page with summaries of all current known research on this gene. Look for the DNA and protein links in the frame on the left, under LocusLink, this will tell you if the gene has been cloned - do not try to use the ones that have not been cloned as there is usually not enough known about those (do not use a record that does not have a link to the protein sequence). Read through this record to get all of the information asked for above and summarize this in your report. You may need to go to the Genbank link to get some of the information.

Clicking on the link in the sidebar of the OMIM entry for http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide will take you to the Genbank database and do a search of the database for you using your gene as the search term. This will produce a long list of database entries, sort of like your first search, only this is a list of cloned and sequenced genes. Look through the list for the one that has the name of the human gene that you are looking for - it should look something like this:

http://www.ncbi.nlm.nih.gov:80/entrez/viewer.fcgi?cmd=Retrieve&db=protein&list_uids=306538&dopt=GenPept&term=&qty=1
cystic fibrosis transmembrane conductance regulator [Homo sapiens]
gi|306538|gb|AAC13657.1|[306538]

Click on the identification number for the correct entry (http://www.ncbi.nlm.nih.gov:80/entrez/viewer.fcgi?cmd=Retrieve&db=protein&list_uids=306538&dopt=GenPept&term=&qty=1 for the one above). Don't forget to try the "graph" view in addition to the normal "Genbank" view as some information is easier to get from the graphic view.You can get information about the protein by following a similar procedure with the "Protein" link in the sidebar of the OMIM entry.

The descriptions in the databases sometimes get a little technical with many medical terms, so the following medical dictionaries might be helpful - Medterms Medical Dictionary Index, and Webster's Medical Dictionary. You can also search several different genetics texts for explanations of any unfamiliar genetics terms at the http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=books.

Useful sites

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM&cmd=Limits
This is a searchable database with information about most known human genes and genetic diseases. It includes phenotypes, inheritance (dominant, recessive, etc.), chromosomal location, different alleles, information about the gene (if it has been cloned), and a bibliography.
http://www.geneclinics.org/servlet/access?id=8888890&key=F2ynXKmT3hdIV&fcn=y&fw=6jtT&filename=/reviewsearch/searchdz.html
This is another searchable database about many human genetic diseases. It is not as complete or up-to-date as OMIM, but is easier to read and has much of the same information, if there is an entry for the gene you are interested in.

Other sites of interest

http://www.ornl.gov/hgmis/publicat/primer/toc.html
A good tutorial on subjects like Southern blots, RFLP mapping, DNA sequencing, etc.
http://www.wzw.tum.de/idw/genglosindex.html
A large glossary of genetic terms (this one can take a while to download)

Grading Rubric

There is a grading rubric for this assignment - http://www.csuchico.edu/~jbell/Labs/wikipediaR.html

Teaching Notes and Tips

The two biggest problems students have with this assignment are choosing a disorder which is not well characterized, and thus not being able to complete many of the requirements as the information is not available, and difficulties with the medical terminology used in some of the databases (OMIM in particular). These are especially problems for students who put the assignment off to the last minute. Students must be warned that not all disorders are well studied or understood, and if they get one of those they need to start over and find one that is better characterized. I have included links ot dictionaries to help with the medical terminology problem, but students need to use them and not just blindly copy what they find in the reports. Requiring submission of a draft can help, especially with the terminology problem.

The other problem is students who try to copy and paste from the databases. This is usually pretty easy to spot as the database entries use lots of jargon and include references to the primary literature. I require them to list their sources so it is easy to determine whether a student is copying and pasting, but it doesn't hurt to let them know how easy it is to discover and why this would be inappropriate for both the class and for the Wikipedia.

Assessment

I convert the rubric included with the assignment into a WebCt quiz that the students have fill out when they submit their report as a self-assessment of their report. Grading using the rubric is fairly easy as the component is either present or not, although some accommodation must be made for some information that is not always available in the database records (allele frequency is frequently not known for rare disorders, for instance).

References and Resources