Design2Data

Ashley Vater, University of California, Davis
Laura Briggs. Department of Biology | Truckee Meadows Community College | 7000 Dandini Blvd | Reno, NV
Jason Labonte. Department of Chemistry | Franklin & Marshall College | 637 College Ave, Lancaster, PA. Department of Biochemical and Biomolecular Engineering | Johns Hopkins University | 3400 N Charles St | Baltimore, MD
Irina Makarevitch. Department of Biology | Hamline University | 1536 Hewitt Ave, Saint Paul, MN
Jaime Mayoral. Department of Biological Sciences | Florida International University | 12000 SW 8th St, Miami, Fl
Janelle Nunez-Castilla. Department of Biological Sciences | Florida International University | 12000 SW 8th St, Miami, Fl
Sharif Rumjahn. Department of Biology | Truckee Meadows Community College | 7000 Dandini Blvd | Reno, NV
Justin Siegel. Genome Center | University of California, Davis | One Shields Ave, Davis, CA. Department of Biochemistry and Molecular Medicine | University of California, Davis | One Shields Ave, Davis, CA. Department of Chemistry | University of California, Davis | One Shields Ave, Davis, CA.
Location:

Abstract

The Design2Data (D2D) program is centered around an undergraduate-friendly protocol workflow that follows the design-build-test-learn engineering framework. This protocol has served as the scaffold for a successful undergraduate training program and has been further developed into courses that range from a 10-week freshman seminar to a year-long, upper-division molecular biology course. The overarching research goal of this CURE probes the current predictive limitations of protein-modeling software by functionally characterizing single amino acid mutants in a robust model system. The most interesting outcomes of this project are dependent on large datasets, and, as such, the project is optimal for multi-institutional collaborations.

Student Goals

  1. Articulate project's relevance to the scientific community & describe the process of research
  2. Practice wet-lab biochemistry skills, collaboration, iteration, and data analysis
  3. Gain confidence in capability to do research & Identify as researchers

Research Goals

  1. Expand functional datasets for libraries of mutant enzymes that could be used by protein-engineering communities to design new molecular-modeling algorithms with more robust predictive power.

Context

D2D is a multi-institutional, networked CURE that aims to make joining and participating readily accessible. The curriculum can be tailored to fit a spectrum of courses from lower-division, introductory courses such as Intro to Cell & Molecular Biology to upper-division Biochemistry. The research sessions form a workflow that can be completed in as little as 10 weeks or expanded out to comprise a year-long project. The sessions are grouped into Design, Build, and Test modules and can run as standalone entities, each with a defined research milestone.

Target Audience: Introductory, Major, Non-major, Upper Division
CURE Duration: A full term, Multiple terms

CURE Design

Obtaining student-generated data has been the focus of the workflow design. The underlying scientific goal of developing the D2D-CURE network is to facilitate academic crowd-sourcing exercises to rapidly address questions that would normally take isolated labs decades to answer.

Faculty are able to integrate the lab activities to align with their course's specific learning goals.
Through the research tasks that comprise the D2D workflow, students will engage with the curricular skills and topics, which are common components of cellular and molecular biology courses and biochemistry courses. Students use Foldit (fold.it) to investigate structural variants of β-glucosidase B (BgLB) and to design oligos from a codon-optimized nucleic acid sequence that are used to produce a mutant BgLB gene coding for a desired structural variant. Students produce pET29 plasmids containing their mutant BgLB gene through Kunkel Mutagenesis and sequence verification protocols. Students induce protein expression from their constructs in an E. coli system, purify the mutant proteins with immobilized metal affinity chromatography, and use colorimetric spectrophotometry assays to characterize catalytic efficiency and thermal stability of their novel enzymes. Students apply Michaelis–Menten enzyme kinetics to investigate their hypotheses and enter data into an early-stage web portal that will guide future designs.

The primary stakeholder for the research generated by this CURE is the computational protein modeling community, who will use the data to benchmark and improve functionally predictive enzyme design algorithms. The students data is collected in an open access database with built-in Michaelis–Menten enzyme kinetics analysis and thermal stability curve fitting functions to help students assess their results.

Core Competencies:Analyzing and interpreting data, Asking questions (for science) and defining problems (for engineering), Constructing explanations (for science) and designing solutions (for engineering), Developing and using models, Planning and carrying out investigations, Using mathematics and computational thinking
Nature of Research:Applied Research, Basic Research, Wet Lab/Bench Research

Tasks that Align Student and Research Goals

Research Goals →
Student Goals ↓

Research Goal 1: Expand functional datasets for libraries of mutant enzymes that could be used by protein-engineering communities to design new molecular-modeling algorithms with more robust predictive power.



Student Goal 1: Articulate project's relevance to the scientific community & describe the process of research

Produce research poster and give oral presentation on experimental process, results, and interpretation of findings.



Student Goal 2: Practice wet-lab biochemistry skills, collaboration, iteration, and data analysis

Perform the following:

  • Enzyme cascade reactions in Kunkel Mutagenesis
  • E. coli transformations and overnight cultures
  • Gene sequence and chromatogram analysis
  • Protein expression IPTG induction
  • Protein purification
  • Enzyme functional characterization assays



Student Goal 3: Gain confidence in capability to do research & Identify as researchers

  • Design novel enzyme mutations using Foldit
  • Contribute data to open source, stakeholder- relevant database


Instructional Materials

The instructional materials are in a continuous state of evolution. The most current version can be found on our website.

Assessment

Students in the CURE will have an option to participate in a short pre / post psychosocial assessment. De-identified data will be made available to instructors whose students complete in the survey. The assessment aims to investigate changes in student attitudes that are predictive of persistence in STEM majors.

Instructional Staffing

There are not staffing guidelines. Each CURE and institution will be different. Please connect with the program coordinator Ashley Vater (awvater@ucdavis.edu) for consultation on instructional staffing for your course and institution.

Author Experience

Our team is a group of highly motivated faculty who are invested in student learning and excited about pushing the boundaries of research (see names and institution details above). Ashley Vater is the contact point-person for the D2D CURE and is a member of the Siegel Lab at UC Davis where she serves as instructional designer. Her work supports the D2D Network and she co-instructs the D2D CURE at UC Davis with Prof. Justin Siegel.


Advice for Implementation

The D2D CURE emerged from years of incubation as the training program for new undergraduate students in the Siegel Lab. The functional results for any student's mutant design are of interest; in other words, there are no "negative results" making this an ideal project to scale up and bring to the classroom. We are actively seeking to expand our network and welcome new collaborators. Please connect if you are interested in participating! Contact program coordinator Ashley Vater (awvater@ucdavis.edu) for implementation advice.

Iteration

Iteration as a central component of CUREs is addressed at the research goal level and the student task level. The database is structured to welcome and easily analyze repeatability and replication, and network faculty are encouraged to plan for class sessions where students repeat challenging experiments. The terminal milestones of the three workflow modules—Design, Build, and Test—allow for research success if progress is unexpectedly slowed, a normal feature of authentic research experiences.

Using CURE Data

Students will upload their data to the D2D Database at the end of the TEST section of the workflow. Mutations that have sequence-verified plasmids will also be reported in the database, and all plasmids will be shipped to the Siegel Lab for long-term storage and multi-institutional sharing purposes. Students will have opportunities to contribute to publications that report results from their work, as well as to present at national protein-modeling developers' conferences. Further, the student-generated data will be immediately available to the protein-modeling community RosettaCommons, which, as a collective body, will provide feedback and guidance on future enzyme targets to explore, thus completing the loop and realizing the hallmark CURE feature of creating stakeholder-relevant research products.

Resources

Resources can be found on the D2D website.