Course Syllabus

CSci 5481:Computational Techniques in Genomics (Fall 2021)

Prof. Rui Kuang (kuang@umn.edu)

Course Description

This course provides a comprehensive introduction to the fundamental algorithms and mathematical models in modern computational molecular biology. Recently, the exciting genome projects have been greatly accelerated by the advent of large-scale DNA/RNA sequencing technologies. The genome sequences of thousands of species and tens of thousands of human subjects are now completely available or partially sequenced. The sequencing technology also enabled sequencing genomes of many individuals to find genetic characteristics of phenotypes. The tremendous amount of sequencing data promises to shed light on the underlying biology principles carried by the genes encoded by the genomes, and the evolutionary histories of the species. It is by nature computational to process, annotate and analyze these large genome sequences, and many algorithms and computational tools play a major role. In this course we will cover algorithms and computational models in the following broad themes,

  • Fundamental sequence analysis algorithms and models for processing and analyzing genome, DNA and protein sequences
  • Short-read alignment and assembly algorithms for new generation sequencing
  • Phylogenetic models and algorithms for inferring evolutionary relations and history
  • Gene prediction and genome annotation algorithms
  • Protein structure modeling

These topics span a variety of computational methods such as local/global sequence alignment algorithms, hidden Markov models, motif finding algorithms, multiple sequence alignment algorithms, short-read alignment/mapping/, phylogenetic trees and algorithms for building and analyzing phylogenetic trees, gene prediction algorithms, gene function prediction algorithms.

The specific objectives for the students are

  • To learn how to develop computational thinking for solving biology problems.
  • To acquire a solid background in fundamental concepts in computational molecular biology.
  • To get familiar with the state-of-the-art computational methods in genome analysis.

The breakdown of the class grade is as follows:

  • Four homework Assignments: 50%
  • Individual project and poster presentation: 35%
  • Three or four exercises and group presentations: 15%

Homework assignments must be completed in Python or Matlab.

Textbooks:

  • Zvelebil and J. Baum, Understanding Bioinformatics, Garland Science, 2007
  • More recent articles on algorithms for short read assembly, mapping and analysis

Prerequisites: CSCI 4041 or instructor approval. C/C++/Java/matlab/Python programming.

Instruction: The class will combine in-person sessions with some pre-recorded video lectures for the need of the necessary flexibilities. Presentations and discussions will be organized through zoom meetings.

  • Online access to lectures: Pre-recorded lectures will be uploaded in canvas in the day before the class. In-person lectures will be recorded by UNITE and made available online in real-time to all students, whether or not they are enrolled through UNITE. This is to help reduce the risk of in-class exposure to the virus and to allow students who test positive to continue taking the course remotely.
  • InteractionOther than in-person lectures, all interaction with the instructor (e.g., office hours, after-class discussions, etc.) will be online only (over Zoom). Please avoid milling around in groups before/after class.
  • Vaccination and masking: Everyone is required to wear a mask while inside a University building, regardless of vaccination status. (More information on these requirements is here.) Please ensure that you follow the requirements and, to the extent possible, practice appropriate social distancing in the classroom.

Intended Audience: This course is primarily for graduate and senior undergraduate students in computer science, biomedical informatics and biological sciences with an interest in computational biology.

Academic integrity policy: Students are encouraged to have discussions in the class forum, but all students must work independently on their assignments and exams. Any student cheat in exams or homework will receive a F as a class grade and the incident will be reported to the University office. More formation on academic misconduct is available at Note on Academic Conduct for New Students and The Office for Community Standards for student academic integrity.

Course Summary:

Date Details Due