Algorithms in Sequence Analysis

Course

URL study guide

https://studiegids.vu.nl/en/courses/2024-2025/X_405050

Course Objective

Have you ever wondered how we can track a gene across 3 billion years of evolution? Or how you can use the genome information of a given cancer patient to find out what may be wrong? Sequence alignment can be used to compare genomes, genes or proteins from bacteria all the way to humans, while further derived algorithms may be employed to make a phylogeny (to find out about evolutionary relationships), find a functional motif in a protein sequence, or a viral sequence in a genome. In this course we focus on the most important algorithms for biological sequence analysis that can be applied to real scientific problems in biology. s [Dublin descriptors]:
- Students will obtain in-depth knowledge about the theory of sequence analysis methods. They will become aware of the major issues, methodology and available algorithms in sequence analysis. Upon completion of the course they will be able to implement several of the most important algorithms in sequence analysis, including dynamic programming and Hidden Markov Models (HMMs) [Knowledge and understanding].
- Students will develop understanding and skills to apply sequence analysis algorithms to protein and DNA sequences. They will gain hands-on experience in tackling biological problems using sequence analysis algorithms, including the statistical framework of HMMs and algorithms used in genome sequencing and analysis [Applying knowledge and understanding].
- Upon completion of the course students will be able to decide which algorithm is best suited for a particular biological sequence analysis problem [Making judgements].
- During this course students are required to read through scientific and technical literature and learn to translate algorithms described in text and formal mathematical notation into computer code [Learning skills].

Course Content

Theory:
- Dynamic programming, database searching, pairwise and multiple alignment, probabilistic methods including HMMs, pattern matching, entropy measures, evolutionary models, and phylogeny. Practical:
- Programming (in Python) an alignment algorithm based on dynamic programming;
- Aligning sequencing data from tumors to the human genome and analysing structural variants;
- Programming (in Python) an implementation of HMMs and using it to predict protein domain structure.

Teaching Methods

13 lectures: 2 two-hour lectures per week. 13 computer practicals and associated assignments: 2 two-hour hands-on sessions per week.

Method of Assessment

The final grade for this course will consist of 50% practical work (seeabove) and 50% theoretical assessment.The theoretical assessment will be an oral and/or written exam (depending on number of students). Further assessment and grading details will be posted on Canvas (resits and compensation rules).

Literature

Course material on Canvas. Books: Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.. Biological Sequence Analysis. Cambridge University Press, 1998, 350 pp., ISBN 0521629713. Recommended reading: Marketa Zvelebil and Jeremy O. Baum Understanding Bioinformatics. Garland Science 2008. ISBN-10: 0-8153-4024-9

Target Audience

M Bioinformatics and Systems Biology M Biomolecular Sciences M Artificial Intelligence M Computational Science

Additional Information

BYOD policy (Bring Your Own Device) We expect students in this course to use their own laptop. This laptop should at the very least support an SSH client, for remote shell access to the VU Linux servers. Ideally, this laptop supports a command line shell, Python 3 and a text editor with syntax highlighting -
- either standalone (e.g. Atom or Sublime Text) or as part of a simple IDE (e.g. Spyder). As such, we recommend the Anaconda python distribution regardless of operating system, along with PuTTy or PowerShell for Windows users specifically. If you are considering purchasing new hardware, we recommend the following: o Processor: Intel i5 / AMD Ryzen 5 or above o Memory: At least 4GB RAM o Storage: At least 512GB harddisk space o Operating System: Ubuntu 16.04 The course is taught in English.

Entry Requirements

Bachelor in any science discipline (including medicine). Basic programming skills (Python) and an interest in biological problems.
Academic year1/09/2431/08/25
Course level6.00 EC

Language of Tuition

  • English

Study type

  • Master