# Main Page

This is the course web page for "BTRY 4840/6840 & CS 4775: Computational Genetics and Genomics" – Fall 2015

Please check this page frequently throughout the semester. It will continually be updated with information you will need. Keep in mind that the schedules for lectures and problem sets are provisional.

## Contents

## Announcements

- You have the option to turn in one homework to be re-graded. These must be sent to Melissa by the end of the day Friday, 12/4/2015.
- A slightly revised version of Problem Set 2 was posted around noon on 9/18/2015.
- Melissa's office hours will be 3:30-4:30 on Sept 16
- Please bring your laptop to section on Friday Sept 18
- Extension/Late policy:
- Beginning with problem set 2, you can request up to 2 days of extensions on the problem sets. You can either use one day for two different problem sets, or you can use two days on one problem set. Email the instructor and TA to request an extension.
- Homework grades will be reduced 25% per day late. So, for example, if you request a 1 day extension and turn in your problem set 2 days late, your grade will be reduced by 25%.
- A 1-2 day extension on problem set 1 will be granted upon request and will not count against the 2 days of extensions for the rest of the term.
- Homeworks will be accepted without penalty up until 11:59pm on the due date

- Melissa's office hours will be from 2pm-3pm on Sep 2 for this week only
- There will be no recitation the first week of classes.
- Welcome to the class!

## General Information

- Lectures: Tues/Thurs, 11:40am-12:55pm, Warren Hall B02
- Recitations: Fri, 1:25pm-2:15pm, Comstock B108
- Instructor: Amy Williams, 102G Weill Hall
- TA: Melissa Hubisz, 102D Weill Hall
- Instructor office hours Tues 4-5pm
- TA office hours Wed 3-4pm
- Course Syllabus

## Discussion Forum / Q&A

Questions and answers for this course will be handled through Piazza. You can sign up for the class forum here.

## Lectures

- 8/25/2015 Lecture 1 Introduction
- 8/27/2015 Lecture 2 Basics of genetics and inheritance, probability and statistics
- 9/01/2015 Lecture 3 Maximum likelihood, Bayes rule, Hypothesis testing
- 9/03/2015 Lecture 4 More hypothesis testing, Big O notation, Dynamic programming
- 9/08/2015 Lecture 5 Manhattan tourist, Dynamic programming, Sequence alignment
- 9/10/2015 Lecture 6 More sequence alignment (affine gaps, optimizations), intro to HMMs
- 9/15/2015 Lecture 7 (Melissa) Molecular evolution
- 9/17/2015 Lecture 8 More HMMs- CpG islands, Viterbi algorithm
- 9/22/2015 Lecture 9 (updated) More HMMs- Forward/Backward algorithm, pseudocounts, Baum Welch, Viterbi training
- 9/24/2015 Lecture 10 (updated) More HMMs- Random sampling, sum of logs, genotyping background, haplotype inference
- 9/29/2015 Lecture 11 (updated) Haplotype inference (family-based and unrelated), Li and Stephens
- 10/1/2015 Lecture 12 Finish Li and Stephens, HAPI-UR, Population structure and Principal component analysis, local ancestry
- 10/6/2015 Lecture 13 Phylogenetic trees, parsimony, UPGMA, Jukes Cantor, Felsensteins algorithm
- 10/8/2015 Lecture 14 (updated) Phylogenetic inference continued, Subtree pruning and regrafting, Nearest Neighbor interchange, Bayesian inference
- 10/15/2015 Lecture 15 (Prof. Haiyuan Yu) Protein interactome networks
- 10/20/2015 Lecture 16 (updated) STRUCTURE and MCMC/Gibbs sampling
- 10/22/2015 Lecture 17 (updated) Motif finding, Information theory
- 10/27/2015 Lecture 18 MCMC convergence, HAPMIX, Chromopainter
- 10/27/2015 Lecture 19 (Melissa) ARGs and ARGweaver
- 11/3/2015 Lecture 20
- 11/5/2015 Guest lecture by Prof. Charles Danko
- 11/10/2015 Lecture 21
- 11/12/2015 Lecture 22
- 11/17/2015 Lecture 23
- 11/19/2015 Lecture 24 (updated)
- 11/24/2015 Lecture 25 (updated)
- 12/1/2015 Lecture 26
- 12/3/2015 Lecture 27

## Problem sets

- Problem set 1 is due Sept 10 at lecture.
- Problem set 2 is due Sept 24 by midnight. You will need the sequences in this file: sequences.fasta
- Problem set 3 is due Oct 8 by midnight. You will need the sequence in this file: pset3-sequence.fa
- Problem set 4 is due Tuesday, Oct 27 by midnight. You will need the sequence data in this file: apoe.fa
- Problem set 5 is due Tuesday, Nov 10 by midnight. You will need the following data files: motif1.fa, motif2.fa, dist10.txt

## Final project

Final project details and possible projects ideas are available. Project proposals are due on November 3 by midnight.

## Other files

Here is the R script shown in section on 9/18/2015. It contains some basic R usage examples, and at the end shows a simulation/LRT on a weighted die: section3.R

Here is the python script from 9/18/2015, showing example input/output of fasta files using Biopython: section3.py

Here is the R script from 9/25/2015, showing the use of EM to estimate parameters of a mixture distribution: mixture.R