Applied section:

The Applied Statistics stream will meet on Wednesdays from 2:30-4:00PM EST in the Ontario room (when available) on the 6th floor of 438 University. The first class will be on Wednesday, Feb 10th.

Date & Room Schedule (might be updated as course progresses)

  1. Wednesday, Feb 10th – Ontario (6th floor)

  2. Wednesday, Feb 17th – Manitoba (6th floor)

  3. Wednesday, Feb 24th – Ontario

  4. Wednesday, Mar 2nd – Ontario

  5. Wednesday, Mar 9th – Ontario

  6. Wednesday, Mar 16th – Ontario

  7. Wednesday, Mar 23rd – Ontario

  8. Wednesday, Mar 30th – Manitoba

  9. Wednesday, Apr 6th – PEI

  10. Wednesday, Apr 13th – British Columbia (6th floor)

Below is an outline of the course - lessons may be added or modified as the course progresses.

Lesson 1: Introduction to Python

  • Introduction to Anaconda and IPython
  • Python 3 versus 2.7
  • Scalar Data types
  • Data Structures/Sequences

Lesson 2: Control flow and Basic operations

  • If, else, while and for loops
  • Ternary expressions
  • List/Set/Dict comprehensions
  • “Pythonic” programming
  • Built in functions and methods

Lesson 3: Functions

  • Writing functions
  • Function scope and side effects
  • Lambda functions
  • Closures
  • Classes: Object orientated programming in Python
  • Functions vs methods
  • Generating classes

Lesson 4: Classes and Modules

  • Classes: Object orientated programming in Python
  • Functions vs methods
  • Generating classes
  • Installing Modules
  • Creating modules

Lesson 5: NumPy and Pandas 1

  • Modules in Python
  • Vectorization in NumPy
  • Expansion to Pandas
  • Reading, displaying and exporting csv files
  • Subsetting, functions and methods on dataframes

Lesson 6: NumPy and Pandas 2

  • Reshaping, pivoting and aggregating using Pandas
  • Advanced data munging
  • Tidy and untidy data

Lesson 7: Plotting and Regression

  • Matplotlib/Seaborn, Pandas plotting, ggplot for Python
  • Saving, embedding and using plots
  • Statsmodels module
  • Linear Regression and ANOVA

Lesson 8: Advanced Stats

  • Logistic regression and GLMs
  • Linear Optimization
  • Clustering

Lesson 9: Machine Learning and scikit-learn

  • Introduction to sci-kit-learn
  • Data formats and methods
  • Unsupervised clustering example – kmeans

Lesson 10: Work Flow

  • Working on remote servers
  • Working with SQL
  • Script organization and modules
  • Jupyter Notebooks