For Better Performance Please Use Chrome or Firefox Web Browser

Data Analysis (Spring 2020)

Topics:

  • Statistical learning: (Supervised learning - Unsupervised learning)
  • Linear Regression
  • Classification: ( Logistic Regression, Bayes Classfier)
  • Linear Model Selection and Regularization: (Ridge Regression, Lasso Regression)
  • Decision Trees: (Bagging, Random Forests, Boosting)
  • Clustering: (K-means, Hierarchical, Model-based: Mixture models)
  • Neural Networks (A brief introduction)

 

TextBook:

Moset of the topics are based on:

  • An Introduction to Statistical Learning: with Applications in R (2013) (Springer Series in Statistics) by G. James, D. Witten, T. Hastie and R. Tibshirani
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer Series in Statistics) (2001 & 2009) by T. Hastie, R. Tibshirani, J. H. Friedman.

     

Prerequisites: 

-

Grading Policy: 

Final exam: 11-12 points

Group assignments: 4-6 points

Group project: 2-4 points

Extra points: up to 1 point for remarkable questions/answers about your classmates' projects

Extra points: up to 1 point for optional quiz for Python software

--------------------------------------------------------------------------------------------------------------------------------------------------------

Assignements must be done with the R software. Project can be done with any software including R/Python.

Students who take less than 40% of the final exam will receive half the group score.

---------------------------------------------------------------------------------------------------------------------------------------------------------

Picture: from Google search

About projects:

The project must be about a real data set in Iran with many variables/observations. There are some deadlines for reporting project progress and presenting your outputs by sending emails to a.mofidian@math.iut.ac.ir

Reports and outputs must be prepared by WORD (Times New Roman, size 11 or Arial, size 10)

The first deadline: March 04 (14 Esfand) (Extended)

  • Sending the data set along with its expalnation in one page including the description of variables, the way of collecting data and the reference.
  • Also, one page about the possible data analysis tasks on your data and difficulties arrising in this way.
  • One/two page/s about visualizing your data. (not necessary with R).

---------------------------------------------------------------------------------------------------------------------------------------------------------

Teacher Assistants: 
  • Amir Abbas Mofidian (PhD student of Statistics): teaching R, evaluating homeworks, consulting on projects
  • Ghasemi nejad: teaching Python
Time: 

Class time: Sundays and Tuesdays, 9:30-11

Also, there are two classes to teach R (Tuesdays at 8:00) and Python (Mondays at 16:30) in the Statistics Lab.

Attendance in the R class is mandatory. (Cancel)

------------------------------------------------------------------------------------------------------------------------------------------------------

تحت نظارت وف ایرانی