Home » Essential Statistics for Data Science- 21M21CS123

Essential Statistics for Data Science- 21M21CS123

Reference:

DateTopicMaterial / Assignment
18 Jan 2021Course Description, Assessment Tools, Online study material, Books, Project Based Learning ComponentCourse Description: Essential Statistics for Data Science EVEN 2021
19 Jan 2021
Generic About Data
Data, Data science, Types of Datasets, Attributes, Types of attributes.
Sources: [Reference 1] Chapter 2
20 Jan 2021
Descriptive Statistics
Data Science Engineer Role, Descriptive Data Summarization
Measure of Central Tendency: Mean, Median, Mode, Mid Range
Measure of Dispersion: Quartiles, Interquartile Range (IQR), Variance, Standard Deviation .
ESDS_Lecture3
25 Jan 2021
Descriptive Statistics
Lecture + Tutorial:
1. Theoretical Assignment: ESDS_Tutorial01.pdf [1]
2. Excel Assignment: Data + Questions are in excel itself ESDS_Excel_PracticeTutorial01.xls
3. Python Assignment: Data which is given in part 2(Excel file) as csv and perform statistical measures those are mentioned in excel file using python.
ESDS_Tutorial01
ESDS_Excel_Practicetutorial01


27 Jan 2021
Inferential Statistics
Normal Distribution, Empirical Rule: 68%-95%-99%
Normal Distribution using Excel: Normal Distribution.xlsx
Normal Distribution using Python: NormalDistribution.ipynb
ESDS_Lecture4
Normal Distribution Excel file
Python File: Shared on Google Classroom*
01 Feb 2021
Inferential Statistics
Lecture+Tutorial
Bernoulli Distribution, Binomial Distribution, Poisson Distribution
Tutorial: Binomial Distribution in Excel and Python
ESDS_Lecture_4-6
02 Feb 2021
Inferential Statistics
Descriptive Vs Inferential Statistics, Sampling Distribution, Central Limit TheoremESDS_7
03 Feb 2021
08 Feb 2021
Python Code for data upload, compute statistical measures
Python code demonstrated and students have also implemented the same to understand the following:-
A. Implementation concepts of statistics for data science Open data file via three methods:
1. Existing Python library data use
2. Read CSV of your own data
3. Use data directly from data
source URL.
B. Code to compute Central tendency measures
C. Plotting of various Graphs: Scatter Plot, Bar, Histogram,
D. Distributed measure
Lecture 8 & Lecture 9
Shared Google Colab Link of Python code with ESDS class students
09 Feb 2021
10 Feb 2021
Proximity Measures: Nominal attribute, Binary Attribute, Numeric attribute.
15 Feb 2021About Parametric, Non-Parametric and Hypothesis test, Chi- Square Test, Chi-square Example Problems, Covariance, Pearson CorrelationESDS_8-10
16 Feb 2021Python code for implementation of chi-square test, correlation, and covariance
17 Feb 2021Test-1 Revision Class & Exam Guidelines
Lecture Taken by Other Faculty Member from Test-1 Examination to Test-2 Examination
(Test-1 Schedule 22 – 26 Feb 2021; Test-2 Schedule 05 – 09 Apr 2021)
12 Apr 2021Powerpoint covers all LR, MLR, Polynomial, KNN, SVM, OLS, Ridge Regression, Lasso RegressionRegression Powerpoint
12 Apr 2021Simple Linear Regression Python CodeSimple Linear Regression Code
13 Apr 2021Multiple Linear Regression Regression Powerpoint
14 Apr 2021Multiple Linear Regression Python
Code
Multiple Linear Regression Code
19 Apr 2021Polynomial Linear Regression, KNN Regression and its
Python Code
Polynomial Linear Regression Code
KNN Classification Code
20 Apr 2021SVR Support Vector Regression
and its Python Code
Support Vector Machine Code
21 Apr 2021RamNavmi Holiday
26 Apr 2021Multicollinearity, Underfitting & overfitting, Regularization, OLS
Ridge Regression, Lasso Regression and its python code
Ridge and Lasso Regression Code
10th- 20th University Off Due to COVID-19

CLASSES OVER FOR EVEN 2021 FROM 20 May 2021
END SEMESTER EXAMINATION
22 – 31 May 2021

  1. Han, J., Kamber, M., & Pei, J. (2011). Data mining concepts and techniques third edition. The Morgan Kaufmann Series in Data Management Systems5(4), 83-124.

* Code Can not be shared here due to security reasons mentioned by wordpress