Data Science Master Program - Full Course
Reading Time: 5 minutes
Statistics Essentials for Analytics
All the topics in the following section will explain the basics of what it is, which scenario you want to use, What math behind it, How to implement with an analytic tool, what inferences you are getting from the final result.
Understanding the Data
Probability and its Uses
Statistical Inference
Data Clustering
Testing the Data
Regression Modelling
Data Science with Python
Module 1: Introduction to Data Science
What is Data Science?
What is Machine Learning?
What is Deep Learning?
What is AI?
Data Analytics & it’s types
Module 2: Introduction to Python
What is Python?
Why Python?
Installing Python
Python IDEs
Jupyter Notebook Overview
Module 3: Python Basics
Python Basic Data types
Lists
Slicing
IF statements
Loops
Dictionaries
Tuples
Functions
Array
Selection by position & Labels
Module 4: Python Packages
Pandas
Numpy
Sci-kit Learn
Mat-plot library
Module 5: Importing data
Reading CSV files
Saving in Python data
Loading Python data objects
Writing data to csv file
Module 6: Manipulating Data
Selecting rows/observations
Rounding Number
Selecting columns/fields
Merging data
Data aggregation
Data munging techniques
Module 7: Statistics Basics
Central TendencyMean
Median
Mode
Skewness
Normal Distribution
Probability BasicsWhat does mean by probability?
Types of Probability
ODDS Ratio?
Standard DeviationData deviation & distribution
Variance
Bias variance Trade offUnderfitting
Overfitting
Distance metricsEuclidean Distance
Manhattan Distance
Outlier analysisWhat is an Outlier?
Inter Quartile Range
Box & whisker plot
Upper Whisker
Lower Whisker
catter plot
Cook’s Distance
Missing Value treatmentsWhat is a NA?
Central Imputation
KNN imputation
Dummification
CorrelationPearson correlation
Positive & Negative correlation
Error MetricsClassification
Confusion Matrix
Precision
Recall
Specificity
F1 Score
RegressionMSE
RMSE
MAPE
Module 8: Machine Learning
Module 9: Supervised Learning
Linear RegressionLinear Equation
Slope
Intercept
R square value
Logistic regressionODDS ratio
Probability of success
Probability of failure
ROC curve
Bias Variance Tradeoff
Module 10: Unsupervised Learning
K-Means
K-Means ++
Hierarchical Clustering
Module 11: Other Machine Learning algorithms
K – Nearest Neighbour
Naïve Bayes Classifier
Decision Tree – CART
Decision Tree – C50
Random Forest
Data Science with R Language
Module 1: Introduction to Data Science Methodologies