Guys, I have listed down all the statistics topics needed to quick start Data Science and Machine Learning. I am in the process of writing articles on each of these topics with examples. So stay tuned to get latest updates on those topics. You can also bookmark this page and follow the blog to get automatic notification on each article completion.
Table of Contents
1. Basic Terminology
Descriptive Statistics
2. Understanding Data in terms of Statistics
3. Probability that is enough to get you started
- Probability Basics
- Conditional Probability
- Monty Hall Problem
- Bayes’ Theorem
- Probability Distributions
- Why we need the Probability Distribution
- Discrete Distribution – Probability Mass Function (PMF)
- Continuous Distribution – Probability Density function (PDF)
- Representation through Histogram
- Shape of PDF
- Negative Skewed
- Normal Distribution
- Positive Skewed
- Kurtosis
- Some Common Probability Distributions
- Bernoulli
- Geometric
- Binomial
- Poisson
- Exponential
Inferential Statistics
4. Central Limit Theorem
- What is Central Limit Theorem
- Expectation and Variance
- Sampling Distribution
5. Hypothesis Testing
- Confidence Interval
- Confidence Level
- Null Hypothesis (H_0)
- Alternate Hypothesis (H_1)
- Significance Level
- Critical Region
- P-value
- Q-value
- One tailed test
- Two tailed test
- Errors
- Type I
- Type II
6. Frequently Used Test Statistics for Inferential Techniques
- z test
- t test
- chi-squared
- F test
7. Analysis of Variance ANOVA
- Test for One Way ANOVA
- Two Way ANOVA
- Multiple Comparisons
8. Regression Analysis
- Simple Linear Regression – SLR
- Concept behind the Simple Linear Regression.
- What is the “Best Fitting Line”?
- The Simple Linear Regression Model.
- Common Error Variances in SLR?
- Goodness of Fit
- MSE (Mean square error)
- MAE (Mean absolute error)
- RMSE (Root mean square error)
- MAPE (Mean absolute percent error)
- NMSE (Normalized mean square error)
- NMAE (Normalized mean absolute error)
- NMAPE (Normalized mean absolute percent error
- R-Squared, the Coefficient of Determination.
- Correlation Coefficient r.
- Covariance Coefficient
- R-squared Limitations.
- Adjusted R-Squared.
- More Examples.
- SLR Estimation and Prediction
- SLR Model Assumptions
- Residual Analysis
- Leverage Analysis
- Multiple Linear Regression – MLR
- Concept behind MLR
- Examples of MLR
- MLR Model
- Multicollinearity
- Variance Inflation Factor
- Matrix form of MLR
- MLR Model Evaluation
- MLR Model Assumptions
- Confusion MatrixRecall
- Precision
- Specificity
- Accuracy
- F1 Score
- Which one to use when
- Logistic Regression – Classification
- What is Logistic Regression
- Why Linear Models are not suitable for classification problems
- Sigmoid Function
- Transform Linear Model to Logistic Model
- What is Log Likelihood
- Model Building
- Stepwise Regression
- Best Subset Regression
- Cross Validation
- Feature Selection
- Principal Component Analysis
- Understanding output of Linear Model in R
- Understanding the output of Logistic Model
- Complete End to End Example using Linear Models
- Complete End to End Example using Logistic Model
9. Time Series Analysis
- What is Time Series Data
- Time Series Equation
- Components of Time Series
- Trend
- Seasonality
- Multiplicative
- Additive
- Random stationary
- Auto Regressive Method
- Auto correlation Factor (ACF)
- Partial Auto correlation (PACF)
- Stationary Model
- Moving Averages
- Simple Moving Averages (SMA)
- Weighted Moving Averages (WMA)
- Exponential Smoothing
- Adding Trend And Seasonality to Moving Averages Process
- Holt Winters Method
- Moving Averages
- AR, MA and ARIMA Models
