Statistics For DS & ML

Guys, I have listed down all the statistics topics needed to quick start Data Science and Machine Learning. I am in the process of writing articles on each of these topics with examples. So stay tuned to get latest updates on those topics. You can also bookmark this page and follow the blog to get automatic notification on each article completion.

Table of Contents

1. Basic Terminology

  • Population and Sample
  • Census and Survey
  • Parameter and Statistics
  • Descriptive and Inferential Statistics
  • Variables
    • Dependent
    • Independent
  • Data
    • Numerical – Quantitative
      • Discrete
      • Continuous
    • Categorical
      • Nominal
      • Ordinal
[All the above Terms are described here]

Descriptive Statistics

2. Understanding Data in terms of Statistics

3. Probability that is enough to get you started

  • Probability Basics
    • Probability vs Statistics
    • Sample Space
    • Event
    • Mutually Exclusive events
    • Independent Events
    • Joint Probability
    • Union Probability
    • Marginal Probability
  • Conditional Probability
    • Visualizing using Probability Tree
  • Monty Hall Problem
  • Bayes’ Theorem
  • Confusion Matrix
    • Recall
    • Precision
    • Specificity
    • Accuracy
    • F1 Score
    • Which one to use when
  • Probability Distributions
    • Why we need the Probability Distribution
    • Discrete Distribution – Probability Mass Function (PMF)
    • Continuous Distribution – Probability Density function (PDF)
    • Representation through Histogram
    • Shape of PDF
      • Negative Skewed
      • Normal Distribution
      • Positive Skewed
      • Kurtosis
  • Some Common Probability Distributions
    • Bernoulli
    • Geometric
    • Binomial
    • Poisson
    • Exponential

Inferential Statistics

4. Central Limit Theorem

  • What is Central Limit Theorem
  • Expectation and Variance
  • Sampling Distribution

5. Hypothesis Testing

  • Confidence Interval
  • Confidence Level
  • Null Hypothesis (H_0)
  • Alternate Hypothesis (H_1)
  • Significance Level
  • Critical Region
  • P-value
  • Q-value
  • One tailed test
  • Two tailed test
  • Errors
    • Type I
    • Type II

6. Frequently Used Test Statistics for Inferential Techniques

  • z test
  • t test
  • chi-squared
  • F test

7. Analysis of Variance ANOVA

  • Test for One Way ANOVA
  • Two Way ANOVA
  • Multiple Comparisons

8. Regression Analysis

9. Time Series Analysis

  • What is Time Series Data
  • Time Series Equation
  • Components of Time Series
    • Trend
    • Seasonality
      • Multiplicative
      • Additive
    • Random stationary
  • Auto Regressive Method
    • Auto correlation Factor (ACF)
    • Partial Auto correlation (PACF)
  • Stationary Model
    • Moving Averages
      • Simple Moving Averages (SMA)
      • Weighted Moving Averages (WMA)
      • Exponential Smoothing
    • Adding Trend And Seasonality to Moving Averages Process
      • Holt Winters Method
  • AR, MA and ARIMA Models

statistics for Data Science and Machine Learning
Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.