Category Archives: Statistics

Bayes’ Theorem with Example for Data Science Professionals

Bayes Theorem is the extension of Conditional probability. Conditional probability helps us to determine the probability of A given B, denoted by P(A|B). So Bayes’ theorem says if we know P(A|B) then we can determine P(B|A), given that P(A) and P(B) are known to us.

Read more

Conditional Probability with examples For Data Science

As the name suggests, Conditional Probability is the probability of an event under some given condition. And based on the condition our sample space reduces to the conditional element.

For example, find the probability of a person subscribing for the insurance given that he has taken the house loan. Here sample space is restricted to the persons who have taken house loan.

Read more

Variance, Standard Deviation and Other Measures of Variability and Spread

Variance and Standard Deviation are the most commonly used measures of variability and spread. Variability and spread are nothing but the process to know how much data is being varying from the mean point. And Variance tells us the average distance of all data points from the mean point. Standard deviation is just the square root of the variance. As variance is calculated in squared unit (explained below in the post) and hence to come up a value having unit equal to the data points, we take square root of the variance and it is called as Standard Deviation.

Read more

Basic Statistics for Data Science – Part 1

The Science of collecting, organizing, presenting, analyzing and interpreting the data is statistics. It is one of the most important disciplines or methods to get a deeper insight into data. Statistical analysis is implemented to manipulate, summarize and investigate data so that useful information can be obtained.

Take away from this post:

  • Types of Statistics: Descriptive vs Inferential
  • Basic terminology like Population vs Sample
  • Types of Variables: Numerical vs Categorical
  • Measures of central tendencies: Mean, Median and Mode and their specific use cases
  • Measures of dispersion/spread: Variance, standard deviation etc.
Read more