Statistics – Data Science Duniya

Practice Problems on Hypothesis Testing

April 17, 2022 Ashutosh Tripathi

Hypothesis testing helps us to validate the various claims made by different people in different scenario. For example if we claim that there is no significant difference between boys and girls intelligence level. So can we validate it significantly? Or can we validate that smoking causes cancer?

What is Feature Scaling in Machine Learning | Normalization vs Standardization

June 12, 2021 Ashutosh Tripathi

Let me start with simple question. Can we compare Mango and Apple? Both have different features in terms of tastes, sweetness, health benefits etc. So comparison can be performed between similar entities else it will be biased. Same logic applies to Machine Learning as well. Feature Scaling in Machine Learning brings features to the same scale before we apply any comparison or model building. Normalization and Standardization are the two frequently used techniques of Feature Scaling in Machine Learning.

Bayes’ Theorem with Example for Data Science Professionals

August 20, 2019 Ashutosh Tripathi

Bayes Theorem is the extension of Conditional probability. Conditional probability helps us to determine the probability of A given B, denoted by P(A|B). So Bayes’ theorem says if we know P(A|B) then we can determine P(B|A), given that P(A) and P(B) are known to us.

Conditional Probability with examples For Data Science

August 15, 2019 Ashutosh Tripathi

Conditional Probability helps Data Scientists to get better results from the given data set and for Machine Learning Engineers, it helps in building more accurate models for predictions.

Variance, Standard Deviation and Other Measures of Variability and Spread

August 9, 2019 Ashutosh Tripathi

Variance and Standard Deviation are the most commonly used measures of variability and spread. Variability and spread are nothing but the process to know how much data is being varying from the mean point.

Basic Statistics for Data Science – Part 1

April 18, 2019 Chetna Tripathi

Types of Statistics: Descriptive vs Inferential
Basic terminology like Population vs Sample
Types of Variables: Numerical vs Categorical
Measures of central tendencies: Mean, Median and Mode and their specific use cases
Measures of dispersion/spread: Variance, standard deviation etc.