Tag: Data Science

What is Logistic Regression?

Logistic regression is the most widely used machine learning algorithm for classification problems. In its original form it is used for binary classification problem which has only two classes to predict. However with little extension and some human brain, logistic regression can easily be used for multi class classification problem. In this post I will be explaining about binary classification. I will also explain about the reason behind maximizing log likelihood function.

Continue reading “What is Logistic Regression?”

Basic Statistics for Data Science – Part 1

The Science of collecting, organizing, presenting, analyzing and interpreting the data is statistics. It is one of the most important disciplines or methods to get a deeper insight into data. Statistical analysis is implemented to manipulate, summarize and investigate data so that useful information can be obtained.populationvssample

Take away from this post:

  • Types of Statistics: Descriptive vs Inferential
  • Basic terminology like Population vs Sample
  • Types of Variables: Numerical vs Categorical
  • Measures of central tendencies: Mean, Median and Mode and their specific use cases
  • Measures of dispersion/spread: Variance, standard deviation etc.

Continue reading “Basic Statistics for Data Science – Part 1”

What is the Coefficient of Determination | R Square

The coefficient of Determination is the direct indicator of how good our model is in terms of performance whether it is accuracy, Precision or Recall. In more technical terms we can define it as The Coefficient of Determination is the measure of the variance in response variable ‘y’ that can be predicted using predictor variable ‘x’. It is the most common way to measure the strength of the model.

Continue reading “What is the Coefficient of Determination | R Square”

What is Linear Regression? Part:1

Linear Regression is a field of study which emphasizes on the statistical relationship between two continuous variables known as Predictor and Response variables. (Note: when there are more than one predictor variables then it becomes multiple linear regression.)

  • Predictor variable is most often denoted as x and also known as Independent variable.
  • Response variable is most often denoted as y and also known as Dependent variable.
Continue reading “What is Linear Regression? Part:1”

Covariance and Correlation

Covariance and Correlation are very helpful in understanding the relationship between two continuous variables. Covariance tells whether both variables vary in same direction (positive covariance) or in opposite direction (negative covariance). There is no significance of covariance numerical value only sign is useful. Whereas Correlation explains about the change in one variable leads how much proportion change in second variable. Correlation varies between -1 to +1. If correlation value is 0 then it means there is no Linear Relationship between variables however other functional relationship may exist.

Continue reading “Covariance and Correlation”
%d bloggers like this: