## Covariance and Correlation

Covariance and Correlation are very helpful in understanding the relationship between two continuous variables. Covariance tells whether both variables vary in same direction (positive covariance) or in opposite direction (negative covariance). There is no significance of covariance numerical value only sign is useful. Whereas Correlation explains about the change in one variable leads how much proportion change in second variable. Correlation varies between -1 to +1. If correlation value is 0 then it means there is no Linear Relationship between variables however other functional relationship may exist.

Let’s understand these terms in detail:

**Covariance:**

In the study of covariance only sign matters. Positive value shows that both variables vary in same direction and negative value shows that they vary in opposite direction.

Covariance between two variables x and y can be calculated as following:

Where:

- x̄ is sample mean of x
- ȳ is sample mean of y
- x_i and y_i are the values of x and y for ith record in sample.
- n is the no of records in sample

**Significance of the formula:**

- Numerator: Quantity of variance in x multiplied by quantity of variance in y.
- Unit of covariance: Unit of x multiplied by unit of y
- Hence if we change the unit of variables, covariance will have new value however sign will remain same.
- Therefore numerical value of covariance does not have any significance however if it is positive then both variables vary in same direction else if it is negative then they vary in opposite direction.

**Correlation:**

As covariance only tells about the direction which is not enough to understand the relationship completely, we divide the covariance with standard deviation of x and y respectively and get correlation coefficient which varies between -1 to +1.

- -1 and +1 tells that both variables have perfect linear relationship.
- Negative means they are inversely proportional to each other with the factor of correlation coefficient value.
- Positive means they are directly proportional to each other mean vary in same direction with the factor of correlation coefficient value.
- if correlation coefficient is 0 then it means there is no linear relationship between variables however there could exist other functional relationship.
- if there is no relationship at all between two variables then correlation coefficient will certainly be 0 however if it is 0 then we can only say that there is no linear relationship but there could exist other functional relationship.

Correlation between x and y can be calculated as following:

Where:

- S_xy is the covariance between x and y.
- S_x and S_y are the standard deviation of x and y respectively.
- r_xy is correlation coefficient.
- Correlation coefficient is dimensionless quantity. Hence if we change the unit of x and y then also coefficient value will remain same.

Let’s understand what is the significance of correlation coefficient with the help of below graph:

If you are an aspiring data scientist or an experienced professional who is trying to make his career in Data Science, then you must visit E-network. Where we focus on high-quality interactive mock interview sessions and help you to QuickStart your Data Science and Machine Learning journey by Preparing a learning roadmap, providing study material, suggesting Best training institutes and provide practice problems with their solutions and many more…

Feel free to contact us for more details and discussions.

Pingback: What is Linear Regression? Part:2 | TECHtunnel

Pingback: What is the Coefficient of Determination | R Square – TECH tunnel

Nice explanation..

LikeLiked by 1 person

Thank you.

LikeLike

Pingback: Feature Selection Techniques in Regression Model – TECH tunnel

Pingback: What is Lost Update Problem in DBMS? – TECH tunnel

Pingback: Serial Schedules, Concurrent Schedules and Conflict Operations – TECH tunnel

Pingback: Concurrent Execution in Transaction | DBMS – TECH tunnel

Pingback: Difference between Normalization and Normal Forms – TECH tunnel

Pingback: Implementation of Atomicity and Durability using Shadow Copy – TECH tunnel

Pingback: What is stepAIC in R? – TECH tunnel

Pingback: What is Multicollinearity? – TECH tunnel

Pingback: Logistic Regression with an example in R – TECH tunnel

This is precise and really helpful.

LikeLike

Thank You Anish.

LikeLike

Pingback: A Complete Guide to Principal Component Analysis – PCA in Machine Learning – TECH Tunnel

Pingback: Step by Step Approach to Principal Component Analysis using Python – TECH Tunnel

Pingback: What is Linear Regression? Part:2 – Yakanak News

Pingback: Data Science and Machine Learning Articles | Yearly round-up 2019 – Data Science, Machine Learning & Artificial Intelligence