In Machine Learning, it is very important to have good understanding of different performance metrics. And it is even more important to know when to use which one to correctly explain the model performance. In classification problems more specific to binary classification, you can not conclude your model without plotting Precision-Recall curve and ROC-AUC curve. In this post, will learn what is the main difference between Precision-Recall curve and ROC-AUC curve and when to use which one.
The ID3 algorithm can be used to construct a decision tree for regression type problems by replacing Information Gain with Standard Deviation Reduction – SDR
A decision tree is built top down from a root node and involves partitioning the data into subsets that contain instances with similar values mean homogeneous data.
Here, standard deviation is used to calculate the homogeneity of a numerical sample (target variable).
What is Covariance coefficient?
Covariance tells you whether two random variables vary with respect to each other or not. And if they vary together then whether they vary in same direction or in opposite direction with respect to each other. So if both random variables vary in same direction then we say it is positive covariance, however if they vary in opposite direction then it is negative covariance.
MLOps is the union of DevOps, machine learning, and data engineering. Built on DevOps’ existing approach, MLOps solutions are developed to increase re-usability, facilitate automation, manage data drift, model versioning, experiment tracking, continuous training and extract richer and consistent insights in a machine learning project.
ROC AUC curve helps you to determine the threshold of binary classification problems in machine learning. In Machine Learning classification problems are based on the probability value and its not always correct to have the threshold as 0.5. It depends on the type and domain of the problem. For example in a legal case you don’t want the false positive to be high or it should be at least as possible. so the threshold in this case would be very high. the term AUC that is Area under curve tells us the model goodness of fit. It is used to do the comparative analysis between different classifiers and identify which one is performing good.
Let me start with simple question. Can we compare Mango and Apple? Both have different features in terms of tastes, sweetness, health benefits etc. So comparison can be performed between similar entities else it will be biased. Same logic applies to Machine Learning as well. Feature Scaling in Machine Learning brings features to the same scale before we apply any comparison or model building. Normalization and Standardization are the two frequently used techniques of Feature Scaling in Machine Learning.