Feature Store in Machine Learning

Feature store in machine learning is the concept to store features in both online and offline stores for model training and serving purposes. Feature store make sure to provide the consistency between the data used for model training and the data used during online serving to models. In other words, it guarantees that you’re serving the same data to models during training and prediction, eliminating training-prediction skew. Feast is one of the open source tools used for feature store.

Continue reading

Homogeneous Data Nodes

A Complete Guide to Decision Tree Formation and Interpretation in Machine Learning

Decision Tree is supervised machine learning algorithm which is used for both types of problems regression (that is predicting the continuous value for future example house price, hours the match can be played given overcast condition etc…) and classification (that is classifying different objects into respective categories or classes for example given the overcast conditions match will be played or not, given image belongs to cat or dog etc…).

Precision-Recall vs ROC-AUC curve

What is the difference between Precision-Recall Curve vs ROC-AUC curve?

In Machine Learning, it is very important to have good understanding of different performance metrics. And it is even more important to know when to use which one to correctly explain the model performance. In classification problems more specific to binary classification, you can not conclude your model without plotting Precision-Recall curve and ROC-AUC curve. In this post, will learn what is the main difference between Precision-Recall curve and ROC-AUC curve and when to use which one.

Decision Tree for Regression

Decision Tree for Regression Models in Machine Learning

The ID3 algorithm can be used to construct a decision tree for regression type problems by replacing Information Gain with Standard Deviation Reduction – SDR
A decision tree is built top down from a root node and involves partitioning the data into subsets that contain instances with similar values mean homogeneous data.
Here, standard deviation is used to calculate the homogeneity of a numerical sample (target variable).

1 2 3 13