While working in regression analysis, you should be familiar with some very basic but very impactful concepts. In machine learning interviews, you can always expects questions from regression analysis. Regression analysis also develop the basic understanding of machine learning model building as we mostly start our machine learning journey from regression analysis only.
Feature store in machine learning is the concept to store features in both online and offline stores for model training and serving purposes. Feature store make sure to provide the consistency between the data used for model training and the data used during online serving to models. In other words, it guarantees that you’re serving the same data to models during training and prediction, eliminating training-prediction skew. Feast is one of the open source tools used for feature store.
Experiment tracking is the process of recording all the important components such as hyper parameters, metrics, models and artifacts like plots PNG images, files etc. Experiment tracking helps to reproduce the old results by using the stored parameters.
Statistics is a subject and a branch of mathematics that is related to all the collection, analysis, interpretation, and visualization of empirical data, and there are two major areas of statistics are descriptive statistics and inferential statistics. If we talk about, descriptive statistics are used to describe the characteristics of sample and population data (what has happened). These properties are used by inferential statistics to test hypotheses, reach conclusions, and make predictions (what can you expect).
In Machine Learning, it is very important to have good understanding of different performance metrics. And it is even more important to know when to use which one to correctly explain the model performance. In classification problems more specific to binary classification, you can not conclude your model without plotting Precision-Recall curve and ROC-AUC curve. In this post, will learn what is the main difference between Precision-Recall curve and ROC-AUC curve and when to use which one.
The ID3 algorithm can be used to construct a decision tree for regression type problems by replacing Information Gain with Standard Deviation Reduction – SDR
A decision tree is built top down from a root node and involves partitioning the data into subsets that contain instances with similar values mean homogeneous data.
Here, standard deviation is used to calculate the homogeneity of a numerical sample (target variable).