Multicollinearity occurs in a multi linear model where we have more than one predictor variables. So Multicollinearity exist when we can linearly predict one predictor variable (note not the target variable) from other predictor variables with significant degree of accuracy. It means two or more predictor variables are highly correlated. But not the vice versa means if there is low correlation among predictors then also multicollinearity may exist.

## What is stepAIC in R?

In R, stepAIC is one of the most commonly used search method for feature selection. We try to keep on minimizing the stepAIC value to come up with the final set of features. “stepAIC” does not necessarily means to improve the model performance, however it is used to simplify the model without impacting much on the performance. So AIC quantifies the amount of information loss due to this simplification. AIC stands for Akaike Information Criteria.

## Feature Selection Techniques in Regression Model

Feature selection is a way to reduce the number of features and hence reduce the computational complexity of the model. Many times feature selection becomes very useful to overcome with overfitting problem. Feature selection helps us in determining the smallest set of features that are needed to predict the response variable with high accuracy. if we ask the model, does adding new features, necessarily increase the model performance significantly? if not then why to add those new features which are only going to increase model complexity.

## Important Use Cases of NLP

NLP can organize unstructured data and perform several automated tasks such as automatic summarization, sentiments analysis, speech recognition, etc.

## Basic Statistics for Data Science – Part 1

Types of Statistics: Descriptive vs Inferential

Basic terminology like Population vs Sample

Types of Variables: Numerical vs Categorical

Measures of central tendencies: Mean, Median and Mode and their specific use cases

Measures of dispersion/spread: Variance, standard deviation etc.

## What is the Coefficient of Determination | R Square

The Coefficient of Determination is the measure of the variance in response variable ‘y’ that can be predicted using predictor variable ‘x’. It is the most common way to measure the strength of the model.

## Employee Attrition Rate Analysis – Insights from IBM HR Data

Storytelling or presenting insights is the most important part of data analytics. This is the selling point of all your hard work. Doesn’t matter how much hard work you have put in developing analytic model until you are able to get the attention of the target audience. Here in this particular article, my focus is on how we can use beautiful graphs to show the insights regarding employee attrition rate from IBM HR Attrition data. After all, a picture is worth to thousands of words.

## What is Linear Regression? Part:1

Linear Regression is a field of study which emphasizes on the statistical relationship between two continuous variables known as Predictor and Response variables. Predictor variable is most often denoted as x and also known as Independent variable. Response variable is most often denoted as y and also known as Dependent variable.

## Covariance and Correlation

Covariance and Correlation are very helpful while understanding the relationship between two continuous variables. Covariance tells whether both variables vary in same direction (positive covariance) or in opposite direction (negative covariance). Whereas Correlation explains about the change in one variable leads how much proportion change in second variable.

## What is Linear Regression? Part:2

In any business there are some easy to measure variables like : Age, Gender, Income, Education Level etc. and there are some difficult to measure

## Necessary Privileges for Creating Database Links

A database link is a pointer in the local database that lets you access objects on a remote database. To create a private database link,

## What is Lost Update Problem in DBMS?

In a schedule, if update performed by transaction T1 on data item ‘X’ gets overwritten by the update performed by transaction T2 on same data

## Serial Schedules, Concurrent Schedules and Conflict Operations

A schedule is the representation of execution sequence for all the instructions of the transactions. Schedules are categorized in two types: Serial Schedules Concurrent Schedules

## Concurrent Execution in Transaction | DBMS

Transaction-processing systems usually allow multiple transactions to run concurrently. Allowing multiple transactions to update data concurrently causes several complications with consistency of the data. Ensuring

## Difference between Normalization and Normal Forms

Normalization: Normalization is the systematic process applied on the relations to reduce the degree of redundancy. Normalization is defined as systematic because it always gives

## Implementation of Atomicity and Durability using Shadow Copy

The recovery-management component of a database system can support atomicity and durability by a variety of schemes. Here we are going to learn about one of

## Convert local path to UNC (Universal) File Path using Java Script ActiveXObject

in razor view engine add the following line to support ActiveXObject: @using System.Web.Script.Serialization

## DBMS Notes for GATE 2020

Relational Model Relational Algebra – Degree, Cardinality, Domain, Union Compatibility and Operators Cartesian Product Join Operators in DBMS Join Conditions – Natural Join, On Condition,

## Types of Functional Dependencies in Normalization

Functional Dependency: In Relational database, Functional dependency is denoted as X -> Y where X:Determinant and Y: Dependent. So, as per the concept the value of Y