How to Use LinkedIn to Drive Traffic to Your Blog

Linkedin is a professional networking platform. where employers and employees can connect to each other. LinkedIn had 630 million registered members in 200 countries as of june 2019. And 2 new members join LinkedIn per second. These numbers are larger than the population of some of the countries.

Read more

What is Logistic Regression?

Logistic regression is the most widely used machine learning algorithm for classification problems. In its original form it is used for binary classification problem which has only two classes to predict. However with little extension and some human brain, logistic regression can easily be used for multi class classification problem. In this post I will be explaining about binary classification. I will also explain about the reason behind maximizing log likelihood function.

Read more

What is Multicollinearity?

Multicollinearity occurs in a multi linear model where we have more than one predictor variables. So Multicollinearity exist when we can linearly predict one predictor variable (note not the target variable) from other predictor variables with significant degree of accuracy. It means two or more predictor variables are highly correlated. But not the vice versa means if there is low correlation among predictors then also multicollinearity may exist.

Read more

What is stepAIC in R?

In R, stepAIC is one of the most commonly used search method for feature selection. We try to keep on minimizing the stepAIC value to come up with the final set of features. “stepAIC” does not necessarily means to improve the model performance, however it is used to simplify the model without impacting much on the performance. So AIC quantifies the amount of information loss due to this simplification. AIC stands for Akaike Information Criteria.

Read more

Feature Selection Techniques in Regression Model

Feature selection is a way to reduce the number of features and hence reduce the computational complexity of the model. Many times feature selection becomes very useful to overcome with overfitting problem. It helps us in determining the smallest set of features that are needed to predict the response variable with high accuracy. if we ask the model, does adding new features, necessarily increase the model performance significantly? if not then why to add those new features which are only going to increase model complexity.

Read more

Important Use Cases of NLP

In today’s world we are generating large amount of data every second. while tweeting, chating, writing or even speaking, we are fabricating corpse of data. Most of the data is in textual and unstructured form. Hence to make this data understandable by computer, we need to process it. NLP technique helps us in processing the data and helps us to get useful insights from it.

Read mor

Basic Statistics for Data Science – Part 1

The Science of collecting, organizing, presenting, analyzing and interpreting the data is statistics. It is one of the most important disciplines or methods to get a deeper insight into data. Statistical analysis is implemented to manipulate, summarize and investigate data so that useful information can be obtained.

Take away from this post:

  • Types of Statistics: Descriptive vs Inferential
  • Basic terminology like Population vs Sample
  • Types of Variables: Numerical vs Categorical
  • Measures of central tendencies: Mean, Median and Mode and their specific use cases
  • Measures of dispersion/spread: Variance, standard deviation etc.
Read more

What is the Coefficient of Determination | R Square

The coefficient of Determination is the direct indicator of how good our model is in terms of performance whether it is accuracy, Precision or Recall. In more technical terms we can define it as The Coefficient of Determination is the measure of the variance in response variable ‘y’ that can be predicted using predictor variable ‘x’. It is the most common way to measure the strength of the model.

Read more

Employee Attrition Rate Analysis – Insights from IBM HR Data

Storytelling or presenting insights is the most important part of data analytics. This is the selling point of all your hard work. Doesn’t matter how much hard work you have put in developing analytic model until you are able to get the attention of the target audience. Here in this particular article, my focus is on how we can use beautiful graphs to show the insights regarding employee attrition rate from IBM HR Attrition data. After all, a picture is worth to thousands of words.

Read more

What is Linear Regression? Part:1

Linear Regression is a field of study which emphasizes on the statistical relationship between two continuous variables known as Predictor and Response variables. (Note: when there are more than one predictor variables then it becomes multiple linear regression.)

  • Predictor variable is most often denoted as x and also known as Independent variable.
  • Response variable is most often denoted as y and also known as Dependent variable.
Read more

Covariance and Correlation

Covariance and Correlation are very helpful in understanding the relationship between two continuous variables. Covariance tells whether both variables vary in same direction (positive covariance) or in opposite direction (negative covariance). There is no significance of covariance numerical value only sign is useful. Whereas Correlation explains about the change in one variable leads how much proportion change in second variable. Correlation varies between -1 to +1. If correlation value is 0 then it means there is no Linear Relationship between variables however other functional relationship may exist.

Read more

What is Linear Regression? Part:2

In any business there are some easy to measure variables like : Age, Gender, Income, Education Level etc. and there are some difficult to measure variables like amount of loan to give, no of days a patient will stay in the hospital, price of the house after 10 years etc. So Regression is the technique which enables you to determine difficult to measure variables with the help of easy to measure variables.

Read more

Necessary Privileges for Creating Database Links

A database link is a pointer in the local database that lets you access objects on a remote database. To create a private database link, you must have been granted the proper privileges. The following table illustrates which privileges are required on which database for which type of link:

Privilege Database Required For
CREATE DATABASE LINK Local Creation of a private database link.
CREATE PUBLIC DATABASE LINK Local Creation of a public database link.
CREATE SESSION Remote Creation of any type of database link.

To see which privileges you currently have available, query ROLE_SYS_PRIVS. For example, you could create and execute the following privs.sql script (sample output included):

                     'CREATE PUBLIC DATABASE LINK')

or just execute following query to see all the permissions for current user:

Source: Oracle Docs

What is Lost Update Problem in DBMS?

In a schedule, if update performed by transaction T1 on data item ‘X’ gets overwritten by the update performed by transaction T2 on same data item ‘X’, then we say that update of T1 is lost to the update of T2.

This problem is known as Lost-Update-Problem in concurrent schedules.


with X = 50 and Y =50 (initial values)

read(x) (T1I1)
x=x+10 (T1I2)
read(x) (T2I1)
x=x+20 (T2I2)
write(x) (T1I3)
read(y) (T1I4)
write(x) (T2I2)
commit (T2I2)
y=y+10 (T1I5)
write(y) (T1I6)
commit (T1I7)


Read more

Serial Schedules, Concurrent Schedules and Conflict Operations

A schedule is the representation of execution sequence for all the instructions of the transactions. Schedules are categorized in two types:

  • Serial Schedules
  • Concurrent Schedules

Serial Schedules:

A schedule is said to be serial if and only if all the instructions of all the transactions get executed non-preemptively as an unit. OR

Each serial schedule consists of a sequence of instructions from various transactions, where the instructions belonging to one single transaction appear together in that schedule.

Read more

Concurrent Execution in Transaction | DBMS

Transaction-processing systems usually allow multiple transactions to run concurrently. Allowing multiple transactions to update data concurrently causes several complications with consistency of the data.

Ensuring consistency in spite of concurrent execution of transactions requires extra work; it is far easier to insist that transactions run serially—that is, one at a time, each starting only after the previous one has completed.

Read more

Difference between Normalization and Normal Forms


Normalization is the systematic process applied on the relations to reduce the degree of redundancy.

Normalization is defined as systematic because it always gives guarantee for following properties –

  • Lossless decomposition.
  • Dependency preservation.
Read more

Implementation of Atomicity and Durability using Shadow Copy

The recovery-management component of a database system can support atomicity and durability by a variety of schemes.

Here we are going to learn about one of the simplest scheme called Shadow copy.

Shadow copy:

Read more

Convert local path to UNC (Universal) File Path using Java Script ActiveXObject


function getUNCPath() {
var filePath = document.getElementById("uploadedFile").value;

var WshNetwork = new ActiveXObject("WScript.Network");
var Drives = WshNetwork.EnumNetworkDrives();
for (i = 0; i < Drives.length; i += 2) {
if(Drives.Item(i) != "")
filePath = filePath.replace(Drives.Item(i), Drives.Item(i + 1));


<form onsubmit="getUNCPath()">
<input type="file" id="uploadedFile"/>
<input type="submit" value="Get the UNC Path!" />

in razor view engine add the following line to support ActiveXObject:

@using System.Web.Script.Serialization

DBMS Notes for GATE 2020

« Older Entries Recent Entries »