Lost-Update-Problem in Concurrent schedules

In a schedule, if update performed by transaction T1 on data item ‘X’ gets overwritten by the update performed by transaction T2 on same data item ‘X’, then we say that update of T1 is lost to the update of T2.

This problem is known as Lost-Update-Problem in concurrent schedules.

Example:

with X = 50 and Y =50 (initial values)

T1 T2
read(x) (T1I1)
x=x+10 (T1I2)
read(x) (T2I1)
x=x+20 (T2I2)
write(x) (T1I3)
read(y) (T1I4)
write(x) (T2I2)
commit (T2I2)
y=y+10 (T1I5)
write(y) (T1I6)
commit (T1I7)

Explanation:

in concurrent execution,

  • T1 reads data item X(=50) from DB and increment its value by 10. (till now it has not performed any write and commit operation)
  • now T2 reads data item X(=50 because T1 has not performed any write and commit) from DB and increment it by 20.
  • T1 writes X. it will write X = 60 (50+10) in local cache. it has not perform any commit so value will not be updated in DB.
  • T1 read Y(=50).
  • T2 writes X. As T2 performed x=x+20 which x=70 will be written to local cache.
  • T2 performs commit which will write x=70 from local cache to DB. Now DB holds X value as 70.
  • Here in this step value of X which is updated as 60 in local cache but not committed to DB is lost to the update done by T2. it is defined as Lost Update Problem.
  • Now T1 will increment Y by 10. (Y = 60 after increment)
  • T1 will write Y = 60 to local cache.
  • T1 commit will update Y value as 60 to DB.
  • final X = 70 and Y = 60 values will be there in DB.

Serial Schedules, Concurrent Schedules and Conflict Operations

A schedule is the representation of execution sequence for all the instructions of the transactions.

Schedules are categorized in two types:

  • Serial Schedules
  • Concurrent Schedules

Serial Schedules:

A schedule is said to be serial if and only if all the instructions of all the transactions get executed non-preemptively as an unit.

OR

Each serial schedule consists of a sequence of instructions from various transactions, where the instructions belonging to one single transaction appear together in that schedule.

Thus, for a set of n transactions, there exist n! different valid serial schedules.

if there exist three transactions then all possible serial schedules are:

            here n = 3(no of transactions)

           3! = 3*2*1 = 6 serial schedules

T1->T2->T3, T1->T3->T2,

T2->T1->T3, T2->T3->T1,

T3->T2->T1, T3->T2->T1

Serial schedules gives guarantee for data consistency.

Concurrent Schedules:

A schedule is said to be concurrent in case the instructions of the transactions get executed preemptively.

When the database system executes several transactions concurrently, the corresponding schedule no longer needs to be serial.

If two transactions are running concurrently, the operating system may execute one transaction for a little while, then perform a context switch, execute the second transaction for some time, and then switch back to the first transaction for some time, and so on. With multiple transactions, the CPU time is shared among all the transactions.

Several execution sequences are possible, since the various instructions from both
transactions may now be interleaved.

In general, it is not possible to predict exactly how many instructions of a transaction will be executed before the CPU switches to another transaction.

Thus, the number of possible schedules for a set of n transactions is much larger than n!.

Concurrent schedules might get affected with conflicting operations and hence does not give guaranty for data consistency.

Conflicting Operations:

A pair of operations is said to be conflicting with each other if and only if:

  • They belong to different transactions
  • Both the operations are accessing the same data item.
  • At least one of these operations is write operation.

Possible conflict:

write-write, write-read, read-write.

Disadvantages of Serial Schedules:

  1. High average waiting time.
  2. Low response time and low throughput could be possible.

Advantage of Serial Schedules:

  1. It always gives guarantee for data consistency.

Disadvantages of Concurrent Schedules:

  1. Possible data inconsistency.
  2. some time too much context switching

Advantage of Concurrent Schedules:

  1. Reduce waiting time.
  2. improve response time and throughput.

Hence RDBMS has implemented Concurrency control mechanism to produce serializable schedules.

Serializable Schedules:

A schedule is said to be serializable if and only if the result of the given concurrent schedule is equivalent to one of the possible serial schedule.

 

 

 

 

How to check Conflict Serializability using Precedence Graph Algorithm

Precedence graph algorithm can be used to find out whether the given concurrent schedule is conflict serializable or not.

Algorithm:

  1. Create the number of node in the graph equal to the number of transactions in the given schedule.
  2. Starting with each and every transaction identify all the existing conflicting operations and represent them in the graph in the form of edges following the direction of the conflicting operation.
  3. Check if the precedence graph has either a cycle or a loop.
  4. If the cycle or loop does exist, then the given schedule is not conflict serializable.
  5. Else the schedule is conflict serializable.
  6. In case the schedule is conflict serializable then apply the Topological ordering in the graph to find out the equivalent serial schedule.

Continue reading