Wednesday, 29 March 2017

Transaction Mnagement System Questions



1.   Define transaction. What are the properties of a transaction? Explain with the help of an example.

Ans: A collection of operations that form a single logical unit of work is called a transaction. The operations that make up a transaction typically consist of requests to access existing data, modify existing data, add new data or any combination of these requests. The statements of a transaction are enclosed within the begin transaction and end transaction statements.
To ensure the integrity of the data, the database system must maintain some desirable properties of the transaction. These properties are known as ACID properties, the acronym derived from the first letter of the terms atomicity, consistency, isolation and durability.
images Atomicity implies that either all of the operations that make up a transaction should execute or none of them should occur.
images Consistency implies that if all the operations of a transaction are executed completely, the database is transformed from one consistent state to another.
images Isolation implies that each transaction appears to run in isolation with other concurrently running transactions.
images Durability (also known as permanence) implies that once a transaction is completed successfully, the changes made by the transaction persist in the database, even if the system fails.
For example, consider a transaction T1 that transfers $100 from account A to account B. Let the initial values of account A be $2000 and account B be $1500. The sum of the values of account A and B is $3500 before the execution of transaction T1. Since T1 is a fund-transferring transaction, the sum of the values of account A and B should be $3500 even after its execution. The transaction T1 can be written as follows:
T1:
         read(A);
         A:=A-100;
         write(A);
         read(B);
         B:= B+100;
         write(B);
If transaction T1 is executed, either $100 should be transferred from account A to B or neither of the accounts should be affected. If T1 fails after debiting $100 from account A, but before crediting $100 to account B, the effects of this failed transaction on account A must be undone. This is the atomicity property of transaction T1. The execution of transaction T1 should also preserve the consistency of the database, that is, the sum of the values of account A and B should be $3500 even after the execution of T1.
Now, let us consider the isolation property of T1. Suppose that during the execution of transaction T1 when $100 is debited from account A and not yet credited to account B, another concurrently running transaction, say T2, reads the values of account A and B. Since T1 has not yet completed, T2 will read inconsistent values. The isolation property ensures that the effects of the transaction T1 are not visible to other transaction T2 until T1 is completed. If the transaction T1 is successfully completed and the user who initiated the transaction T1 has been notified about successful transfer of funds, then the data regarding the transfer of funds should not be lost even if the system fails. This is the durability property of T1. The durability can be guaranteed by ensuring that either
images all the changes made by the transaction are written to the disk before the transaction is completed or
images the information about all the changes made by the transaction is written to the disk and is sufficient to enable the database system to reconstruct the changes when the database system is restarted after the failure.

2.   Explain state transition diagram. Explain, when a transaction is said to be failed.
Ans: State transition diagram is a diagram that describes how a transaction passes through various states during its execution. Whenever a transaction is submitted to a DBMS for execution, either it executes successfully or fails due to some reasons. During its execution, a transaction passes through various states that are active, partially committed, committed, failed and aborted as shown in Figure 9.1.
images
Figure 9.1 State Transition Diagram Showing Various States of a Transaction
A transaction enters into the active state with its commencement. At this point, the system marks BEGIN_TRANSACTION operation to specify the beginning of the transaction execution. During its execution, the transaction stays in the active state and executes several READ and WRITE operations on the database. The READ operation transfers a data item from the database to a local buffer of the transaction that has executed the read operation. The WRITE operation transfers the data item from the local buffer of the transaction back to the database.
Once the transaction executes its final operation, the system marks END_TRANSACTION operation to specify the end of the transaction execution. At this point, the transaction enters into the partially committed state. The actual output at this point may still be residing in the main memory and, thus, any kind of hardware failure might prevent its successful completion. In such a case, the transaction may have to be aborted.
Before actually updating the database on the disk, the system first writes the details of updates performed by the transaction in the log file. The log file is then written to the disk so that, even in case of failure, the system can re-construct the updates performed by the transaction when the system restarts after the failure. When this information is successfully written out in the log file, the system marks COMMIT_TRANSACTION operation to indicate the successful end of the transaction. Now, the transaction is said to be committed and all its changes must be reflected permanently in the database.
If the transaction is aborted during its active state or the system fails to write the changes in the log file, the transaction enters the failed state. The failed transaction must be rolled back to undo its effects on the database to maintain the consistency of the database. When the transaction leaves the system, it enters into the terminated state. At this point, the transaction information maintained in the log file during its execution is removed.
3.   Why is it desirable to have concurrent execution of multiple transactions? What are the various anomalies that can occur due to undesirable interleaving of transactions? Explain by giving suitable examples.
Ans: The database system allows concurrent execution of transactions due to two reasons as given below:
images To increase system throughput: A transaction performing read or write operation using I/O devices may not be using the CPU at a particular point of time. Thus, while one transaction is performing I/O operations, the CPU can process another transaction. This is possible because CPU and I/O system in the computer system are capable of operating in parallel. This overlapping of I/O and CPU activities reduces the amount of time for which the disks and processors are idle and, thus, increases the throughput of the system.
images To reduce average response time: An interleaved execution of a short and a long transaction allows short transaction to be completed quickly. If a short transaction is executed after the execution of a long transaction in a serial order, the short transaction may have to wait for a long time, leading to unpredictable delays in executing a transaction. On the other hand, if these transactions are executed concurrently, it reduces the unpredictable delays, and thereby reduces the average response time.
If the transactions are interleaved in an undesirable manner, it may lead to several anomalies such as lost update, dirty read and unrepeatable read. To understand these anomalies, consider two transactions T1 and T2, where T1 transfers $100 from account A to account B, and T2 adds two percent interest to account A. Suppose that the initial values of account A and B are $2000 and $1500, respectively. Then, after serial execution of transactions T1 and T2 the value of account A should be $1938 and that of account B should be $1600.
Suppose that the operations of T and T2 are interleaved in such a way that T2 reads the value of account A before T1 updates its value in the database. Now, when T2 updates the value of account A in the database, the value of account A updated by the transaction T1 is overwritten and hence is lost. This is known as lost update problem. The value of account A at the end of both the transactions is $2040 instead of $1938, which leads to data inconsistency. The interleaved execution of transactions T1 and T2 that leads to lost update problem is shown in Figure 9.2(a).
The second problem occurs when a transaction fails after updating a data item, and before this data item is changed back to its original value, another transaction reads this updated value. For example, assume that Tj fails after debiting $100 from account A, but before crediting this amount to account B. This will leave the database in an inconsistent state. The value of account A is now $1900, which must be changed back to the original one, that is, $2000. However, before the transaction Tj is rolled back, let another transaction T2 reads the incorrect value of account A. This incorrect value of account A that is read by transaction T2 is called dirty data, and the problem is called dirty read problem. The interleaved execution of transactions Tj and T2 that leads to dirty read problem is shown in Figure 9.2(b).
The third problem occurs when a transaction tries to read the value of data item twice, and another transaction updates the same data item in between the two read operations of the first transaction. As a result, the first transaction reads varied values of same data item during its execution. This is known as unrepeatable read. For example, consider a transaction T3 that reads the value of account A. At this point, let another transaction T4 updates the value of account A. Now if T3 again tries to read the value of account A, it will get a different value. As a result, the transaction T3 receives different values for two reads of account A. The interleaved schedule of transactions T3 and T4 that leads to a problem of unrepeatable read is shown in Figure 9.2(c).
images
Figure 9.2 Anomalies Due to Interleaved Execution

4.   Define the following terms in context of a transaction:
(a) Schedule
Ans: A list of operations (such as reading, writing, aborting or committing) from a set of transactions is known as a schedule (or history). A schedule should comprise all the instructions of the participating transactions and also preserve the order in which the instructions appear in each individual transaction.
(b) Non-serial schedule
Ans: In a multi-user database system, several transactions are executed concurrently for efficient use of system resources. When two transactions are executed concurrently, the operating system may execute one transaction for some time, then perform a context switch and execute the second transaction for some time, and then switch back to the first transaction, and so on. Thus, when multiple transactions are executed concurrently, the CPU time is shared among all the transactions. The schedule resulted from this type of interleaving of operations from various transactions is known as non-serial schedule.
(c) Cascading rollback
Ans: Cascading rollback is a situation in which failure of single transaction leads to a series of transaction rollbacks.
(d) Blind writes
Ans: Blind writes are those write operations which are performed without performing the read operation.
(e) Topological sorting
Ans: The process of ordering the nodes of an acyclic preceding graph is known as topological sorting. The topological ordering produces equivalent serial schedule.
(f) Observable external writes
Ans: There are some situations when a transaction needs to perform write operations on a terminal or a printer. These writes are known as observable external writes. Since the content once written on the printer or the terminal cannot be erased, the database system allows such writes to take place only after the transaction has entered the committed state.
(g) Compensating transaction
Ans : Compensating transaction is a transaction which reverses the effect of a committed transaction.

5.   Define serial schedule. Why is it always considered to be correct?
Ans: Serial schedule is a schedule consisting of a sequence of instructions from various transactions, where the operations of one single transaction appear together in that schedule. Each transaction in a serial schedule is executed independently without any interference from the operations of other transactions. As long as every transaction is executed from beginning to end without any interference from other transactions, it gives a correct end result on the database. Therefore, every serial schedule is considered to be correct. Hence, for a set of n transactions, n! different valid serial schedules are possible. The serial schedule of transactions T1 and T5 in the order T5 followed by T1 is shown in Figure 9.3.
images
Figure 9.3 Serial Schedule in the Order T5 Followed by T1

6.   Explain serializable schedule by giving an example.
Ans: The consistency of the database under concurrent execution can be ensured by interleaving the operations of transactions in such a way that the final output is the same as that of some serial schedule of those transactions. Such a schedule is referred to as serializable schedule. Thus, a schedule S of n transactions T1, T2, T3, …, Tn is serializable if it is equivalent to some serial schedule of the same n transactions.
Figure 9.4 shows a non-serial schedule of transactions T1 and T5, which is equivalent to serial schedule shown in Figure 9.3. After the execution of this schedule, the final values of accounts A, B and C are $1900, $1550 and $550. Thus, the sum A + B + C is preserved and, hence, it is a serializable schedule.
images
Figure 9.4 Serializable Schedule

7.   What are result equivalent schedules? Explain with example.
Ans: Two different schedules may produce the same final state of the database. Such schedules are known as result equivalent, since they have the same effect on the database. However, in some cases, two different schedules may accidentally produce the same final state of database. For example, consider two schedules Si and S. that are updating the data item Q, as shown in Figure 9.5. Suppose that the value of data item Q is $100 then the final state of database produced by schedules Si and Sj is the same, that is, they produce the same value of Q ($200).
images
Figure 9.5 Result Equivalent Schedules

8.   Discuss the two different forms of schedule equivalence.
Ans: Two different forms of schedule equivalence are conflict equivalence and view equivalence that lead to the notions of conflict serializability and view serializability.
Conflict equivalence and conflict serializability: Two operations in a schedule are said to conflict if they belong to different transactions, access the same data item, and at least one of them is the write operation. On the other hand, two operations belonging to different transactions in a schedule do not conflict if both of them are read operations or both of them are accessing different data items. To understand the concept of conflicting operations in a schedule, consider two transactions T6 and T7 that are updating the data items Q and R in the database. A non-serial schedule of these transactions is shown in Figure 9.6.
images
Figure 9.6 Non-serial Schedule Showing Conflicting Operations
In Figure 9.6, the write(Q) operation of T6 conflicts with the read(Q) operation of T7 because both the operations are accessing the same data item Q, and one of these operation is the write operation. On the other hand, the write(Q) operation of T7 is not conflicting with the read(R) operation of T6, because both are accessing different data items Q and R.
If a schedule S can be transformed into a schedule S‘ by performing a series of swaps of non-conflicting operations, then S and S’ are conflict equivalents. Note that while swapping the order of execution of two conflicting operations cannot be changed because if they are applied in different order, they can have different effect on the database or on the other transactions in the schedule. Thus, two schedules are said to be conflict equivalent if the order of any two conflicting operations is same in both the schedules. A schedule, which is conflict equivalent to some serial schedule, is known as conflict serializable.
View equivalence and view serializability: Two schedules S and S' are said to be view equivalent if the schedules satisfy these conditions.
images The same set of transactions participates in S and S‘, and S and S’ include the same set of operations of those transactions.
images If the transaction Ti reads the initial value of a data item say Q in schedule S, then Ti must read the initial value of same data item Q in schedule S' also.
images For each data item Q, if Ti executes read(Q) operation after the write(Q) operation of transaction Tj in schedule S, then Ti must execute the read(Q) operation after the write(Q) operation of Tj in schedule S' also.
images If the transaction Ti performs the final write operation on any data item say Q in schedule S, then it must perform final write operation on the same data item Q in schedule S' also.
A schedule S is said to be view serializable if it is view equivalent to some serial schedule.

9.   What is a precedence graph? How can it be used to test the conflict serializability of a schedule?
Ans: A precedence graph is a directed graph G=(N,E) where N={T1, T2, …, Tn} is a set of nodes and E = {e1, e2, …, en} is a set of directed edges. For each transaction Ti in a schedule, there exists a node in the graph. An edge ei in the graph is shown as (TiTj), where Ti is the starting node and Tj is the ending node of the edge ei. The edge ei is created if one of the operations in Ti appears in the schedule before some conflicting operation in Tj.
The algorithm to test the conflict serializability of a schedule S is given as follows:
1.   For each transaction Ti participating in the schedule S, create a node labelled Ti.
2.   If Tj executes a read operation after Ti executes a write operation on any data item say Q, create an edge (TiTj) in the precedence graph.
3.   If Tj executes a write operation after Ti executes a read operation on any data item say Q, create an edge (TiTj) in the precedence graph.
4.   If T executes a write operation after T>j executes a write operation on any data item say Q, create an edge (TiTj) in the precedence graph.
5.   If the resultant precedence graph is acyclic, that is, it does not contain any cycle, then the schedule S is considered to be conflict serializable. If a precedence graph contains a cycle, the schedule S is not conflict serializable.
An edge from Ti to Tj in the precedence graph implies that the transaction Ti must appear before Tj in any serial schedule S‘ that is equivalent to S. Thus, if there are no cycles in the precedence graph, a serial schedule S’ equivalent to S can be created by ordering the transactions in such a way that whenever an edge (TiTj) exists in the precedence graph, T must appear before T. in the equivalent serial schedule S'.
For example, consider the schedule shown in Figure 9.5. The precedence graph of this schedule is shown in Figure 9.7(a). It contains two nodes, one for each transaction—T6 and T7. An edge from T6 to T7 is created, which is labelled with the data items Q and R that led to the creation of this edge. Since there are no cycles in the precedence graph, this schedule is a conflict serializable schedule. The serial schedule equivalent to Schedule 5 is shown in Figure 9.6(b).
images
Figure 9.7 Testing Conflict Serializability

10. What are recoverable and cascadeless schedules? Explain.
Ans: If, in a schedule S, a transaction Tj reads a data item written by Ti, then the commit operation of Ti should appear before the commit operation of Tj. Such a schedule is known as recoverable schedule.
Sometimes, even if a schedule is recoverable, several transactions may have to be rolled back in order to recover completely from the failure of a transaction Ti It happens if transactions have read the value of a data item written by Ti. For example, consider the schedule shown in Figure 9.8. In this schedule, the transaction T8 reads the value of data item Q written by transaction T9, and transaction T10 reads the value of data item Q written by transaction T8.
Now suppose that the transaction T8 fails due to any reason, it has to be rolled back. Since transaction T9 has read the data written by T8, it also needs to be rolled back. The transaction T10 also needs to be rolled back because it has in turn read the data written by T9. Such a situation, in which failure of single transaction leads to a series of transaction rollbacks, is called cascading rollback.
Cascading rollback requires undoing of significant amount of work and, thus, is undesirable. Therefore, schedules should be restricted to avoid cascading rollbacks. The schedules that avoid cascading rollbacks are known as cascadeless schedule. In cascadeless schedules, transactions Ti and Tj are interleaved in such a way that Tj reads a data item previously written by Ti however, commit operation of Ti appears before the read operation of Tj.
images
Figure 9.8 Schedule Resulting Cascading Rollback

11. Discuss the four levels of isolation in SQL. What are the possible violations for these levels?
Ans: SQL:92 supports four levels of transaction isolation. The isolation levels reduce the isolation between concurrently executing transactions. The level providing the greatest isolation from other transactions is SERIALIZABLE. It is the default level of isolation. At this level, a transaction is fully isolated from the changes made by other transactions. The other isolation levels are READ UNCOMMITTED, READ COMMITTED and REPEATABLE READ. READ UNCOMMITTED is the lowest level of isolation. At this level, a transaction can read subsequent changes made by other transactions, either committed or uncommitted. With read uncommitted isolation level, one or more of the three problems, namely, dirty read, unrepeatable read and phantom read may occur.
In READ COMMITTED isolation level, dirty reads are not possible since a transaction is not allowed to read the updates made by uncommitted transactions. However, unrepeatable reads and phantoms are possible. Phantom read problem occurs when a transaction T reads a set of rows from a table based on some condition specified in WHERE clause, meanwhile another transaction T inserts a new row into that table that also satisfies the condition and commits. Now, if T is again executed, then it will see a tuple (or phantom tuple) that previously did not exist.
In REPEATABLE READ isolation level, dirty reads and unrepeatable reads are not possible but phantoms are possible. In SERIALIZABLE isolation level, dirty reads, unrepeatable reads, and phantoms are not possible. The possible violations for different transaction isolation levels are summarized in Table 9.1.
Table 9.1 Possible violations for different isolation levels
images


12. Discuss the commit and rollback statements in SQL.
Ans: The COMMIT statement terminates the current transaction and makes all the changes made by the transaction permanent in the database. That is, it commits the changes to the database. The COMMIT statement has the following general format:
COMMIT [WORK]
where WORK is an optional keyword that does not change the semantics of COMMIT.
The ROLLBACK statement, on the other hand, terminates the current transaction and undoes all the changes made by the transaction. That is, it rolls back the changes to the database. The ROLLBACK statement has the following general format:
ROLLBACK [WORK]
In this case also, WORK is an optional keyword that does not change the semantics of ROLLBACK.
13. Consider two transactions T1 and T2 given below:
images
Give a serial schedule for these transactions and an equivalent non-serial schedule for these transactions.
Ans: Serial schedule
images
Non-serial schedule
images

14. Consider the following precedence graph. Is the corresponding schedule conflict serializable? Give reasons also.
images
Ans: The corresponding schedule will be conflict serializable as the precedence graph is acyclic.

15. Consider three transactions T1, T2 and T3 and two schedules S1 and S2 given below. Draw the precedence graph for schedules S1 and S2 and test whether they are conflict serializable or not.
T1: r1 (P); r1 (R); w1 (P);
T2: r2 (R); r2(Q); w2 (R); w2 (Q);
T3: r3 (P); r3 (Q); w3 (Q);
S1: r1 (P); r2 (R); r1 (R); r3 (P); r3 (Q); w1 (P); w3 (Q); r2 (Q); w2 (R);
w2 (Q);
S2: r1 (P); r2 (R); r3 (P); r1 (R); r2 (Q); r3 (Q); w1 (P); w2 (R); w3 (Q);
w2(Q);
Ans: Schedule S1
images
The precedence graph for this schdeule is shown below:
images
This schedule is conflict serializable since the precedence graph is acyclic. Schedule S2
images
This schedule is not conflict serializable since the precedence graph is cyclic.
16. Consider schedule S3 given below. Determine whether it is cascadeless, recoverable or non-recoverable.
S3: r1(P); r2(R); r1(R); r3(P); r3(Q); w1(P); w3(Q); r2(Q); w3(R); w2(Q); c1 c2; c3
Ans:
images
Here, transaction T2 reads the data item Q written by T3. If transaction T3 fails after T2 commits, its effects need to be undone. Since T2 has read the values of Q written by transaction T3, it also has to be rolled back. However, T2 has already been committed, thus cannot be rolled back. Therefore, this schedule is a non-recoverable schedule.

17. Give an example of SQL transaction.

Ans: An example of an SQL transaction consisting of multiple SQL statements is given in Figure 9.9. In this transaction, two SQL statements are given that update the BOOK and PUBLISHER relations of Online Book database. The first statement updates the price of all the books published by publisher P001. The next statement updates the email-id of the publisher with P_ID="P002". If an error occurs on any SQL statement, the entire transaction is rolled back. That is, any updated value would be restored to its original value.
images
Figure 9.9 An Example of SQL Transaction

No comments:

Post a Comment

GATE 2018 NOTES

                    COMPUTER NETWORK : PROF.SUBODH R.NIKHALE ...

Amazing idea