UNIT - V Introduction to Transaction Processing Concepts and Theory

 UNIT - V Introduction to Transaction Processing Concepts and Theory

Topics Covered:

Ø  Introduction

Ø  Introduction to Transaction processing

*    Single user versus Multiuser systems

*    Transactions, Database Items, Read and Write Operations and DBMS Buffers.

*    Why concurrency control is needed

*    Why Recovery is needed

Ø  Transactions and System concepts

*    Transaction states and Additional operations

*    System log

   

Ø  Desirable properties of transactions*

Ø   Characterizing Schedules based on Serializability.


Introduction:

                     One of the main advantages of storing data in a database is to allow sharing of it among multiple users. Several users access the database or perform operations at the same time. What if a user tries to access a data item that is being used /modified by another user? So unit try to give you a detail idea on how concurrent transactions are executed under the control of DBMS. However, in order to explain the concurrent transactions, first we must describe the term transaction.

   Concurrent execution of user programs is essential for better performance of DBMS, as concurrent running of several user programs keeps utilizing CPU time efficiently. DBMS is only concerned about what data is being read /written from/ into the database.


Single-User versus Multiuser Systems:

                                                                        Database can be classified into various type based on several criteria, such as the data model, number of user it supports and database distribution.

Based on data model, database can be divided into various types. They are,

1.    Hierarchical Model

2.    Network Model

3.    Entity-Relationship Model

4.    Relational Model

5.    Object-Oriented Data Model

6.    Object-Relational Data Model Etc.

The most popular data model in use today is the relational data model. Well known DBMSs like Oracle, MS SQL Server, DB2 and MySQL support relational data model.

Other traditional models, such as hierarchical data models and network data model are still used in industry mainly on mainframe platforms. However, they are not commonly used due to their complexity. These are all referred to as traditional models because they preceded the relational model.

In recent years, the newer object-oriented data models were introduced. In this model information is represented in the form of objects as used in object-oriented programming. Object-oriented databases are different from relational databases, which are table-oriented. Object-oriented database management systems (OODBMS) combine database capabilities with object-oriented programming language capabilities.

The object-oriented models have not caught on as expected so are not in widespread use. Some examples of object-oriented DBMSs are O2, ObjectStore and Jasmine.

Based on Database Distribution, there are four types of database systems. They are, Centralized database systems, Distributed database system, Homogeneous distributed database systems and Heterogeneous distributed database systems.

Centralized systems: With a centralized database system, the DBMS software and database are stored at a single site that is used by several other systems too.

Distributed database system: In a distributed database system, the actual database and the DBMS software are distributed from various sites that are connected by a computer network.

Homogeneous distributed database systems: Homogeneous distributed database systems use the same DBMS software from multiple sites. Data exchange between these various sites can be handled easily.

Heterogeneous distributed database systems: In a heterogeneous distributed database system, different sites might use different DBMS software, but there is additional common software to support data exchange between these sites.

Based on the number of users it supports (number of users who can use the system concurrently), database can be classified into two types such as single user database system and multi user database system.

A single-user database system - supports one user at a time can use the system and a multiuser database system - supports multiple users can use system concurrently. In Single-user DBMSs are restricted to personal computer systems; most other DBMSs are multiuser. For example: An airline reservations system is used by hundreds of travel agents (reservation clerks) concurrently. Database systems used in banks, insurance agencies, stock exchanges, supermarkets, and many other applications are multiuser systems. In these systems, hundreds or thousands of users are typically operating on the database by submitting transactions concurrently to the system.

In multi user database system, multiple users can access databases and use computer systems simultaneously because of the concept of multiprogramming.

Multiprogramming is a simple form of parallel processing in which several programs are run at the same time on a single processor. Multiprogramming allows the operating system of the computer to execute multiple programs or processes at the same time.

Some of the characteristics of multiprogramming are,

·         Multiprogramming is the allocation of more than one concurrent program on a computer system and its resources.

·         Multiprogramming allows using the CPU effectively by allowing various users to use the CPU and I/O devices effectively.

·         Multiprogramming makes sure that the CPU always has something to execute, thus increases the CPU utilization.

Example of multiprogramming: Use a browser, play video, download apps and transfer data at the same time.

But, single central processing unit (CPU) can only execute at most one process at a time. However, multiprogramming, execute some commands from one process, then suspend that process and execute some commands from the next process, and so on. A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU again. Hence, concurrent execution of processes is called interleaved, as illustrated in Figure below.

In the above figure shows two processes, A and B, executing concurrently in an interleaved fashion. Interleaving keeps the CPU always busy, when a process requires an input or output (I/O) operation, such as reading a block from disk. The CPU is switched to execute another process rather than remaining idle during I/O time. Interleaving also prevents a long process from delaying other processes.


If the computer system has multiple hardware processors (CPUs), parallel processing of multiple processes is possible, as illustrated by processes C and D in above Figure 21.1.

Most of the theory concerning concurrency control in databases is developed in terms of interleaved concurrency, so for the remainder of this chapter we assume this model. In a multiuser DBMS, the stored data items are the primary resources that may be accessed concurrently by interactive users or application programs, which are constantly retrieving information from and modifying the database.

What is a Transaction?

                            A transaction is an action or series of action, which occurs on the database. It provides a mechanism for describing database processing operations such as read -retrieval, write - insert or update, delete (reads a value from the database or writes a value to the database).

For example, some of the transactions at a bank may be withdrawal or deposit of money; transfer of money from A’s account to B’s account etc. A transaction would involve manipulation of one or more data values in a database. Thus, it may require reading and writing of database value.

Simple Transaction Example:

1. Read account balance

2. Deduct the amount from balance

3. Write the remaining balance to account

4. Read account balance

5. Add the amount to account balance

6. Write the new updated balance to account

Operations

Descriptions

Retrieve

To retrieve data stored in a database.

Insert

To store new data in database.

Delete

To delete existing data from database.

Update

To modify existing data in database.

Commit

To save the work done permanently.

Rollback

To undo the work done.


Begin Transaction: it is a sign that indicates the start of a transaction execution.

Read operation: it is used to read the value from the database and stores it in a buffer in main memory.

Write Operation: it is used to write the value back to the database from the buffer.

End transaction: it is the symbol that indicates end of a transaction.

Commit: it makes sure that all the changes made by transactions are made permanent.

Rollback: it’s a Sign to specify that the transaction has been unsuccessful and terminates the transactions and rejects any change made by the transaction. A committed transaction cannot be rolled back.

Read (Retrieve) and Write (Insert or Update) operations are most common operations in each transaction.

The database operations of each transaction can either be embedded within an application program or they can be specified interactively via a high-level query language such as SQL.

If the database operations in a transaction do not update the database but only retrieve data, this type of transaction is called a read-only transaction; otherwise it is known as a read-write transaction.

Why Concurrency control needed? Next Blog

Comments

Popular posts from this blog