UNIT - V Introduction to Transaction Processing Concepts and Theory
UNIT - V Introduction to Transaction Processing Concepts and Theory
Topics Covered:
Ø
Introduction
Ø
Introduction to Transaction processing
Single user versus
Multiuser systems
Transactions,
Database Items, Read and Write Operations and DBMS Buffers.
Why concurrency
control is needed
Why Recovery is
needed
Ø
Transactions and System concepts
Transaction states
and Additional operations
System log
Ø
Desirable properties of transactions
Ø Characterizing Schedules based on Serializability.
Introduction:
One of
the main advantages of storing data in a database is to allow sharing of it
among multiple users. Several users access the database or perform operations
at the same time. What if a user tries to access a data item that is being used
/modified by another user? So unit try to give you a detail idea on how
concurrent transactions are executed under the control of DBMS. However, in
order to explain the concurrent transactions, first we must describe the term
transaction.
Concurrent execution of user programs is essential for better performance of DBMS, as concurrent running of several user programs keeps utilizing CPU time efficiently. DBMS is only concerned about what data is being read /written from/ into the database.
Single-User versus
Multiuser Systems:
Database can be classified into various type
based on several criteria, such as the data
model, number of user it supports
and database distribution.
Based
on data model, database can be
divided into various types. They are,
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model Etc.
The most popular data model in
use today is the relational data model.
Well known DBMSs like Oracle, MS SQL Server, DB2 and MySQL support relational data model.
Other traditional models, such as
hierarchical data models and network data model are still used in industry mainly on mainframe platforms.
However, they are not commonly used due
to their complexity. These are all referred to as traditional models because
they preceded the relational model.
In recent years, the newer object-oriented data models were
introduced. In this model information is represented in the form of objects as used in object-oriented
programming. Object-oriented databases are different from relational databases,
which are table-oriented. Object-oriented database management systems (OODBMS)
combine database capabilities with object-oriented programming language
capabilities.
The object-oriented models
have not caught on as expected so are not in widespread use. Some
examples of object-oriented DBMSs are O2, ObjectStore and Jasmine.
Based on Database Distribution,
there are four types of database systems. They are, Centralized database systems, Distributed
database system, Homogeneous
distributed database systems and Heterogeneous
distributed database systems.
Centralized
systems: With a centralized database system, the
DBMS software and database are
stored at a single site that is used
by several other systems too.
Distributed
database system: In a distributed database system, the
actual database and the DBMS software are distributed from
various sites that are connected by a computer network.
Homogeneous
distributed database systems: Homogeneous distributed database systems use the same DBMS software from multiple sites. Data
exchange between these various sites can be handled easily.
Heterogeneous
distributed database systems: In a heterogeneous distributed database system,
different sites might use different DBMS software, but there is additional
common software to support data exchange between these sites.
Based
on the number of users it supports
(number of users who can use the system concurrently),
database can be classified into two types such as single user database system and multi user database system.
A single-user
database system - supports one
user at a time can use the system and a multiuser database system -
supports multiple users can use
system concurrently. In
Single-user DBMSs are restricted to personal
computer systems; most other DBMSs are multiuser. For example: An airline reservations system is used by hundreds of
travel agents (reservation clerks) concurrently. Database systems used in
banks, insurance agencies, stock exchanges, supermarkets, and many other
applications are multiuser systems. In these systems, hundreds or thousands of
users are typically operating on the database by submitting transactions
concurrently to the system.
In
multi user database system, multiple users can access databases and use
computer systems simultaneously because of the concept of multiprogramming.
Multiprogramming
is a simple form of parallel processing in which several programs are
run at the same time on a single processor. Multiprogramming
allows the operating system of the computer to execute multiple programs or processes
at the same time.
Some of the characteristics
of multiprogramming are,
·
Multiprogramming
is the allocation of more than one concurrent program on a computer system and
its resources.
·
Multiprogramming
allows using the CPU effectively by allowing various users to use the CPU and
I/O devices effectively.
·
Multiprogramming
makes sure that the CPU always has something to execute, thus increases the CPU
utilization.
Example of multiprogramming: Use a browser, play video, download apps and transfer data at
the same time.
But,
single central processing unit (CPU) can only execute at most one process at a time. However, multiprogramming,
execute some commands from
one process, then suspend that process and execute some commands from the next
process, and so on. A process is resumed at the point where it was suspended
whenever it gets its turn to use the CPU again. Hence, concurrent execution of
processes is called interleaved,
as illustrated in Figure below.
In
the above figure shows two processes, A and B, executing concurrently in an
interleaved fashion. Interleaving keeps the CPU always busy, when a process
requires an input or output (I/O) operation, such as reading a block from disk.
The CPU is switched to execute another process rather than remaining idle
during I/O time. Interleaving also prevents a long process from delaying other
processes.
If the computer system has multiple hardware processors (CPUs), parallel processing of multiple processes is possible, as illustrated by processes C and D in above Figure 21.1.
Most of the theory concerning
concurrency control in databases is developed in terms of interleaved
concurrency, so for the remainder of this chapter we assume this model. In a
multiuser DBMS, the stored data items are the primary resources that may be
accessed concurrently by interactive users or application programs, which are
constantly retrieving information from and modifying the database.
What is a Transaction?
A
transaction is an action or series of action, which occurs on the database. It
provides a mechanism for describing database processing operations such as read -retrieval, write - insert
or update, delete (reads a value from the
database or writes a value to the database).
For
example, some of the transactions at a bank may
be withdrawal or deposit of money; transfer of money from A’s account to B’s
account etc. A transaction would involve manipulation of one or more data
values in a database. Thus, it may require reading and writing of database
value.
Simple Transaction Example:
1. Read account
balance
2. Deduct the
amount from balance
3. Write the
remaining balance to account
4. Read account
balance
5. Add the amount
to account balance
6. Write the new
updated balance to account
Operations |
Descriptions |
Retrieve |
To retrieve data stored in a database. |
Insert |
To store new data in database. |
Delete |
To delete existing data from database. |
Update |
To modify existing data in database. |
Commit |
To save the work done permanently. |
Rollback |
To undo the work done. |
Begin Transaction: it is a sign that
indicates the start of a transaction execution.
Read operation: it is used to read the
value from the database and stores it in a buffer in main memory.
Write Operation: it is used to write
the value back to the database from the buffer.
End transaction: it is the symbol that
indicates end of a transaction.
Commit: it makes sure that all
the changes made by transactions are made permanent.
Rollback: it’s
a Sign to specify that the transaction has been unsuccessful and terminates the
transactions and rejects any change made by the transaction. A committed
transaction cannot be rolled back.
Read (Retrieve) and Write (Insert or Update) operations are most
common operations in each transaction.
The
database operations of each transaction can either be embedded within an
application program or they can be specified interactively via a high-level
query language such as SQL.
If
the database operations in a transaction do not update the database but only
retrieve data, this type of transaction is called a read-only
transaction; otherwise it is known as a
read-write transaction.
Why Concurrency control needed? Next Blog
Comments
Post a Comment