Introduction to DBMS
1. What
is database?
Database
is a structure that stores information about multiple types of entities,
the attributes (or characteristics) of the entities, and the relationships
among the entities. It also contains data types for attributes and indexes.
2. Which
type of storage system is Oracle? What other storage systems are available?
Oracle
is a type of Relational Database Management System (RDBMS). Another
available storage system is Hierarchical Storage Management system,
such as Information Management System (IMS) from IBM and Integrated
Database Management System (IDMS) from Computer Associates.
3. Define
Database Management System (DBMS) and RDBMS.
DBMS is
a program that defines the rules for data storage and retrieval. RDBMS is a
special type of DBMS that stores the data in the relational format as described
in the relational theory by E.F. Codd.
4. How
is Oracle different from other programming languages?
Oracle
or any other DBMS is a program that handles user requests (data retrieval,
storage, or modification requests) without any requirement to specify
the algorithm to do so.
5. Give
some examples of join methods.
Some examples
of join methods are given as follows:
Merge
join —Sorts both the joining tables by using the join key and then merges
the sorted rows.
Nested
loop join —Applies filter conditions specific to the outer table and
gets a result set. After that, it joins the inner table with the result set
using either an index or a full table scan.
Hash
join —Uses hash algorithm for filter conditions on smaller table
first and then performs the same hashing algorithm on the other table for
joined columns. After that, it returns the matching rows.
6. Which
sorting algorithm does Oracle follow?
Oracle
used to follow a balanced binary tree sorting algorithm to effectively
build an in-memory index on the incoming data.
Binary tree search places a
huge memory and CPU overhead on the system for large searches; therefore,
Oracle introduced an algorithm based on heap sort, which is more efficient.
However, it also has a limitation of re-ordering incoming rows even
when the data is not arriving out of order.
7. What
is database designing?
Database
designing is a process, which follows the requirement analysis phase to
determine the required database structure to satisfy application, product,
or business requirements as per the given criteria.
8. What
is data modeling?
A data
model consists of the following two components:
Structure
—Refers to the way the system structures the data (or at least how the
user thinks it is structured)
Operations
—Refers to the facilities given to the users of the DBMS to manipulate
the data within the database
Following are the operations
that can be performed on the database:
Create
tables and queries using Data Definition Language
Add,
sort, and calculate the data using Data Manipulation Language
Retrieve
the data using Data Query Language
Control
the access of data using Data Control Language
9. What
is the difference between logical data model and physical data model?
Physical
data model depicts database in terms of physical objects, such as tables
and constraints; whereas, logical data model depicts database in terms of
logical objects, such as entities and relationships, and it is represented
by the entity-relationship model.
10. What are
the components of logical data model?
Following
are the components of the logical data model:
Entity
—Refers to an object of interest, that is, something that we want to
store information about. In a relational database structure, each entity
has its own table, that is, a structure of rows and columns.
Attribute
—Refers to the property or the information of the entity that we are
interested in. It is stored as a column of the table and has specific datatype
associated with it.
Record
—Refers to a collection of all the properties associated with an entity
for one specific condition. It is represented by a row in a table.
Domain
—Refers to the description of the legitimate values for an attribute.
It is the set of all the possible values for that attribute.
Tuple
—Refers to the design of a record.
Relation
—Represents a relation between two entities. Therefore, a relational
database is a collection of entities and relationships.
Relational
database —Refers to a set of related entities, which is used to store
required information as per the design.
11. Define
relationship.
A
relationship is an association between entities (or tables). It is based on
having common data values in the designated columns of each table. A
relationship is defined by linking the common columns in the two tables.
Relationships are actual
objects in the database.
Following are the three
types of relationships:
One-to-many
—Implies that a single record from the first table can be linked to multiple
records in the second table while a single record in the second table relates
to only a single record in the first table.
One-to-one
—Implies that a single record from the first table can be linked to only
a single record in the second table and vice versa.
Many-to-many
—Implies that multiple records from the first table can be linked to multiple
records in the second table and vice versa.
One-to-many is the most commonly
used relationship, where a parent table is the table on the one side of a
one-to-many relationship; whereas, a child table is the table on the many
side of a one-to-many relationship.
12. What is
a primary key?
A
primary key is an attribute (or a collection of attributes) that uniquely
identifies each row in the table. In other words, each entity instance in a
table must be unique; and therefore, primary key is a way of ensuring this.
13. What is
a foreign key?
A foreign
key is an attribute in a child table that matches the primary key value in
the parent table.
14. What is
normalization? What are the different forms of normalization?
Normalization
is a standard to model tables and relationships in a manner to avoid redundancy
and other undesirable dependencies or effects on the data. There are six
different normal forms, which are given as follows:
First
Normal Form (1NF) —A relation R is in 1NF if and only if all the underlying
domains contain atomic values only.
Second
Normal Form (2NF) —A relation R is in 2NF if and only if it is in 1NF
and every non-key attribute is
fully functionally dependent on the primary key[I1].
It means that all the columns
except the primary key column should depend on the primary key column
for a table to be in 2NF.
Third
Normal Form (3NF) —A relation R is in 3NF if and only if it is in 2NF
and every non-key attribute is non-transitively dependent on the primary
key.
It means that all the columns
except the primary key column should be directly dependent on the primary
key column and not through any other column.
Boyce
Codd Normal Form (BCNF) —A relation R is in BCNF if and only if every determinant
is a candidate key[I2].
In any table, there exists a
primary key, which is used to determine rest of the column values. If
there is more than one such set of columns that can determine rest of the column
values, then this set of columns is called a candidate key or an alternate
key.
Fourth
Normal Form (4NF) —A relation R is in 4NF if and only if there exists
multi-valued dependency in R, say A->> B, then, all the attributes of R
are also
functionally dependent on A.
Fifth
Normal Form (5NF) —A relation R is in 5NF or Project-Join Normal Form
(PJNF) if and only if every join dependency is implied by candidate keys.
15. What is
functional dependency?
Generally,
in any table or relation, an attribute or a set of attributes is used to determine
the value of rest of the attributes in the table or relation. This is possible
as the rest of the attributes are the functions of the key column. This situation
is called functional dependency and it is possible when there exists
one and only value for each non-key attributes that correspond to the key attribute.
Consider an employee
table, where employee_id is the key and rest of the attributes, such as name
of the employee, can be determined by employee_id as they are functionally
dependent on the employee_id. For example, if there is an employee
whose name is Ram with employee_id 1001, then the name corresponding to
the employee_id 1001 will always be Ram only. No other name can be associated
with this employee_id.
16. What is
trivial functional dependency?
A trivial
functional dependency is the situation, which shows a relation between
an attribute with a superset of itself such that the attribute is dependent
on the superset.
Consider an attribute of
the dress entity, say length. If the length attribute is dependent on the
dress code attribute of the dress entity, then it is a non-trivial dependency;
however, if the length attribute is dependent on the color and length attributes
of the dress entity, then it is a trivial dependency.
17. What is
full functional dependency?
Full
functional dependency is the situation when an attribute depends on a
group of attributes completely but not on the subset of the attributes.
For example, suppose the
price of a pizza depends on its type and location of the pizza store. Neither
of these alone can give the price of the pizza; therefore, price is
fully functionally dependent
on the pizza type and location of its store.
18. What is
multi-valued dependency?
A multi-valued
dependency is the situation when an attribute (assume B) is dependent
on another attribute (assume A). However, there are multiple rows in a
table representing the dependency. This situation can happen when
there is at least one more attribute which is dependent on A and it requires
to have multiple rows of A and B to represent all the possible combinations.
A multi-valued dependency is represented by A->> B.
19. What is
transitive dependency?
A transitive
dependency is the situation when an attribute indirectly depends on
another attribute through a third attribute.
20. What is
the standard normal form for most Online Transaction Processing (OLTP)
databases?
3NF is
the standard normal form for most OLTP databases.
21. What is
de-normalization? Why is it recommended?
OLTP
databases are intended more for DML operations; therefore, these databases
are designed to reduce redundancy; whereas, Online Analytical Processing
(OLAP) databases or warehousing applications are primarily used for reporting.
OLAP databases are not used
for day-to-day transactions in general and reporting performance is
more critical for such databases; therefore, these databases are designed
to store data to support easy access without requiring too many joins. Redundancy
of data is not of much concern in these databases; therefore, tables are designed
as a join of two or more tables from a typical OLTP application. This is
called de-normalization.
22. What is
join dependency?
A join
dependency is a situation where a table can be created by joining two or
more tables.
2
Oracle Architecture
1. What
are the components of an Oracle instance?
Background
processes and memory structure are the two components of an Oracle instance.
2. What
is the difference between database and instance?
Database
is the collection of datafiles that contain the information of interest;
whereas, instance is the combination of background processes and memory
structure.
3. What
are the different types of memory structure available in Oracle?
System
Global Area (SGA) and Program Global Area (PGA) are the two types of
memory structure available in
Oracle.
4. What
are the components of SGA?
SGA is
used to store shared information. It includes Buffer cache, Log buffer,
Shared pool, Large pool, and Java pool.
5. Why
do you need PGA?
PGA is
a memory area used by Oracle database. It is required to store session
specific information.
6. What
are the disk components?
Datafiles,
Redo Logs, Control files, Password files, and Parameter files are the disk
components.
7. What
happens when a user requests for some information from RDBMS?
The following
steps are performed when a user requests for some information:
1. RDBMS checks
if a copy of the parsed SQL statement exists in the library cache. If parsed
copy exists, then steps 2 to 6 are skipped.
2. RDBMS validates
the syntax of the statement.
3. RDBMS ensures
that all the columns and tables referenced in the statement exist.
4. RDBMS acquires
parse locks on objects referenced in the statement so that their definitions
do not change while statement is parsed.
5. RDBMS ensures
that the user has sufficient privileges.
6. Statement
is parsed and execution plan is created.
7. Statement
is executed.
8. Values are
fetched.
8. What
do you mean by data consistency? How does Oracle maintain
consistency of data?
In a
multi-user environment, there can be situations when one or more users
are reading certain set of data while other is modifying the same set of
data. Data consistency provides a stable or consistent data to the users
throughout the session.
Data consistency is maintained
through rollback segments. A rollback segment holds the data image before
change; therefore, one or more sessions will get the same image of the data
throughout the session that were reading the data while another session
is updating.
9. What
happens when multiple users try to update the same set of data?
Oracle
uses locking mechanism to ensure that only one session or user can update
a single set of data at given point in time.
10. How long
does the rollback segment hold data to maintain consistency?
Rollback
segment holds the previous version of data block until either it gets committed
or the space is not available for reuse. If a session is reading specific block
from rollback segment while the other session has committed the information,
then more rollback space is required to maintain more rollback information. In
that case, this specific block ages out of the rollback segment and the reading
or selecting session gets an error.
11. What is
the process of updating or inserting certain data in the database?
Following
is the process to update or insert data in the database:
i. RDBMS
searches for parsed statement in library cache or parses the statement to
generate execution plan.
ii. Server process
retrieves relevant data from disk to the buffer cache. In case of inserting
data in the database, the data block with sufficient free space will be retrieved.
iii. Lock is acquired.
The data block is updated in the buffer cache.
iv. A lock is
acquired on rollback segment as well to store old version of data in case
rollback is required.
v. User process
creates a redo entry in the redo log buffer to support recovery.
12. What is
System Change Number (SCN)?
SCN is
an ID that Oracle generates for every transaction. It is recorded with
the corresponding change in a redo entry.
13. What is
DataBase WRiter (DBWR)?
DBWR is
a background process that writes changed data blocks from buffer cache to the
datafile. The changed data blocks are called dirty blocks.
14. When
does DBWR write to the datafile?
Following
are the situations when DBWR writes to the datafile:
i. When
checkpoint occurs
ii. When number
of dirty blocks reaches a threshold
iii. Every
three seconds due to timeout
iv. When server process
needs free space in buffer cache to read new blocks
15. What is
LoG WRiter (LGWR)?
LGWR is
the background process that writes redo information from redo log buffers
to the log files.
16. When
does LGWR write to the log file?
Following
are the situations when LGWR writes to the log file:
i. When a
user process commits a transaction
ii. When redo
log buffer is one-third full
iii. When more
than a megabyte of changes are recorded into the redo log buffer
iv. Before DBWR
writes modified blocks to the datafiles
17. What is
the physical structure of the disk resources in Oracle?
Following
is the physical structure of the disk resources in Oracle:
Control
files —Stores the status of the physical structure of the database. It
contains different types of information, such as archive log history,
datafile records, redo threads, log records, and database name.
Datafiles
—Refers to the physical files of the operating system that store the
data of all logical structures in the database
Redo
log files —Refers to the files, which log a history of all the changes
made to the database.
18. What is
the logical structure of the disk resources?
Following
is the logical structure of the disk resources:
Data
block —Refers to the smallest logical storage unit. Size of a data
block is a multiple of operating system block size.
Extent
—Refers to the contiguous set of data blocks, which is allocated as a
unit to a segment.
Tablespace
—Refers to the final logical storage unit. It is mapped to physical
datafile.
Segment
—Allocates a logical structure, such as table. It is a set of extents,
which are stored in the same tablespace.
19. What is
the difference between a temporary tablespace and a permanent tablespace?
A temporary
tablespace provides temporary storage during the processing of database
function, such as sorting; whereas, a permanent tablespace is used to
store permanent database objects, such as tables, partitions, indexes,
and clusters.
You do not need to backup or
restore a temporary tablespace.
20. Name a
tablespace, which is automatically created when you create a database.
The SYSTEM
tablespace is created automatically during database creation. It contains
data dictionary objects. All data stored on behalf of stored PL/SQL program
units (procedures, functions, packages, and triggers) resides in the SYSTEM
tablespace, which is always online when the database is open.
21. Which
file is accessed first when you start an Oracle database?
Initialization
parameter file or SPFILE is always accessed first when an Oracle
database is started.
This file is used to determine
database level setting because those values are stored as parameters in
this file.
22. What is
the job of the System Monitor (SMON) and Process Monitor (PMON) processes?
SMON
helps in recovery at instance startup. It is responsible for cleaning
up temporary segments and coalescing free extents. SMON also performs
failed instance recovery for other failed Real Application Clusters instances.
PMON performs process recovery
when a user process fails.
No comments:
Post a Comment