Data Guard Broker
The broker is part of Oracle's
Database Enterprise Manager and an integral part of Data Guard, it is the
management framework. You can of course use SQL*Plus to manage Data Guard but
the broker when used with the Enterprise Manager becomes a powerful tool
offering the following
- Configuration
- Monitoring
- Alerting
- Performance Analysis
- Manual switchover's
- Automatic switchover's
Before I discuss the integration of
the broker into the Enterprise Manager lets first discuss the broker itself.
The broker can make changes to a Data Guard setup (physical or logical
database), the type of changes are configuration, transport modes, apply setup
and role change services and the protection mode. You can also monitor the
health of any configuration, it is also responsible for implementing and
managing the automatic failover capability also know as Fast-Start Failover
(FSFO).
The broker has three parts, a set of
background processes on each database, a set of configuration files and a
commandline interface (CLI) called DGMGRL.
The broker has a number of processes
running, the broker manages these processes automatically you have no control
over them
- Data Guard Monitor (DMON) - this is the main process and is responsible for coordinating all broker actions as well as maintaining the configuration files, this process is enabled or disabled with the DG_BROKER_START parameter
- Broker Resource Manager (RSM) - is responsible for handling any SQL commands used by the broker that need to be executed on one of the databases in the configuration. These SQL commands are made as a result of a change to the configuration made through DGMGRL or are the configuration commands executed by the broker during the database startup
- Data Guard Net Server (NSVn) - from 1 to n can exist, they are responsible for making contact with the remote databases and sending across any work items to the remote database.
- DRCn - these network receiver processes establish the connection from the source database NSVn process, this is a similar connection to the LogWriter Network Service (LNS) to a Remote File Server (RFS) connection for redo transport. When the broker needs to send something (data or SQL) between databases it uses this NSV to DRC connection, these connection are started as needed.
- Configuration files - the configuration is stored in two binary command files. The files contain the complete configuration setup, the database states, how to connect to each one and what parameters to setup when each database starts up.
In a broker configuration it is the
Data Guard Monitor (DMON) process on the primary database that is the owner of
the configuration, all orders will come from the primary even if use a DGMGRL
CLI from another server, the standby database will receive all configuration
changes via the primary.
When the DMON process communicates
with the standby database it uses one of the NSV processes to send the work to
the standby, this protects the DMON from a hang if the network should go down.
Whenever the DMON needs to execute some SQL it will enlist the aid of the RSM
process on the primary database, if the SQL is for the primary database it will
execute it directly, however if the SQL is for a standby database the RSM
process asks the NSV process to send it to the standby database again this
protects the RSM from a network hang.
Each NSV process has a DRC partner
process on the standby database, which will perform the work of the NSV
process, the DRC process on the standby will return the results or status back
to the NSV on the primary.
On startup the DMON will communicate
with each standby via the NSV-DRC connection process pair to establish
communication and configuration information so that the standby can start the
apply services, if there are any problems check the alert.log after the NVS
section to see if the are any TNS errors.
When using a RAC environment there
are another set of process called the internode servers (INSV) which maintain a
connection between the nodes in the cluster to ensure that the broker on each
node knows the state of the cluster. A INSV process will always be started on
the primary even if the primary is not in a RAC.
Lastly you may see one more process
called the Fast-Start-Failover Process (FSFP), which is used only when the
primary database is under the control of Data Guard automatic failover feature
Fast-Start Failover. We will be discussing FSFP in another section switchover and failover.
Each database has two copies of the
configuration, stored in the below parameter locations
- DG_BROKER_CONFIG_FILE1
- DG_BROKER_CONFIG_FILE2
By default they are stored in $ORACLE_HOME/dbs
directory with the filename of dr1<DB_UNIQUE_NAME>.dat and
dr2<DB_UNIQUE_NAME>.dat, if using a RAC these should be keep on a shared
location as only one set of copies can be exist for the entire RAC.
The primary has the master copies of
the configuration files, all changes to these files are done by the primary,
even if you are on the standby server the configuration changes will be passed
to the primary via NSV-DRC processes and the DMON process will make the change
which is then propagated back to the standby databases. The reason for the
multiple copies of the configuration file is if the primary has a problem then
each standby database has a copy, when a failover occurs the standby that
becomes the primary will then become the master of those configuration files.
You have two choices to interact
with Data Guard either Enterprise Manager (EM) or the broker CLI DGMGRL you can
swap between the two with a few simple configuration changes. To gain full full
functionality of Data Guard through Grid Control you must use the broker.
Broker CLI DGMGRL comes with
Enterprise Manager, you can run the broker on any platform you wish it does not
need to be the same platform as the primary or standby databases.
You can access the DGMGRL command line
using the below
DGMGRL CLI
|
[primargydg01] > dgmgrl
sys/oracle
[primargydg01] > dgmgrl
sys/oracle@PROD1
|
When you start the DGMGRL
commandline it does not connect you to a database, only the Data Guard
configuration. Before you start to use the broker there are a number of things
to perform first
- Configure the Broker parameters
- The broker and the listener
- RAC and the broker
- Connecting to the broker
Even if you use Enterprise Manager
you should really have a understanding on how the CLI works, as mentioned above
there are two configuration files, both the systems parameters must be set on
the primary and all standby databases, you put set this files anywhere you like
data Guard configuration files
location
|
# Directory
alter system set DG_BROKER_CONFIG_FILE1 = '<directory>/<file>'; alter system set DG_BROKER_CONFIG_FILE2 = '<directory>/<file>';
# ASM
alter system set DG_BROKER_CONFIG_FILE1 = '+DATA/PROD1/Broker/dr1PROD1.dat'; alter system set DG_BROKER_CONFIG_FILE1 = '+FLASH/PROD1/Broker/dr2PROD1.dat';
Note: if you are using ASM then
the directories must be created already, use the asmcmd command to do
this, the broker will create symlinks to the actual config files but the
directories must be there
|
In a RAC environment they should be
in a shared area and all nodes should point to the same files, only one set of
files must exist in a RAC.
Once the directory has been created
you are ready to startup the broker, do this on all databases (primary and
standby)
Start the Broker
|
alter system set
DG_BROKER_START=TRUE SCOPE=BOTH;
|
When you start the broker it does
not configure any files yet because you have to use the DGMGRL command, all it
does is start the necessary processes we mentioned earlier. If you need to
change the location of the files the steps are below
- stop the broker using the DG_BROKER_START parameter by setting this to false (on all nodes in the cluster)
- change the DG file destination parameters
- copy the files to the new location
- then re-start the broker, again set the DG_BROKER_START parameter to true (on all nodes in the cluster)
The broker uses the Oracle Net
Services to make connections to the databases, setup both redo transport and
archive gap resolution and perform role transitions. We need to create a
special static entry in the listener.ora file for each database in the broker
configuration, this entry makes it possible for the broker to connect to an
idle instance using a remote SYSDBA connection and perform the necessary
startup. Here is an example
Broker listener.ora static entry
|
## Primary and all standby
databases
SID_LIST_LISTENER =
(SID_LIST = (SID_DESC = (GLOBAL_DBNAME = PROD1DR_DGMGRL) (ORACLE_HOME = /scratch/OracleHomes/OraHome111) (SID_NAME = PROD1DR) ) ) SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (GLOBAL_DBNAME = PROD1_DGMGRL) (ORACLE_HOME = /scratch/OracleHomes/OraHome111) (SID_NAME = PROD1) ) ) |
Once the above has been done we are
ready to create a broker configuration, there is no difference in a non-RAC or
RAC environment, to connect to the configuration you use the dgmgrl command
but one word of advise make sure that you use a specific user and password and
not the / as the user, this has know to cause problems at a later date
DGMGRL CLI
|
[primargydg01] > dgmgrl
sys/oracle
[primargydg01] > dgmgrl
sys/oracle@PROD1
|
There are four main area's to DGMGRL
CLI
- Connection and Help - connect, help and exit
- Creation and Editing - create, add, enable, edit and convert
- Monitoring - show
- Role Transition - switchover, failover and reinstate
If you used the Enterprise
Manager/Grid Control to setup your standby database then the next steps would
have already been performed for you, if however you used the Power User Method then you need to follow the next steps to configure the
broker.
The first step is to create the base
configuration
Create base configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> create configuration prod1 as > primary database is prod1 > connect identifier is prod1; Configuration "prod1" created with primary database "prod1" |
The above command would have created
a base configuration, you can display the configuration using the below command
Display configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
## Here is the same screen shot
after adding a logical server
DGMGRL> show configuration;
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database prod1lr - Logical standby database
Fast-Start Failover: DISABLE
Configuration Status:
SUCCESS |
Next we add the standby database to
the configuration, if you receive an error stating that the properties could
not be imported from the database, you need to check DB_UNIQUE_NAME also the
TNSNAME and the transport parameter (LOG_ARCHIVE_DEST_n), so what is the proper
setup, you must have your redo transport parameter defined using the
DB_UNIQUE_NAME method, meaning that each redo transport parameter must contain
the DB_UNIQUE_NAME=<name> attribute. The broker will search all of your
LOG_ARCHIVE_DEST_n parameters looking for a database with a unique name that
matches the database name you entered for the command. If you have not done
this then you need to use the full set of arguments to add the database
Add the standby database
|
# Primary Database server - if you
have setup db_unique_name, tnsname and log_archive_dest_n
DGMGRL> connect sys/password DGMGRL> add database prod1dr; # Primary Database server - the full command set DGMGRL> connect sys/password DGMGRL> add database prod1dr > as connect identifier is prod1dr > maintained as physical;
Database "prod1dr" added
# if you want to add a logical
standby you can use the following
DGMGRL> add database prod1lr
> as connect identifier is prod1lr > maintained as logical; Database "prod1lr" added |
When you display the configuration
you should see that the standby database has been added
Display configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
DISABLED
## Because we have not setup any
parameters for Broker all parameters defaults will be presumed,
## some parameters are set by examining the database others are Broker default values. DGMGRL> show database verbose prod1
Database - prod1
Role: PRIMARY
Intended State: OFFLINE Instance(s): PROD1
Properties:
DGConnectIdentifier = 'prod1' ObserverConnectIdentifier = '' LogXptMode = 'ASYNC' DelayMins = '0' Binding = 'optional' MaxFailure = '0' MaxConnections = '1' ReopenSecs = '300' NetTimeout = '30' RedoCompression = 'DISABLE' LogShipping = 'ON' PreferredApplyInstance = '' ApplyInstanceTimeout = '0' ApplyParallel = 'AUTO' StandbyFileManagement = 'AUTO' ArchiveLagTarget = '0' LogArchiveMaxProcesses = '4' LogArchiveMinSucceedDest = '1' DbFileNameConvert = '' LogFileNameConvert = '' FastStartFailoverTarget = '' StatusReport = '(monitor)' InconsistentProperties = '(monitor)' InconsistentLogXptProps = '(monitor)' SendQEntries = '(monitor)' LogXptStatus = '(monitor)' RecvQEntries = '(monitor)' HostName = 'primarydg01' SidName = 'PROD1' StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=primarydg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PROD1_DGMGRL)(INSTANCE_NAME=PROD1)(SERVER=DEDICATED)))' StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST' AlternateLocation = '' LogArchiveTrace = '0' LogArchiveFormat = '%t_%s_%r.dbf' TopWaitEvents = '(monitor)'
Database Status:
DISABLED
DGMGRL> show database verbose
prod1dr
Database - prod1dr
Role: PHYSICAL STANDBY
Intended State: OFFLINE Transport Lag: (unknown) Apply Lag: (unknown) Real Time Query: OFF Instance(s): PROD1DR
Properties:
DGConnectIdentifier = 'prod1dr' ObserverConnectIdentifier = '' LogXptMode = 'ASYNC' DelayMins = '0' Binding = 'OPTIONAL' MaxFailure = '0' MaxConnections = '1' ReopenSecs = '300' NetTimeout = '30' RedoCompression = 'DISABLE' LogShipping = 'ON' PreferredApplyInstance = '' ApplyInstanceTimeout = '0' ApplyParallel = 'AUTO' StandbyFileManagement = 'AUTO' ArchiveLagTarget = '0' LogArchiveMaxProcesses = '4' LogArchiveMinSucceedDest = '1' DbFileNameConvert = '' LogFileNameConvert = '' FastStartFailoverTarget = '' StatusReport = '(monitor)' InconsistentProperties = '(monitor)' InconsistentLogXptProps = '(monitor)' SendQEntries = '(monitor)' LogXptStatus = '(monitor)' RecvQEntries = '(monitor)' HostName = 'physicaldg01' SidName = 'PROD1DR' StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=physicaldg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PROD1DR_DGMGRL)(INSTANCE_NAME=PROD1DR)(SERVER=DEDICATED)))' StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST' AlternateLocation = '' LogArchiveTrace = '0' LogArchiveFormat = '%t_%s_%r.dbf' TopWaitEvents = '(monitor)'
Database Status:
DISABLED |
One parameter to watch for is the
local archiving parameter on the primary and standby databases, the broker will
modify this parameter and if necessary add the VALID_FOR attribute in
preparation for the archival of your standby redo log files.
One area to watch is if you add an
additional standby database, the broker could presume the wrong parameter
values, for instance you may want one standby to use SYNC and another to use
ASYNC, but if you configure the second standby database using the short method
it may default to SYNC, so when you do add any additional standby databases
always double check the parameters are set correctly for the type of
configuration that you want, any changes should be done before enabling the
additional standby database.
Once we are happy that all standby
databases have been added and their parameters have been check we are ready to
enable the configuration, the command will issue alter system commands
on both the primary and standby databases, start the redo transport to the
standby databases and the apply services, before examining the configuration
give it sometime to complete as it does take a while, otherwise you see some
ORA-16610 errors.
Enabling the configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> enable configuration Enabled. |
You can tail the alert log file and
see what action is taking place, hopefully after a few minutes you should see a
successful configuration running, you can also check the redo transport and
apply services by viewing the v$managed_standby view.
Display configuration
|
# Primary Database server
DGMGRL> connect sys/password DGMGRL> show configuration;
Configuration - prod1
Protection Mode: MaxPerformance
Databases: prod1 - Primary database prod1dr - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
# Standby database - check the
LGWR - RFS connection process and the MRP0 is in the applying state
sql> select client_process, process, thread#, sequence#, status from v$managed_standby; |
As I mentioned above the broker has
a number of properties which can be changed, to change the property we use the
EDIT command in the DGMGRL CLI
Edit configuration
|
EDIT CONFIGURATION SET PROPERTY
<name>=<value>
EDIT DATABASE <db_name> SET PROPERTY <name>=<value> EDIT INSTANCE <in_name> SET PROPERTY <name>=value> |
Configuration
Properties
|
|
BystandersFollowRoleChange
|
|
FastStartFailoverAutoReinstate
|
|
FastStartFailoverLagLimit
|
|
FastStartFailoverPmyShutdown
|
|
FastStartFailoverThreshold
|
|
CommunicationTimeout
|
the amount of time the broker will
wait for a response before giving up (default 180 seconds), you can remove
any timeout by setting this to 0
|
Database
Properties
|
|
FastStartFailoverTarget
|
|
ObserverConnectIdentifier
|
|
ApplyInstanceTimeout
|
defines how long the broker should
wait until moving the apply process to another instance in a standby RAC,
default is 0 when means immediately
|
PreferredApplyInstance
|
allows you to tell the broker
where you would like the apply to run when you have a RAC, by default it is
empty which means it can choose any instance
|
ArchiveLagTarget
|
alters database parameter:
ARCHIVE_LAG_TARGET
|
DbFileNameConvert
|
alters database parameter:
DB_FILE_NAME_CONVERT
|
LogArchiveMaxProcesses
|
alters database parameter: LOG_ARCHIVE_MAX_PROCESSES
|
LogArchiveMinSuccessDest
|
alters database parameter:
LOG_ARCHIVE_IN_SUCCEED_DEST
|
LogFileNameConvert
|
alters database parameter:
LOG_FILE_NAME_CONVERT
|
LogShipping
(standby role only) |
enables or defers redo transport
to that standby database.
alters database parameter:
LOG_ARCHIVE_DEST_STATE_n
|
StandbyFileManagement
(standby role only) |
alters database parameter:
STANDBY_FILE_MANAGEMENT
|
Instance
Properties
|
|
HostName
|
No explaination needed here
|
SidName
|
No explaination needed here
|
LogArchiveTrace
|
alters database parameter:
LOG_ARCHIVE_TRACE
|
LogArchiveFormat
|
alters database parameter:
LOG_ARCHIVE_FORMAT
|
StandbyArchiveLocation
|
alters database parameter:
LOG_ARCHIVE_DEST_n
|
AlternateLocation
|
alters database parameter:
LOG_ARCHIVE_DEST_n
|
LsbyMaxSga
|
alters database parameter: MAX_SGA
|
LsbyMaxServers
|
alters database parameter:
MAX_SERVERS
|
There are a number of commands that
you can use to change the state of the database
turn off/on the redo transport
service for all standby databases
|
Primary
|
DGMGRL> edit database prod1 set
state=transport-off;
DGMGRL> edit database prod1 set state=transport-on; |
turn off/on the apply state
|
Standby
|
DGMGRL> edit database prod1dr
set state=apply-off;
DGMGRL> edit database prod1dr set state=apply-on; |
put a database into a real-time
query mode
|
Standby
|
DGMGRL> edit database prod1dr
set state=apply-off;
sql> alter database open read only; DGMGRL> edit database prod1dr set state=apply-on; |
change the protection mode
|
Primary
|
# Choose what level of protection
you require
sql> alter database set standby to maximize performance; sql> alter database set standby to maximize availability; sql> alter database set standby to maximize protection;
# display the configuration
DGMGRL> show configuration |
There are a number of useful
monitoring commands and log files that can help with diagnosing problems
configuration
|
DGMGRL> show configuration;
|
database
|
DGMGRL> show database prod1;
DGMGRL> show database prod1dr; # There are a number of specific information commands, here are the most used DGMGRL> show database prod1 statusreport; DGMGRL> show database prod1 inconsistentProperties; DGMGRL> show database prod1 inconsistentlogxptProps; DGMGRL> show database prod1 logxptstatus; DGMGRL> show database prod1 latestlog; |
Logfiles
|
# change the instance name to
reflect the one you have choosen
prod1 (alert log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/alert_PROD1.log prod1 (DG log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/drcPROD1.log
prod1dr (alert log):
/u01/app/oracle/diag/rdbms/prod1dr/PROD1DR/trace/alert_PROD1DR.log
prod1dr (DG log): /u01/app/oracle/diag/rdbms/prod1dr/PROD1DR/trace/drcPROD1DR.log |
Lastly there is a utility called the
"DG Menu Utility", this can be downloaded from the Oracle Data Guard
11g handbook web site http://www.dataguardbook.com, I will be covering this in my monitoring
section but below is a screen shot of the menu
DG Menu Utility
|
DG Menu
======================================== # ------------------------------------------------------------------------- # # # # Data Guard Check List - primarydg01 # # # 0. Review database information and status # # 1. Check for password file # # 2. Check for forced logging and unrecoverable activities # # ------------------------------------------------------------------------- # # 3. Check for archive log mode # # 4. Check for standby redo logs # # 5. Check current SCN on primary and standby databases # # 6. Check archive log destinations # # ------------------------------------------------------------------------- # # 7. Check Data Guard Status View for errors and fatal messages # # 8. Check Managed Recovery Process Status # # 9. Check for missing archive logs # # 10. Check archive log gaps on the standby database # # 11. Check average apply rate / active apply rate # # 12. Check transport / apply lag # # 13. How far behind is my Data Guard in terms of time? # # # # ------------------------------------------------------------------------- # # 20. Launch the Logical Standby Data Guard Submenu # # ------------------------------------------------------------------------- # # 21. Generate init.ora entries for primary database # # 22. Generate init.ora entries for standby database # # 23. Generate tnsnames.ora entries for primary and standby databases # # 24. Generate SQL syntax to create standby redo logs # # # # ------------------------------------------------------------------------- # # 30. Generate syntax to duplicate standby database from active database # # # # x. Exit # # ------------------------------------------------------------------------- # # Enter Task Number:
#
------------------------------------------------------------------------- #
# # # Logical Standby Data Guard Check List - primarydg01 # # # 1. Check Logical Progress - View Overall Progress Of SQL Apply # # 2. Check Logical Events - History on Logical Standby Apply Activity # # 3. Check Logical Events - Detailed View # # 4. Check Logical Stats - Logical Standby Stats # # 5. Check Logical Parameters - Logical Standby Parameters # # 6. Look At What The Logical Standby Processes Are Doing # # Coordinator, Reader, Builder, Preparer, Analyzer, Applier ... # # 7. Look At The Status Codes For The Logical Standby Processes # # 8. Look At Events The Applier Process Is Stuck On # # ------------------------------------------------------------------------- # # 10. Check the LCR - Look At Bytes Paged Out # # 11. Generate Syntax To Skip Transactions # # Based On MAX(EVENT_TIME) FROM DBA_LOGSTDBY_EVENTS # # DO NOT SKIP DML STATEMENTS # # 12. Diagnostic Script Per Metalink Note ID: 241512.1 # # Look for output in logical_diag_[ORACLE_SID_MONDD_HHMM.out] format # # ------------------------------------------------------------------------- # # 20. Review What Is NOT Supported In Your Logical Standby Database # # 21. Review Tables That Do NOT have Unique Identifiers # # 22. Check Primary Database For Supplemental Logging # # # # ------------------------------------------------------------------------- # # 30. Start Logical Standby Database # # 40. Stop Logical Standby Database - PLEASE BE CAREFUL !!!!! # # THIS WILL STOP THE LOGICAL STANDBY APPLY PROCESS # # ------------------------------------------------------------------------- # # # # x. Exit # # ------------------------------------------------------------------------- # # Enter Task Number: |
Remove the broker
|
|
No comments:
Post a Comment