.: Physical Standby Database

Why we need Data Guard:

High Availability

Disaster Recovery

Data Protection

Workload Offloading & Testing

How Data Guard Works:

In Simple Terms, Dataguard Track the Changes on Primary Database and Replicate the same to Standby Databases through Redo Logs/Archive Logs.

Evolution of Oracle Dataguard:

Oracle Dataguard initially called as Standby database which introduced in version 7.3

In that version, The Redo files has to be copied manually and the database has to be recovered manually.

In 8i Version, Oracle Introduced Automated Archivelog Shipping and Managed Recovery Mode. But there are no option for log Delay.

In 9i Version, Oracle Renamed it as Oracle Dataguard and Introduced Dataguard Broker with Switchover and Delay options. But there are no failover option and after the switchover, the other database has to be rebuild manually.

In 10g version, Oracle Introduced Real time Apply with Fast Start Failover option. Oracle also introduced Database Flashback option along with Guaranteed Restore points.

In 11g version, Oracle Introduced Active Standby and Snapshot Standby Options. Also, the Block CHange Tracking enabled at Standby through which the RMAN incremental Backups can be offloaded.

In 12c version, Oracle Introduced Multi Instance Redo Apply and Parallel Multi instance REcovery which enabled all Nodes of RAC Instance participated in Recovery Operation incase if the standby database is RAC Configured. THe Support for Diagnostic, tuning packs enabled to create awr reports and analyze them with SQL plan analyzer.

In 18c, Oracle Introduced Recovery of standby database using Service. through this we will be able to refresh the standby database including controlfile, datafile and newly added datafiles Just with a single command. We can also create Global Temporary Tables for Testing purpose.

In 19c, Oracle Introduced 2 exciting features..

First is Active DG DML Redirect, through which even the DML Statements can be executed directly on Standby Database which will be applied to Primary Database as well which reduces the workload of Primary Database. Also we have automatic Standby recovery through Flashback feature by which we dont need to flashback/rebuild the standby database after Primary Flashback..

Types of Standby databases:

Classified as Physical and Logical Standby databases.

Physical Standby, the Redo data gets applied, to make the standby in sync.

Logical Standby, The corresponding SQL Statment gets applied to make standby database sync with Primary.

Usually the Physical Standby is in Mount Mode. But the Active Standby Option enables it open with Read Only Mode along with the changes on Primary gets applied on Standby database. This Active Dataguard feature useful for Offloading Reporting queries along with Backup jobs etc.

The Snapshot Standby Features enables the database open in Read-Write Mode with a Guaranteed Restrore point created along with Flashback. This enables the standby database to perform Code Testing and Load Testing Activities. Once the activity completed, the Database put back in physical standby and all the database changes are flashed back.

What are the services required on the primary and standby database ?

The Redo Buffer on Primary keep track of all changes on the database which then Written to Online Redo Logs using LGWR Process.

The Online Redo Log content written to Archived Redo Log files using ARChiver Process.

Initially, the Archivelog Were manually copied to Standby database and applied using Recover Standby database Option.

Later, Oracle Introduced Managed standby Recovery which introduced RFS (Remote File System) and MRP (Managed Recovery Process). After this, the Archivelogs are shipped automatically using Remote Archival method and applied on Standby database which is in Managed Standby Mode.

But, Incase of Primary Database Crash, the Committed data on Online Redo Logs couldnt be recovered. This result in Data Loss. To Avoid this scenario, Oracle Introduced Standby Redo Logs.

The services required on the primary database are:

• Log Writer Process (LGWR) - Collects redo information and updates the online redo logs. It can also create local archived redo logs and transmit online redo to standby databases.

• Archiver Process (ARCn) - One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.

• Fetch Archive Log (FAL) Server - Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request. .

The services required on the standby database are:

• Fetch Archive Log (FAL) Client - Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.

• Remote File Server (RFS) - Receives archived and/or standby redo logs from the primary database.

• Archiver (ARCn) Processes - Archives the standby redo logs applied by the managed recovery process (MRP).

• Managed Recovery Process (MRP) - Applies archive redo log information to the standby database.

On the Primary Database, Oracle Data Guard uses the Log Writer Process (LGWR) or Archiver Process (ARCH) or Log Writer Network Server Process (LNSn) to collect transactions redo data and ship this data to the standby.

On the standby database Oracle Data Guard uses the Remote File Server (RFS) process to receive the redo records from the primary database, the Managed Recover Process (MRP) to apply redo information/recovery to the physical standby database.

Fetch archive log (FAL) process (physical standby databases only) – This process provides a client/server mechanism for resolving gaps detected in the range of archived redo logs generated at the primary database and received at the standby database. The FAL client requests the transfer of archived redo log files automatically when it detects a gap in the redo logs received by the standby database. The FAL server typically runs on the primary database and services the FAL requests coming from the FAL client. The FAL client and server are configured using the FAL_CLIENT and FAL_SERVER initialization parameters which are set on the standby location.

DB_UNIQUE_NAME:
DB_UNIQUE_NAME specifies a globally unique name for the database. Databases with the same DB_NAME within the same DB_DOMAIN(for example, copies of a database created for reporting or a physical standby) must have a unique DB_UNIQUE_NAME. Every database's DB_UNIQUE_NAME must be unique within the enterprise .The DB_UNIQUE_NAME values of the primary and standby database should be used in the DG_CONFIG setting of the LOG_ARCHIVE_CONFIG parameter.

Specifies a unique name for this database. This name does not change even if the primary and standby databases reverse roles.DB_UNIQUE_NAME is required if you specify the LOG_ARCHIVE_CONFIG parameter. Specify the DG_CONFIG attribute to identify the DB_UNIQUE_NAME for the primary database and each standby database in the Data Guard configuration. The default value of this parameter enables the primary database to send redo data to remote destinations and enables standby databases to receive redo data.

ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(DB11G,DB11G_STBY)';
LOG_ARCHIVE_CONFIG='DG_CONFIG=(db_unique_name, db_unique_name, ...)'
this parameter enables the primary database to send redo data to remote destinations and enables standby databases to receive redo data.Specify the DG_CONFIG attribute to identify the DB_UNIQUE_NAME for the primary database and each standby database in the Data Guard configuration.

LOG_FILE_NAME_CONVERT:

This parameter converts the path names of the primary database online redo log file to path names on the standby database.

DB_FILE_NAME_CONVERT:

This parameter must specify paired strings. The first string is a sequence of characters to be looked for in a primary database filename.If that sequence of characters is matched, it is replaced by the second string to construct the standby database filename.

STANDBY_FILE_MANAGEMENT:

Set the STANDBY_FILE_MANAGEMENT parameter to AUTO so that when datafiles are added to or dropped from the primary database,x`corresponding changes are made in the standby database without manual intervention.

Synchronous transport (SYNC) is also referred to as "zero data loss" method because the LGWR is not allowed to acknowledge a commit has succeeded until the LNS can confirm that the redo needed to recover the transaction has been written at the standby site.

Asynchronous transport (ASYNC) is different from SYNC in that it eliminates the requirement that the LGWR waits for a acknowledgment from the LNS, creating a "near zero" performance on the primary database regardless of distance between the primary and the standby locations.

AFFIRM - the LNS process waits for the RFS to perform a direct I/O on the SRL file before returning a success message
NOAFFIRM - the LNS process will not wait and sends a success message immediately

Modes of Standby databases:

A protection mode is only set on the primary database.It defines the way Oracle Data Guard will maximize a Data Guard configuration for performance, availability, or protection in order to achieve the maximum amount of allowed data loss when the primary database or site fails.
We can between three protection modes, with different logging options. Each mode for your environment has a different impact on availability, costs, data loss, performance, and scalability.

Maximum Protection mode:
Maximum Protection mode with LGWR SYNC AFFIRM option for an environment that requires no data loss and no divergence.
This mode has highest level of data protection and ensures no-data-loss.Data is synchronously transmitted to the standby database from the primary database.Transactions are not committed on the primary database unless the redo data is available on at least one standby database configured in this mode.If the last standby database configured in this mode becomes unavailable, processing stops on the primary database.

Maximum Protection Mode which ensure transaction on primary commits only after it is applied on Standby. There are no Dataloss in Max Protection mode.Here the redo transport type is SYNC & AFFIRM.
Maximum Availability:
Maximum Availability mode with LGWR SYNC AFFIRM option for an environment that needs no data loss but tolerates divergence when sites are temporarily inaccessible.
similar to the maximum protection mode, including zero data loss. Transactions are not allowed to commit until all redo data are written to the online redo logs and propagated to at least one synchronized secondary database.If for any reason, the primary database cannot propagate its redo stream to one secondary database, the primary will NOT shutdown and operates as it it were in maximum performance mode until issues are fixed.

Maximum availability Mode. In This also, the transaction on Primary gets commited only after it is applied on standby database. But If Standby not available due to outage/network issue, the transaction on primary will continue without any impact.. Usually there will not be any data loss. Here also the redo transport is sync & Affirm.

Maximum Performance:
Maximum Performance mode with ARCH/ LGWR ASYNC (AFFIRM or NOAFFIRM) option for an environment that tolerates minimal data loss and divergence when sites are temporarily inaccessible.
In this mode, as the primary database processes transactions, redo data is asynchronously shipped to the standby database.The commit operation of the primary database does not wait for the standby database to acknowledge receipt of redo data before completing write operations on the primary database.If any standby destination becomes unavailable, processing continues on the primary database and there is little effect on primary database performance. This is the default mode.

Maximum performance mode in which the primary database transaction will not wait for Standby replication which improve the performance of the primary database. But there are significant chances of Data Loss.

Important related query:

Gap between DR and Primary:
++++++++++++++++++++++++++
select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') WHEN,': standby is ' || trim(to_char(1440 * (sysdate - max (next_time)),99999.99) ||' minutes behind') LAG from v$archived_log where applied ='YES';
select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') WHEN,': standby is ' || trim(to_char(1440 * (sysdate - max (next_time)),99999.99) ||' minutes behind') LAG from $archived_log where applied ='YES';

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

recover managed standby database nodelay parallel 20 disconnect from session;

recover managed standby database parallel 20 delay 240 disconnect from session;
alter database mount standby database;
alter database open read only
recover standby database until time '2015-10-08 05:00:00' using backup controlfile parallel 20;
select distinct to_char(CHECKPOINT_TIME,'DD-MON-YYYY HH24:MI:SS') from v$datafile_header;

set lines 500;
select PROCESS,CLIENT_PROCESS,STATUS,THREAD#,SEQUENCE#,BLOCK#,BLOCKS from v$managed_standby where process like '%MRP%';
select THREAD#,MAX(SEQUENCE#),ROUND((SYSDATE-MAX(FIRST_TIME))*24,2) "Hours Bejind" from v$archived_log where applied='YES' group by THREAD#;

ALTER SYSTEM SET local_listener='DB_NAME_LOCAL' SCOPE=spfile;

SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#)
"Difference" FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#))
ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#))
APPL WHERE ARCH.THREAD# = APPL.THREAD# ORDER BY 1;

alter system set local_listener=TEST_LOCAL scope=both sid='TEST1';
alter system set remote_listener=TESTD_REMOTE scope=both sid='TEST';
set linesize 150
set pagesize 500
col name for a70
col "COMPLETION_TIME" for a25
select NAME,ARCHIVED,APPLIED,DELETED,STATUS,REGISTRAR,to_char(COMPLETION_TIME,'DD-MON-YYYY HH24:MI:SS')"COMPLETION_TIME" from v$archived_log where APPLIED='NO' order by name; 12:04 PM
alter database register logfile '/oracle/archive/EPICPROD/fs01/EPICPROD.arch2_184958_809040320.dbf'; 12:05 PM
Select thread#, sequence#, applied, to_char(first_time, 'mm/dd/yy hh24:mi:ss') first, to_char(next_time, 'mm/dd/yy hh24:mi:ss')
next, to_char(completion_time, 'mm/dd/yy hh24:mi:ss') completion from v$archived_log order by first_time;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DELAY 240 DISCONNECT FROM SESSION;----------------start MRP---in 1g

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING ARCHIVED LOGFILE DISCONNECT;-------start MRP in 12c

Command to check throughput:::::::::::::::::::::::::
alter session set nls_date_format = 'dd-mm-yyyy hh24:mi:ss';
Select To_char(START_TIME) "Recover_start",To_char(Item)||' = '||To_char(Sofar)||' '||To_char(Units)||' '||
To_char(TIMESTAMP) "Values" From v$Recovery_progress Where Start_time=(Select Max(Start_time) From v$Recovery_progress);

Check MRP Process and Lag for DR:::::::::
select process,client_process,status,thread#,sequence#,block#,blocks from v$managed_standby where PROCESS like '%MRP%';

select thread#,max(sequence#),round((sysdate-max(first_time))*24,2) "Hours Behind" from v$archived_log where Applied='YES' group by thread#;

DG Details:
++++++++++++
show parameter dg;
dgmgrl sys/<password>
show configuration verbose

Standby Redo Logs (SRL):

After the Introduction of Standby Redo Log files, The Dataguard read the content of Online Redo Logs and write them directly to Standby Redo Logs which was used for Managed Recovery.

Here comes the 2 new Processes. that is Network Server Async Process (NSA) and Network Server Sync Process (NSS). Prior to 12c, the Log-Write Network Server Process (LNS) Process which is was used instead of NSA and NSS.

If the database configured with max performance mode which is Async replication, then NSA Process will read the change data from Online Redo Logs which gets written to Standby Redo Log files.

If the Database in Max Protection or Max Availability Mode which is Sync replication, then NSS Process will read the change data directly from LGWR Process which will be writtent to Standby Redo Log files.

if you configured Real time apply for the dataguard, then the changes applied to standby database immediately.

Incase, if Dataguard configured with Log Delay, then Standby redo Log content written to Remote Archive Files and applied after a certain time.

In case if there is a Lag, RFS process will connect with Primary Archivelog files and get the required data to make the standby database in sync with primary.

What & Why on Standby Redo Logs (SRL):

The standby redo log file is similar to an online redo log, exception is that a standby redo log is used to store redo data that has been received from another database

If standby database is in maximum protection or maximum availability modes, then Standby Redo Logs are mandatory. In Max Protection and Max Availability Modes, the Standby Redo Logs reduce the Transaction Wait time on Standby Commit. But Even if your database is in Max Performance mode, then the Standby Redo Logs can be used to minimize the Data Loss.

Standby Redo Logs are needed, If you are using Real-Time Apply on Standby Database. Also, Standby Redo Logs used in Cascaded Standby databases where the Redo data read from another standby database.

Standby Redo Log Creation:

Just add Standby Clause in Alter database add Logfile command. so the command will be “alter database add Standby Logfile size <>.

Best Practices of Standby Redo Logs

Create the Standby Redo Logs with the same exact byte size as the Online Redo Log files. If they can’t be the same exact size, make sure they are bigger than the Online Redo Log files.

Standby Redo Log Count Should be Equal or Higher than Online Redo Logs. The Best practice is to have 1 Standby Redo Log file higher than Online Redo Log files.

When you create SRLs in the standby, create SRLs in the primary as well. They will normally never be used. But When you do switchover, These files will be used after the Primary database converted to Standby Database.

For an Oracle RAC primary database, create the number of SRLs equal or higher than the number of ORLs in all primary instances. For example, if you have a 3-node RAC database with 4 ORLs in each thread, create 12 SRLs (3×4) in your standby.

Standby redo logs can be created even after the standby has been created. In this case create the SRL’s on the primary before the creation of SRL on the standby database.

Other Related details can be availed from below topic:

create standby
switch-over

Monday, June 12, 2017

Physical Standby Database

No comments:

Post a Comment