Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Saturday, April 30, 2022

Long running concurrent request in R12.2:

 Before proceeding further Collect following basis level of information to development team:

1 Oracle seeded program/Custom program?

2 How much time it used earlier?

3 Is there any recent code change done in concurrent program?

4 Is this program fetching higher data compare to last run?

5 Does this job running any specific time/ It can be run any time?

6 Does this job fetching data using DB link?

7 Does the problem happen on both the test and production instance?

meanwhile we need to start basic sanity check in our environment.

Verify the status of the concurrent program:

select REQUEST_ID,phase_code,status_code,ORACLE_SESSION_ID from apps.fnd_concurrent_requests where request_id=’1234567’;

Find which concurrent manger ran a specific concurrent request:

select b.USER_CONCURRENT_QUEUE_NAME from fnd_concurrent_processes a,fnd_concurrent_queues_vl b, fnd_concurrent_requests c where a.CONCURRENT_QUEUE_ID = b.CONCURRENT_QUEUE_ID and a.CONCURRENT_PROCESS_ID = c.controlling_manager and c.request_id ='856272604';

Verify the actual and target of the CM:

Verify the concurrent manager status

Find SID for the concurrent request use below query:

set lines 200

set pages 200

col USERNAME for a10

col MODULE for a25

col ACTION for a25

col PROGRAM for a18

SELECT a.request_id, d.sid, d.serial# ,d.osuser,d.process , c.SPID,d.EVENT FROM apps.fnd_concurrent_requests a, apps.fnd_concurrent_processes b, v$process c, v$session d WHERE a.controlling_manager = b.concurrent_process_id AND c.pid = b.oracle_process_id AND b.session_id=d.audsid AND a.request_id = &Request_ID AND a.phase_code = 'R';

Find any blocking session for concurrent request use below query and verify concurrent program SID:

select (select username from v$session where sid=a.sid) blocker, a.sid, ' is blocking ' "IS BLOCKING", (select username from v$session where sid=b.sid) blockee,b.sid from v$lock a, v$lock b where a.block = 1 and b.request > 0 and a.id1 = b.id1 and a.id2 = b.id2;

If any blocking session there use below query to find the concurrent request id 

SELECT s.inst_id,a.request_id,s.sid,s.serial#,c.spid FROM apps.fnd_concurrent_requests a, gv$process c, gv$session s WHERE s.sid in ('&sid') AND s.paddr = c.addr AND a.oracle_process_id = c.spid AND a.phase_code = UPPER ('R');

Find client identifier and module, action for concurrent request

Then check with module owner with the concurrent request

select sid,serial#,USERNAME,CLIENT_IDENTIFIER,status,LAST_CALL_ET/60,SQL_HASH_VALUE,module,action,program from v$session where sid='&n';

Find which sql query is running 

select sql_text from v$sql where hash_value='12345678';

Find stale value of the tables which is used by the concurrent program:

select table_name, stale_stats, last_analyzed from dba_tab_statistics where stale_stats='YES';

You may have to run the gather stats against those tables which are having stale value.

Use OEM and monitor session ID

If development team ask tkprof,AWR,ASH report please generate and analysis.  

Tuesday, April 19, 2022

users with sysdba can login to database without password!

 connect / as sysdba


They logged into the OS didn't they - they provided a password then!

"as sysdba" is extremely powerful, it uses OS AUTHENTICATION, it does not require database authentication (it is used to login before there is a "database" or even an "instance"!)

/ as sysdba - connect to LOCAL instance as specified in the environment variables ORACLE_HOME + ORACLE_SID

sys/a@fstest as sysdba - connect OVER A NETWORK to a remote database, using the credentials sys/a - eg: resolve (using the tnsnames.ora or whatever naming you are using) fstest to a REMOTE MACHINE and connect to it.

won't use the password file, it uses OS authentication.

orapwd - the password file used for remote authentication - won't help you here at all, it would be used over the network, but not for local connections.

"as sysdba" is extremely powerful, it uses OS AUTHENTICATION, it does not require database authentication (it is used to login before there is a "database" or even an "instance"!)

remove them from the DBA group (the ability to connect / as sysdba goes away)

grant them sysdba (this will put them into the password file - you need to have remote login password file set up - you might have to use orapwd to create an empty one if you don't have one yet).

When Oracle is installed on a Windows OS it creates a local group named 'ora_dba -- Oracle DBA Group' and the local user 'Administrator' with its member by default.The membership of this local group grants its members the privilege to connect as sysdba without password -- as they are authenticated at OS level.

When you connect as sysdba issuing a ‘CONNECT / AS SYSDBA’, Oracle checks if your account is a member of the os group ‘ORA_sid_DBA’ or ‘ORA_DBA’ group.

Operating System Group UNIX User Group Windows User Group

OSDBA                  dba            ORA_DBA

OSOPER                  oper            ORA_OPER

This is why REMOTE_OS_AUTHENT MUST always be set to FALSE otherwise anyone with his laptop can connect as a privileged user.

Password file:

initialization parameter file parameter REMOTE_LOGIN_PASSWORDFILE to exclusive, shared, or none.

exclusive specifies that only one instance can use the password file and that the password file contains names other than SYS. 

 shared. It specifies that multiple instances (for example, an Oracle RAC environment) can use the password file. 

none specifies that Oracle Database ignores the password file.


=========

Let’s say we want to create a database user who logs in with their OS account. To do so, we will use the “external authentication” option offered by oracle.

Check the value of the os_authent_prefix parameter

SQL> show parameter os_authent_prefix

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

os_authent_prefix                    string      ops$

The ops$ value is the prefix that oracle will use for each OS account.As we can see, our user is prefixed with ops$. Oracle uses the prefix to recognize OS users when they attempt to connect. It is also a way for Oracle to separate a database account and an OS account.

Let’s create the external user account in the database

SQL> create user ops$xxxx identified externally;

Give the necessary privileges

SQL> grant connect, resource, create session to ops$xxxx;

Let’s create the user named xxxx and his password.

[root@dbserver ~]# useradd -g oinstall -G dba,oper,asmdba xxxx

[root@dbserver ~]# passwd xxxx

Open a new session and connect with the new operating system (os) user created “xxxx”

login as: xxxx

xxxx@192.168.54.10's password

Load the  database

[xxxx@dbserver ~]$  . oraenvoracle database 11

ORACLE_SID = [xxxx] ? achats

ORACLE_BASE environment variable is not being set since this

information is not available for the current user ID xxxx.

You can set ORACLE_BASE manually if it is required.

Resetting ORACLE_BASE to its previous value or ORACLE_HOME

The Oracle base has been set to /u01/app/oracle/product/12.1.0/db

Here, our target database is called achats

Connect to database

[xxxx@dbserver ~]$ sqlplus /

SQL*Plus: Release 12.1.0.2.0 Production on Mon Dec 23 09:47:48 2019

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics

and Real Application Testing options

SQL>

Now the os user named “xxxx” can connect to the database without being asked for a password.

Monday, April 18, 2022

Interview questions - Upgrade

 What happens when you give "STARTUP UPGRADE"?

$sqlplus "/as sysdba"

SQL> STARTUP UPGRADE

Note: The UPGRADE keyword enables you to open a database based on an earlier Oracle Database release. It also restricts logons to AS SYSDBAsessions, disables system triggers, and performs additional operations that prepare the environment for the upgrade.

You might be required to use the PFILE option to specify the location of your initialization parameter file.

Once the database is started in upgrade mode, only queries on fixed views execute without errors until after the catupgrd.sql script is run. Before running catupgrd.sql, queries on any other view or the use of PL/SQL returns an error.



Friday, April 15, 2022

Interview Questions and Answers - RAC

 What is RAC?

Allows multiple instances to access a single database.
A Cluster is a feature of High Availability, where it eliminates single point of failure.

Why do we have to create odd number of voting disk?

As far as voting disks are concerned, a node must be able to access strictly more than half of the voting disks at any time. So if you want to be able to tolerate a failure of n voting disks, you must have at least 2n+1 configured. (n=1 means 3 voting disks). You can configure up to 32 voting disks, providing protection against 15 simultaneous disk failures.

Oracle recommends that customers use 3 or more voting disks in Oracle RAC 10g Release 2. Note: For best availability, the 3 voting files should be physically separate disks. It is recommended to use an odd number as 4 disks will not be any more highly available than 3 disks, 1/2 of 3 is 1.5...rounded to 2, 1/2 of 4 is 2, once we lose 2 disks, our cluster will fail with both 4 voting disks or 3 voting disks.

Does the cluster actually check for the vote count before node eviction? If yes, could you expain this process briefly?

Yes. If you lose half or more of all of your voting disks, then nodes get evicted from the cluster, or nodes kick themselves out of the cluster

How does OCSSD starts first if voting disk & OCR resides in ASM Diskgroups?

You might wonder how CSSD, which is required to start the clustered ASM instance, can be started if voting disks are stored in ASM?

This sounds like a chicken-and-egg problem:

without access to the voting disks there is no CSS, hence the node cannot join the cluster.

But without being part of the cluster, CSSD cannot start the ASM instance.

To solve this problem the ASM disk headers have new metadata in 11.2:

you can use kfed to read the header of an ASM disk containing a voting disk.

The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does not require the ASM instance to be up.

Once the voting disks are located, CSS can access them and joins the cluster.

What is gsdctl in RAC? list gsdctl commands in Oracle RAC?

GSDCTL stands for Global Service Daemon Control, we can use gsdctl commands to start, stop, and obtain the status of the GSD service on any platform.

The options for gsdctl are:-

$ gsdctl start -- To start the GSD service

$ gsdctl stop  -- To stop the GSD service

$ gsdctl stat  -- To obtain the status of the GSD service

Log file location for gsdctl:

$ ORACLE_HOME/srvm/log/gsdaemon_node_name.log

What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd node is in cold standby mode. If the instance fails for some reason then RAC one node detect it and restart the instance on the same node or the instance is relocate to the 2nd node incase there is failure or fault in 1st node. The benefit of this feature is that it provides a cold failover solution and it automates the instance relocation without any downtime and does not need a manual intervention. Oracle introduced this feature with the release of 11gR2 (available with Enterprise Edition).

What is RAC and how is it different from non RAC databases?

Oracle Real Application clusters allows multiple instances to access a single database, the instances will be running on multiple nodes.

In Real Application Clusters environments, all nodes concurrently execute transactions against the same database.

Real Application Clusters coordinates each node's access to the shared data to provide consistency and integrity.

What are the advantages of RAC (Real Application Clusters)?

Reliability - if one node fails, the database won't fail

Availability - nodes can be added or replaced without having to shutdown the database

Scalability - more nodes can be added to the cluster as the workload increases

What is Cache Fusion?

Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block, it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process manages the cache fusion.

What command would you use to check the availability of the RAC system?

crs_stat -t -v (-t -v are optional)

How do we verify that RAC instances are running?

SQL>select * from V$ACTIVE_INSTANCES;

The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.

How can you connect to a specific node in a RAC environment?

tnsnames.ora ensure that you have INSTANCE_NAME specified in it.

Which is the "MASTER NODE" in RAC?

The node with the lowest node number will become master node and dynamic remastering of the resources will take place.

To find out the master node for particular resource, you can query v$ges_resource for MASTER_NODE column.

To find out which is the master node, you can see ocssd.log file and search for "master node number".when the first master node fails in the cluster the lowest node number will become master node.

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.

Give few examples for solutions that support cluster storage?

·ASM (automatic storage management),

·Raw disk devices,

·Network file system (NFS),

·OCFS2 and

·OCFS (Oracle Cluster Fie systems).

What are Oracle Cluster Components?

1.Cluster Interconnect (HAIP)

2.Shared Storage (OCR/Voting Disk)

3.Clusterware software

4.Oracle Kernel Components

What are Oracle RAC Components?

VIP, Node apps etc.

What are Oracle Kernel Components?

Basically Oracle kernel need to switched on with RAC On option when you convert to RAC, that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.

How to turn on RAC?

# link the oracle libraries

$ cd $ORACLE_HOME/rdbms/lib

$ make -f ins_rdbms.mk rac_on

# rebuild oracle

$ cd $ORACLE_HOME/bin

$ relink oracle

Disk architechture in RAC?

SAN (Storage Area Networks) - generally using fibre to connect to the SAN

NAS (Network Attached Storage) - generally using a network to connect to the NAS using either NFS, ISCSI

What is Oracle Clusterware?

The Clusterware software allows nodes to communicate with each other and forms the cluster that makes the nodes work as a single logical server.

The software is run by the Cluster Ready Services (CRS) using the Oracle Cluster Registry (OCR) that records and maintains the cluster and node membership information and the voting disk which acts as a tiebreaker during communication failures. Consistent heartbeat information travels across the interconnect to the voting disk when the cluster is running.

Real Application Clusters

Oracle RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide a highly scalable and available database solution for all your business applications. Oracle RAC provides the foundation for enterprise grid computing.

Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a single database across a cluster of servers, providing fault tolerance from hardware failures or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of capability in terms of availability, scalability, and low-cost computing.

One DB opened by multipe instances so the the db ll be Highly Available if an instance crashes.

Cluster Software. Oracles Clusterware or products like Veritas Volume Manager are required to provide the cluster support and allow each node to know which nodes belong to the cluster and are available and with Oracle Cluterware to know which nodes have failed and to eject then from the cluster, so that errors on that node can be cleared.

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

The cluster registry holds all information about nodes, instances, services and ASM storage if used, it also contains state information ie they are available and up or similar.

The voting disk is used to determine if a node has failed, i.e. become separated from the majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted and will after the reboot add itself again the the surviving cluster nodes.

What are the Oracle Clusterware key components?

Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.

What is Voting Disk and OCR?

Voting Disk:

Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.

A node must be able to access more than half of the voting disks at any time.

For example, if you have 3 voting disks configured, then a node must be able to access at least two of the voting disks at any time. If a node cannot access the minimum required number of voting disks it is evicted, or removed, from the cluster.

Oracle Cluster Registry (OCR) 

The cluster registry holds all information about nodes, instances, services and ASM storage if used, it also contains state information ie they are available and up or similar.

The OCR must reside on shared disk that is accessible by all of the nodes in your cluster.

What are the administrative tasks involved with voting disk?

Following administrative tasks are performed with the voting disk :

1) Backing up voting disks

2) Recovering Voting disks

3) Adding voting disks

4) Deleting voting disks

5) Moving voting disks

Can you add voting disk online? Do you need voting disk backup?

Yes,  as per documentation, if you have multiple voting disk you can add online, but if you have only one voting disk , by that cluster will be down as its lost you just need to start crs in exclusive mode and add the votedisk using

crsctl add votedisk <path>

What is the Oracle Recommendation for backing up voting disk?

Oracle recommends us to use the dd command to backup the voting disk with a minimum block size of 4KB.

How do we backup voting disks?

1) Oracle recommends that you back up your voting disk after the initial cluster creation and after we complete any node addition or deletion procedures.

2) First, as root user, stop Oracle Clusterware (with the crsctl stop crs command) on all nodes. Then, determine the current voting disk by issuing the following command:

crsctl query votedisk css

3) Then, issue the dd or ocopy command to back up a voting disk, as appropriate.

Give the syntax of backing up voting disks:-

On Linux or UNIX systems:

dd if=voting_disk_name of=backup_file_name

where,

voting_disk_name is the name of the active voting disk

backup_file_name is the name of the file to which we want to back up the voting disk contents

On Windows systems, use the ocopy command:

copy voting_disk_name backup_file_name

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -showbackup

You have lost OCR disk, what is your next step?

The cluster stack will be down due to the fact that cssd is unable to maintain the integrity, this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and running. You can add the ocr back by restoring the automatic backup or import the manual backup,

What are the major RAC wait events?

In a RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.The most common wait events related to this are gc cr request and gc buffer busy

GC CR request :the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block.

What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report? 

This is most likely due to a fault in interconnect network.

Check netstat -s

if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.

How do you troubleshoot node reboot?

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots

Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.

What are Oracle Clusterware processes for 10g on Unix and Linux?

Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user

Event manager daemon (evmd) —A background process that publishes events that crs creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.

What are Oracle database background processes specific to RAC?

Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.

What is ACMS?

ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.

What is SCAN listener?

A scan listener is something that additional to node listener which listens the incoming db connection requests from the client which got through the scan IP, it got end points configured to node listener where it routes the db connection requests to particular node listener.

SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the concurrent manager (kind of job scheduler in oracle apps).

Steps to disable the SCAN IP,

i.  Do not use SCAN IP at the client end.

ii. Stop scan listener

    srvctl stop scan_listener

iii.Stop scan

    srvctl stop scan (this will stop the scan vip's)

iv. Disable scan and disable scan listener

    srvctl disable scan

What are the different network components are in 10g RAC?

public, private, and vip components

Private interfaces is for intra node communication.

VIP is all about availability of application. When a node fails then the VIP component fail over to some other node, this is the reason that all applications should based on vip components means tns entries should have vip entry in the host list

What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

How can we configure the cluster interconnect?

· Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.

· On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be used by Oracle Clusterware.

· Windows clusters use the TCP protocol.

What is the purpose of Private Interconnect?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.

RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that did will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

Give situations under which VIP address failover happens?

VIP addresses failover happens when the node on which the VIP address runs fails; all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services enable us to define rules and characteristics to control how users and applications connect to database instances.

What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing, failover options, and high availability.

What enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

What are the types of connection load-balancing?

Connection Workload management is one of the key aspects when you have RAC instances as you want to distribute the connections to specific nodes/instance or those have less load.

There are two types of connection load-balancing:

1.Client Side load balancing (also called as connect time load balancing)

2.Server side load balancing (also called as Listener connection load balancing)

What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.

Client Side load balancing:- Oracle client side load balancing feature enables clients to randomize the connection requests among all the available listeners based on their load.

An tns entry that contains all nodes entries and use load_balance=on (default its on) will use the connect time load balancing or client side load balancing.

Sample Client Side TNS Entry:-

    finance =

    (DESCRIPTION =

         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))

         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))

         (ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))

    (LOAD_BALANCE = yes)

    (CONNECT_DATA =

         (SERVER = DEDICATED)

         (SERVICE_NAME = FINANCE) (FAILOVER=ON)

    (FAILOVER_MODE =  (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5))

    )

    )

Server side load balancing:- This improves the connection performance by balancing the number of active connections among multiple instances and dispatchers. In a single instance environment (shared servers), the listener selects the least dispatcher to handle the incoming client requests. In a rac environments, PMON is aware of all instances load and dispatchers , and depending on the load information PMON redirects the connection to the least loaded node.

In a RAC environment, *.remote_listener parameter which is a tns entry containing all nodes addresses need to set to enable the load balance advisory updates to PMON.

Sample Tns entry should be in an instances of RAC cluster,

    local_listener=LISTENER_MYRAC1

    remote_listener = LISTENERS_MYRACDB

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using the below

·       OEM (Enterprise Manager),

·       SQL*PLUS,

·       Server control (SRVCTL),

·       Cluster Verification Utility (CLUVFY),

·       DBCA,

·       NETCA

Name some Oracle Clusterware tools and their uses?

·OIFCFG - allocating and deallocating network interfaces.

·OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.

·OCRDUMP - Identify the interconnect being used.

·CVU - Cluster verification utility to get status of CRS resources.


What is the difference between CRSCTL and SRVCTL?

crsctl manages clusterware-related operations:

    Starting and stopping Oracle Clusterware

    Enabling and disabling Oracle Clusterware daemons

    Registering cluster resources

srvctl manages Oracle resource–related operations:

    Starting and stopping database instances and services

    Also from 11gR2 manages the cluster resources like network,vip,disks etc

How do we remove ASM from a Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows:

srvctl stop asm -n node_name

srvctl remove asm -n node_name

We can verify if ASM has been removed by issuing the following command:

srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?

Issue the following srvctl command:

srvctl config database -d database_name

cd CRS_HOME/bin

./crs_stat

What are the modes of deleting instances from ORacle Real Application cluster Databases?

We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).

What are the background process that exists in 11gr2 and functionality?

Process Name     Functionality

crsd     •The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.

cssd     •Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node.

diskmon     •Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.

evmd     •Event Manager (EVM): Is a background process that publishes Oracle Clusterware events

mdnsd     •Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

gnsd     •Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.

ons     •Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events

oraagent     •oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).

orarootagent     •Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address

oclskd     •Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS

gipcd     •Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure

ctssd     •Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP

Under which user or owner the process will start?

Component                     Name of the Process         Owner

Oracle High Availability Service         ohasd                 init, root

Cluster Ready Service (CRS)             Cluster Ready Services         root

Cluster Synchronization Service (CSS)         ocssd,cssd monitor, cssdagent     grid owner

Event Manager (EVM)                 evmd, evmlogger         grid owner

Cluster Time Synchronization Service (CTSS)     octssd                 root

Oracle Notification Service (ONS)         ons, eons             grid owner

Oracle Agent                     oragent             grid owner

Oracle Root Agent                 orarootagent             root

Grid Naming Service (GNS)             gnsd                 root

Grid Plug and Play (GPnP)             gpnpd                 grid owner

Multicast domain name service (mDNS)         mdnsd                 grid owner

What is the major difference between 10g and 11g RAC?

There is not much difference between 10g and 11gR (1) RAC. But there is a significant difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS

    Databases

    Instances

    Applications

    Node Monitoring

    Event Services

    High Availability

From 11gR2(onwards) its completed HA stack managing and providing the following resources as like the other cluster software like VCS etc.

    Databases

    Instances

    Applications

    Cluster Management

    Node Management

    Event Services

    High Availability

    Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN – Single Access Client Naming method, HAIP

    Storage Management (with help of ASM and other new ACFS filesystem)

    Time synchronization (rather depending upon traditional NTP)

    Removed OS dependent hang checker etc, manages with own additional monitor process

What is hangcheck timer? 

The hangcheck timer checks regularly the health of the system. If the system hangs or stop the node will be restarted automatically.

There are 2 key parameters for this module:

-> hangcheck-tick: this parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.

-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node.

State the initialization parameters that must have same value for every instance in an Oracle RAC database?

Some initialization parameters are critical at the database creation time and must have same values.Their value must be specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance are given below:

ACTIVE_INSTANCE_COUNT

ARCHIVE_LAG_TARGET

COMPATIBLE

CLUSTER_DATABASE

CLUSTER_DATABASE_INSTANCE

CONTROL_FILES

DB_BLOCK_SIZE

DB_DOMAIN

DB_FILES

DB_NAME

DB_RECOVERY_FILE_DEST

DB_RECOVERY_FILE_DEST_SIZE

DB_UNIQUE_NAME

INSTANCE_TYPE (RDBMS or ASM)

PARALLEL_MAX_SERVERS

REMOTE_LOGIN_passWORD_FILE

UNDO_MANAGEMENT

What is RAC? What is the benefit of RAC over single instance database?

In Real Application Clusters environments, all nodes concurrently execute transactions against the same database. Real Application Clusters coordinates each node's access to the shared data to provide consistency and integrity.

Benefits:

Improve response time

Improve throughput

High availability

Transparency

Advantages of RAC (Real Application Clusters)

Reliability - if one node fails, the database won't fail

Availability - nodes can be added or replaced without having to shutdown the database

Scalability - more nodes can be added to the cluster as the workload increases

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service maintains communication between each nodes and listeners.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What is voting disk?

Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes in the cluster. All nodes in the cluster registers their heart-beat information in the voting disk, so as to confirm that they are all operational. If heart-beat information of any node in the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster Synchronization Service) daemon in the clusterware maintains the heart beat of all nodes to the voting disk. When any node is not able to send heartbeat to voting disk, then it will reboot itself, thus help avoiding the split-brain syndrome.

For high availability, Oracle recommends that you have a minimum of three or odd number (3 or greater) of votingdisks.

Voting Disk - is file that resides on shared storage and Manages cluster members.  Voting disk reassigns cluster ownership between the nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which nodes are currently members of the cluster. The voting disk files are also used in concert with other Cluster components such as CRS to maintain the clusters integrity.

Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM even if the ASM instance is down. As a result CSS can continue to maintain the Oracle cluster even if the ASM instance has failed.

How many voting disks are you maintaining ?

http://www.toadworld.com/KNOWLEDGE/KnowledgeXpertforOracle/tabid/648/TopicID/RACR2ARC6/Default.aspx

By default Oracle will create 3 voting disk files in ASM.

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.

You should plan on allocating 280MB for each voting disk file. For example, if you are using ASM and external redundancy then you will need to allocate 280MB of disk for the voting disk. If you are using ASM and normal redundancy you will need 560MB.

Why we need to keep odd number of voting disks ?

Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.

What are Oracle RAC software components?

Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes:

ACMS—Atomic Controlfile to Memory Service (ACMS)

GTX0-j—Global Transaction Process

LMON—Global Enqueue Service Monitor

LMD—Global Enqueue Service Daemon

LMS—Global Cache Service Process

LCK0—Instance Enqueue Process

RMSn—Oracle RAC Management Processes (RMSn)

RSMN—Remote Slave Monitor

What is TAF?

TAF (Transparent Application Failover) is a configuration that allows session fail-over between different nodes of a RAC database cluster.

Transparent Application Failover (TAF). If a communication link failure occurs after a connection is established, the connection fails over to another active node. Any disrupted transactions are rolled back, and session properties and server-side program variables are lost. In some cases, if the statement executing at the time of the failover is a Select statement, that statement may be automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were "in flight" at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.

TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level.

What are the requirements for Oracle Clusterware?

1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster Registry - OCR)

2. Two netwrok cards on each cluster ware node (and three set of IP address) -

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)

IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for connection failover)

3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster File System), NFS, …..

Which enable the  load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

How to find location of OCR file when CRS is down?

If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.

When the CRS is down:

Look into “ocr.loc” file, location of this file changes depending on the OS:

On Linux: /etc/oracle/ocr.loc

On Solaris: /var/opt/oracle/ocr.loc

When CRS is UP:

Set ASM environment or CRS environment then run the below command:

ocrcheck

In 2 node RAC, how many NIC’s are r using ?

2 network cards on each clusterware node

Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)

In 2 node RAC, how many IP’s are r using ?

6 - 3 set of IP address

## eth1-Public:  2

## eth0-Private: 2

## VIP: 2

How to find IP’s information in RAC ?

Edit the /etc/hosts file as shown below:

# Do not remove the following line, or various programs

# that requires network functionality will fail.

127.0.0.1               localhost.localdomain localhost

## Public Node names

 192.168.10.11          node1-pub.hingu.net     node1-pub

192.168.10.22          node2-pub.hingu.net     node2-pub

## Private Network (Interconnect)

 192.168.0.11            node1-prv               node1-prv

192.168.0.22            node2-prv               node2-prv

## Private Network (Network Area storage)

 192.168.1.11            node1-nas               node1-nas

192.168.1.22            node2-nas               node2-nas

192.168.1.33            nas-server              nas-server

## Virtual IPs

 192.168.10.111          node1-vip.hingu.net     node1-vip

192.168.10.222          node2-vip.hingu.net     node2-vip

What is difference between RAC ip addresses ?

Public IP adress is the normal IP address typically used by DBA and SA to manage storage, system and database. Public IP addresses are reserved for the Internet.

Private IP address is used only for internal clustering processing (Cache Fusion) (aka as interconnect). Private IP addresses are reserved for private networks.

VIP is used by database applications to enable fail over when one cluster node fails. The purpose for having VIP is so client connection can be failover to surviving nodes in case there is failure

Can application developer access the private ip ?

No. private IP address is used only for internal clustering processing (Cache Fusion) (aka as interconnect)

35. How to stop whole cluster with single command

crsctl stop cluster (possible only from 11gr2), please note crsctl commands becomes global now, if you do not specify node specifically the command executed globally for example

crsctl stop crs (stops in all crs resource in all nodes)

crsctl stop crs –n <ndeoname) (stops only in specified node)

36. CRS is not starting automatically after a node reboot, what you do to make it happen?

42. Can you add voting disk online? Do you need voting disk backup?

Yes,  as per documentation, if you have multiple voting disk you can add online, but if you have only one voting disk , by that cluster will be down as its lost you just need to start crs in exclusive mode and add the votedisk using

crsctl add votedisk <path>

43. You have lost OCR disk, what is your next step?

The cluster stack will be down due to the fact that cssd is unable to maintain the integrity, this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and running. You can add the ocr back by restoring the automatic backup or import the manual backup,

What happens when ocssd fails, what is node eviction? how does node eviction happens? For all answer will be same.

http://db.geeksinsight.com/2012/12/27/oracle-rac-node-evictions-11gr2-node-eviction-means-restart-of-cluster-stack-not-reboot-of-node/

 why we required to maintain odd number of voting disks?

Odd number of disk are to avoid split brain, When Nodes in cluster can't talk to each other they run to lock the Voting disk and whoever lock the more disk will survive, if disk number are even there are chances that node might lock 50% of disk (2 out of 4) then how to decide which node to evict.

whereas when number is odd, one will be higher than other and each for cluster to evict the node with less number


Good one for RAC> https://dbaclass.com/article/oracle-rac-interview-questions/


Friday, April 8, 2022

Parallel concurrent Processing (PCP) in EBS R12

 What is PCP?

PCP allows running the concurrent managers across multiple node.

How Parallel Concurrent Processing Works?

How to Configure Parallel Concurrent Processing (PCP)

Backup all .ora files

Take backup of all .ora files such as tnsnames.ora, listener.ora and sqlnet.ora files, where they exist, under the 10.1.2 and 10.1.3 ORACLE_HOME locations on the each node.

Edit Context file

Edit the applications context file for following. Stop Application:

Set the value of the variable 

> APPLDCP to ON

> s_applcsf to <shared location between apps servers>

> s_appltmp to <shared location between apps servers>

Edit parameters in spfile (for the transaction manager)

By user having sysdba privilege

alter system set “_lm_global_posts”=true scope=spfile;

alter system set “_immediate_commit_propagation”=true scope=spfile;

Edit utl_file_dir variable

From database node as sysdba

ALTER SYSTEM SET utl_file_dir='<shared temp location’ SCOPE=BOTH sid=’*’;

Bounce database to reflect the changes made.

Execute AutoConfig

Execute AutoConfig on all concurrent processing nodes:

Check the tnsnames.ora and listener.ora configuration files

Check the tnsnames.ora and listener.ora configuration files, located in $INST_TOP/ora/10.1.2/network/admin. Ensure that the required FNDSM and FNDFS entries are present for all other concurrent nodes.

Start the Applications

Log on to Oracle E-Business Suite Release 12 using the SYSADMIN account, and choose the System Administrator Responsibility.

• Navigate to Install > Nodes screen, and ensure that each node in the cluster is registered.

•Set up the primary and secondary node names

Navigate to Concurrent > Manager > Define, and set up the primary and secondary node names for all the concurrent managers according to the desired configuration for each node workload.

Verify that the Internal Monitor for each node is defined properly, with correct primary node specification, and work shift details. For example, Internal Monitor: Host1 must have primary node as host1. Also ensure that the Internal Monitor manager is activated: this can be done from Concurrent > Manager > Administrator. 

Set profile option 'Concurrent: PCP Instance Check' to OFF if database instance-sensitive failover is not required (In case of Non RAC Database). By setting it to 'ON', a concurrent manager will fail over to a secondary Application tier node if the database instance to which it is connected becomes unavailable for some reason.

Set Up Transaction Managers  (Only R12)

If you are already using the transnational managers and If you wish to have transnational managers fail over, Perform the below steps

  - Shut down the application services (servers) on all nodes

  - Shut down all the database instances cleanly in the Oracle RAC environment, using the command: 

 - SQL>shutdown immediate;

 - Edit the $ORACLE_HOME/dbs/<context_name>_ifile.ora and add the following parameters:

        _lm_global_posts=TRUE

        _immediate_commit_propagation=TRUE

 - Start the instances on all database nodes.

 - Start up the application services (servers) on all nodes.

 - Log on to Oracle E-Business Suite Release 12 using the SYSADMIN account, and choose the System Administrator responsibility. Navigate to Profile > System, change the profile option ‘Concurrent: TM Transport Type' to ‘QUEUE', and verify that the transaction manager works across the Oracle RAC instance.

- Navigate to Concurrent > Manager > Define screen, and set up the primary and secondary node names for transaction managers.

- Restart the concurrent managers.

- If any of the transaction managers are in a deactivated status, activate them from Concurrent > Manager > Administrator.

 Set Up Load Balancing on Concurrent Processing Nodes (Only Applicable in case of RAC)

If you wish to have PCP to use the load balancing capability of RAC, You can perform the below, Connections will load balanced using SID_BALANCE value and they will connect to all the RAC nodes.

   - Edit the applications context file through the Oracle Applications Manager interface, and set the value of Concurrent Manager TWO_TASK (s_cp_twotask) to the load balancing alias (<service_name>_balance>).

- Execute AutoConfig by running $INST_TOP/admin/scripts/adautocfg.sh on all concurrent nodes.

Is RAC Mandatory to Implement PCP?

  - No, RAC is not manadatory for PCP, If you have two or more applications nodes, You can enable PCP, But PCP works better in conjunction with RAC to handle all the failover scenarious.

How PCP Works with RAC?

 - In RAC Enabled env, PCP uses cp_two_task env variable to connect to DB RAC node, This can be set one CM node to one RAC node or you can set to connect to all the RAC nodes in the cluster.

What happens when one of the RAC node goes down when PCP enabled?

 - When Concurrent: PCP Instance Check is set to ON and cp_two_task value set to SID (i.e One CM node connects to only one RAC node always), If one DB node goes down, PCP identifies the DB failure and shifts all the CM managers to other applications node where Database is available.

What happen when one of the PCP node goes down?

 - IMON identifies the failure and through FNDSM (service Manager) It initiates ICM to start in surviving node (If ICM is is running on Failed node), ICM will start all the managers.

What is primary and Secondary Nodes in PCP?

 - It is requirement to define the primary and secondary node to distribute load on the servers, If this is not defined,All the managers will start on the node where ICM is running by default.

How Fail Back happens in PCP?

 - Once failed node comes online, IMON detects and ICM will fail back all the managers defined on that node. 

What happens to requests running during failover in PCP?

 - It is important to note RAC and PCP does not support any DML commands and TAF and FAN are not supported with E-Bussiness Suite.

 - When a request is running, If CM goes down it is having status running normal and it will not have any associated process ID, When ICM start in other node, It verifies for all the running normal requests and verifies the OS process ID, If it did not find the process ID, It will resubmit the request to start.

 -  This behavior is normal even in NON PCP env.

 - The Internal Concurrent Manager (ICM) will only restart a request if the following conditions are met

 The ICM got the manager's database lock for the manager that was running the request

 The phase of the request is "running" (phase_code = 'R')

 The program for this request is set to "restart on failure"

 All of the above requirements have been met AND at least one of the following:

           a.  The ICM is just starting up, (ie. it has just spawned on a given node and going through initial code before the main loop)

           b.  The node of the concurrent manager for which we got the lock is down

           c.  The database instance (TWO_TASK) defined for the node of that concurrent  manager is down (this is not applicable if one is using some "balance" @ TWO_TASK on that node)

How PCP identifies when node goes down?

  - There are two types of failures that PCP recognizes.

 Is the node pingable ? 

 Issues an operating system ping on the machine name - timeout or available.


 Is the database available? 

 Query on V$threads and V$instance for value of open or close.

 - When any of the two above failures occur, the following example will illustrate the failover and failback of managers.

 Primary node = HOST1 - Managers assigned to primary node are ICM (FNDLIBR-cpmgr) , FNDCRM

 Secondary node = HOST2 - Manager assigned to secondary node is STandard Manager (FNDLIBR)

 When HOST1 becomes unavailable, both ICM and FNDCRM are migrated over to HOST2.

 This is viewable from Administer Concurrent Manager form in System Administrator Responsibility.

 The $APPLCSF/log/.mgr logfile will also reflect that HOST1 is being added to unavailable list.

 On HOST2, after pmon cycle, FNDICM, FNDCRM, and FNDLIBR are now migrated and running.

 (Note: FNDIMON and FNDSM run independently on each concurrent processing node. FNDSM

 is not a persistent process, and FNDIMON is a persistent process local to each node)

 Once HOST1 becomes available, FNDICM and FNDCRM are migrated back to the original primary 

 node for successful failback.

In summary, in a successful fail over and failback scenario, all managers should failover to their secondary node, and once node or instance becomes available; then all managers should failback to primary node.

How  PCP works  internally

1) ICM contacts TNS Listener . TNS listener must be started on all the CM nodes

2) TNS Listener spawns Service Manager( FNDSM).Each CM node will have Service Manager( FNDSM) started

3) ICM communicates with Service Manager(FNDSM)

4) Service Manager spawns various Manager and Service processes

5) If ICM crashes due to node failures

6) Internal Monitor will spawn ICM locally when it detects ICM is down

Thursday, April 7, 2022

Oracle DBA Interview Questions - Backup and Recovery

What is difference between Restoring and Recovery of database?

Restoring means copying the database object from the backup media to the destination where actually it is required where as recovery means to apply the database object copied earlier (roll forward) in order to bring the database into consistent state.

What is the difference between complete and incomplete recovery?

An incomplete database recovery is a recovery that it does not reach to the point of failure. The recovery can be either point of time or particular SCN or Particular archive log specially incase of missing archive log or redolog failure where as a complete recovery recovers to the point of failure possibly when having all archive log backup.

How would you decide your backup strategy and timing for backup?

In fact backup strategy is purely depends upon your organization business need.

If no downtime then database must be run on archivelog mode and you have to take frequently or daily backup.

If sufficient downtime is there and loss of data would not affect your business then you can run your database in noarchivelog mode and backup can be taken in-frequently or weekly or monthly.

In most of the case in an organization when no downtime then frequent inconsistent backup needed (daily backup), multiplex online redo log files (multiple copies), different location for redo log files, database must run in archivelog mode and dataguard can be implemented for extra bit of protection.

What is the benefit of running the DB in archivelog mode over no archivelog mode?

When a database is in no archivelog mode whenever log switch happens there will be a loss of some redoes log information in order to avoid this, redo logs must be archived. This can be achieved by configuring the database in archivelog mode.

If an oracle database is crashed? How would you recover that transaction which is not in backup?

If the database is in archivelog we can recover that transaction otherwise we cannot recover that transaction which is not in backup.

What is the difference between HOTBACKUP and RMAN backup?

For hotbackup we have to put database in begin backup mode, then take backup where as RMAN would not put database in begin backup mode. RMAN is faster can perform incremental (changes only) backup, and does not place tablespace in hotbackup mode.

Can we use Same target database as Catalog database?

No, the recovery catalog should not reside in the target database (database to be backed up) because the database can not be recovered in the mounted state.

Incremental backup levels:

Level 0 – full backup that can be used for subsequent incrementals

RMAN> backup incremental level 0 database;

Differential Level 1–only the blocks that have changed since the last backup (whether it is level 0 or level 1)

RMAN> backup incremental level 1 differential database;

Cumulative Level 1 – all changes since the last level 0 incremental backup

RMAN> backup incremental level 1 cumulative database;

A full backup cannot be used for a cumulative level 1 backup.

A cumulative level 1 backup must be done on top of an incremental level 0 backup.

Why RMAN incremental backup fails even though full backup exists?

If you have taken the RMAN full backup using the command ‘Backup database’, where as a level 0 backup is physically identical to a full backup. The only difference is that the level 0 backup is recorded as an incremental backup in the RMAN repository so it can be used as the parent for a level 1 backup. Simply the ‘full backup without level 0’ can not be considered as a parent backup from which you can take level 1 backup.

Can we perform RMAN level 1 backup without level 0?

If no level 0 is available, then the behavior depends upon the compatibility mode setting (oracle version).

If the compatibility mode less than 10.0.0, RMAN generates a level 0 backup of files contents at the time of backup.

If the compatibility is greater than 10.0.0, RMAN copies all block changes since the file was created, and stores the results as level 1 backup.

How to put Manual/User managed backup in RMAN?

In case of recovery catalog, you can put by using catalog command:

RMAN> CATALOG START WITH ‘/oracle/backup.ctl’;

How to check RMAN version in oracle?

If you want to check RMAN catalog version then use the below query from SQL*plus

SQL> Select * from rcver;

What happens actually in case of instance Recovery?

While Oracle instance fails, Oracle performs an Instance Recovery when the associated database is being re-started. Instance recovery occurs in 2 steps:

Cache recovery: Changes being made to a database are recorded in the database buffer cache as well as redo log files simultaneously. When there are enough data in the database buffer cache, they are written to data files. If an Oracle instance fails before these data are written to data files, Oracle uses online redo log files to recover the lost data when the associated database is re-started. This process is called cache recovery.

Transaction recovery: When a transaction modifies data in a database (the before image of the modified data is stored in an undo segment which is used to restore the original values in case the transaction is rolled back). At the time of an instance failure, the database may have uncommitted transactions. It is possible that changes made by these uncommitted transactions have gotten saved in data files. To maintain read consistency, Oracle rolls back all uncommitted transactions when the associated database is re-started. Oracle uses the undo data stored in undo segments to accomplish this. This process is called transaction recovery.

What is RMAN?

Recovery Manager (RMAN) is a utility that can manage your entire Oracle backup and recovery activities.

What is the difference between using recovery catalog and control file?

When new incarnation happens, the old backup information in control file will be lost. It will be preserved in recovery catalog.

In recovery catalog we can store scripts.

Recovery catalog is central and can have information of many databases.

Can we use same target database as catalog?

No, The recovery catalog should not reside in the target database (database should be backed up), because the database can’t be recovered in the mounted state.

How do you know that how much RMAN task has been completed?

By querying v$rman_status or v$session_longops

From where list & report commands will get input?

Both the commands command quering v$ and recovery catalog views. V$BACKUP_FILES or many of the recovery catalog views such asRC_DATAFILE_COPY or RC_ARCHIVED_LOG.

Command to delete archive logs older than 7days?

RMAN> delete archivelog all completed before sysdate-7;

How many times does oracle ask before dropping a catalog?

The default is two times one for the actual command, the other for confirmation.

How to view the current defaults for the database.

RMAN> show all;

What is the use of crosscheck command in RMAN?

Crosscheck will be useful to check whether the catalog information is intact with OS level information. This command only updates repository records with the status of the backups.

e.g. If user removes archived logs from disk with an operating system command, the repository still indicates that the logs are on disk, when in fact they are not.

 What are the differences between crosscheck and validate commands?

Validate command is to examine a backup set and report whether it can be restored. RMAN scans all of the backup pieces in the specified backup sets and looks at the checksum to verify that the contents are intact so that backup can be successfully restored if necessary.

Crosscheck command is to verify the status of backups and copies recorded in the RMAN repository against media such as disk or tape. The crosscheck command only processes files created on the same device type as the channel running crosscheck.

Which one is good, differential (incremental) backup or cumulative (incremental) backup?

A differential backup, which backs up all blocks changed after the most recent incremental backup at level 1 or 0

RMAN> BACKUP INCREMENTAL LEVEL 1 DATABASE;

A cumulative backup, which backs up all blocks changed after the most recent incremental backup at level 0

RMAN> BACKUP INCREMENTAL LEVEL 1 CUMULATIVE DATABASE;

Cumulative backups are preferable to differential backups when recovery time is more important than disk space, because during recovery each differential backup must be applied in succession. Use cumulative incremental backups instead of differential, if enough disk space is available to store cumulative incremental backups.

This is command for taking Level 0 backup.

RMAN> BACKUP INCREMENTAL LEVEL 0 DATABASE;

What is the difference between backup set and backup piece?

Backup set is logical and backup piece is physical.

RMAN command to backup for creating standby database

RMAN> duplicate target database

You loss one datafile and DB is running in ARCHIVELOG mode. You have full database backup of 1 week/day old and don’t have backup of this (newly created) datafile. How do you restore/recover file?

Create data file and recover datafile.

SQL> alter database create datafile ‘/u01/app/oracle/oradata/xyz.dbf’ size 2G;

RMAN> recover datafile file_id;

What is obsolete backup & expired backup?

A status of “expired” means that the backup piece or backup set is not found in the backup destination.

A status of “obsolete” means the backup piece is still available, but it is no longer needed. The backup piece is no longer needed since RMAN has been configured to no longer need this piece after so many days have elapsed, or so many backups have been performed.

What is the difference between hot backup & RMAN backup?

For hot backup, we have to put database in begin backup mode, then take backup.

RMAN won’t put database in backup mode.

How to put manual/user-managed backup in RMAN (recovery catalog)?

By using catalog command.

RMAN> CATALOG START WITH ‘/tmp/backup.ctl’;

What are the Architectural components of RMAN?

RMAN Executables

Sercer process

Channels

Target database

Recovery catalog database (optional)

Media management Layer (optional)

Backups, backup sets and backup pieces

What are channels?

A channel is an RMAN server process started when there is a need to communicate with an I/O device, such as a disk or a tape. A channel is what reads and writes RMAN backup files. It is through the allocation of channels that you govern I/O characteristics:

Type of I/O device being read or written to, either a disk or an sbt_tape

Number of processes simultaneously accessing an I/O device

Maximize size of files created on I/O devices

Maximize rate at which database files are read

Maximize number of files open at a time

Why is the catalog optional?

Because RMAN manages backup and recovery operations, it requires a place to store necessary information about the database. RMAN always stores this information in the target database control file. You can also store RMAN metadata in a recovery catalog schema contained in a separate database. The recovery catalog schema must be stored in a database other than the target database.

What is a Backup set?

A logical grouping of backup files — the backup pieces — that are created when you issue an RMAN backup command. A backup set is RMAN’s name for a collection of files associated with a backup. A backup set is composed of one or more backup pieces.

What are the benefits of using RMAN?

Incremental backups that only copy data blocks that have changed since the last backup.

Tablespaces are not put in backup mode, thus there is noextra redo log generation during online backups.

Detection of corrupt blocks during backups.

Parallelization of I/O operations.

Automatic logging of all backup and recovery operations.

Built-in reporting and listing commands.

What are the various reports available with RMAN

RMAN>list backup;

RMAN> list archive;

In catalog database, if some of the blocks are corrupted due to system crash, How will you recover?

using RMAN BLOCK RECOVER command

How do you enable the autobackup for the controlfile using RMAN?

Issue command at RMAN prompt.

RMAN> configure controlfile autobackup on;

Also we can configure controlfile backup format.

RMAN> configure controlfile autobackup format for device type disk to

2> ‘$HOME/BACKUP/RMAN/ F.bkp’;

How do you identify what are the all the target databases that are being backed-up with RMAN database?

You don’t have any view to identify whether it is backed up or not. The only option is connect to the target database and give list backup this will give you the backup information with date and timing.

How do you identify the block corruption in RMAN database? How do you fix it?

Using v$block_corruption view you can find which blocks corrupted.

RMAN> block recover datafile <fileid> block <blockid>;

Using the above statement You recover the corrupted blocks. First check whether the block is corrupted or not by using this command

SQL>select file# block# from v$database_block_corruption;

file# block

2 507

the above block is corrupted…

conn to Rman

To recover the block use this command…

RMAN>blockrecover datafile 2 block 507;

the above command recover the block 507

Now just verify it…..

Rman>blockrecover corruption list;

How do you clone the database using RMAN software? Give brief steps? When do you use crosscheck command?

Check whether backup pieces proxy copies or disk copies still exist.

Two commands available in RMAN to clone database:

1) Duplicate

2) Restore.

List some of the RMAN catalog view names which contain the catalog information?

RC_DATABASE_INCARNATION RC_BACKUP_COPY_DETAILS

RC_BACKUP_CORRUPTION

RC_BACKUP-DATAFILE_SUMMARY

How do you install the RMAN recovery catalog?

Steps to be followed:

1) Create connection string at catalog database.

2) At catalog database create one new user or use existing user and give that user a recovery_catalog_owner privilege.

3) Login into RMAN with connection string

a) export ORACLE_SID

b) rman target catalog @connection string

4) rman> create catalog;

5) register database;

What is the difference between physical and logical backups?

In Oracle Logical Backup is “which is taken using either Traditional Export/Import or Latest Data Pump”. Where as Physical backup is known “when you take Physical O/s Database related Files as Backup”.

What is RAID? What is RAID0? What is RAID1? What is RAID 10?

RAID: It is a redundant array of independent disk

RAID0: Concatenation and stripping

RAID1: Mirroring

How to enable Fast Incremental Backup to backup only those data blocks that have changed?

SQL> ALTER DATABASE enable BLOCK CHANGE TRACKING;

How do you set the flash recovery area?

SQL> ALTER SYSTEM SET db_recovery_file_dest_size = 100G;

SQL> ALTER SYSTEM SET db_recovery_file_dest = ‘/u10/oradata/school’;

What is auxiliary channel in RMAN? When do you need this?

An auxiliary channel is a link to auxiliary instance. If you do not have automatic channels configured, then before issuing the DUPLICATE command, manually allocate at least one auxiliary channel within the same RUN command.

How do you use the V$RECOVERY_FILE_DEST view to display information regarding the flashrecovery area?

SQL> SELECT name, space_limit, space_used,space_reclaimable, number_of_filesFROM v$recovery_file_dest;

How can you display warning messages?

SQL> SELECT object_type, message_type,message_level, reason, suggested_actionFROM dba_outstanding_alerts;

How do you backup the entire database?

RMAN> BACKUP DATABASE;

How do you backup an individual tablespaces?

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO DISK;

RMAN> BACKUP TABLESPACE system;

How do you backup datafiles and control files?

RMAN> BACKUP DATAFILE 3;

RMAN> BACKUP CURRENT CONTROLFILE;

Use a fast recovery without restoring all backups from their backup location to the location specified inthe controlfile.

RMAN> SWITCH DATABASE TO COPY;

My Database has Level 1 backup, tell me what are all backed up ? with Example?

Database is UP and has taken Level 0 backup, is the backup taken is Consistent or Inconsistent?

How do you say a backup is Consistent or Inconsistent, Oracle Terminology?

Can we take backup when the Database is down?

If i have a RMAN full backup Level 0 of Sun @9PM, on Mon 9PM taken incremental Level 1 backup.What type of backup do you get and what is actually backedup?

If i have a RMAN full backup of Sun @9PM, on Mon 9PM taken incremental Level 1 backup.On Tuesday Database Crashed.What type of backup do you get and what is actually backedup?

There is no Backup available, Can we take a Level 1 backup?

A table got dropped between 9AM - 11AM how to get the Table backup using RMAN, 

DB size 500GB available mount point space for table recovery is 15GB?

Sys Admin has changed the time from 10:00 AM to 9:30 AM, table dropped, How do you recover the Table?

A DATAFILE is corrupted and there is no backup, How to recover the datafile?

All Controlfiles are corrupted, How to recover the controlfile?

Tuesday, April 5, 2022

ORACLE INTERVIEW QUESTION -DATA GUARD

What is data guard?

Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Data Guard maintains these standby databases as copies of the production database. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.

What are the advantages in using Oracle Data Guard?

Following are the different benefits in using Oracle Data Guard feature in your environment.

High Availability.

Data Protection.

Off loading Backup operation to standby database.

Automatic Gap detection and Resolution in standby database.

Automatic Role Transition using Data Guard Broker.

What are the Protection Modes in Dataguard?

Data Guard Protection Modes

This section describes the Data Guard protection modes.

In these descriptions, a synchronized standby database is meant to be one that meets the minimum requirements of the configured data protection mode and that does not have a redo gap.

Maximum Availability

This protectionmode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one synchronized standby database. If the primary database cannot write its redo stream to at least one synchronized standby database, it operates as if it were in maximum performance mode to preserve primary database availability until it is again able to write its redo stream to a synchronized standby database.

This mode ensures that no data loss will occur if the primary database fails, but only if a second fault does not prevent a complete set of redo data from being sent from the primary database to at least one standby database.

Maximum Performance

This protectionmode provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by delays in writing redo data to the standby database(s).

This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance.

This is the default protection mode.

Maximum Protection

This protection mode ensures that zero data loss occurs if a primary database fails. To provide this level of protection, the redo data needed to recover a transaction must be written to both the online redo log and to at least one synchronized standby database before the transaction commits. To ensure that data loss cannot occur, the primary database will shut down, rather than continue processing transactions, if it cannot write its redo stream to at least one synchronized standby database.

Because this data protection mode prioritizes data protection over primary database availability, Oracle recommends that a minimum of two standby databases be used to protect a primary database that runs in maximum protection mode to prevent a single standby database failure from causing the primary database to shut down.

What is the difference between Physical standby and Logical standby database?

Data Guard Apply process in standby database can apply redo information directly and in that case it will be called physical standby.

OR It can apply SQL and in that case it will be called Logical standby.

Physical Standby:

In this case standby database is an exact, block-by-block, physical replica of the primary database.

The change vectors received by RFS process are directly applied to the standby database by using media recovery.so here the apply process read data blocks, assemble redo changes from mappings, and then apply redo changes to data blocks directly.

Physical Standby is the best choice for disaster recovery (DR) based upon their simplicity, transparency, high performance, and good data protection.

Logical Standby:

In this case standby database uses SQL Apply method to “mine” the redo by converting it to logical change records, and then building SQL

transactions and applying SQL to the standby database.

As this process of replaying the workload is more complex than the Physical Standby’s process, so it requires more memory, CPU, and I/O.

One good advantage here is that a logical standby database can be opened read-write while SQL Apply is active which means you can update (create/insert/delete etc) local tables and schemas in the logical standby database.

Explain the Dataguard Architecture

Data Guard architecture incorporates the following items:

• Primary Database - A production database that is used to create standby databases. The archive logs from the primary database are transfered and applied to standby databases. Each standby can only be associated with a single primary database, but a single primary database can be associated with multiple standby databases.

• Standby Database - A replica of the primary database.

• Log Transport Services - Control the automatic transfer of archive redo log files from the primary database to one or more standby destinations.

• Network Configuration - The primary database is connected to one or more standby databases using      Oracle Net.

• Log Apply Services - Apply the archived redo logs to the standby database. The Managed Recovery      Process (MRP) actually does the work of maintaining and applying the archived redo logs.

• Role Management Services - Control the changing of database roles from primary to standby. The services include switchover, switchback and failover.

• Data Guard Broker - Controls the creation and monitoring of Data Guard. It comes with a GUI and command line interface.

Primary Database:

A Data Guard configuration contains one production database, also referred to as the primary database, that functions in the primary role. This is the database that is accessed by most of your applications.

Standby Database:

A standby database is a transactionally consistent copy of the primary database. Using a backup copy of the primary database, you can create up to nine standby databases and incorporate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and then applying the redo to the standby database.

The types of standby databases are as follows:

Physical standby database:

Provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.

Logical standby database:

Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.

What are the Steps to create Physical Standby database?

1.Take a full hot backup of Primary database

2. Enable force logging to the database

3. Prepare parameter file for primary database

4. Enable archiving

5.Create standby control file

6.Transfer full backup, init.ora, standby control file to standby node.

7.Modify init.ora file on standby node.

8.Restore database

9.Recover Standby database

10.Put Standby database in Managed Recover mode

What are the DATAGUARD PARAMETERS in Oracle?

Set Primary Database Initialization Parameters

----------------------------------------------

On the primary database, you define initialization parameters that control redo transport services while the database is in the primary role. There are additional parameters you need to add that control the receipt of the redo data and log apply services when the primary database is transitioned to the standby role.

DB_NAME=chicago

DB_UNIQUE_NAME=chicago

LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'

CONTROL_FILES='/arch1/chicago/control1.ctl', '/arch2/chicago/control2.ctl'

LOG_ARCHIVE_DEST_1=

 'LOCATION=/arch1/chicago/

  VALID_FOR=(ALL_LOGFILES,ALL_ROLES)

  DB_UNIQUE_NAME=chicago'

LOG_ARCHIVE_DEST_2=

 'SERVICE=boston LGWR ASYNC

  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)

  DB_UNIQUE_NAME=boston'

LOG_ARCHIVE_DEST_STATE_1=ENABLE

LOG_ARCHIVE_DEST_STATE_2=ENABLE

REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE

LOG_ARCHIVE_FORMAT=%t_%s_%r.arc

LOG_ARCHIVE_MAX_PROCESSES=30

Primary Database: Standby Role Initialization Parameters

FAL_SERVER=boston

FAL_CLIENT=chicago

DB_FILE_NAME_CONVERT='boston','chicago'

LOG_FILE_NAME_CONVERT=  '/arch1/boston/','/arch1/chicago/','/arch2/boston/','/arch2/chicago/'

STANDBY_FILE_MANAGEMENT=AUTO

Prepare an Initialization Parameter File for the Standby Database

-----------------------------------------------------------------

Create a text initialization parameter file (PFILE) from the server parameter file (SPFILE) used by the primary database; a text initialization parameter file can be copied to the standby location and modified. For example:

CREATE PFILE='/tmp/initboston.ora' FROM SPFILE;

Modifying Initialization Parameters for a Physical Standby Database.

DB_NAME=chicago

DB_UNIQUE_NAME=boston

LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'

CONTROL_FILES='/arch1/boston/control1.ctl', '/arch2/boston/control2.ctl'

DB_FILE_NAME_CONVERT='chicago','boston'

LOG_FILE_NAME_CONVERT= '/arch1/chicago/','/arch1/boston/','/arch2/chicago/','/arch2/boston/'

LOG_ARCHIVE_FORMAT=log%t_%s_%r.arc

LOG_ARCHIVE_DEST_1=  'LOCATION=/arch1/boston/

VALID_FOR=(ALL_LOGFILES,ALL_ROLES)

DB_UNIQUE_NAME=boston'

LOG_ARCHIVE_DEST_2=  'SERVICE=chicago LGWR ASYNC  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)   DB_UNIQUE_NAME=chicago'

LOG_ARCHIVE_DEST_STATE_1=ENABLE

LOG_ARCHIVE_DEST_STATE_2=ENABLE

REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE

STANDBY_FILE_MANAGEMENT=AUTO

FAL_SERVER=chicago

FAL_CLIENT=boston

What are the services required on the primary and standby database ?


The services required on the primary database are:


• Log Writer Process (LGWR) - Collects redo information and updates the online redo logs. It can also      create local archived redo logs and transmit online redo to standby databases.


• Archiver Process (ARCn) - One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.


• Fetch Archive Log (FAL) Server - Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request. .


The services required on the standby database are:


• Fetch Archive Log (FAL) Client - Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.


• Remote File Server (RFS) - Receives archived and/or standby redo logs from the primary database.


• Archiver (ARCn) Processes - Archives the standby redo logs applied by the managed recovery process (MRP).


• Managed Recovery Process (MRP) - Applies archive redo log information to the standby database.



What is RTS (Redo Transport Services) in Dataguard?


It controls the automated transfer of redo data from the production database to one or more archival destinations. The redo transport services perform the following tasks:


a) Transmit redo data from the primary system to the standby systems in the configuration.


b) Manage the process of resolving any gaps in the archived redo log files due to a network failure.


c) Automatically detect missing or corrupted archived redo log files on a standby system and automatically retrieve replacement archived redo log files from the

primary database or another standby database.


What is a Snapshot Standby Database?


Oracle 11g introduces the Snapshot Standby database which essentially is an updateable standby database which has been created from a physical standby database.


We can convert a physical standby database to a snapshot standby database, do some kind of testing on a database which is a read write copy of the current primary or production database and then finally revert it to  it’s earlier state as a physical standby database.


While the snapshot standby database is open in read-write mode, redo is being received from the primary database, but is not applied.


After converting it back to a physical standby database, it is resynchronized with the primary by applying the accumalated redo data which was earlier shipped from the primary database but not applied.


Using a snapshot standby, we are able to do real time application testing using near real time production data. Very often we are required to do production clones for the purpose of testing. But using snapshot standby databases we can meet the same requirement sparing the effort,time,resources and disk space.


A snapshot standby database is a fully updatable standby database that is created by converting a physical standby database into a snapshot standby database.


Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives. The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.


How to delay the application of logs to a physical standby? 


A standby database automatically applies redo logs when they arrive from the primary database. But in some cases, we want to create a time lag between the archiving of a redo log at the primary site, and the application of the log at the standby site.


Modify the LOG_ARCHIVE_DEST_n initialization parameter on the primary database to set a delay for the standby database.


Example: For 60min Delay:

ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=stdby_srvc DELAY=60';

The DELAY attribute is expressed in minutes.

The archived redo logs are still automatically copied from the primary site to the standby site, but the logs are not immediately applied to the standby database. The logs are applied when the specified time interval expires.


What is the usage of DB_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?


DB_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. DB_FILE_NAME_CONVERT parameter are used to update the location of data files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database data files location.


What is the usage of LOG_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?


LOG_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. LOG_FILE_NAME_CONVERT parameter are used to update the location of redo log files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database redo log file location.


Your standby database was out of reach because of network issue. How will you synchronize it with primary database again?


Data Guard automatically resynchronizes the standby following network or standby outages using redo data that has been archived at the primary.



What is the difference between SYNC and ASYNC redo transport method?


Synchronous transport (SYNC)


Also known as a “zero data loss” redo transport menthod.


Below is how it works:


1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database


2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database


3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database


4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database


5) Remote File Server (RFS) transmits an acknowledgement back to the LNS process on the primary database


6) Log Network Server (LNS) notifies the LGWR that transmission is complete on the primary database.


7) Log Writer (LGWR) acknowledges the commit to the user.


Asynchronous transport (ASYNC)


Unlike SYNC, Asynchronous transport (ASYNC) eliminates the requirement that the LGWR wait for acknowledgement from the LNS. This removes the performance impact on the primary database irrespective of the distance between primary and standby locations. So if the LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.


Below is how it works:


1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database


2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database


3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database


4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database


so step 5, 6 & 7 as discussed above for SYNC are not applicable here.


The only drawback of ASYNC is the increased potential for data loss. Say a failure destroyed the primary database before any transport lag was reduced to zero, this means any committed transactions that were a part of the transport lag will be lost. So it is highly advisable to have enough network bandwidth to handle peak redo

generation rates when using ASYNC method.


How Synchronous transport (SYNC) can impact the primary database performance?


SYNC guarantees protection for every transaction that the database acknowledges as having been committed but at the same time LGWR must wait for confirmation that data is protected at the standby before it can proceed with the next transaction. It can impact primary database performance and it depends on factors like

the amount of redo information to be written

available network bandwidth

round-trip network latency (RTT)

standby I/O performance writing to the SRL.

distance betweeen primary and standby databases as network RTT increases with distance.



What is Data Guard’s Automatic Gap Resolution?


Your database is using ASYNC transport method and the instance load is at the peak. The LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.


Now in some cases there can be two or more log switches before the LNS has completed sending the redo information from online redo log files and in meantime if any such required online redo log files were archived then those redo information will be transmitted via Data Guard’s gap resolution process “Automatic Gap Resolution”.


OR


In some other case when your network or the standby database is down and your primary system is one busy system, so before the connection between the primary and standby is restored, a large log file gap will be formed.

Automatic Gap Resolution will take care of such scenarios by following below action plan:


1) ARCH process on the primary database continuously ping the standby database during the outage to determine its status.

2) As soon as the standby is restored, the ARCH ping process queries the standby control file (via its RFS process) to determine the last complete log file that the standby received from the primary database.

3) Data Guard determines which log files are required to resynchronize the standby database and immediately begins transmitting them using additional ARCH processes.

4) LNS process at primary database will also attempt and succeed in making a connection to the standby database and will begin transmitting current redo. So first all the ARCH files are applied and then current redo log.


The Data Guard architecture enables gaps to be resolved quickly using multiple background ARCH processes



How is Data Guard Apply process works if primary and secondary database involves Oracle RAC?


If Primary database is RAC but standby is Non-RAC:


Each primary Oracle RAC instance ships its own thread of redo that is merged by the Data Guard apply process at the standby and applied in SCN order to the standby database.


If both Primary and standby databases are RAC:


If the standby is also an Oracle RAC database, only one instance (the apply instance) will merge and apply changes to the standby database. If the apply instance fail for any reason, the apply process will automatically failover to a surviving instance in the Oracle RAC standby database when using the Data Guard broker.


What is Active Data Guard Option (Oracle Database 11g Enterprise Edition)?


For physical standby database, prior to 11g, the database would have to be in the mount state when media recovery was active which means you were not able to query the standby database during media recovery stage as there was no read-consistent view.


Active Data Guard 11g features solves the read consistency problem by use of a “query” SCN. The media recovery process on the standby database will advance the query SCN after all the changes in a transaction have been applied . The query SCN will appear to user as the CURRENT_SCN column in the V$DATABASE view on the standby database. So Read-only users will only be able to see data up to the query SCN, and hence guaranteeing the same read consistency as the primary database.

This enables a physical standby database to be open as read-only while media recovery is active, making it useful for doing read-only workloads.


Also, if you need read-write access to the standby database, you can use SQL Apply method of dataguard.


What are the important database parameters related to Data Guard corruption prevention?


On the primary database:


a) DB_ULTRA_SAFE


Values can be DATA_AND_INDEX or DATA_ONLY. Setting DB_ULTRA_SAFE at the primary will also automatically set DB_ LOST_WRITE_PROTECT=TYPICAL on the primary database.

In Oracle Database 11g Release 2 (11.2), the primary database automatically attempts to repair the corrupted block in real time by fetching a good version of the same block from a physical standby database.


On the standby database:


a) DB_BLOCK_CHECKSUM=FULL

DB_BLOCK_CHECKSUM detects redo and data block corruptions and detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.


b) DB_LOST_WRITE_PROTECT=TYPICAL

A lost write can occur when an I/O subsystem acknowledges the completion of a write, while in fact the write did not occur in persistent storage.

This will create a stale version of the data block. When the DB_LOST_WRITE_PROTECT initialization parameter is set, the database records buffer cache block reads in the redo log, and this information is used to detect lost writes.

You set DB_LOST_WRITE_PROTECT to TYPICAL in both primary and standby databases.


What is Switchover event?


Switchover is useful for minimizing downtime during planned maintenance. It is a planned event in which Data Guard reverses the roles of the primary and a standby database.


The primary database runs unaffected while we are making the required changes on our standby database (e.g. patchset upgrades, full Oracle version upgrades, etc).


Once changes are complete, production is switched over to the standby site running at the new release.


This means regardless of how much time is required to perform planned maintenance, the only production database downtime is the time required to execute a switchover, which can be less than 60 seconds


Below operations happens when switchover command is executed:

1. primary database is notified that a switchover is about to occur.

2. all users are disconnected from the primary.

3. a special redo record is generated that signals the End Of Redo (EOR).

4. primary database is converted into a standby database.

5. the final EOR record is applied to standby database, this guarantees that no data has been lost, and it converts the standby to the primary role.


What is Failover event?


The Failover process is similar to switchover event except that the primary database never has the chance to write an EOR record as this is an unplanned event.

Whether or not a failover results in data loss depends upon the Data Guard protection mode:


a) Maximum Protection >> No Data Loss

b) Maximum Availability >> No Data Loss (except when there was a previous failure (e.g. a network failure) that had INTERRUPTED REDO TRANSPORT and allowed the primary database to move ahead of standby)


c) Maximum Performance (ASYNC) >> may lose any committed transactions that were not transmitted to the standby database before the primary database failed.


Failover event can be of two types:

1) Manual

Administrator have complete control of primary-standby role transitions. It can lengthen the outage by the amount of time required for the administrator to be notified and manual execution of command.

2) Automatic

It uses Data Guard’s Fast-Start Failover feature which automatically detects the failure, evaluates the status of the Data Guard configuration, and, if appropriate, executes the failover to a previously chosen standby database.


Which tools can be used for Data Guard Management?


1) SQL*Plus – traditional method, can prove most tedious to use


2) Data Guard broker – automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. Simplifies and automates many administrative

tasks. It has its own command line (DGMGRL) and syntax.


3) Enterprise Manager – requires that the Data Guard broker be enabled. a GUI to the Data Guard broker, replacing the DGMGRL command line and interfacing directly with the broker’s monitor processes.


What is the difference between Recovery Point Objective(RPO) and Recovery Time Objective (RTO)?


A) Recovery Point Objective(RPO)

RPO concerns with data. It is the amount of data you are willing to lose when the failure occurs in your database system. Usually people define data loss in terms of time, so possible values can be 5 seconds of data loss, 2 hours of data loss etc.


Remember that each standby database has its own set of attributes and parameters. It means you can mix zero data loss standby databases with minimal data loss standby

databases in the same Data Guard configuration

If you have decided that you want to implement zero data loss strategy, then you should really focus on Networks and Data Loss


B) Recovery Time Objective (RTO)

RTO is defined as how fast you can get back up and running (whereas RPO is concerned with data loss)


So with your RPO strategy you lost say only about 6 seconds of data as you committed to your client but with RTO you need to formulate how fast clients can connect back to the database system after the data loss has occurred.


What are Standby Redo Log (SRL) files?


The SRL files are where the Remote File Server (RFS) process at your standby database writes the incoming redo so that it is persistent on disk for recovery. SRL files are important for better redo transport performance and data protection.


SRL are MUST in Maximum Availability or Maximum Protection mode and OPTIONAL (but recommended) in Maximum Performance mode.


If there are no Standby Redo Log (SRL) files, then at each log switch in the primary database, the RFS process on the standby database that is serving an asynchronous standby destination has to create an archive log of the right size. While the RFS is busy doing creating the archive log file, the LNS process at the primary database has to wait, getting further and further behind the LGWR (in case of Maximum Performance mode). That is why it recommended to have Standby Redo Log (SRL) files in Maximum Performance mode also.


We generally configure them on our primary database as well in preparation for a role transition b/w primary-standby.


Also, do not multiplex SRLs. Since Data Guard will immediately request a new copy of the archive log if an SRL file fails, there is no real need to have more than one copy of each.


===========================

Q1. What is meant by Data Guard?

Ans:

The software which Oracle Corporation markets as Oracle Data Guard forms an extension to the Oracle relational database management system (RDBMS). It aids in establishing and maintaining secondary standby databases as alternative/supplementary repositories to production primary databases.  provides both graphical user interface (GUI) and command-line (CLI) tools for managing Data Guard configurations.Data Guard supports both physical standby and logical standby sites. Oracle Corporation makes Data Guard available only as a bundled feature included within its “Enterprise Edition” of the Oracle RDBMS.With appropriately set-up Data Guard operations, DBAs can facilitate failovers or switchovers to alternative hosts in the same or alternative locations.


Q2. What are the types of Oracle Data Guard?

Ans:Oracle Data Guard classified into two types based onthe way of creation and method used for Redo Apply. They are as follows.

Physical standby(Redo Apply technology)

Logical standby(SQL Apply Technology)

Snapshot standby(not applied until the snapshot standby is Converted)

Q3. What is the use of standby redo log?

Ans:The Advantage of having Standby Redo Logs is that every records written into the Online RedoLogs of the Primary Database is transferred to the Standby database and written into the Standby Redo Logs at the same time.Therefore, you it minimizes the probability of Data Loss on the Standby Database.For real time apply, it is mandatory to have redolog.Without Standby Redo Logs, an Archived Redo Log is created by the RFS process and when it has completed,this Archived Redo Log is applied to the Standby Database by the MRP process . An incomplete ArchiveLog file cannot be applied on the Standby Database and will not be used in a Failover situation. This causes a certain data loss.If you have Standby Redo Logs, the RFS process will write into the Standby Redo Log ,when their record entered in online redo log and when a log switch occurs, the Archiver Process of the Standby Database will archive this Standby Redo Log to an Archived Redo Log, while the MRP process applies the information to the Standby Database. In a Failover situation, you will also have access to the information already written in the Standby Redo Logs, so the information will not be lost.


Q4. What are the advantages in using Oracle Data Guard?

Ans:

Following are the different benefits in using Oracle Data Guard feature in your environment:

High Availability.

Data Protection.

Off loading Backup operation to standby database.

Automatic Gap detection and Resolution in standby database.

Automatic Role Transition using Data Guard Broker.

Q5. What is active dataguard. Does it needs additional licensing?

Ans:

Active dataguard means, the standby database is open with read only mode;when redo logs are getting applied in real time.Below are the benefit of using active dataguard:

Reporting queries can be offloaded to standby database.

Physical block corruptions are repaired automatically either at primary or physical standby database.

RMAN backups can be initiated from standby , instead  of primary which will reduce cpu load from primary.

Q6. What are the different services available in Oracle Data Guard?

Ans:

Following are the different Services available in Oracle Data Guard of Oracle database:

Redo Transport Services.

Log Apply Services.

Role Transitions.

Redo Transport Services

Control the automated transfer of redo data from the production database to one or more archival destinations.

Apply Services

Apply redo data on the standby database to maintain transactional synchronization with the primary database. Redo data can be applied either from archived redo log files, or, if real-time apply is enabled, directly from the standby redo log files as they are being filled, without requiring the redo data to be archived first at the standby database.

Role Transitions:Change the role of a database from a standby database to a primary database, or from a primary database to a standby database using either a switchover or a failover operation.

Q7. What are the different types of redo transport services in dataguard?

Ans:

SYNC and ASYNC.

SYNC(SYNCHRONOUS):

This mode is used for maximum protection and maximum availability protection mode. The synchronous redo transport mode transmits redo data synchronously with respect to transaction commitment. A transaction cannot commit until all redo generated by that transaction has been successfully sent to every standby destinations.

ASYNC(ASYNCHRONOUS):

This mode is used for maximum performance mode. A transaction can commit without waiting for the redo generated by that transaction to be successfully sent to any redo transport destination that uses the asynchronous redo transport mode.

Q8. What are the different Protection modes available in Oracle Data Guard?

Ans:

Following are the different protection modes available in Data Guard of Oracle database you can use any one based on your application requirement.

Maximum Protection

Maximum Availability

Maximum Performance (Default Mode)

Q9. What are the new feature for DG in 11g?

Ans:

Active database duplicate

snapshot stand by

Active data guard

Redo compression

32 standby database  configuration support (earlier only 9)

FAL_CLIENT not require

Q10. How to check what protection mode of primary database in your Oracle Data Guard?

Ans:

By using following query you can check protection mode of primary database in your Oracle Data Guard setup:

SQL> select protection_mode from v$database;

PROTECTION_MODE

MAXIMUM PERFORMANCE

Q11. What is the difference between Real-time apply and redo apply?

Ans:

In real-time apply, LOG APPLY Service will not wait for the current SRL to archive.SO the redo data is apply AS SOON AS it receive from primary.But in normal apply mode : LOG APPLY service will wait for SRL to archive.

Q12. How to change protection mode in Oracle Data Guard setup?

Ans:By using the following query your can change the protection mode in your primary database after setting up required value in corresponding LOG_ARCHIVE_DEST_n parameter in the primary database for thcorresponding standby database.

Q13. Which process is responsible for Redo log information  transmitting?

Ans:

LGWR or ARCH.

Q14. What are the advantages of using Physical standby database in Oracle Data Guard?

Ans:

Advantages of using Physical standby database in Oracle Data Guard are as follows:

High Availability.

Load balancing (Backup and Reporting).

Data Protection.

Disaster Recovery.

Q15. Which parameter user for GAP Resolution?

Ans:

FAL_SERVER and FAL_CLIENT

When MRP finds that an archive log is missing during media recovery, it sends the fal_client information to the server identified by fal_server and requests fal_server to resend the file again.  The fal_client and fal_server init.ora (spfile) parameters are set on the standby instance

Q16. What is physical standby database in Oracle Data Guard?

Ans:

Provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.As of Oracle Database 11g release 1 (11.1), a physical standby database can receive and apply redo while it is open for read-only access. A physical standby database can therefore be used concurrently for data protection and reporting.

Q17. What is difference between Physical stand by and snapshot standby?

Ans:

Snapshot stdb will receive redo but not apply .Normal standby will receive and apply redo.

Q18. What is Logical standby database in Oracle Data Guard?

Ans:

Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.

A logical standby database can be used for other business purposes in addition to disaster recovery requirements. This allows users to access a logical standby database for queries and reporting purposes at any time. Also, using a logical standby database, you can upgrade Oracle Database software and patch sets with almost no downtime. Thus, a logical standby database can be used concurrently for data protection, reporting, and database upgrades.

Q19. What is active dataguard  duplicate?

Ans:Starting from 11g we can duplicate database by two way:

Active DB duplicate.

Backup-based duplicate.

Active DB duplicate copies the live TARGET DB over the network to the AUXILLARY destination and then create the duplicate database.In an active duplication process, target database online image copies and archived redo log files were copied through the auxiliary instance service name. So there is no need of target db backup.

Q20. What is Snapshot standby database in Oracle Data Guard?

Ans:A snapshot standby database is a fully updatable standby database. Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives. The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.

A snapshot standby database is best used in scenarios that require a temporary, updatable snapshot of a physical standby database. Note that because redo data received by a snapshot standby database is not applied until it is converted back into a physical standby, the time needed to recover from a primary database failure is directly proportional to the amount of redo data that needs to be applied.

Subscribe For Free Demo

Your Name

Your Email

Mobile Number

Q21.What are the advantages of Logical standby database in Oracle Data Guard?


Ans:


Better usage of resource

Data Protection

High Availability

Disaster Recovery

Q22.What is the usage of DB_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?


Ans:


DB_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. DB_FILE_NAME_CONVERT parameter are used to update the location of data files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database data files location.


Q23.What is the usage of LOG_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?


Ans:


LOG_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. LOG_FILE_NAME_CONVERT parameter are used to update the location of redo log files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database redo log file location.


Q24.  What are the Step for Physical  Standby?


Ans:


These are the steps to follow:


Enable forced logging

Create a password file

Configure a standby redo log

Enable archiving

Set up the primary database initialization parameters

Configure the listener and tnsnames to support the database on both nodes

col name format a20

col thread# format 999

col sequence# format 999

col first_change# format 999999

col next_change# format 999999

SELECT thread#, sequence# AS “SEQ#”, name, first_change# AS “FIRSTSCN”,

next_change# AS “NEXTSCN”,archived, deleted,completion_time AS “TIME”

FROM   v$archived_log

V$ log_history

Q25.Tell me about parameter which is used for standby database?


Ans:


Log_Archive_Dest_n

Log_Archive_Dest_State_n

Log_Archive_Config

Log_File_Name_Convert

Standby_File_Managment

DB_File_Name_Convert

DB_Unique_Name

Control_Files

Fat_Client

Fat_Server

The LOG_ARCHIVE_CONFIG parameter enables or disables the sending of redo streams to the standby sites. The DB_UNIQUE_NAME of the primary database is dg1 and the DB_UNIQUE_NAME of the standby database is dg2. The primary database is configured to ship redo log stream to the standby database. In this example, the standby database service is dg2.


Next, STANDBY_FILE_MANAGEMENT is set to AUTO so that when Oracle files are added or dropped from the primary database, these changes are made to the standby databases automatically. The STANDBY_FILE_MANAGEMENT is only applicable to the physical standby databases.


Setting the STANDBY_FILE_MANAGEMENT parameter to AUTO is is recommended when using Oracle Managed Files (OMF) on the primary database. Next, the primary database must be running in ARCHIVELOG mode


Q26.What are the uses of Oracle Data Guard?


Ans:


Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data.

Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions.

With Data Guard, administrators can optionally improve production database performance by offloading resource-intensive backup and reporting operations to standby systems.

Q27. What is Redo Transport Services?


Ans:


It control the automated transfer of redo data from the production database to one or more archival destinations.


Q28. What is apply services?


Ans:


Apply redo data on the standby database to maintain transactional synchronization with the primary database. Redo data can be applied either from archived redo log files, or, if real-time apply is enabled, directly from the standby redo log files as they are being filled, without requiring the redo data to be archived first at the standby database. It also allows read-only access to the data.


Q29. What is Data Guard Broker?


Ans:


Data guard Broker manage primary and standby databases using the SQL command-line interfaces or the Data Guard broker interfaces, including a command-line interface (DGMGRL) and a graphical user interface that is integrated in Oracle Enterprise Manager.


Q30. What actions can be performed by Data Guard Broker?


Ans:


Create and enable Data Guard configurations, including setting up redo transport services and apply services

Manage an entire Data Guard configuration from any system in the configuration

Manage and monitor Data Guard configurations that contain Oracle RAC primary or standby databases

Simplify switchovers and failovers by allowing you to invoke them using either a single key click in Oracle Enterprise Manager or a single command in the DGMGRL command-line interface.

Enable fast-start failover to fail over automatically when the primary database becomes unavailable. When fast-start failover is enabled, the Data Guard broker determines if a failover is necessary and initiates the failover to the specified target standby database automatically, with no need for DBA intervention.

Course Curriculum

Enhance Your Career with Oracle Data Guard Training from Real Time Experts

Instructor-led Sessions Real-life Case StudiesAssignments

Q31. What are the Data guard Protection modes and summarize each?


Ans:


Maximum availability :

This protection mode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one standby database.


Maximum performance :

This is the default protection mode. It provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log.


Maximum protection :

This protection mode ensures that no data loss will occur if the primary database fails. To provide this level of protection, the redo data needed to recover a transaction must be written to both the online redo log and to at least one standby database before the transaction commits. To ensure that data loss cannot occur, the primary database will shut down, rather than continue processing transactions.


Q32. If you didn’t have access to the standby database and you wanted to find out what error has occurred in a data guard configuration, what view would you check in the primary database to check the error message?


Ans:


You can check the v$dataguard_status view.  Select message from v$dataguard_status;


Q33.  In Oracle 11g, what command in RMAN can you use to create the standby database while the target database is active?


Ans:


Oracle 11g has made it extremely simple to set up a standby database environment because Recovery Manager (RMAN) now supports the ability to clone the existing primary database directly to the intended standby database site over the network via the DUPLICATE DATABASE command set while the target database is active.  RMAN automatically generates a conversion script in memory on the primary site and uses that script to manage the cloning operation on the standby site with virtually no DBA intervention required.   You can execute this in a run block in RMAN:duplicate target database for standby dorecover from active database;


Q34.  What additional standby database mode does Oracle 11g offer?


Ans:


Oracle 11g has introduced the Oracle Snapshot Standby Database.   In Snapshot Standby Database a physical standby database can easily open in read-write mode and again you can convert it back to the physical standby database. This is suitable for test and development environments and also maintains protection by continuing to receive data from the production database and archiving it for later use.


Q35. In Oracle 11g how can speed up backups on the standby database?


Ans:


In Oracle 11g, block change tracking is now supported in the standby database.


Q36.  With the availability of Active Data Guard, what role does SQL Apply (logical standby) continue to play?


Ans:


Use SQL Apply for the following requirements: (a) when you require read-write access to a synchronized standby database but do not modify primary data, (b) when you wish to add local tables to the standby database that can also be updated, or (c) when you wish to create additional indexes to optimize read performance.  The ability to handle local writes makes SQL Apply better suited to packaged reporting applications that often require write access to local tables that exist only at the target database. SQL Apply also provides rolling upgrade capability for patchsets and major database releases.  This rolling upgrade functionality can also be used by physical standby databases beginning with Oracle 11g using Transient Logical Standby.


Q37.  Why would I use Active Data Guard and not simply use SQL Apply (logical standby) that is included with Data Guard 11g?


Ans:


If read-only access satisfies the requirement – Active Data Guard is a closer fit for the requirement, and therefore is much easier to implement than any other approach.  Active Data Guard supports all datatypes and is very simple to implement. An Active Data Guard replica can also easily support additional uses – offloading backups from the primary database, serve as an open read-write test system during off-peak hours (Snapshot Standby), and provide an exact copy of the production database for disaster recovery – fully utilizing standby servers, storage and software while in standby role.


Q38.  Why do I need the Oracle 11g Active Data Guard Option?


Ans:


Previous capabilities did not allow Redo Apply to be active while a physical standby database was open read-only, and did not enable RMAN block change tracking on the standby database.  This resulted in (a) read-only access to data that was frozen as of the time that the standby database was opened read-only, (b) failover and switchover operations that could take longer to complete due to the backlog of redo data that would need to be applied, and (c) incremental backups that could take up to 20x longer to complete – even on a database with a moderate rate of change. Previous capabilities are still included with Oracle Data Guard 11g, no additional license is required to use previous capabilities.


Q39.  If you wanted to upgrade your current 10g physical standby data guard configuration to 11g, can you upgrade the standby to 11g first then upgrade the primary ?


Ans:


Yes, in Oracle 11g, you can temporarily convert the physical standby database to a logical standby database to perform a rolling upgrade.  When you issue the convert command you need to keep the identity:alter database recover logical standby keep identity;


Q40.  If you have a low-bandwidth WAN network, what can you do to improve the Oracle 11g data guard configuration in a GAP detected situation?


Ans:


Oracle 11g introduces the capability to compress redo log data as it transports over the network to the standby database.  It can be enabled using the compression parameter.  Compression becomes enabled only when a gap exists and the standby database needs to catch up to the primary database.


alter system set log_archive_dest_1=’SERVICE=DBA11GDR COMPRESSION=ENABLE’;


Q41.  In an Oracle 11g Logical Standby Data Guard configuration, how can you tell the dbms_scheduler to only run jobs in primary database?


Ans:


Oracle 11g, logical standby now provides support for DBMS_SCHEDULER.  It is capable of running jobs in both primary and logical standby database.  You can use  the DBMS_SCHEDULER.SET_ATTRIBUTE procedure to set the database_role.  You can specify that the jobs can run only when operating in that particular database role.


Q42.  How can you control when an archive log can be deleted in the standby database in oracle 11g ?


Ans:


In Oracle 11g, you can control it by using the log_auto_delete initialization parameter.  The log_auto_delete parameter must be coupled with the log_auto_del_retention_target parameter to specify the number of minutes an archivelog is maintained until it is purged. Default is 24 hours.  For archivelog retention to be effective, the log_auto_delete parameter must be set to true.


Q43.  Can Oracle Data Guard be used with Standard Edition of Oracle ?


Ans:


Yes and No.   The automated features of Data Guard are not available in the standard edition of Oracle.   You can still however, perform log shipping manually and write scripts to manually perform the steps.    If you are on unix platform, you can write shell scripts that identify the logs and then use the scp or sftp command to ship it to the standby server.  Then on the standby server, identify which logs have not been applied and apply/recover them maually and remove them once applied.


Q44. What is the difference between Active Dataguard, and the Logical Standby implementation of 10g dataguard?


Ans:


Active dataguard is mostly about the physical standby.


Use physical standby for testing without compromising protection of the production system. You can open the physical standby read/write – do some destructive things in it (drop tables, change data, whatever – run a test – perhaps with real application testing). While this is happening, redo is still streaming from production, if production fails – you are covered. Use physical standby for reporting while in managed recovery mode. Since physical standby supports all of the datatypes – and logical standby does not (11g added broader support, but not 100%) – there are times when logical standby isn’t sufficient. It also permits fast incremental backups when offloading backups to a physical standby database.


Q45. Can Oracle’s Data Guard be used on Standard Edition, and if so how? How can you test that the standby database is in sync?


Ans:


Oracle’s Data Guard technology is a layer of software and automation built on top of the standby database facility. In Oracle Standard Edition it is possible to be a standby database, and update it *manually*. Roughly, put your production database in archivelog mode. Create a hotbackup of the database and move it to the standby machine. Then create a standby controlfile on the production machine, and ship that file, along with all the archived redolog files to the standby server. Once you have all these files assembled, place them in their proper locations, recover the standby database, and you’re ready to roll. From this point on, you must manually ship, and manually apply those archived redologs to stay in sync with production.


To test your standby database, make a change to a table on the production server, and commit the change. Then manually switch a logfile so those changes are archived. Manually ship the newest archived redolog file, and manually apply it on the standby database. Then open your standby database in read-only mode, and select from your changed table to verify those changes are available. Once you’re done, shutdown your standby and startup again in standby mode.


Q46.What is data guard in simple language?


Ans:


Your primary database is running and you want to reduce downtime because of unplanned outages. You create a replica of this primary database (termed as standby database).


You regularly ship redo generated in the primary database to standby database and apply it there. So that is our ‘Data Guard’ standby database and it is in a continuous state of recovery, validating and applying redo to remain in sync with the primary database.


Q47.Your standby database was out of reach because of network issue. How will you synchronize it with primary database again?


Ans:


Data Guard automatically resynchronizes the standby following network or standby outages using redo data that has been archived at the primary.


Q48.What is Redo Transport Services (RTS)?


Ans:


This process takes care of the transmission of redo from a primary database to the standby database.


Q49. how Redo Transport Services (RTS) works?


Ans:


Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

Q50.Synchronous transport (SYNC) how it works?


Ans:


Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

Remote File Server (RFS) transmits an acknowledgement back to the LNS process on the primary database

Log Network Server (LNS) notifies the LGWR that transmission is complete on the primary database.

Log Writer (LGWR) acknowledges the commit to the user.

Course Curriculum

Enroll in Oracle Data Guard Course & Get Noticed By Top Hiring Companies

Weekday / Weekend Batches

Q51.Asynchronous transport (ASYNC)  how it works?


Ans:


Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

Q52.How Synchronous transport (SYNC) can impact the primary database performance?


Ans:


SYNC guarantees protection for every transaction that the database acknowledges as having been committed but at the same time LGWR must wait for confirmation that data is protected at the standby before it can proceed with the next transaction. It can impact primary database performance and it depends on factors like:


the amount of redo information to be written

available network bandwidth

round-trip network latency (RTT)

standby I/O performance writing to the SRL.

distance betweeen primary and standby databases as network RTT increases with distance.

Q53.What is Data Guard’s Automatic Gap Resolution?


Ans:


Your database is using ASYNC transport method and the instance load is at the peak. The LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.


Q54.What is the difference between Physical standby and Logical standby database?


Ans:


Data Guard Apply process in standby database can apply redo information directly and in that case it will be called physical standby.


Physical Standby:In this case standby database is an exact, block-by-block, physical replica of the primary database.The change vectors received by RFS process are directly applied to the standby database by using media recovery.so here the apply process read data blocks, assemble redo changes from mappings, and then apply redo changes to data blocks directly.Physical Standby is the best choice for disaster recovery (DR) based upon their simplicity, transparency, high performance, and good data protection.

Logical Standby:In this case standby database uses SQL Apply method to “mine” the redo by converting it to logical change records, and then building SQLtransactions and applying SQL to the standby database.As this process of replaying the workload is more complex than the Physical Standby’s process, so it requires more memory, CPU, and I/O.One good advantage here is that a logical standby database can be opened read-write while SQL Apply is active which means you can update (create/insert/delete etc) local tables and schemas in the logical standby database.

Q55.How is Data Guard Apply process works if primary and secondary database involves Oracle RAC?


Ans:


If Primary database is RAC but standby is Non-RAC:Each primary Oracle RAC instance ships its own thread of redo that is merged by the Data Guard apply process at the standby and applied in SCN order to the standby database.

If both Primary and standby databases are RAC:If the standby is also an Oracle RAC database, only one instance (the apply instance) will merge and apply changes to the standby database. If the apply instance fail for any reason, the apply process will automatically failover to a surviving instance in the Oracle RAC standby database when using the Data Guard broker.

Q56.What is Active Data Guard Option (Oracle Database 11g Enterprise Edition)?


Ans:


For physical standby database, prior to 11g, the database would have to be in the mount state when media recovery was active which means you were not able to query the standby database during media recovery stage as there was no read-consistent view.


Active Data Guard 11g features solves the read consistency problem by use of a “query” SCN. The media recovery process on the standby database will advance the query SCN after all the changes in a transaction have been applied . The query SCN will appear to user as the CURRENT_SCN column in the V$DATABASE view on the standby database. So Read-only users will only be able to see data up to the query SCN, and hence guaranteeing the same read consistency as the primary database.


This enables a physical standby database to be open as read-only while media recovery is active, making it useful for doing read-only workloads.Also, if you need read-write access to the standby database, you can use SQL Apply method of dataguard.


Q57.What are the important database parameters related to Data Guard corruption prevention?


Ans:


On the primary database:


DB_ULTRA_SAFE

Values can be DATA_AND_INDEX or DATA_ONLY. Setting DB_ULTRA_SAFE at the primary will also automatically set DB_ LOST_WRITE_PROTECT=TYPICAL on the primary database.


In Oracle Database 11g Release 2 (11.2), the primary database automatically attempts to repair the corrupted block in real time by fetching a good version of the same block from a physical standby database.


On the standby database:


DB_BLOCK_CHECKSUM=FULL

DB_BLOCK_CHECKSUM detects redo and data block corruptions and detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.


DB_LOST_WRITE_PROTECT=TYPICAL

A lost write can occur when an I/O subsystem acknowledges the completion of a write, while in fact the write did not occur in persistent storage.


This will create a stale version of the data block. When the DB_LOST_WRITE_PROTECT initialization parameter is set, the database records buffer cache block reads in the redo log, and this information is used to detect lost writes.


Q58.What are different Data Guard protection modes?


Ans:


Data Guard protection modes implement rules that controls how the configuration will respond to failures, enabling you to achieve specific objectives for data protection, availability, and performance.


Maximum Performance

– emphasis is on primary database performance over data protection.


– requires ASYNC (the default method) redo transport so that the LGWR process never waits for acknowledgment from the standby database.


– network connection between primary and standby OR the availability of the standby database DO NOT IMPACT the primary database performance.


Maximum Availability

– first emphasis is on availability and second priority is zero data loss protection.


– requires SYNC redo transport so primary database performance may be impacted in waiting for acknowledgment from the standby (it doesn’t mean indefinite wait in case standby database fails, maximum wait will be equal to parameter NET_TIMEOUT seconds).


Maximum Protection

– utmost priority is on data protection.


– also requires SYNC redo transport.


– unlike ‘Maximum Availability’ it does not consider the NET_TIMEOUT parameter, which means If the primary does not receive acknowledgment from a SYNC standby database, it will stall primary and eventually abort it, preventing any unprotected commits from occurring.


– highly recommended to use a minimum of two SYNC standby databases at different locations if using ‘Maximum Protection’ to have high availability of primary database.


Q59.What is Switchover event?


Ans:


Switchover is useful for minimizing downtime during planned maintenance. It is a planned event in which Data Guard reverses the roles of the primary and a standby database.


Q60. What are the operations happens when switchover command is executed?


Ans:


primary database is notified that a switchover is about to occur.

all users are disconnected from the primary.

a special redo record is generated that signals the End Of Redo (EOR).

primary database is converted into a standby database.

the final EOR record is applied to standby database, this guarantees that no data has been lost, and it converts the standby to the primary role.

 Q61.What is Failover event?


Ans:


The Failover process is similar to switchover event except that the primary database never has the chance to write an EOR record as this is an unplanned event.


Q62. What are the different types ofFailover event ?


Ans:


Manual:Administrator have complete control of primary-standby role transitions. It can lengthen the outage by the amount of time required for the administrator to be notified and manual execution of command.

Automatic:It uses Data Guard’s Fast-Start Failover feature which automatically detects the failure, evaluates the status of the Data Guard configuration, and, if appropriate, executes the failover to a previously chosen standby database.

Q63.Which tools can be used for Data Guard Management?


Ans:


SQL*Plus – traditional method, can prove most tedious to use

Data Guard broker – automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. Simplifies and automates many administrative tasks. It has its own command line (DGMGRL) and syntax.

Enterprise Manager – requires that the Data Guard broker be enabled. a GUI to the Data Guard broker, replacing the DGMGRL command line and interfacing directly with the broker’s monitor processes.

Q64.What is Data Guard 11g snapshot standby?


Ans:


With 11g, you can thoroughly test your changes on a true replica of your production system and database using actual production workload.


Data Guard 11g physical standby can now be converted to a snapshot standby, independent of the primary database, that is open read-write and able to be used for preproduction testing. It uses Flashback Database and sets a guaranteed restore point (GRP) at the SCN before the standby was open read-write.


Q65.What is the difference between Recovery Point Objective(RPO) and Recovery Time Objective (RTO)?


Ans:


Recovery Point Objective(RPO)

RPO concerns with data. It is the amount of data you are willing to lose when the failure occurs in your database system. Usually people define data loss in terms of time, so possible values can be 5 seconds of data loss, 2 hours of data loss etc.Remember that each standby database has its own set of attributes and parameters. It means you can mix zero data loss standby databases with minimal data loss standbydatabases in the same Data Guard configuration


Recovery Time Objective (RTO)

RTO is defined as how fast you can get back up and running (whereas RPO is concerned with data loss).So with your RPO strategy you lost say only about 6 seconds of data as you committed to your client but with RTO you need to formulate how fast clients can connect back to the database system after the data loss has occurred.


Q66.What are Standby Redo Log (SRL) files?


Ans:


The SRL files are where the Remote File Server (RFS) process at your standby database writes the incoming redo so that it is persistent on disk for recovery. SRL files are important for better redo transport performance and data protection.SRL are MUST in Maximum Availability or Maximum Protection mode and OPTIONAL (but recommended) in Maximum Performance mode.If there are no Standby Redo Log (SRL) files, then at each log switch in the primary database, the RFS process on the standby database that is serving an asynchronous standby destination has to create an archive log of the right size. While the RFS is busy doing creating the archive log file, the LNS process at the primary database has to wait, getting further and further behind the LGWR (in case of Maximum Performance mode). That is why it recommended to have Standby Redo Log (SRL) files in Maximum Performance mode also.


Q67.What is Fast Start Fail Over (FSFO)?


Ans:


Main criticism of Oracle standby databases has always been that too much manual interaction is required in case of disaster situation. FSFO helps in filling up this requirement. FSFO quickly and reliably fails over the target standby database to the primary database role, without requiring you to perform any manual steps to invoke the failover.  Please keep in mind that you need to have Broker configuration done to be able to use FSFO feature.


Q68.What is the concept of OBSERVER in Fast Start Fail Over ( FSFO)?


Ans:


In normal scenario, If you have to perform a switch over activity in your standby setup you keep some kind of monitor/observation on your setup so that you are aware that when your primary database is not available and you need to switch over to standby database. Oracle helped in getting this manual observation activity by providing an OBSERVER process which constantly monitors the availability of the Primary database. Now, if we run OBSERVER on the primary or secondary database server itself then their is risk that the OBSERVER itself will get down when that server is down because of disaster. That is why observer is a separate OCI client-side component that runs on a different computer from the primary and standby databases.


So Once the observer is started, no further user interaction is required. If both the observer and designated standby database lose connectivity with the primary database for longer than the number of seconds specified by the FastStartFailoverThreshold configuration property, the observer will initiate a fast-start failover to the standby database.


Q69.What are the high level steps for configuring Fast Start Fail Over (FSFO)?


Ans:


To configure FSFO in your Standby setup, broad level steps will be:


STEP 1: Determine Which of the Available Standby Databases is the Best Target for the Failover. Means you want to choose physical over logical etc.


STEP 2: Specify the Target Standby Database with the FastStartFailoverTarget Configuration Property. This may not be required if you have only one standby database in configuration.


STEP 3: Determine the Protection Mode You Want . You will have to choose from either maximum performance or maximum availability. This is more of business decision and you will have to take consensus from all stakeholders on which mode will prove right for you.


STEP 4: Set the FastStartFailoverThreshold Configuration Property. This parameter tells how long (in seconds) OBSERVER process should wait before starting failover


STEP 5: Set Other Properties Related to Fast-Start Failover (Optional). There are some other parameters like FastStartFailoverAutoReinstate,  ObserverOverride etc which can also be applicable to meet your specific requirements.


STEP 6: Enable Additional Fast-Start Failover Conditions (Optional). This step can give you some more options to define when you primary database is unusable example: stuck archiver, corrupt control file etc.


STEP 7: Enable FSFO Using DGMGRL or Cloud Control. This is the main step in which you will enable to FSFO.


STEP 8: Start the Observer. You can use Cloud Control or DGMGRL to start the observer process.


STEP 9: Verify the Fast-Start Failover Environment. DGMGRL command “SHOW FAST_START FAILOVER” can show you easily the status of FSFO.


Q70. What is Maximum protection mode in dataguard?


Ans:


MAXIMUM PROTECTION


This mode provides maximum protection. It guarantees zero data loss. In this mode the redo/transaction data must be written to both primary redo log and standby redo log. For any reason(mostly N/W issue) if it is unable to write to standby, Then primarya will get shutdown.


Explore Python Sample Resumes! Download & Edit, Get Noticed by Top Employers!

Oracle Data Guard Sample Resumes! Download & Edit, Get Noticed by Top Employers!

Q71. What is snapshot standby database.


Ans:


Snapshot standby is a feature in Oracle 11g that allows doing a read-write operation on the standby database i.e we can convert the physical standby database to snapshot standby for testing purpose.  On that, we can do all types of testing (BOTH READ/WRITE) or can be used as a development database (which is an exact replication of production ). Once the testing is over we can again convert the snapshot database to physical standby. Once it is converted physical standby database, whatever changes were done to the snapshot standby will be reverted.


Q72. What is the difference between switchover and failover:


Ans:


A switchover means just switching roles between the primary database and standby db.


nswitchover, the primary database chnaged to a standby role, and the standby database changed to the primary role.This is typically done for planned maintenance of the primary db server.A failover is when the primary database fails and one of the standby databases is transitioned to take over the primary role. Failover is performed only in the event of a catastrophic failure of the primary database, and there is no possibility of recovering the primary database in a timely manner. Failover may or may not result in data loss depending on the protection mode in effect at the time of the failover.


Q73. What is the process to apply a psu patch in dataguard setup?


Ans:


Make sure lag between primary and standby is zero.

Cancel the recovery (MRP) on standby.

Shutdown standby db and listener.

Apply patch to binary using opatch apply command.

Once patch applied to binary , startup the listener and standby in mount stage or OPEN(if active dataguard).

Now shutdown primary db and listener.

Apply patch to binary using opatch apply command.

Once patch applied to binary , startup the listener and prim db in mount OPEN(if active dataguard).

Start the MRP recovery process on standby .

Q74. What is fal_client and fal_server parameter?


Ans:


FAL Means – Fetch Archive log. FAL_CLIENT and FAL_SERVER parameters are used on standby database for archive gap resolution.


FAL_SERVER and FAL_CLIENT parameters are required on standby database only .


FAL_SERVER: Specify an Oracle Net Service Name (TNS-Alias or Connect Descriptor) that points to the Database from where the missing ArchiveLog(s) should be requested i.e the tns alias of the primary db.


FAL_CLIENT: Specify an Oracle Net Service Name (TNS-Alias or Connect Descriptor) that points  from FAL_SERVER TO standby db. i.e the tns alias of the standby db.


fal_server = ‘primdb’


fal_client = ‘stdbydb’


Q75. What are different types of dataguard and their difference.


Ans:


There are two types of dataguard setups:PHYSICAL and LOGICAL.


PHYSICAL STANDBY:

A physical standby database is an exact, block-for-block copy of a primary database. A physical standby ismaintained as an exact copy through a process called REDO APPLY, in which redo data received from aprimary database is continuously applied to a physical standby database using the database recovery mechanisms. So it will be always in sync with primary.


This Standby database can be opened in read only mode( knows as ACTIVE DATA GUARD), for reporting purpose. Most of the corporations use physical standby for dataguardconfiguration.


LOGICAL STANDBY:

The logical standby database is kept synchronized with the primary database through SQL APPLY, which transforms the data in the redo received  from the primary database into SQL statements and then executes the SQL statements on the standby database. So it contains same logical information as that of production , but physical structure of data can be different.


Q76. What is MAXIMUM AVAILABILITY mode in dataguard?


Ans:


It provides the highest level of data protection that is possible without affecting the availability of the primary database.Transactions are not allowed to commit until all redo data are written to the online redo logs and propagated to at least one synchronized secondary database. If for any reason, the primary database cannot propagate its redo stream to one secondary database, the primary will NOT shutdown and operates as it it were in maximum performance mode until issues are fixed.


Q77.  What is MAXIMUM PERFORMANCE mode in dataguard?


Ans:


This is the default protection mode. With this protection mode, a transaction is committed as soon as the redo data needed to recover the transaction is written to the local (online) redo log.


Q78. Lets sa,y, few of archive logs are missing from primary db server, before it is shipped to standby. In this case, How you will resolve the gap and keep the standby in sync with that of primary?


Ans:


Recover the standby database by taking an incremental backup of primary db (using scn of standby db) . 


Q79.The support DBA , added a tempfile on primary database, but the tempfile is not reflecting on standby database despite, the standby_file_management is set to AUTO.


Ans:


Adding tempfiles to TEMP tablespaces in primary database, will not automatically create on standby database. Because no redo is generated, while adding tempfile. So DBA have to add the temp file manually.


Q80. How many standby databases we can create (in 10g/11g)?


Ans:


Till Oracle 10g, 9 standby databases are supported.


From Oracle 11g R2, we can create 30 standby databases.


Q81. What are differences between physical, logical, snapshot standby and ADG (or) what are different types of standby databases?


Ans:


Physical standby – in mount state, MRP will apply archives

ADG (Active Data Guard) – in READ ONLY state, MRP will apply archives

Logical standby – in READ ONLY state, LSP will run

Snapshot standby databases – Physical standby database can be converted to snapshot standby database, which will be in READ WRITE mode, can do any kind of testing, then we can convert back snapshot standby database to physical standby database and start MRP which will apply all pending archives.

Q82. How to find out backlog of standby?


Ans:


select round((sysdate – a.NEXT_TIME)*24*60) as “Backlog”,m.SEQUENCE#-1 “Seq Applied”,m.process, m.status  from v$archived_log a, (select process,SEQUENCE#, status from v$managed_standby where process like ‘%MRP%’)m where a.SEQUENCE#=(m.SEQUENCE#-1);


Q83. If you didn’t have access to the standby database and you want to find out what error has occurred in a Data Guard configuration, what view would you check in the primary database to check the error message?


Ans:


You can check the v$dataguard_status view. 


select message from v$dataguard_status;


Q84.. How can you recover standby which far behind from primary (or) without archive logs how can we make standby sync?


Ans:


By using RMAN incremental backup.


Q85. What is snapshot standby (or) How can we give a physical standby to user in READ WRITE mode and let him do updates and revert back to standby?


Ans:


Till Oralce 10g, create guaranteed restore point, open in read write, let user do DMLs on snapshot standby, flashback to restore point, start MRP.


From Oracle 11g, convert physical standby to snapshot standby, let user do DMLs on snapshot standby, convert to physical standby, start MRP.


Q86. What are the uses of standby redo log files?


Ans:


A standby redo log resides on the standby database site. The standby redolog file is similar to an online redo log, except that a standby redo log is used to store redo data that has been received from a another/primary database.


Q87. What is failover/switchover (or) what is the difference between failover and switchover?


Ans:


Switchover – This is done when both primary and standby databases are available. It is pre-planned.

Failover – This is done when the primary database is NO longer available (i.e. in a Disaster). It is not pre-planned.

Q88. What are the background processes involved in Data Guard?


Ans:


RFS, MRP, LSP


Q89.What are the services required on the primary and standby data-base?


Ans:


The services required on the primary database are:


Log Writer Process (LGWR): Collects redo information and updates the online redo logs. It can also create local archived redo logs and transmit online redo to standby databases.

Archiver Process (ARCn): One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.

Fetch Archive Log (FAL) Server: Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request.

Log network server (LNS): LNS is used on the primary to initiate a connection with the standby database.

The services required on the standby database are:


Fetch Archive Log (FAL) Client: Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.

Remote File Server (RFS): Receives archived and/or standby redo logs from the primary database.

Archiver (ARCn) Processes: Archives the standby redo logs applied by the managed recovery process (MRP).

Managed Recovery Process (MRP): pplies archive redo log information to the standby database.

It controls the automated transfer of redo data from the production database to one or more archival destinations. The redo transport services perform the following tasks.

Q90.How to delay the application of logs to a physical standby?


Ans:


A standby database automatically applies redo logs when they arrive from the primary database. But in some cases, we want to create a time lag between the archiving of a redo log at the primary site, and the application of the log at the standby site.


Modify the Log_Archive_Dest_n initialization parameter on the primary database to set a delay for the standby database.


Example: For 60min Delay:


ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=’SERVICE=stdby_srvc DELAY=60′;


The DELAY attribute is expressed in minutes.


The archived redo logs are still automatically copied from the primary site to the standby site, but the logs are not immediately applied to the standby database. The logs are applied when the specified time interval expires.


Q91.What are the parameters we’ve to set in primary/standby for Data Guard?


Ans:


DB_UNIQUE_NAME

LOG_ARCHIVE_CONFIG

LOG_ARCHIVE_MAX_PROCESSES

DB_CREATE_FILE_DEST

DB_FILE_NAME_CONVERT

LOG_FILE_NAME_CONVERT

LOG_ARCHIVE_DEST_n

LOGARCHIVE_DEST_STATE_n

FAL_SERVER

FAL_CLIENT

STANDBY_FILE_MANAGEMENT

Q92.What is active data guard? Does it needs additional licensing?


Ans:


Active dataguard means, the standby database is open with read only mode, when redo logs are getting applied in real time.


Below are the benefit of using active dataguard:

Reporting queries can be offloaded to standby database.

Physical block corruptions are repaired automatically either at primary or physical standby database.

RMAN backups can be initiated from standby , instead of primary which will reduce cpu load from primary.

Q93.What is active dataguard duplicate?


Ans:


Starting from 11g we can duplicate database by two way :


Active DB duplicate 

Backup-based duplicate.

Active DB duplicate copies the live TARGET DB over the network to the AUXILLARY destination and  then create the duplicate database. In an active duplication process, target database online image  copies and archived redo log files were copied through the auxiliary instance service name. So there is  no need of target db backup.