.: RAC(Real Application cluster) Details

RAC:
Allows multiple instances to access a single database.
A Cluster is a feature of High Availability, where it eliminates single point of failure.
11g2 Rac feature

We can store everything on the ASM. We can store OCR & voting files also on the ASM.
Single Client Access Name (SCAN) - eliminates the need to change tns entry when nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners as remote listeners. SCAN is fully qualified name. Oracle recommends assigning 3 addresses to SCAN, which create three SCAN listeners.
By default, LOAD_BALANCE is ON.
GSD (Global Service Deamon), gsdctl introduced.
Grid Naming Service (GNS) is a new service introduced in Oracle RAC 11g R2. With GNS, Oracle Clusterware (CRS) can manage Dynamic Host Configuration Protocol (DHCP) and DNS services for the dynamic node registration and configuration
Oracle Local Registry (OLR) - From Oracle 11gR2 "Oracle Local Registry (OLR)" something new as part of Oracle Clusterware.The Oracle Local Registry (OLR) is similar to the Oracle Cluster Registry, but it only stores information about the local node. The OLR is not shared by other nodes in the cluster and is used by the OHASd while starting or joining the cluster.The OLR stores information that is typically required by the OHASd, such as the version of Oracle Clusterware, the configuration, and so on. Oracle stores the location of the OLR in a text file named /etc/oracle/olr.loc. This file will have the location of the OLR configuration file $GRID_HOME/cdata/<hostname.olr>.
What is OLR and why it is required?

While starting clusterware, it need to access the OCR , to know which resources it need to start. However the OCR file is stored inside ASM, which is not accessible at this point( because ASM resource also present in OCR file.

To avoid this, The resources which need to be started on node is stored in operating file system called as OLR ( Oracle local registry). Each node will have their OLR file.

Where the OLR is stored? When olr backup is created.

By default, OLR is located at Grid_home/cdata/host_name.olr

The OLR is backed up after an installation or an upgrade. After that time, you can only manually back up the OLR. Automatic backups are not supported for the OLR.

If olr file is missing ,How can you restore olr file from backup

# crsctl stop crs -f

# touch $GRID_HOME/cdata/<node>.olr

# chown root:oinstall $GRID_HOME/cdata/<node>.olr

# ocrconfig -local -restore $GRID_HOME/cdata/<node>/backup_<date>_<num>.olr

# crsctl start crs

Someone deleted the olr file by mistake and currently no backups are available . What will be the impact and how can you fix it?

If OLR is missing , then if the cluster is already running, then cluster will run fine. But if you try to restart it , It will fail.

So you need to do below activities.

On the failed node:

# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force

# $GRID_HOME/root.sh

Multicasting is introduced in 11gR2 for private interconnect traffic.
In Oracle 10g RAC and 11gR1 RAC, Oracle clusterware and ASM are installed in the different Oracle homes, and the Clusterware has to be up before ASM instance can be started because ASM instance uses the clusterware to access the shared storage. Oracle 11g R2 introduced the grid infrastructure home which combines Oracle clusterware and ASM. The OCR and votingdisk of 11g R2 clusterware can be stored in ASM.

Oracle Clusterware:

Clusterware is the software. Clusterware is run by Cluster Ready Services (CRS) using two key components –voting disk to record node membership information and Oracle Cluster Registry (OCR). Clusterware monitors all components like instances and listeners.

Voting Disk –

Voting disks are important component of Oracle Clusterware.It is file that resides on shared storage and Primary function of voting disks is to manage node membership and prevent SPLITBRAIN Syndrome.Voting disk reassigns cluster ownership between the nodes in case of failure.Each voting disk must be accessible by all nodes in the cluster.If any node is not passing heat-beat across other note or voting disk, then that node will be evicted by Voting disk.We must have odd number of disks.Oracle recommends minimum of 3 and maximum of 5. In 10g, Clusterware can supports 32 voting disks but in 11gR2 supports 15 voting disks.

In 11g Release 2 ,voting disk data is automatically backed up in the OCR whenever there is a configuration change.Oracle recommends NOT used dd command to backup or restore as this can lead to loss of the voting disk.

Also the data is automatically restored to any voting that is added.

What Information is stored in VOTING DISK/FILE

It contains 2 types of data .

Static data : Info about the nodes in cluster

Dynamic data: Disk heartbeat logging

It contains the important details of the cluster nodes membership like

a. Which node is part of the cluster.

b. Which node is leaving the cluster and

c. Which node is joining the cluster.

What is the purpose of Voting disk?

Voting disk stores information about the nodes in the cluster and their heartbeat information. Also stores information about cluster membership.

Why we need voting disk?

Oracle Clusterware uses the VD to determine which nodes are members of a cluster. Oracle Cluster Synchronization Service daemon (OCSSD) on each cluster node updates the VD with the current status of the node every second. The VD is used to determine which RAC nodes are still in the cluster should the interconnect heartbeat between the RAC nodes fail.

To find current location of Voting disk:

[oracle@rsingle ~]$ crsctl query css votedisk

Voting disk backup (In 10g)

dd if=<voting-disk-path> of=<backup/path>

Add/delete vote disk

crsctl add css votedisk <path> -adds a new voting disk

crsctl delete css votedisk <path> -- deletes the voting disk

OCR:

OCR is the central repository for CRS and it stores details about the services and status of the resources.It is a binary file which resides on shared storage and accessiable by all nodes.It created at the time of Grid Installation. It’s store information to manage Oracle cluster-ware and it’s component such as RAC database,listener, VIP,Scan IP & Services.Ocr contain information such as which database instance runs on which node and which services runs on which database.Oracle Clusterware automatically creates OCR backups every 4 hours. At any one time, Oracle Clusterware always retains the latest 3 backup copies of the OCR that are 4 hours old, 1 day old and 1 week old.

Oracle stores the location of the OCR file in a text file called ocr.loc, which is located in different places depending on the operating system. For example, on Linux-based systems the ocr.loc file is placed under the /etc/oracle directory, and for UNIX-based systems the ocr.loc is placed in /var/opt/oracle. Windows systems use the registry key Hkey_Local_Machine\software\Oracle\ocr to store the location of the ocr.loc file.

What is OCR and what it contains?

OCR is the central repository for CRS, which stores the metadata, configuration and state information for all cluster resources defined in clusterware.It is a binary file which resides on shared storage and accessiable by all nodes.It created at the time of Grid Installation.

node membership information
status of cluster resources like database,instance,listener,services
ASM DISKGROUP INFORMATION
Information ocr,vd and its location and backups.
vip and scan vip details.

Who updates OCR and how/when it gets updated?

OCR is updated by clients application and utilities through CRSd process.

1.tools like DBCA,DBUA,NETCA,ASMCA,CRSCTL,SRVCTL through CRsd process.

2. CSSd during cluster setup

3.CSS during node addition/deletion.

Each node maintains a copy of OCR in the memory. Only one CRSd(master) , performs read, write to the OCR file . Whenever some configuration is changed, CRSd process will refresh the local OCR cache and remote OCR cache and updates the OCR FILE in disk.

So whenever we try get cluster information using srvctl or crsctl , then it uses the local ocr for fetching the data . But when it modify , then through CRSd process, it will updates the ocr physical file).

OCR file has been corrupted, there is no valid backup of OCR. What will be the action plan?

In this case , we need to deconfig and reconfig.

deconfig can be done using rootcrs.sh -deconfig option

and reconfig can be done using gridsetup.sh script.

There is no way to customize the backup frequencies or the number of files that Oracle Grid Infrastructure retains while automatically backing OCR.

To find location of Corrent OCR:

[oracle@rsingle ~]$ ocrcheck

./ocrconfig -manualbackup

./ocrconfig -showbackup

Important daemons:

The CRS has four components and that run as deamons or processes.
OPROCd - Process Monitor Daemon
CRSd - CRS daemon, the failure of this daemon results in a node being reboot to avoid data corruption
OCSSd - Oracle Cluster Synchronization Service Daemon (updates the registry)

[ocssd, crsd, evmd, oprocd, racgmain, racgimon]

EVMd - Event Volume Manager Daemon.

Oracle High Availability Services Daemon (OHASD):
OHAS is implemented via a new daemon process which is called ohasd.

Oracle High Availability Services Daemon (OHASD) anchors the lower part of the Oracle Clusterware stack, which consists of processes that facilitate cluster operations in RAC databases. This includes the GPNPD, GIPC, MDNS and GNS background processes.

To enable OHAS :crsctl enable crs and this will cause OHAS to autostart when each node re-boots. To verify that OHAS is running, check for the CRS-4123 message in your alert log.

CRS:Cluster Ready Services:
ps -ef | grep -i crs | grep -v grep
root 25863 1 1 Oct27 ? 11:37:32 /opt/oracle/grid/product/11.2.0/bin/crsd.bin
crsd.bin:
The above process is responsible for start, stop, monitor and failover of resource.It maintains OCR and also restarts the resources when the failure occurs.Daemon restarted automatically, no node restart Run as root user.

CSS:Cluster Synchronization Service
$ ps -ef | grep -v grep | grep css
root 19541 1 0 Oct27 ? 00:05:55 /opt/oracle/grid/product/11.2.0/bin/cssdmonitor
root 19558 1 0 Oct27 ? 00:05:45 /opt/oracle/grid/product/11.2.0/bin/cssdagent
oragrid 19576 1 6 Oct27 ? 2-19:13:56 /opt/oracle/grid/product/11.2.0/bin/ocssd.bin

CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor).
CSS Monitor (cssdmonitor):
Monitors node hangs(via oprocd functionality) and monitors OCCSD process hangs (via oclsomon functionality) and monitors vendor clusterware(via vmon functionality).

CSS Agent (cssdagent):
Spawned by OHASD process.Previously(10g) oprocd, responsible for I/O fencing.Killing this process would cause node reboot.Stops,start checks the status of occsd.bin daemon.

CSS daemon (ocssd):occsd.bin:
Manages cluster node membership runs as oragrid user.Failure of this process results in node restart.

EVM:Event Manager:
background process that publishes Oracle Clusterware events.It monitors the message flow between the nodes and logs the relevant event information to the log files.

Diskmon :
Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.

ONS/eONS:
ONS is Oracle Notification Service. eONS is a Java Process.

OPROCD:
Runs as root and provides node fencing instead of hangcheck timer kernel module

RACG
CTSS:
Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP.
Gipcd :
Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure
Oracle Agent:
Oracle Root Agent:Orarootagent :
Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address
Oclskd :
Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS .
Oracle High Availability Service:
Gnsd :
Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.
Mdnsd :
Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

Oracle RAC instances are composed of following background processes:
LMON — Global Enqueue Service Monitor
LMD — Global Enqueue Service Daemon
LMS — Global Cache Service Process
LCK0 — Instance Enqueue Process
DIAG — Diagnosability Daemon
RMSn — Oracle RAC Management Processes (RMSn)
RSMN — Remote Slave Monitor
DBRM — Database Resource Manager (from 11g R2)
PING — Response Time Agent (from 11g R2)
ACMS — Atomic Control file to Memory Service (ACMS)(from Oracle 11g)
GTX0-j — Global Transaction Process (from Oracle 11g)

What is Split Brain?
In a Oracle RAC environment all the instances/servers communicate with each other using high-speed interconnects on the private network. This private network interface or interconnect are redundant and are only used for inter-instance oracle data block transfers. Now talking about split-brain concept with respect to oracle rac systems, it occurs when the instance members in a RAC fail to ping/connect to each other via this private interconnect, but the servers are all pysically up and running and the database instance on each of these servers is also running. These individual nodes are running fine and can conceptually accept user connections and work independently. So basically due to lack of commincation the instance thinks that the other instance that it is not able to connect is down and it needs to do something about the situation. The problem is if we leave these instance running, the sane block might get read, updated in these individual instances and there would be data integrity issue, as the blocks changed in one instance, will not be locked and could be over-written by another instance. Oracle has efficiently implemented check for the split brain syndrome.

In a cluster, a private interconnect is used by cluster nodes to monitor each node’s status and communicate with each other. When two or more nodes fail to ping or connect to each other via this private interconnect, the cluster gets partitioned into two or more smaller sub-clusters each of which cannot talk to others over the interconnect. Oblivious of the existence of other cluster fragments, each sub-cluster continues to operate independently of the others. This is called “Split Brain”. In such a scenario, integrity of the cluster and its data might be compromised due to uncoordinated writes to shared data by independently operating nodes. Hence, to protect the integrity of the cluster and its data, the split-brain must be resolved.

Split brain syndrome occurs when the instances in a RAC fails to connect or ping to each other via the private interconnect. So, in a two node situation both the instances will think that the other instance is down because of lack of connection.The problem which could arise out of this situation is that the same block might get read, updated in these individual instances which cause data integrity issues, because the block changed in one instance will not be locked and could be overwritten by another instance.Both Instance start working independently.

Interconnect –
is private network that connects all the servers in cluster.
Interconnect uses switch that only nodes in cluster can access. Instances in cluster communicate to each other via interconnect.

Global Resource Directory (GRD):.It records and stores current status of datablock whenever block is transferred from a local cache to another indtance GRD is updated.
It has two part:
1.GCS(Global cache service)
2.GES(Global enqueue services)

— Global Cache Service (GCS): Management of data sharing and exchange is done by GCS.It contain information of current lock and instance waiting block to acquire the lock.LMS background process.
— Global En-queue Service (GES):It handle non-datablock resources and control on dictionary and library cache lock and trasanction. LMD.

SCAN:Single Client Access Name
Oracle RAC 11g release 2 introduces the Single Client Access Name (SCAN),which provides a single name for clients to access Oracle Databases running in a cluster and simplify the database connection strings that an Oracle Client uses to connect.eliminates the need to change TNSNAMES entry when nodes are added to or removed from the Cluster.

Difference between CRSCTL and SRVCTL?

Crsctl command is used to manage the elements of the clusterware (crs,cssd, OCR,voting disk etc.) while srvctl is used to manage the elements of the cluster (databases,instances,listeners, services etc) .
Both command were introduced with Oracle 10g and have been improved since.

What is cache fusion?
Transferring of data between RAC instances by using private network.Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block, it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk.

Cache Fusion Oracle RAC transfer the data block from buffer cache of one instance to the buffer cache of another instance using the cluster high speed interconnect.

27. What is FAN?
Ans:
Applications can use Fast Application Notification (FAN) to enable rapid failure detection, balancing of connection pools after failures, and re-balancing of connection pools when failed components are repaired. The FAN process uses system events that Oracle publishes when cluster servers become unreachable or if network interfaces fail.

28. What is FCF?
Ans:
Fast Connection Failover provides high availability to FAN integrated clients, such as clients that use JDBC, OCI, or ODP.NET. If you configure the client to use fast connection failover, then the client automatically subscribes to FAN events and can react to database UP and DOWN events. In response, Oracle gives the client a connection to an active instance that provides the requested database service.

29. What is TAF and TAF policies?
Ans:
Transparent Application Failover (TAF) - A runtime failover for high availability environments, such as Real Application Clusters and Oracle Real Application Clusters Guard, TAF refers to the failover and re-establishment of application-to-service connections. It enables client applications to automatically reconnect to the database if the connection fails, and optionally resume a SELECT statement that was in progress. This reconnect happens automatically from within the Oracle Call Interface (OCI) library.

44. Why Clusterware installed in root (why not oracle)?

46. What is the difference between cr block and cur (current) block?

43. What is fencing?
Ans:
I/O fencing prevents updates by failed instances, and detecting failure and preventing split brain in cluster. When a cluster node fails, the failed node needs to be fenced off from all the shared disk devices or diskgroups. This methodology is called I/O Fencing, sometimes called Disk Fencing or failure fencing.

Grid Naming Service (GNS):It is another new service introduced in Oracle RAC 11g R2. With GNS, Oracle Cluster Software (CRS) can manage DHCP and DNS Services for the dynamic node registration and configuration.

Major RAC wait events?
In RAC environment the buffer cache is global across all instances in the cluster and hence the processing differs.The most common wait events related to this are gc cr request and gc buffer busy

GC CR request: the time it takes to retrieve the data from the remote cache

Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will increase the amount of data blocks
requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested data block.

To verify that RAC instances are running?

select * from V$ACTIVE_INSTANCES;
select inst_id,username,failover_method,failover_type,failed_over from gv$session where username='&username';

ACMS?
ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.

How does OCSSD starts first if voting disk & OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be started if voting disks are stored in ASM?

Without access to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance.
To solve this problem the ASM disk headers have new metadata in 11.2:
you can use kfed to read the header of an ASM disk containing a voting disk.
The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does not require the ASM instance to be up.
Once the voting disks are located, CSS can access them and joins the cluster.

In Oracle RAC clusters, we see three types of IP addresses:

Public IP: The public IP address is for the server. This is the same as any server IP address, a unique address with exists in /etc/hosts.
Private IP: Oracle RCA requires "private IP" addresses to manage the CRS, the clusterware heartbeat process and the cache fusion layer.
Virtual IP: Oracle uses a Virtual IP (VIP) for database access. The VIP must be on the same subnet as the public IP address. The VIP is used for RAC failover (TAF).

VIP, A Virtual IP is nothing but another IP which runs on same interface eth0 as your Public IP.
This VIP is available on all nodes like your each node individual. Your listener is aware of both Public IP & vip.
It listens to public IP & VIP. Incase of a fail over the vip of Node-1 shifted to Node# 2.

roothas.sh Vs rootcrs.sh

Both resides in $GRID_HOME/oui/bin

roothas.pl will be useful or used when you run the grid infrastructure in standalone mode (single node Cluster)

rootcrs.pl will be useful or used when you run the grid infrastructure in normal mode (normal cluster comprising of one or more node)

Refer: https://www.dbaplus.ca/2020/05/12201-initohasd-does-not-start.html

What is node eviction?

Word Meaning: to force(someone) to leave the place.

The process of removing the failed(due to various reasons) node from the cluster is known as eviction. Prior to 11gR2 Oracle tries to prevent from split brain situation by quickly rebooting the failed node . After 11gr2 Clusterware will attempt to clean up the failed resources . If the clusterware is able to clean up the failed resources, OHASD will try to restart the CRS stack. Once this task is done all the cluster resources on that node will be started automatically. This is called reboot less fencing(or eviction). If clusterware can not stop or clean the failed resources then it will roboot the node.

Causes of node eviction :

-Missing network heartbeat

-Missing disk heartbeat

-CPU starvation issues

-Hanging cluster processes

-May have more...

How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation [ID 942166.1]

http://oracle-help.com/oracle-rac/node-eviction-oracle-rac/

https://db.geeksinsight.com/2012/12/27/oracle-rac-node-evictions-11gr2-node-eviction-means-restart-of-cluster-stack-not-reboot-of-node/

After an Oracle RAC node crashes,rerouting transactions to survining node:

https://community.oracle.com/tech/developers/discussion/1079972/shutdown-one-node-in-rac

Transparent Application Fail-over in Oracle RAC

http://oracle-help.com/oracle-rac/transparent-application-fail-over-in-oracle-rac/

What is GPNP profile?

The GPnP profile is a small XML file located n GRID_HOME/gpnp/<hostname>/profiles/peer under the name profile.xml. It is used to establish the correct global personality of a node. Each node maintains a local copy of the GPnP Profile and is maintanied by the GPnP Deamon (GPnPD. GPnP Profile is used to store necessary information required for the startup of Oracle Clusterware like SPFILE location,ASM DiskString etc.It contains various attributes defining node personality.

Cluster name

Network classifications (Public/Private)

Storage to be used for ASM : SPFILE location,ASM DiskString etc

Digital signature information : The profile is security sensitive. It might identify the storage to be used as the root partition of a machine. Hence, it contains digital signature information of the provisioning authority.

What is GPNP profile?

Grid plug and play(GPNP) file is small xml file present at os local file system . Each node have their owner GPNP file.

GPNP file is managed by GPNP daemon.

It stores information like asm diskstring , asm spfile which are required to start the cluster.

– Storage to be used for CSS

– Storage to be used for ASM : SPFILE location,ASM DiskString

– public private network details.

When clusteware is started, It needs voting disk( which is inside ASM). So first it will check the gpnp profile to get the voting disk location( asm_diskstring is defined inside gpnp profile) .As asm is not up at this point, asm voting disk file will read using kfed read command. ( We can run kfed, even when asm instance is down).

https://rafik-dba.blogspot.com/2019/02/oracle-rac-startup-sequence.html#:~:text=ORACLE%20RAC%20STARTUP%20SEQUENCE%201%20ONS%3A-%20Oracle%20notification,for%20cluster.%203%20SCAN%20Listener%3A-%204%20Node%20Listener%3A-

==========

What are the software stacks in oracle clusterware?

From 11g onward, there are two stacks for clusterware is CRS.

lower stack is high availability cluster service stack ( managed by ohasd daemon)

upper stack is CRSD stack ( managed by CRSd daemon).

What are the role of CRSD,CSSD,CTSSD, EVMD, GPNPD?

CRSD – Cluster ready service daemon – It manages the cluster resources , based on OCR information. It includes start,stop and failover or resource. It monitors database instance, asm instance ,listeners, services and etc on and automatically restarts them when failure occurs.

CSSD – > Cluster syncronization service – It manages the cluster configuration like, which nodes are part of cluster etc. . When a node is added or deleted, it inform the same about this other nodes. It is also responsible for node eviction if situation occurs.

CSSD has 3 processes – >

the CSS daemon (ocssd),

the CSS Agent (cssdagent), The cssdagent process monitors the cluster and provides input/output fencing.

the CSS Monitor (cssdmonitor) – Monitors internode cluster health

CTSSD – > Provides time managment for cluster service. If ntp is running on server, then CTSS runs in observer mode.

EVMD – > Event Manger , Is a background process that publishes Oracle Clusterware events and manages message flow between the nodes and logs relevant information to log file.

oclskd -> Cluster Kill Daemon – > Is used by CSS to reboot a node based on requests from other nodes in the cluster

Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure

Grid Plug and Play (GPNPD): GPNPD provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes node have the most recent profile.

Multicast Domain Name Service (mDNS): Grid Plug and Play uses the mDNS process to locate profiles in the cluster, as well as by GNS to perform name resolution.

Oracle Grid Naming Service (GNS): Handles requests sent by external DNS servers, performing name resolution for names defined by the cluster.

ASM spfile is stored inside ASM diskgroup, So how clusterware starts the ASM instance( as asm instance needs asm file startup)?

So here is the sequence of cluster startup.

ohasd is started by init.ohasd

ohasd accesses OLR file(stored in local file system) to initialize ohasd process.

ohasd starts gpnpd and cssd.

cssd process reads gpnp profile to get information like asm_diskstring, asm spfile ..

cssd scans all the asm disk headers and find the voting disk location and read using kfed command and it joins the cluster.

To read the spfile, It is not necessary to open the disk. All information necessary for this stored in the asm disk header. OHASD reads the header of asm disk containing spfile( this spfile location is retrieved from gpnp profile). and contents of the spfile are read using kfed command. Using this asm spfile, ASM instance is started.

Now asm instance is up, OCR can be accessed, as it is inside ASM diskgroup. So OHASD will star the CRSD.

So below are the 5 important files it access.

FILE 1 : OLR ( ORACLE LOCAL REGISTRY ) ——————————-> OHASD Process

FILE 2 :GPNP PROFILE ( GRID PLUG AND PLAY ) ————————> GPNPD process

FILE 3 : VOTING DISK —————————————————————-> CSSD Process

FILE 4 : ASM SPFILE ——————————————————————> OHASD Process

FILE 5 : OCR ( ORACLE CLUSTER REGISTRY ) ——————————> CRSD Process

In RAC, where we define the SCAN?

We can define SCAN with below 2 option.

Using corporate DNS

Using Oracle GNS( Grid naming service)

What is rebootless node fencing?

Prior to 11.2.0.2 , If failures happens with RAC components like private interconnect and voting disk accessibility, then to avoid split brain , oracle clusterware does fast reboot of the node But the problem was that node reboot that, if any non cluster related processes are running are running on node, then those also gets aborted. Also , with reboot, the resources also need to be remasterd, which is expensive sometime.

Also if sometime if some issue or blockages in the i/o temporarily then also clusterware will misjudge that, initiate reboot.

So to avoid this, from 11.2.0.2 onward, this method has been improved, and known as reboot-less node fencing.

First clusterware finds which node to be evicted

Then i/0 generating processes will be killed on the problematic node.

Clusterware resources will be stopped on the problematic node

OHASD process would be running, will try continuously to start CRS, till issue is resolved.

But if due to any issue, the it is unable to stop the processes on the problematics node( i.e rebootless fencing fails) , then fast reboot will be initiated by cssd.

Explain the steps for node addition in oracle rac.

Run gridsetup.sh from any of the existing nodes and select for add node option and then proceed with the rest of part.

Now extend the oracle_home to the new node using addnode.sh script( from existing node)

Now run dbca from the existing node and add the new instance.

https://dbaclass.com/article/how-to-add-a-node-in-oracle-rac-19c/

Explain the steps for node deletion.

Delete the instance usind dbca

Deinstall ORACLE_HOME from $ORACLE_HOME/deinstall

Run gridsetup.sh and select delete node option

https://dbaclass.com/article/how-to-delete-a-node-from-oracle-rac-19c/

asm spfile location is missing inside gpnp profile, Then how will asm instance startup?

For this, we need to understand the search order of asm spfile

First it will check for asm spfile location inside gpnp profile

If no entry is found inside gpnp profile, then it will check the default path $ORACLE_HOME/dbs/spfile+ASM.ora or a pfile.

How you troubleshoot, if the cluster node gets rebooted.?

http://www.dbaref.com/troubleshooting-rac-issues/howtotroubleshootgridinfrastructurestartupissues

In a 12c two node RAC, What will happen, if I unplug the network cable for private interconnect?

Rebootless node fencing will happen. i.e the node which is going to be evicted, on that node all cluster services will be down. and the services will be moved to the surviving node. And crs will do the restart attempt continuously until the private interconnect issues fixed. Please note – the node will not be reboot, only the cluster services willl go down.

However Prior to 11.2 , In this situation, the node reboot will occur.

Suppose someone has changed the permission of files inside grid_home. How you will fix it?

You can run rootcr.sh -init command to revert the permission.

# cd <GRID_HOME>/crs/install/

# ./rootcrs.sh -init

Alternatively you can check the below files under $GRID_HOME>/crs/utl/<hostname>/

– crsconfig_dirs which has all directories listed in <GRID_HOME> and their permissions

– crsconfig_fileperms which has list of files and their permissions and locations in <GRID_HOME>.

CSSD is not coming up ? What you will check and where you will check.

Voting disk is not accessible

Issue with private interconnect

2.the auto_start parameter is set to NEVER in ora.ocssd resource . ( To fix the issue, change it to always using crsctl modify resource )

Oracle RAC Interview Questions>>>https://dbaclass.com/article/oracle-rac-interview-questions/

Related doc:

Frequently Asked Questions (RAC FAQ) (Doc ID 220970.1)

https://oracle-patches.com/en/databases/oracle/oracle-rac-components

http://oracle-help.com/oracle-rac/rac-11gr2-clusterware-startup-sequence/

Wednesday, June 28, 2017

RAC(Real Application cluster) Details

No comments:

Post a Comment