ORACLE RAC Basics

Oracle Clusterware (Cluster Ready Services in 10g/ Cluster Manager in 9i) – provides infrastructure that binds multiple nodes that then operate as single server. Clusterware monitors all components like instances and listeners. There are two important components in Oracle clusterware, Voting Disk and OCR (Oracle Cluster Registry).

Voting Disk – is file that resides on shared storage and Manages cluster members.  Voting disk reassigns cluster ownership between the nodes in case of failure.

OCR (Oracle Cluster Registry) – resides on shared storage and maintains information about cluster configuration and information about cluster database. OCR contains information like which database instances run on which nodes and which services runs on which database.

CRS Resource – anything that Oracle Clusterware manages is classified as CRS resource like database, instance, service, listener, VIP address and so on.

.
Cluster-Aware Storage – is storage solution for Oracle RAC like RAW device, OCFS, ASM
.
Interconnect – is private network that connects all the servers in cluster. Interconnect uses switch that only nodes in cluster can access. Instances in cluster communicate to each other via interconnect.
.

Cache Fusion – is disk less cache coherency mechanism in Oracle RAC that provides copies of data blocks directly from one instance’s memory cache (in which that block is available) to other instance (instance which is request for specific data block).  Cache Fusion provides single buffer cache (for all instances in cluster) through interconnect.

In Single Node oracle database, an instance looking for data block first checks in cache, if block is not in cache then goes to disk to pull block from disk to cache and return block to client.

In RAC Database there is remote cache so instance should look not only in local cache (cache local to instance) but on remote cache (cache on remote instance). If cache is available in local cache then it should return data block from local cache; if data block is not in local cache, instead of going to disk it should first go to remote cache (remote instance) to check if block is available in local cache (via interconnect)

This is because accessing data block from remote cache is faster than accessing it from disk.

.

Cache Fusion Model
Cache fusion Model is dependent on three services
– Global Resource Directory (GRD)
– Global Cache Service (GCS)
– Global En-queue Service (GES) and –
.
SSH User Equivalency – means assigning same properties (username, userid, group, group id and same password) to operating system user (installing & owning RAC database) across all nodes in cluster

CVU (Cluster Varification Utility) – is utility to verify that system meets all the criteria for Oracle Clusterware Installation.

 

Storage Options for RAC

  • CFS (Cluster File System) – Easy to manage but only available on some platforms.  Does not address striping and mirroring.
  • RAW – Available on all platforms but difficult to manage. Does not address striping and mirroring.
  • NFS – Easy to manage but only available on some platforms.  Does not address striping and mirroring.
  • ASM (Automatic Storage Management) – Easy to manage, available on ALL platforms, and DOES address striping and mirroring.

CFS (Cluster Filesystems)

  • The idea of CFS is to basically share file filesystems between nodes.
  • Easy to manage since you are dealing with regular files.
  • CFS is configured on shared storage.  Each node must have access to the storage in which the CFS is mounted.
  • NOT available on all platforms.  Supported CFS solutions currently:
  • OCFS on Linux and Windows (Oracle)
  • DBE/AC CFS (Veritas)
  • GPFS (AIX)
  • Tru64 CFS (HP Tru64)
  • Solaris QFS

RAW (character devices)

  • Hard to manage since you are dealing with character devices and not regular files.
  • Adding and resizing datafiles is not trivial.
  • On some operating systems volume groups need to deactivated before LVs can be manipulated or added.

NFS (Network Filesystem)

  • NOT available on all platforms.  Supported NFS solutions currently:
  • Network Appliance
  • Redhat Linux
  • Fujitsu Primecluster
  • Solaris Suncluster

ASM

  • Stripes files rather than logical volumes.
  • Enables online disk reconfiguration and dynamic rebalancing.
  • Provides adjustable re balancing speed.
  • Provides redundancy on a file basis.
  • Supports only Oracle files.
  • Is cluster aware.
  • Is automatically installed as part of the base code set .

Source : Internet

Advertisements

Real Applications Cluster (RAC) FAQs

Some of the good Real Applications Cluster (RAC) interview questions:

Q 1.  How to setup SCAN in 11gr2 and how it works?

SCAN is a single name defined in DNS that resolves to three IP addresses using a round-robin algorithm. The three IP addresses are on the same subnet as RAC’s default public network in the cluster.

How to setup SCAN:

In order to successful install grid infrastructure you need to configure your DNS prior to installing the grid infrastructure to resolve the name accordingly. Oracle requires at least one IPs to be configured for the scan name.

You can have two options to define SCAN name

1) Define it in your DNS (Domain Name Service)
2) Use GNS (Grid Naming Service)

How SCAN works:

1) Client sends connection to SCAN name SCAN name resolved to SCAN IP address returned by DNS. SCAN IPs are returned in round-robin process. [If GNS is used for SCAN ip management then DNS delegates the request to GNS services and GNS services in turn return a SCAN IP address ( again in round-robin fashion)]

2) The TNS request is now forwarded by SCAN IP to the SCAN Listeners. Remember that the remote_listener parameter is already made to point to SCAN tnsnames.ora entry and local_listener uses VIP Listener entry.

3) SCAN listeners in turn forward the request to local listeners (least loaded one). The remote listeners which points to SCAN listeners will do the load balancing and local listeners will take care of new process spawning and connection to database.

4) Local listeners take care of client request.

PMON registers with SCAN listener as defined in parameter ‘remote_listener’ setting and also with the node listener depending on the local listener settings. On the basis of PMON provided details SCAN will choose the least loaded node to forward the request received from client.


Q 2.  What are benefits of SCAN?

A) NO NEED TO RECONFIGURE CLIENT: the SCAN makes it possible to add or remove nodes from the cluster without needing to reconfigure clients (Because the SCAN is associated with the cluster as a whole, rather than to a particular node). Before SCAN, if the cluster changed, then the client TNSNAMES.ORA files (or other tns connect strings like Easy Connect strings) would need to change.

B) LOCATION INDEPENDENCE: SCAN can connect from one node to any node . This is  location independence for the databases, so that client configuration does not have to depend on which nodes are running a particular database.

C) LOAD BALANCING:  Round-robin on DNS level allows for a connection request load balancing across SCAN listeners floating in the cluster.

New features of SCAN in database 12c:

1. SCAN and Oracle Clusterware managed VIPs now support IPv6 based IP addresses

2. SCAN is by default restricted to only accept service registration from nodes in the cluster

3. SCAN supports multiple subnets in the cluster (one SCAN per subnet)


Q 3. Describe basic steps to do RAC to NON-RAC cloning

1)  Make initialization parameter file from current database
2) Remove instance parameters concerning Cluster, ASM and database name (cluster_database, cluster_database_instances, instance_number, thread etc)
3) startup target database in nomount mode
4) Backup source RAC database and copy the backup pieces to target server.
5)  Duplicate database by RMAN “duplicate database” command.

 $rman auxiliary /
 RMAN> duplicate database to "TEST" backup location '/BACKUPS/TEST' nofilenamecheck;

6) Post clone steps: remove thread 2, remove undo tablespace 2, change password etc


Q 4.  Describe basic steps to do NON-RAC to RAC cloning/conversion

Need to migrate from older version (than 10g)? First upgrade your existing single instance database, test the upgraded database and then migrate to RAC.

Assuming you already have servers ready with OS user/group already setup.

1) Install and configure Grid Infrastructure (Clusterware+ ASM)
2) Add storage to Automatic Storage Management (ASM)
3) Install Oracle Database Software
4) Duplicate single instance Non-ASM database to ASM using RMAN

> Backup source NON-RAC database and copy the backup pieces to target RAC NODE 1.
> Create password file and init.ora file for RAC NODE 1 (don’t add cluster parameters yet)
> Start the auxiliary database in NOMOUNT mode on RAC NODE1
> Duplicate the database to RAC server

 $rman auxiliary /
 RMAN> duplicate database to "TEST" backup location '/BACKUPS/TEST' nofilenamecheck;

5)  Manually Convert single-instance to RAC.

> Create redo thread 2 and enable it.
> Add undo tablespace 2 for the second instance
> Add cluster related parameters now

*.cluster_database_instances=2
 *.cluster_database=true
 *.remote_listener='LISTENERS_ORADB’
 ORADB1.instance_number=1
 ORADB2.instance_number=2
 ORADB1.thread=1
 ORADB2.thread=2
 ORADB1.undo_tablespace='UNDOTBS1'
 ORADB2.undo_tablespace='UNDOTBS2'

> Copy the updated init.ora file to RAC NODE2 and rename the files as per instance name.
> Update the environment and start the database on both nodes
> Register the RAC instances with CRS

$ srvctl add database -d ORADB -o /home/oracle/product/v10204
$ srvctl add instance -d ORADB -i ORADB1 -n orarac1
$ srvctl add instance -d ORADB -i ORADB2 -n orarac2

> Create the spfile on ASM shared storage
> Cluster Verify


Q 5.  How do you start/stop RAC services ?

Check out this post


Q 6.  What is node eviction?  What causes Node eviction?

NODE EVICTION:

The Oracle Clusterware is designed to perform a node eviction by removing one or more nodes from the
cluster if some critical problem is detected. A critical problem could be a node not responding via a
network heartbeat, a node not responding via a disk heartbeat, a hung or severely degraded machine,
or a hung ocssd.bin process. The purpose of this node eviction is to maintain the overall health of the
cluster by removing bad members.

Starting in 11.2.0.2 RAC (or if you are on Exadata), a node eviction may not actually reboot the machine.
This is called a rebootless restart. In this case we restart most of the clusterware stack to see if that fixes the
unhealthy node.

CAUSES OF NODE EVICTION:

a) Network failure or latency between nodes. It would take 30 consecutive missed checkins (by default – determined by the CSS misscount) to cause a node eviction.
b) Problems writing to or reading from the CSS voting disk. If the node cannot perform a disk heartbeat to the majority of its voting files, then the node will be evicted.

c) A member kill escalation. For example, database LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanism. If this times out it could escalate to a node kill.

d) An unexpected failure or hang of the OCSSD process, this can be caused by any of the above issues or something else.

e) An Oracle bug.

f) High Load on Database Server: Out of 100 Issues, I have seen 70 to 80 time High load on the system was reason for Node Evictions, One common scenario is due to high load RAM and SWAP space of DB node got exhaust and system stops working and finally reboot.

g) Database Or ASM Instance Hang: Sometimes Database or ASM instance hang can cause Node reboot. In these case Database instance is hang and is terminated afterwards, which cause either reboot cluster or Node eviction.


Q 7. What happens to data block/resources of the evicted node?

Those are recreated from PI (Past Image) by the surviving instances.

Below is how it works:

– At Node eviction, instance Failure is detected by Cluster Manager and GCS
– Reconfiguration of GES resources (enqueues); global resource directory is frozen
– Reconfiguration of GCS resources; involves redistribution among surviving instances
– One of the surviving instances becomes the “recovering instance”
– SMON process of recovering instance starts first pass of redo log read of the failed instance’s redo log thread
– SMON finds BWR (block written records) in the redo and removes them as their PI is already written to disk
– SMON prepares recovery set of the blocks modified by the failed instance but not written to disk
– Entries in the recovery list are sorted by first dirty SCN
– SMON informs each block’s master node to take ownership of the block for recovery
– Second pass of log read begins.
– Redo is applied to the data files.
– Global Resource Directory is unfrozen


Q 8. How do you troubleshoot Node Eviction?

Look for error messages in below log files:

Clusterware alert log in /log/ :

The cssdagent log(s) in /log//agent/ohasd/oracssdagent_root
The cssdmonitor log(s) in /log//agent/ohasd/oracssdmonitor_root
The ocssd log(s) in /log//cssd
The lastgasp log(s) in /etc/oracle/lastgasp or /var/opt/oracle/lastgasp
IPD/OS or OS Watcher data

‘opatch lsinventory -detail’ output for the GRID home

*Messages files:

* Messages file locations:
Linux: /var/log/messages
Sun: /var/adm/messages
HP-UX: /var/adm/syslog/syslog.log
IBM: /bin/errpt -a > messages.out


Q 9. What is split brain situation in RAC?

‘Split Brain’ situation occurs when the instance members in a RAC fail to ping/connect to each other via this private interconnect, but the servers are all physically up and running and the database instance on each of these servers is also running. These individual nodes are running fine and can conceptually accept user connections and work independently. So, basically due to lack of communication, the instance thinks that the other instance that it is not able to connect is down and it needs to do something about the situation. The problem is if we leave these instance running, the same block might get read, updated in these individual instances and there would be data integrity issue, as the blocks changed in one instance, will not be locked and could be over-written by another instance. Oracle has efficiently implemented check for the split brain syndrome.

In such scenario, if any node becomes inactive, or if other nodes are unable to ping/connect to a node in the RAC, then the node which first detects that one of the node is not accessible, it will evict that node from the RAC group. e.g. there are 4 nodes in a rac instance, and node 3 becomes unavailable, and node 1 tries to connect to node 3 and finds it not responding, then node 1 will evict node 3 out of the RAC groups and will leave only Node1, Node2 & Node4 in the RAC group to continue functioning.


 Q 10. What is voting disk?

The voting disk records node membership information. CSSD process on every node makes entries in the voting disk to ascertain the membership of that node.
While marking their own presence, all the nodes also register the information about their communicability with other nodes in voting disk . This is called network heartbeat.
Healthy nodes will have continuous network and disk heartbeats exchanged between the nodes. Break in heart beat indicates a possible error scenario.If the disk block is not updated in a short timeout period, that node is considered unhealthy and may be rebooted to protect the database information.

During reconfig (join or leave) CSSD monitors all nodes and determines whether a node has a disk heartbeat, including those with no network heartbeat. If no disk heartbeat is detected then node is declared as dead.

So, Voting disks contain static and dynamic data.

Static data : Info about nodes in the cluster
Dynamic data : Disk heartbeat logging

It maintains and consists of important details about the cluster nodes membership, such as

– which node is part of the cluster,
– who (node) is joining the cluster, and
– who (node) is leaving the cluster.
The voting disk is not striped but put as a whole on ASM Disks

with external redundancy, one copy of voting file will be stored on one disk in the diskgroup.- If we store voting disk on a diskgroup with normal redundancy, we should be able to tolerate the loss of one disk i.e. even if we lose one disk, we should have sufficient number of voting disks so that clusterware can continue. If the diskgroup has 2 disks (minimum required for normal redundancy), we can store 2 copies of voting disk on it.

If we lose one disk, only one copy of voting disk will be left and clusterware won’t be able to continue, because to continue, clusterware should be able to access more than half the no. of voting disks i.e.> (2*1/2)
i.e. > 1
i.e.= 2
Hence, to be able to tolerate the loss of one disk, we should have 3 copies of the voting disk on a diskgroup with normal redundancy . So, a normal redundancy diskgroup having voting disk should have minimum 3 disks in it.


Q 11. How do you back up voting disk?

In previous versions of Oracle Clusterware you needed to backup the voting disks with the dd command. Starting with Oracle Clusterware 11g Release 2 you no longer need to backup the voting disks. The voting disks are automatically backed up as a part of the OCR

you may need to back up manually the Voting disk file every time

– you add or remove a node from the cluster or
– immediately after you configure or upgrade a cluster.


Q 12. What are various key RAC background processs and what they do?

Oracle RAC instances are composed of following background processes:

ACMS — Atomic Control file to Memory Service (ACMS)
GTX0-j — Global Transaction Process
LMON — Global Enqueue Service Monitor
LMD — Global Enqueue Service Daemon
LMS — Global Cache Service Process
LCK0 — Instance Enqueue Process
DIAG — Diagnosability Daemon
RMSn — Oracle RAC Management Processes (RMSn)
RSMN — Remote Slave Monitor
DBRM — Database Resource Manager (from 11g R2)
PING — Response Time Agent (from 11g R2)


Q 13. 


Q 14.  What are the RAC specific wait events?

The special use of a global buffer cache in RAC makes it imperative to monitor inter -instance communication via the cluster-specific wait events such as gc cr request and gc buffer busy.

a) gc cr request wait event

The gc cr request wait event specifies the time it takes to retrieve the data from the remote cache. High wait times for this wait event often are because of RAC interconnect Traffic Using Slow Connection or Inefficient Queries. Poorly tuned queries will increase the amount of data blocks requested by an Oracle session. The more blocks requested typically means the more often a block will need to be read from a remote instance via the interconnect.

b) gc buffer busy acquire and gc buffer busy release wait event

In Oracle 11g you will see gc buffer busy acquire wait event when the global cache open request originated from the local instance and gc buffer busy release when the open request originated from a remote instance. These wait events are all very similar to the buffer busy wait events in a single-instance database and are often the result of Hot Blocks or Inefficient Queries.

c) GC Current Block 2-Way/3-Way

For a Current Block in read mode a KPJUERPR ( Protected Read ) lock is requested. Excessive waits for gc current block are either related to inefficient QEP leading to nummerous block visits or application affinity not being in play

d) GC CR Block Congested/GC Current Block Congested

if LMS process did not process a request within 1ms than LMS marks the response to that block with the congestion wait event. Root cause: LMS is suffering CPU scheduling, LMS is suffering resources like memory ( paging )


Q 15. How can you measure the RAC interconnect bandwidth?

From Tool – Measureware, MRTG etc
From Database Table – from Stats pack report, average time of data block from v$sysstat.


Q 16.  How is SCAN listener different from VIP listener in RAC? How many scan listeners are required for a 10 node RAC? Explain in detail.

Scan Name came to picture after Oracle RAC 11gR2. SCANs uses IP’s not assigned to any interface. Clusterware will be in charge of it. It will direct the requests to the appropriate servers in the cluster. The main purpose of SCAN is to provide ease of management/connection. For instance you can add new nodes to the cluster without changing your client TNSnames. This is because Oracle will automatically distribute requests accordingly based on the SCAN IPs which point to the underlying VIPs. Scan listeners do the bridge between clients and the underlying local listeners which are VIP-dependent.

VIPs are the IPs which CRS maintains for each node. A VIP is a virtual IP for a specific server node in the cluster.Should that server node fail, this VIP is transferred to another server node in order to still provide network connectivity for clients using the address of the failed server node. VIP in other words provides high availability as despite a server node failing, network communication to this node will still be supported by another node via the failed node’s VIP.

On a 10 node cluster, there will be 10 virtual IP addresses with 10 virtual hostnames – which means that many clients will need to know and use all 10 VIPs in order to make load balanced, or high availability, or TAF, connections. SCAN replaces this on the client side – by providing the client with a Single Client Acces Name to use as oppose to 10 VIPs. Oracle asks to have 3 Scan IPs/Listeners to serve multi-node RAC system. You can have 10 node system but you need only 3 SCAN listeners. The SCAN listeners can run on any node of the cluster. Having three is plenty.


Q 17.  Difference between RAC and Non-RAC database. Difference in terms of SGA/shared pool?


Q 18.  What is cache fusion in RAC?


Q 19. In RAC How do you do patching?


Q 20. How do you check for conflicts with database patching?


Q 21. What is OCR File?

OCR contains information about all Oracle resources in the cluster.

The purpose of the Oracle Cluster Registry (OCR) is to hold cluster and database configuration information for RAC and Cluster Ready Services (CRS) such as the cluster node list, and cluster database instance to node mapping, and CRS application resource profiles.

Oracle recommends that you configure:

At least three OCR locations, if OCR is configured on non-mirrored or non-redundant storage. Oracle strongly recommends that you mirror OCR if the underlying storage is not RAID. Mirroring can help prevent OCR from becoming a single point of failure.

At least two OCR locations if OCR is configured on an Oracle ASM disk group. You should configure OCR in two independent disk groups. Typically this is the work area and the recovery area.


Q 22. What is OLR?

Oracle Local Registry (OLR) is a registry similar to OCR located on each node in a cluster, but contains information specific to each node.
It contains manageability information about Oracle Clusterware, including dependencies between various services. Oracle High Availability Services uses this information. OLR is located on local storage on each node in a cluster


Q 23. How will you take ocr Backup?

There are two methods of copying OCR content and using the content for recovery. The first method uses automatically generated physical OCR file copies and the second method uses manually created logical OCR export files.

Because of the importance of OCR information, the ocrconfig tool should be used to make daily copies of the automatically generated backup files

RAC Startup and Shutdown Steps

startup-shutdown

The RAC database running on my laptop has two nodes RAC1 and RAC2.
The instance names are brij1 and brij2.

The binary files to start-stop (srvctl, crsctl) can be located under $ORA_CRS_HOME/bin directory

SHUTDOWN PROCESS
@@@@@@@@@@@

First take down your database

1) STOP DATABASE ON ALL NODES
==============================
$ ./srvctl stop database -d brij

OR

1) STOP ALL INSTNACES
=======================
./srvctl stop instance -d brij -i brij1
./srvctl stop instance -d brij -i brij2

Second turn is of ASM instance.

2) Shut down all ASM instances on all nodes.
=========================================
./srvctl stop asm -n rac1
./srvctl stop asm -n rac2

Then you will shutdown all node applications

3) Stop all node applications on all nodes.
========================================
$ ./srvctl stop nodeapps -n rac1
$ ./srvctl stop nodeapps -n rac2

Verification >>>> Run ./crs_stat -t  and it should show ‘target’ and ‘state’ for  all components as “OFFLINE”

And the last step is …

4) Shut down the Oracle Clusterware or CRS process by entering the following command on all nodes as the root user
=======================================================
/etc/init.d/init.crs stop

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

STARTUP PROCESS
@@@@@@@@@@

1) Start up the Oracle Clusterware or CRS process by entering the following command on all nodes as the root user
=======================================================
/etc/init.d/init.crs start

2) Start all node applications on all nodes
===========================================
./srvctl start nodeapps -n rac1
./srvctl start nodeapps -n rac2

3) Start up all ASM instances on all nodes.
=========================================
./srvctl start asm -n rac1
./srvctl start asm -n rac2

4)  START ALL INSTNACES
=======================
./srvctl start instance -d brij -i brij1
./srvctl start instance -d brij -i brij2

OR

4) START DATABASE ON ALL NODES
==============================

$ ./srvctl start database -d brij

Verification >>>> Run ./crs_stat -t  and it should show ‘target’ and ‘state’ for  all components as “ONLINE”

11gR2 RAC – Start/Stop RAC Node(s)

TIP: Some points about setting up RMAN on RAC Environment

In this post I will some tips for setting up RMAN in RAC environment.

I will not cover  topics about  RMAN that can be configured in standalone environment (e.g Incremental backup, use of FRA, etc.)

First question: Is there a difference of setting up RMAN between  standalone and RAC environments?

The answer is YES, not too much but some points must be observed.

RMAN Catalog

First of all, In my point of view use RMAN Catalog is mandatory. Because it’s a HA environment and  to restore a database without RMAN Catalog can take long time.

To protect and keep backup metadata for longer retention times than can be accommodated by the control file, you can create a recovery catalog. You should create the recovery catalog schema in a dedicated standalone database. Do not locate the recovery catalog with other production data. If you use Oracle Enterprise Manager, you can create the recovery catalog schema in the Oracle Enterprise Manager repository database.

About HA of RMAN Catalog?

I always recommend to place the Host that will hold RMAN Catalog on VirtualMachine, because  is a machine which require low resource and disk space and have low activity.

In case of the failure of Host RMAN Catalog is easy move that host to another Physical Host or Recover the whole virtual machine.

But if the option of use a VM is not avaliable. Use another cluster (e.g Test Cluster) env if avaliable.

The Database of RMAN catalog must be in ARCHIVELOG. Why?

It’s a prod env (is enough), will generate very small amount of  archivelogs, and in case of any corruption or user errors (e.g User generated new incarnation of Prod Database  during a Test Validation of Backup in Test env) can be necessary recovery point in time.

Due a small database of low activity I see some customers not giving importance to this database. It’s should not happens.

High availability of execution of backup using RMAN:

We have some challenges:

  1.  The backup must not affect the availability cluster, but the backup must be executed daily.
  2.  The backup cannot be dependent of nodes (i.e backup must be able to execute in all nodes independently if have some nodes active or not)
  3. Where store the scripts of backup? Where store the Logs?

I don’t recommend use any nodes of cluster to start/store scripts backups. Due if that node fail backup will not be executed.
Use the Host where is stored RMAN Catalog to store your backup scripts too and start these scripts from this host, the utility RMAN works as client only… the backup is always performed on server side.

Doing this you will centralize all scripts and logs of backup from your environment. That will ease the management of backup.

Configuring the RMAN Snapshot Control File Location in a RAC 11.2

RMAN creates a copy of the control file for read consistency, this is the snapshot controlfile. Due to the changes made to the controlfile backup mechanism in 11gR2 any instances in the cluster may write to the snapshot controlfile. Therefore, the snapshot controlfile file needs to be visible to all instances.
The same happens when a backup of the controlfile is created directly from sqlplus any instance in the cluster may write to the backup controfile file.
In 11gR2 onwards, the controlfile backup happens without holding the control file enqueue. For non-RAC database, this doesn’t change anything.
But, for RAC database, the snapshot controlfile location must be in a shared file system that will be accessible from all the nodes.
The snapshot controlfile MUST be accessible by all nodes of a RAC database.

See how do that:

https://forums.oracle.com/forums/thread.jspa?messageID=9997615

Since version 11.1 : Node Affinity Awareness of Fast Connections

In some cluster database configurations, some nodes of the cluster have faster access to certain data files than to other data files. RMAN automatically detects this, which is known as node affinity awareness. When deciding which channel to use to back up a particular data file, RMAN gives preference to the nodes with faster access to the data files that you want to back up. For example, if you have a three-node cluster, and if node 1 has faster read/write access to data files 7, 8, and 9 than the other nodes, then node 1 has greater node affinity to those files than nodes 2 and 3.

Channel Connections to Cluster Instances with RMAN

Channel connections to the instances are determined using the connect string defined by channel configurations. For example, in the following configuration, three channels are allocated using dbauser/pwd@service_name. If you configure the SQL Net service name with load balancing turned on, then the channels are allocated at a node as decided by the load balancing algorithm.

However, if the service name used in the connect string is not for load balancing, then you can control at which instance the channels are allocated using separate connect strings for each channel configuration. So,your backup scripts will fail if  that node/instance is down.

So, my recommendation in admin-managed database environment is create a set of nodes to perform the backup.

E.g : If  you have 3 nodes you should use one or two node to perform backup, while the other node is less loaded. If you are using Load Balance in your connection… the new connection will be directed to the least loaded node.

Autolocation for Backup and Restore Commands

RMAN automatically performs autolocation of all files that it must back up or restore. If you use the noncluster file system local archiving scheme, then a node can only read the archived redo logs that were generated by an instance on that node. RMAN never attempts to back up archived redo logs on a channel it cannot read.

During a restore operation, RMAN automatically performs the autolocation of backups. A channel connected to a specific node only attempts to restore files that were backed up to the node. For example, assume that log sequence 1001 is backed up to the drive attached to node1, while log 1002 is backed up to the drive attached to node2. If you then allocate channels that connect to each node, then the channel connected to node1 can restore log 1001 (but not 1002), and the channel connected to node2 can restore log 1002 (but not 1001).

Configuring Channels to Use Automatic Load Balancing

To configure channels to use automatic load balancing, use the following syntax:

1
CONFIGURE DEVICE TYPE [disk | sbt] PARALLELISM number_of_channels;

Where number_of_channels is the number of channels that you want to use for the operation. After you complete this one-time configuration, you can issue BACKUP or RESTORE commands.

Setup  Parallelism on RMAN  is not enough to keep a balance, because if you start the backup from remote host using default SERVICE_NAME and if you are using parallelism the RMAN can start a session in each node and the backup be performed by all nodes at same time, this is not a problem, but can cause a performance   issue on your environment due high load.

Even at night the backup can cause performance problems due maintenance of the database (statistics gathering, verification of new SQL plans “automatic sql tuning set”, etc).

The bottlenecks are usually in or LAN or SAN, so use all nodes to perform backup can be  a waste. If the backup is run via LAN you can gain by using more than one node, but the server that is receiving the backup data will become a bottleneck.

I really don’t like to use more than 50% of nodes of RAC to execute backup due it can increase the workload in all nodes of clusters and this can be a problem to the application or database.

So, thinking to prevent it we can configure a Database Service to control where backup will be performed.

Creating a Database Service to perform Backup

Before start I should explain about limitation of database service.

Some points about Oracle Services.

When a user or application connects to a database, Oracle recommends that you use a service for the connection. Oracle Database automatically creates one database service (default service is always the database  name)  when the database is created.   For more flexibility in the management of the workload using the database, Oracle Database enables you to create multiple services and specify which database instances offer the services.

You can define services for both policy-managed and administrator-managed databases.

  • Policy-managed database: When you define services for a policy-managed database, you assign the service to a server pool where the database is running. You can define the service as either uniform (running on all instances in the server pool) or singleton (running on only one instance in the server pool).
  • Administrator-managed database: When you define a service for an administrator-managed database, you define which instances normally support that service. These are known as the PREFERRED instances. You can also define other instances to support a service if the preferred instance fails. These are known as AVAILABLE instances.
About Service Failover in Administrator-Managed Databases

When you specify a preferred instance for a service, the service runs on that instance during normal operation. Oracle Clusterware attempts to ensure that the service always runs on all the preferred instances that have been configured for a service. If the instance fails, then the service is relocated to an available instance. You can also manually relocate the service to an available instance.

About Service Failover in Policy-Managed Databases

When you specify that a service is UNIFORM, Oracle Clusterware attempts to ensure that the service always runs on all the available instances for the specified server pool. If the instance fails, then the service is no longer available on that instance. If the cardinality of the server pool increases and a instance is added to the database, then the service is started on the new instance. You cannot manually relocate the service to a specific instance.

When you specify that a service is SINGLETON, Oracle Clusterware attempts to ensure that the service always runs on only one of the available instances for the specified server pool. If the instance fails, then the service fails overs to a different instance in the server pool. You cannot specify which instance in the server pool the service should run on.

For SINGLETON services, if a service fails over to an new instance, then the service is not moved back to its original instance when that instance becomes available again.

Summarizing about use Services

If your database is Administrator-Managed  we can create a service and define  where backup will be executed, and how much nodes we can use with preferred and available nodes.

If your database is Policy-Managed we cannot define where backup will be executed, but we can configure a service SINGLETON, that will be sure that backup  will be executed in  only node, if that node fail the service will be moved to another available node, but we cannot choose in which node backup will be performed.

Note:

For connections to the target and auxiliary databases, the following rules apply:

Starting with 10gR2, these connections can use a connect string that does not bind to any particular instance. This means you can use load balancing.

Once a connection is established, however, it must be a dedicated connection  that cannot migrate to any other process or instance. This means that you still can’t use MTS or TAF.

Example creating service for Administrator-Managed Database

The backup will be executed on db11g2 and db11g3, but can be executed on db11g1 if db11g2 and db11g3 fail.

Set ORACLE_HOME  to same used by Database

1
2
3
4
5
$ srvctl add service -d db_unique_name -s service_name -r preferred_list [-a available_list] [-P TAF_policy]
$ srvctl add service -d db11g -s srv_rman -r db11g2,db11g3 -a db11g1 -P NONE -j LONG
$ srvctl start service -d db11g -s srv_rman
Example creating service for Policy-Managed Database

Using a service SINGLETON the backup will be executed on node which service was started/assigned. The service will be changed to another host only if that node fail.

Set ORACLE_HOME  to same used by Database

1
2
3
4
5
6
7
$ srvctl config database -d db11g |grep "Server pools"
Server pools: racdb11gsp
$ srvctl add service -d db11g -s srv_rman -g racdb11gsp -c SINGLETON
$ srvctl start service -d db11g -s srv_rman

If you have more than 2 nodes on cluster (with policy managed database) and you want use only 2 or more nodes to perform backup, you can choose the options below.

Configure a Service UNIFORM (the service will be available on all nodes)   you can control how much instance will be used to perform backup, but you cannot choose in which node backup will be performed. In fact the service does not control anything, you will set  PARALLELISM (RMAN) equal number of nodes wich you want use .

Ex: I have 4 Nodes but I want start backup in 2 nodes. I must choose parallelism 2.  Remember that Oracle can start 2 Channel on same host, this depend on workload of each node.

Using Policy Managed Database you should be aware that you do not care where (node) each instance is running, you will have a pool with many nodes and Oracle will manage all resources inside that pool. For this reason is not possible to control where you will place a heavier load.

This will only work if you are performing online backup or are using Parallel Backup.

Configuring RMAN to Automatically Backup the Control File and SPFILE

If you set CONFIGURE CONTROLFILE AUTOBACKUP to ON, then RMAN automatically creates a control file and an SPFILE backup after you run the BACKUP or COPYcommands. RMAN can also automatically restore an SPFILE, if this is required to start an instance to perform recovery, because the default location for the SPFILE must be available to all nodes in your Oracle RAC database.

These features are important in disaster recovery because RMAN can restore the control file even without a recovery catalog. RMAN can restore an autobackup of the control file even after the loss of both the recovery catalog and the current control file. You can change the default name that RMAN gives to this file with the CONFIGURE CONTROLFILE AUTOBACKUP FORMAT command. Note that if you specify an absolute path name in this command, then this path must exist identically on all nodes that participate in backups.

RMAN performs the control file autobackup on the first allocated channel. Therefore, when you allocate multiple channels with different parameters, especially when you allocate a channel with the CONNECT command, determine which channel will perform the control file autobackup. Always allocate the channel for this node first.

Enjoy..

Explaining: How to store OCR, Voting disks and ASM SPFILE on ASM Diskgroup (RAC or RAC Extended)

In 2011 I saw many doubts and concerns about how to store Voting,OCR and ASM SPFILE on ASM Diskgroup in this post I’ll show you how to set up your environment by applying best practices based on my experience.

To start I need explain some concepts:

Voting Disks is like an “Database Instance” and OCR is like a “Database”. During startup of Cluster Oracle first read and open all Voting Disks and after ASM be Started Oracle read and open all OCR.

So, Oracle does not need of ASM Instance  be started or  DISKGROUP be mounted to read and open Voting Disk.

Voting disks:

Voting Disk also known as Voting files: Is a file that manages information about node membership.

How they are stored in ASM?
Voting disks are placed directly on ASMDISK. Oracle Clusterware will store the votedisk on the disk within a disk group that holds the Voting Files. Oracle Clusterware does not rely on ASM to access the Voting Files, that’s means wich Oracle Clusterware does not need of Diskgroup to read and write on ASMDISK.
You cannot find/list Voting files using SQLPLUS(v$asm_diskgroup,v$asm_files,v$asm_alias), ASCMD utility or ASMCA gui.
You only know if exist a voting files in a ASMDISK (v$asm_disk using column VOTING_FILE). So, voting files not depend of Diskgroup to be accessed, does not mean that, we don’t need the diskgroup, diskgroup and voting file are linked by their settings.

Oracle Clusterware take configuration of DISKGROUP to configure the own voting files.
As Voting Disk are placed directly in ASMDISK of Diskgroup, we cannot use more than 1(one) Diskgroup.
The redundancy of voting files depend on ASMDISK not of Diskgroup. If you lose one ASMDISK  it’s means you lose one voting file. Differently when using files managed by Diskgroup.

  • When votedisk is on ASM diskgroup, no crsctl add option available. The number of votedisk is determined by the diskgroup redundancy. If more copy of votedisk is desired, one can move votedisk to a diskgroup with higher redundancy.
  • When votedisk is on ASM, no delete option available, one can only replace the existing votedisk group with another ASM diskgroup.

You cannot place Voting files in differents Diskgroup. To use a quorum failgroup is required if you are using RAC Extended or if you are using more than 1 Storage in your cluster.
The COMPATIBLE.ASM disk group compatibility attribute must be set to 11.2 or greater to store OCR or voting disk data in a disk group.

Oracle Cluster Registry (OCR) and ASM Server Parameter File (ASM SPFILE):

OCR: The Oracle RAC configuration information repository that manages information about the cluster node list and instance-to-node mapping information. OCR also manages information about Oracle Clusterware resource profiles for customized applications.

The OCR is totally different from Voting Disk. Oracle Clusterware rely on ASM to access the OCR and SPFILE. The OCR and SPFILE are stored similar to how Oracle Database files are stored. The extents are spread across all the disks in the diskgroup and the redundancy (which is at the extent level) is based on the redundancy of the disk group. For this reason you can only have one OCR in a diskgroup.

So, if your Diskgroup where OCR is stored become unavaliable you will lose your OCR and SPFILE. Then we need put OCR mirror in another disk group to support failure of disk group.

The interesting discussion is what happens if you have the OCR mirrored and one of the copies gets corrupt? You would expect that everything will continue to work seemlessly. Well.. Almost.. The real answer depends on when the corruption takes place.

If the corruption happens while the Oracle Clusterware stack is up and running, then the corruption will be tolerated and the Oracle Clusterware will continue to funtion without interruptions. Despite the corrupt copy. DBA is advised to repair this hardware/software problem that prevent OCR from accessing the device as soon as possible; alternatively, DBA can replace the failed device with another healthy device using the ocrconfig utility with -replace flag.

If however the corruption happens while the Oracle Clusterware stack is down, then it will not be possible to start it up until the failed device becomes online again or some administrative action using ocrconfig utility with -overwrite flag is taken.

Basic rules: You cannot create more than 1 (one) OCR or SPFILE in same Diskgroup.
The COMPATIBLE.ASM disk group compatibility attribute must be set to 11.2 or greater to store OCR or voting disk data in a disk group.

Best Practice for ASM is to have 2 diskgroups to store OCR.

Oracle Recommend: With Oracle Grid Infrastructure 11g Release 2, it is recommended to put the OCR and Voting Disks in Oracle ASM, using the same disk group you use for your database data.
I don’t agree!!!
I really don’t recommend put database files and clusterware files together. This can disrupt the management of the environment and cause dowtime. (e.g you never can stop this diskgroup)
Example:  The voting files are not stored in the diskgroup (+data), they are placed directly in asmdisk, then in case of maintenance in the diskgroup, for example to increase the size of Luns, you can not just remove the asmdisk, you must move the voting files to another place and achieve the maintenance in diskgroup.

Downtime ???? Yes… You can move only vote and ocr without downtime, but to move ASMSPIFILE you need downtime. This is required to ASM use new SPFILE and release the old Diskgroup. See: ORA-15027: active use of diskgroup precludes its dismount (With no database clients connected) [ID 1082876.1]

Voting files can be stored in only one diskgroup.
We can have X number of disk groups, during maintenance operations (replicate, drop, move,resize,clone etc.) the Clusterware files are unnecessarily involved.

So I recommend always create two small DISKGROUP:
+VOTE – Storing Voting files and OCR mirror
+CRS – Storing OCR and ASM Spfile.

Keep mind: You must make desing of LUNs of theses diskgroup (+VOTE, +CRS) before clusterware become avaliable to RAC databases (i.e Before install RAC).

Recomendation of Design of Luns:
Voting Disk need 300Mb
OCR and ASM SPFILE need 300M

Even using mirroring by Hardware (Storage), I recommend  create mirroring by ASM to Diskgroup that will store voting files, because these files will be configured as multiplexing.

Diskgroup VOTE:
Create 3 Luns of 500Mb each. If possible put each Lun in different controller, array or storage.

Diskgroup CRS:
If you are using mirror of storage or you are using only one storage, it’s recommended you create 1(one) Lun using external redundancy.

If you are not using mirror of storage or you are using more than one storage.

Using more than one storage :
1 Lun (500M) in each storage. Creating a diskgroup with normal redundancy.

If you are using one storage, but not using mirroring of storage.
Create 2 Luns of 500Mb each. Creating a diskgroup with normal redundancy.
Place each Lun in different controller, array or storage.

These Luns are exclusive to Cluster.

Returning to the old days…

So, we return to the old days (10.2), when we created separated luns to clusterware files using raw devices.

It may seem like a setback, but is a process that will facilitate management of environment, and makes it safer by separating files (clusterware files) that are extremely important  keeping the high availability cluster.

When you perform maintenance of the clusterware files you will change only  the diskgroup(CRS and VOTE) when you perform maintenance of the Diskgroup (Database Files or  ACFS)  the clusterware  files will not be involved.

Now, let’s do somes tests:

During Grid Install … What I can do to accomplish it?
We cannot achieve desired result during setup, but we can reconfigure it at end of installation. So, during install I always create a temporary diskgroup named +CRSTMP with external redundancy, asmdisk (lun) size 1G.

The diskgroup +CRSTMP will have one voting file, ocr and asm spfile.

Checking if nodes is Actives:

1
2
3
4
$ olsnodes -s
lnxora01        Active
lnxora02        Active
lnxora03        Active

Use OCRCHECK to know where you OCR files are stored.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3848
         Available space (kbytes) :     258272
         ID                       : 1997055112
         Device/File Name         :    +CRSTMP
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check bypassed due to non-privileged user

Use crsctl to know where Voting file is stored

1
2
3
4
5
$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a0d6ea8dfb944fe7bfb799a451195a18 (ORCL:CRSTMP01) [CRSTMP]
Located 1 voting disk(s).

Use asmcmd to known where ASM SPFILE is stored

1
2
$ asmcmd spget
+CRSTMP/testcluster/ASMPARAMETERFILE/REGISTRY.253.772133609

Getting info about Voting Disk on ASM. We cannot see the voting file on ASM, we only know wich asmdisk he is stored.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SQL>
SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A10
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK
WHERE GROUP_NUMBER = ( SELECT GROUP_NUMBER
             FROM V$ASM_DISKGROUP
             WHERE NAME='CRSTMP');
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
CRSTMP01   ORCL:CRSTMP01                  MEMBER               CRSTMP01             REGULAR              Y

Getting full name of OCR and ASM SPFILE on ASM

olsnodes -c : show name of cluster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ olsnodes -c
tstcluster
set linesize 100
col FILES_OF_CLUSTER for a60
select concat('+'||gname, sys_connect_by_path(aname, '/')) FILES_OF_CLUSTER
     from ( select b.name gname, a.parent_index pindex, a.name aname,
              a.reference_index rindex , a.system_created, a.alias_directory,
              c.type file_type
       from v$asm_alias a, v$asm_diskgroup b, v$asm_file c
       where a.group_number = b.group_number
             and a.group_number = c.group_number(+)
             and a.file_number = c.file_number(+)
             and a.file_incarnation = c.incarnation(+)
     ) WHERE file_type in ( 'ASMPARAMETERFILE','OCRFILE')
start with (mod(pindex, power(2, 24))) = 0
            and rindex in
                ( select a.reference_index
                  from v$asm_alias a, v$asm_diskgroup b
                  where a.group_number = b.group_number
                        and (mod(a.parent_index, power(2, 24))) = 0
                        and a.name = LOWER('&CLUSTERNAME')
                )
connect by prior rindex = pindex;
Enter value for clustername: tstcluster
old  17:                         and a.name = LOWER('&CLUSTERNAME')
new  17:                         and a.name = LOWER('tstcluster')
FILES_OF_CLUSTER
---------------------------------------------------------
+CRSTMP/tstcluster/OCRFILE/REGISTRY.255.772133361
+CRSTMP/tstclsuter/ASMPARAMETERFILE/REGISTRY.253.772133609

After the disks are available on all hosts, we can start.

CRS01 and CRS02  will be used to diskgroup CRS

VOTE01,VOTE02 and VOTE03 will be used to diskgroup VOTE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
col path for a30
 col name for a20
 col header_status for a20
 select path,name,header_status from v$asm_disk
 where path like '%CRS%' or path like '%VOTE%';
PATH                           NAME                 HEADER_STATUS
------------------------------ -------------------- --------------------
ORCL:CRS01                                          PROVISIONED
ORCL:CRS02                                          PROVISIONED
ORCL:VOTE01                                         PROVISIONED
ORCL:VOTE02                                         PROVISIONED
ORCL:VOTE03                                         PROVISIONED
ORCL:CRSTMP01                  CRSTMP01             MEMBER

Creating Diskgroup VOTE each disk must be in different failgroup. I don’t add QUORUM failgroup because theses luns are on Storage. I recommend use QUORUM failgroup when you are placing disk out of your  environment. (e.g use NFS file-disk to quorum purpouses), because this disks cannot contain data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SQL>
CREATE DISKGROUP VOTE NORMAL REDUNDANCY
     FAILGROUP STG1_C1 DISK 'ORCL:VOTE01'
     FAILGROUP STG1_C2 DISK 'ORCL:VOTE02'
     FAILGROUP STG1_C1_1 DISK 'ORCL:VOTE03'
     ATTRIBUTE 'compatible.asm' = '11.2.0.0.0';
Diskgroup created.
# starting diskgroup on others nodes
SQL> ! srvctl start diskgroup -g vote -n lnxora02,lnxora03
# checking if diskgroup is active on all nodes
SQL> ! srvctl status diskgroup -g vote
Disk Group vote is running on lnxora01,lnxora02,lnxora03

Creating DISKGROUP CRS:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
SQL>
CREATE DISKGROUP CRS NORMAL REDUNDANCY
     FAILGROUP STG1_C1 DISK 'ORCL:CRS01'
     FAILGROUP STG1_C2 DISK 'ORCL:CRS02'
     ATTRIBUTE 'compatible.asm' = '11.2.0.0.0';
Diskgroup created.
# starting diskgroup on others nodes
SQL> ! srvctl start diskgroup -g crs -n lnxora02,lnxora03
# checking if diskgroup is active on all nodes
SQL> ! srvctl status diskgroup -g crs
Disk Group crs is running on lnxora01,lnxora02,lnxora03
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SQL>
SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A10
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK
WHERE GROUP_NUMBER IN ( SELECT GROUP_NUMBER
             FROM V$ASM_DISKGROUP
             WHERE NAME IN ('CRS','VOTE'));
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
VOTE03     ORCL:VOTE03                    MEMBER               STG1_C1_1            REGULAR              N
VOTE02     ORCL:VOTE02                    MEMBER               STG1_C2              REGULAR              N
VOTE01     ORCL:VOTE01                    MEMBER               STG1_C1              REGULAR              N
CRS01      ORCL:CRS01                     MEMBER               STG1_C1              REGULAR              N
CRS02      ORCL:CRS02                     MEMBER               STG1_C2              REGULAR              N

Moving Voting Files from +CRSTMP to +VOTE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ crsctl replace votedisk +VOTE
Successful addition of voting disk aaa75b9e7ce24f39bfd9eecb3e3c0e38.
Successful addition of voting disk 873d51346cd34fc2bf9caa94999c4cd8.
Successful addition of voting disk acda8619b74c4fe8bf886ee6c9fe8d1a.
Successful deletion of voting disk a0d6ea8dfb944fe7bfb799a451195a18.
Successfully replaced voting disk group with +VOTE.
CRS-4266: Voting file(s) successfully replaced
$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   aaa75b9e7ce24f39bfd9eecb3e3c0e38 (ORCL:VOTE01) [VOTE]
 2. ONLINE   873d51346cd34fc2bf9caa94999c4cd8 (ORCL:VOTE02) [VOTE]
 3. ONLINE   acda8619b74c4fe8bf886ee6c9fe8d1a (ORCL:VOTE03) [VOTE]
Located 3 voting disk(s).
SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A10
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK
WHERE GROUP_NUMBER = ( SELECT GROUP_NUMBER
             FROM V$ASM_DISKGROUP
             WHERE NAME='VOTE');
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
VOTE03     ORCL:VOTE03                    MEMBER               STG1_C1_1            REGULAR              Y
VOTE02     ORCL:VOTE02                    MEMBER               STG1_C2              REGULAR              Y
VOTE01     ORCL:VOTE01                    MEMBER               STG1_C1              REGULAR              Y

Moving OCR to diskgroup +CRS and +VOTE and removing from diskgroup +CRSTMP

What is OCR determines whether the principal or mirror, is the order wich we add new OCR.
Therefore, we add first on the diskgroup CRS +  and later in diskgroup VOTE,  at time to remove the OCR on diskgroup CRSTMP the OCR on diskgroup CRS will become the principal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3868
         Available space (kbytes) :     258252
         ID                       : 1997055112
         Device/File Name         :    +CRSTMP
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded
# /u01/app/11.2.0/grid/bin/ocrconfig -add +CRS
# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3836
         Available space (kbytes) :     258284
         ID                       : 1997055112
         Device/File Name         :    +CRSTMP
                                    Device/File integrity check succeeded
         Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded
# /u01/app/11.2.0/grid/bin/ocrconfig -add +VOTE
 /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3836
         Available space (kbytes) :     258284
         ID                       : 1997055112
         Device/File Name         :    +CRSTMP
                                    Device/File integrity check succeeded
         Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
         Device/File Name         :      +VOTE
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded
# /u01/app/11.2.0/grid/bin/ocrconfig -delete +CRSTMP
/u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3836
         Available space (kbytes) :     258284
         ID                       : 1997055112
         Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
         Device/File Name         :      +VOTE
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded

Moving ASM SPFILE to diskgroup +CRS

You will get the error that the file is still being used, but actually the file is copied to the file system and the profile is updated.

1
2
3
4
5
6
7
8
9
10
$ asmcmd spget
+CRSTMP/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772133609
$ asmcmd spmove '+CRSTMP/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772133609' '+CRS/tstcluster/spfileASM.ora'
ORA-15032: not all alterations performed
ORA-15028: ASM file '+CRSTMP/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772133609' not dropped; currently being accessed (DBD ERROR: OCIStmtExecute)
# checking if  file was copied and profile updated
$ asmcmd spget
+CRS/tstcluster/spfileASM.ora

Checking files of cluster on ASM

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
set linesize 100
col FILES_OF_CLUSTER for a60
select concat('+'||gname, sys_connect_by_path(aname, '/')) FILES_OF_CLUSTER
     from ( select b.name gname, a.parent_index pindex, a.name aname,
              a.reference_index rindex , a.system_created, a.alias_directory,
              c.type file_type
       from v$asm_alias a, v$asm_diskgroup b, v$asm_file c
       where a.group_number = b.group_number
             and a.group_number = c.group_number(+)
             and a.file_number = c.file_number(+)
             and a.file_incarnation = c.incarnation(+)
     ) WHERE file_type in ( 'ASMPARAMETERFILE','OCRFILE')
start with (mod(pindex, power(2, 24))) = 0
            and rindex in
                ( select a.reference_index
                  from v$asm_alias a, v$asm_diskgroup b
                  where a.group_number = b.group_number
                        and (mod(a.parent_index, power(2, 24))) = 0
                        and a.name = LOWER('&CLUSTERNAME')
                )
connect by prior rindex = pindex;
Enter value for clustername: tstcluster
old  17:                         and a.name = LOWER('&CLUSTERNAME')
new  17:                         and a.name = LOWER('tstcluster')
FILES_OF_CLUSTER
------------------------------------------------------------
+CRSTMP/tstcluster/OCRFILE/REGISTRY.255.772133361
+CRSTMP/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772133609
+VOTE/tstcluster/OCRFILE/REGISTRY.255.772207785
+CRS/tstcluster/OCRFILE/REGISTRY.255.772207425
+CRS/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772208263
+CRS/tstcluster/spfileASM.ora

In order the ASM can use the new SPFILE and  disconnect from the diskgroup + CRSTMP, we need to restart the cluster.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all
CRS-2673: Attempting to stop 'ora.crsd' on 'lnxora01'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'lnxora01'
.
.
.
CRS-2673: Attempting to stop 'ora.crsd' on 'lnxora02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'lnxora02'
.
.
.
CRS-2673: Attempting to stop 'ora.crsd' on 'lnxora03'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'lnxora03'
.
.
.
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'lnxora01' has completed
.
.
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'lnxora02' has completed
.
.
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'lnxora03' has completed
# /u01/app/11.2.0/grid/bin/crsctl start cluster -all
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'lnxora01'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'lnxora02'
.
$ asmcmd spget
+CRS/tstcluster/spfileASM.ora

Now we can drop diskgroup +CRSTMP

1
2
3
4
5
6
7
8
9
10
11
12
13
SQL> ! srvctl stop diskgroup -g crstmp -n lnxora02,lnxora02
SQL> drop diskgroup crstmp including contents;
Diskgroup dropped.
SQL>
FILES_OF_CLUSTER
------------------------------------------------------------
+CRS/tstcluster/OCRFILE/REGISTRY.255.772207425
+CRS/tstcluster/ASMPARAMETERFILE/REGISTRY.253.772211229
+CRS/tstcluster/spfileASM.ora
+VOTE/tstcluster/OCRFILE/REGISTRY.255.772207785
Adding a 3rd Voting File on NFS to a Cluster using Oracle ASM

In this post I’ll show how configure it on Linux, more detailed step, or how configure it in others platform you can use this Oracle white paper  (http://www.oracle.com/technetwork/database/clusterware/overview/grid-infra-thirdvoteonnfs-131158.pdf)

Based on the above settings I’ll show you how easy it is to add a votedisk using ASM. (Linux only)

Preparing NFS Server: ( Oracle recommend use a exclusive host to 3rd votedisk)

1
2
3
# mkdir /votedisk
# vi /etc/exports
/votedisk *(rw,sync,all_squash,anonuid=54321,anongid=54325)

Setting Up NFS Clients
This conf above must be in all nodes of cluster

1
2
# cat /etc/filesystem
lnxnfs:/votedisk      /voting_disk    nfs     rw,bg,hard,intr,rsize=32768,wsize=32768,tcp,noac,vers=3,timeo=600       0       0

Mount the /voting_disk in all nodes of cluster and check if they are with right options

1
2
3
4
# mount /voting_disk
$ mount |grep voting_disk
lnxnfs:/votedisk on /voting_disk type nfs (rw,bg,hard,intr,rsize=32768,wsize=32768,tcp,nfsvers=3,timeo=600,noac,addr=192.168.217.45)

Create a Disk-File to be used by ASM

1
2
3
4
5
6
7
8
9
10
$ dd if=/dev/zero of=/voting_disk/asm_vote_quorum bs=10M count=58
58+0 records in
58+0 records out
608174080 bytes (608 MB) copied, 3.68873 seconds, 165 MB/s
# chmod 660 /voting_disk/asm_vote_quorum
# chown oracle.asmadmin /voting_disk/asm_vote_quorum
# ls -ltr /voting_disk/asm_vote_quorum
-rw-rw---- 1 oracle asmadmin 608174080 Jan 10 20:00 /voting_disk/asm_vote_quorum

Adding the new Diskstring on ASM

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SQL> show parameter asm_diskstring
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      ORCL:*
SQL> ALTER SYSTEM SET asm_diskstring ='ORCL:*','/voting_disk/asm_vote_quorum' SCOPE=both SID='*';
SQL> show parameter asm_diskstring
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------------
asm_diskstring                       string      ORCL:*, /voting_disk/asm_vote_quorum</pre>
$ asmcmd dsget
parameter:ORCL:*, /voting_disk/asm_vote_quorum
profile:ORCL:*,/voting_disk/asm_vote_quorum

Checking if this new disk is avaliable on ASM

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
$ kfod disks=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group
================================================================================
   1:        580 Mb /voting_disk/asm_vote_quorum             oracle   asmadmin
   2:        486 Mb ORCL:CRS01
   3:        486 Mb ORCL:CRS02
   .
   .
   .
  9:         580 Mb ORCL:VOTE01
  10:        580 Mb ORCL:VOTE02
  11:        580 Mb ORCL:VOTE03
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
     +ASM3 /u01/app/11.2.0/grid
     +ASM1 /u01/app/11.2.0/grid
     +ASM2 /u01/app/11.2.0/grid
SQL> col path for a30
SQL>
select path,header_status
from v$asm_disk
 where path like '%vote_quorum%';
PATH                           HEADER_STATUS
------------------------------ --------------------
/voting_disk/asm_vote_quorum   CANDIDATE
SQL>  ALTER DISKGROUP VOTE
      ADD
      QUORUM FAILGROUP STG_NFS DISK '/voting_disk/asm_vote_quorum';
Diskgroup altered.
$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   aaa75b9e7ce24f39bfd9eecb3e3c0e38 (ORCL:VOTE01) [VOTE]
 2. ONLINE   873d51346cd34fc2bf9caa94999c4cd8 (ORCL:VOTE02) [VOTE]
 3. ONLINE   51f29389684e4f60bfb4b1683db8bd09 (/voting_disk/asm_vote_quorum) [VOTE]
Located 3 voting disk(s).
SQL>
SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A10
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK
WHERE GROUP_NUMBER = ( SELECT GROUP_NUMBER
             FROM V$ASM_DISKGROUP
             WHERE NAME='VOTE');
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
VOTE01     ORCL:VOTE01                    MEMBER               STG1_C1              REGULAR              Y
VOTE02     ORCL:VOTE02                    MEMBER               STG1_C2              REGULAR              Y
VOTE03     ORCL:VOTE03                    MEMBER               STG1_C1_1            REGULAR              N
VOTE_0003  /voting_disk/asm_vote_quorum   MEMBER               STG_NFS              QUORUM               Y
### Use WAIT option to make sure wich you can remove asmdisk, it will not release the prompt until the rebalance operation completed.
SQL>  ALTER DISKGROUP VOTE
     DROP DISK 'VOTE03'
     REBALANCE POWER 3 WAIT;
Diskgroup altered.
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
VOTE01     ORCL:VOTE01                    MEMBER               STG1_C1              REGULAR              Y
VOTE02     ORCL:VOTE02                    MEMBER               STG1_C2              REGULAR              Y
VOTE_0003  /voting_disk/asm_vote_quorum   MEMBER               STG_NFS              QUORUM               Y
Can we have 15 Voting Disk on ASM?

No. 15 voting files is allowed if you not storing voting on ASM. If you are using ASM the maximum number of voting files is 5. Because Oracle will take configuration of Diskgroup.
Using high number of voting disks can be useful when you have a big cluster environment with (e.g) 5 Storage Subsystem and 20 Hosts in a single Cluster. You must set up a voting file in each storage … but if you’re using only one storage voting 3 files is enough.

https://forums.oracle.com/forums/thread.jspa?messageID=10070225

Oracle Doc’s: You should have at least three voting disks, unless you have a storage device, such as a disk array, that provides external redundancy. Oracle recommends that you do not use more than 5 voting disks. The maximum number of voting disks that is supported is 15.
http://docs.oracle.com/cd/E11882_01/rac.112/e16794/crsref.htm#CHEJDHFH

See this example;

I configured 7 ASM DISK but ORACLE used only 5 ASM DISK.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
SQL> CREATE DISKGROUP DG_VOTE HIGH REDUNDANCY
     FAILGROUP STG1 DISK 'ORCL:DG_VOTE01'
     FAILGROUP STG2 DISK 'ORCL:DG_VOTE02'
     FAILGROUP STG3 DISK 'ORCL:DG_VOTE03'
     FAILGROUP STG4 DISK 'ORCL:DG_VOTE04'
     FAILGROUP STG5 DISK 'ORCL:DG_VOTE05'
     FAILGROUP STG6 DISK 'ORCL:DG_VOTE06'
     FAILGROUP STG7 DISK 'ORCL:DG_VOTE07'
   ATTRIBUTE 'compatible.asm' = '11.2.0.0.0';
Diskgroup created.
SQL> ! srvctl start diskgroup -g DG_VOTE -n lnxora02,lnxora03
$  crsctl replace votedisk +DG_VOTE
CRS-4256: Updating the profile
Successful addition of voting disk 427f38b47ff24f52bf1228978354f1b2.
Successful addition of voting disk 891c4a40caed4f05bfac445b2fef2e14.
Successful addition of voting disk 5421865636524f5abf008becb19efe0e.
Successful addition of voting disk a803232576a44f1bbff65ab626f51c9e.
Successful addition of voting disk 346142ea30574f93bf870a117bea1a39.
Successful deletion of voting disk 2166953a27a14fcbbf38dae2c4049fa2.
Successfully replaced voting disk group with +DG_VOTE.
$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   427f38b47ff24f52bf1228978354f1b2 (ORCL:DG_VOTE01) [DG_VOTE]
 2. ONLINE   891c4a40caed4f05bfac445b2fef2e14 (ORCL:DG_VOTE02) [DG_VOTE]
 3. ONLINE   5421865636524f5abf008becb19efe0e (ORCL:DG_VOTE03) [DG_VOTE]
 4. ONLINE   a803232576a44f1bbff65ab626f51c9e (ORCL:DG_VOTE04) [DG_VOTE]
 5. ONLINE   346142ea30574f93bf870a117bea1a39 (ORCL:DG_VOTE05) [DG_VOTE]
SQL >
SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A10
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK
WHERE GROUP_NUMBER = ( SELECT GROUP_NUMBER
             FROM V$ASM_DISKGROUP
             WHERE NAME='DG_VOTE');
NAME       PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
---------- ------------------------------ -------------------- -------------------- -------------------- --------------------
DG_VOTE01  ORCL:DG_VOTE01                 MEMBER               STG1                 REGULAR              Y
DG_VOTE02  ORCL:DG_VOTE02                 MEMBER               STG2                 REGULAR              Y
DG_VOTE03  ORCL:DG_VOTE03                 MEMBER               STG3                 REGULAR              Y
DG_VOTE04  ORCL:DG_VOTE04                 MEMBER               STG4                 REGULAR              Y
DG_VOTE05  ORCL:DG_VOTE05                 MEMBER               STG5                 REGULAR              Y
DG_VOTE06  ORCL:DG_VOTE06                 MEMBER               STG6                 REGULAR              N
DG_VOTE07  ORCL:DG_VOTE07                 MEMBER               STG7                 REGULAR              N
Errors and Workaround

ASM removed VOTEDISK from wrong ASMDISK (failgroup)… How fix it?

You can not choose which ASMDISK the votedisk will be removed. This can be a problem.
It is easy to solve this problem.

Follow this steps:

1
2
3
4
5
6
7
8
9
10
11
## As you configured an NFS then you can move the votedisk to NFS.
$ crsctl replace votedisk '/voting_disk/vote_temp
#### So, you can drop desired ASMDISK and ADD new ASMDISK with QUORUM option.
#### It's recommended you have 3 Failgroup (one failgroup in each storage) and 3rd failgroup is a quorum on nfs.
#### After reconfigure ASM Diskgroup VOTE you can move votedisk on nfs to ASM.
$ crsctl replace votedisk +VOTE
### Everthing will work

After restart Cluster CRS is not Starting, how fix it?

Problem: After restart Cluster in all nodes the CRS is not starting in some nodes after change OCR Location.
The node with problem was not updated the OCR Location, so he can trying find old diskgroup.

I changed OCR from +CRSTMP to +CRS, +VOTE

You can solve it’s manually:
The error on log crsd.log is look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2012-01-10 14:39:26.144: [ CRSMAIN][4039143920] Initializing OCR
2012-01-10 14:39:26.145: [ CRSMAIN][1089243456] Policy Engine is not initialized yet!
[   CLWAL][4039143920]clsw_Initialize: OLR initlevel [70000]
2012-01-10 14:39:32.712: [  OCRRAW][4039143920]proprioo: for disk 0 (+CRSTMP), id match (0), total id sets, (0) need recover (0), my votes (0), total votes (0), commit_lsn (0), lsn (0)
2012-01-10 14:39:32.712: [  OCRRAW][4039143920]proprioo: my id set: (723563391, 1028247821, 0, 0, 0)
2012-01-10 14:39:32.712: [  OCRRAW][4039143920]proprioo: 1st set: (0, 0, 0, 0, 0)
2012-01-10 14:39:32.712: [  OCRRAW][4039143920]proprioo: 2nd set: (0, 0, 0, 0, 0)
2012-01-10 14:39:32.838: [  OCRRAW][4039143920]utiid:problem validating header for owner db phy_addr=0
2012-01-10 14:39:32.838: [  OCRRAW][4039143920]proprinit:problem reading the bootblock or superbloc 26
2012-01-10 14:39:33.565: [  OCRAPI][4039143920]a_init:16!: Backend init unsuccessful : [26]
2012-01-10 14:39:33.570: [  CRSOCR][4039143920] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage
2012-01-10 14:39:33.570: [  CRSOCR][4039143920][PANIC] OCR Context is NULL(File: caaocr.cpp, line: 145)
2012-01-10 14:39:33.570: [    CRSD][4039143920][PANIC] CRSD Exiting. OCR Failed
2012-01-10 14:39:33.571: [    CRSD][4039143920] Done.
[/sourcode]
We can get error in two phrases:
1
2012-01-10 14:39:32.712: [  OCRRAW][4039143920]proprioo: for disk 0 (+CRSTMP), id match (0), total id sets, (0) need recover (0), my votes (0), total votes (0), commit_lsn (0), lsn (0)
2012-01-10 14:39:33.570: [  CRSOCR][4039143920] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage

To solve it:

Connect on server wich CRS is working.

And see the content of file “cat /etc/oracle/ocr.loc”

In my case:

On node where CRS is working:

1
2
3
4
5
6
host: lnxora01
$ cat /etc/oracle/ocr.loc
#Device/file +CRSTMP getting replaced by device +CRS
ocrconfig_loc=+CRS
ocrmirrorconfig_loc=+VOTE
local_only=false

On node where CRS is not working:

1
2
3
4
host: lnxora02
$ cat /etc/oracle/ocr.loc
ocrconfig_loc=+CRSTMP
local_only=false

The file “/etc/oracle/ocr.loc” must be equal in all node, so I updated the ocr.loc on server with problem and all the CRS started without error

Enjoy…

Ref:https://levipereira.wordpress.com/2012/01/11/explaining-how-to-store-ocr-voting-disks-and-asm-spfile-on-asm-diskgroup-rac-or-rac-extended/

Restoring OCR and Voting Disk on 11gR2

OCR and Voting Disk are critical components for Oracle RAC. There are scenario’s where it can be lost. e.g Volume containing them is overwritten or somebody accidently deletes the file. Fortunately there are steps to restore them. Starting 11gR2, Oracle automatically takes backup of Voting disk too in OCR backups. Below steps list down the exact procedure for performing the restore

Step 1: Identify the OCR backup location. ocrconfig -showbackup would provide the location of the backup. Depending on what is the master node, you should find the backup on one of the nodes on cluster. Default location is grid_home/cdata/cluster_name

Step 2: Identify which backup to restore. There is weekly backup, daily backup and 3 hourly backup’s which are preserved. You should identify the file which contain’s the latest ocr changes i.e if you modified db service,instance using srvctl or added a node. We are using weekly backup as no changes were made to the system

Step 3: Ensure that CRS is stopped on all nodes and restore the backup

[root@proddb-001 ]# ocrconfig -restore week.ocr 

Step 4: Start the CRS in exclusive mode

[root@proddb-001 ]#  crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'proddb-001'
CRS-2676: Start of 'ora.mdnsd' on 'proddb-001' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'proddb-001'
CRS-2676: Start of 'ora.gpnpd' on 'proddb-001' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'proddb-001'
CRS-2672: Attempting to start 'ora.gipcd' on 'proddb-001'
CRS-2676: Start of 'ora.cssdmonitor' on 'proddb-001' succeeded
CRS-2676: Start of 'ora.gipcd' on 'proddb-001' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'proddb-001'
CRS-2672: Attempting to start 'ora.diskmon' on 'proddb-001'
CRS-2676: Start of 'ora.diskmon' on 'proddb-001' succeeded
CRS-2676: Start of 'ora.cssd' on 'proddb-001' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'proddb-001'
CRS-2676: Start of 'ora.ctssd' on 'proddb-001' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'proddb-001'
CRS-2676: Start of 'ora.crsd' on 'proddb-001' succeeded

We can verify that no votingdisk is present

[root@proddb-001 ]# crsctl query css votedisk
Located 0 voting disk(s).

Step 5: Add the voting disk and provide the location

[root@proddb-001 cludata]# crsctl add css votedisk /u05/cludata/cssfile
Now formatting voting disk: /u05/cludata/cssfile.
CRS-4603: Successful addition of voting disk /u05/cludata/cssfile.
[root@proddb-001 cludata]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   77de89dc89fe4fb3bfb87e5a33237312 (/u05/cludata/cssfile) []
Located 1 voting disk(s).

Step 6: Stop the crs using -f option

[root@proddb-001 cludata]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'proddb-001'
CRS-2673: Attempting to stop 'ora.crsd' on 'proddb-001'
CRS-2677: Stop of 'ora.crsd' on 'proddb-001' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'proddb-001'
CRS-2673: Attempting to stop 'ora.ctssd' on 'proddb-001'
CRS-2677: Stop of 'ora.mdnsd' on 'proddb-001' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'proddb-001' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'proddb-001'
CRS-2677: Stop of 'ora.cssd' on 'proddb-001' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'proddb-001'
CRS-2677: Stop of 'ora.gipcd' on 'proddb-001' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'proddb-001'
CRS-2677: Stop of 'ora.gpnpd' on 'proddb-001' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'proddb-001' has completed
CRS-4133: Oracle High Availability Services has been stopped.

Step 7: Start the crs normally on all nodes.

crsctl start crs >/pre>