In Oracle 11gr2 we can use ASM as the storage for OCR and VOTING disks.
There are few things we need to understand
1. Directly restore of a manual / automatic OCR backup is not possible, if the OCR is stored on ASM disk group.
2. For successful ASM start, CRS stack must be up.
3. At the time of OCR restore, OCR should not be in use, i.e. no CRS daemon must be running.
So it is kind of cyclic dependency between ASM & CRS. There is a way to overcome this scenario. But to keep the setup simple, we can always follow traditional approach of the using CFS (Clustered File System) to store the OCR and VOTING disks.
In this scenario, I’ve described how to change the storage from ASM to CFS for OCR & VOTE.
Environment Details
SunOS node1.mydomain.com 5.10 Generic_141445-09 i86pc i386 i86pc
SunOS node2.mydomain.com 5.10 Generic_141445-09 i86pc i386 i86pc
# cat /etc/release
Solaris 10 10/09 s10x_u8wos_08a X86
Copyright 2009 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 16 September 2009
I have already created a CFS as /ocrvote and 2 directories (ocr,vote) in it for storing these files.
First we’ll add the OCR to CFS
1. Check the existing OCR details
# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2332
Available space (kbytes) : 259788
ID : 775547019
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
As we can see the current OCR is stored on ASM DG +DATA.
2. Add new OCR on CFS
# ocrconfig -add /ocrvote/ocr/ocr.dat
PROT-30: The Oracle Cluster Registry location to be added is not accessible
Workaround to this problem
# touch /ocrvote/ocr/ocr.dat
Now retry the same operation
# ocrconfig -add /ocrvote/ocr/ocr.dat
#
Check the OCR details again
# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2184
Available space (kbytes) : 259936
ID : 1340063014
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /ocrvote/ocr/ocr.dat
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
We are seeing second OCR added on CFS
Add Third OCR on CFS
# touch /ocrvote/ocr/ocr2.dat
# ocrconfig -add /ocrvote/ocr/ocr2.dat
# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2152
Available space (kbytes) : 259968
ID : 1340063014
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /ocrvote/ocr/ocr.dat
Device/File integrity check succeeded
Device/File Name : /ocrvote/ocr/ocr2.dat
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
# ls -al /ocrvote/ocr/
total 25368
drwxr-xr-x 2 oragrid oinstall 96 Sep 3 17:56 .
drwxr-xr-x 5 oragrid oinstall 96 Sep 2 20:51 ..
-rw-r–r– 1 root root 272756736 Sep 3 17:58 ocr.dat
-rw-r–r– 1 root root 272756736 Sep 3 17:58 ocr2.dat
3. Delete OCR file from ASM DG +DATA
# ocrconfig -delete +DATA
Check the OCR details again
# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2152
Available space (kbytes) : 259968
ID : 1340063014
Device/File Name : /ocrvote/ocr/ocr.dat
Device/File integrity check succeeded
Device/File Name : /ocrvote/ocr/ocr2.dat
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
Second part, change VOTING disk
1. Started the CRS in the exclusive mode.
This mode will allow ASM to start & stay up without the presence of a Voting disk and without the CRS daemon process running
# CRS_HOME/bin/crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘node1’
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘node1’
CRS-2676: Start of ‘ora.gipcd’ on ‘node1’ succeeded
CRS-2676: Start of ‘ora.mdnsd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘node1’
CRS-2676: Start of ‘ora.gpnpd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘node1’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘node1’
CRS-2679: Attempting to clean ‘ora.diskmon’ on ‘node1’
CRS-2681: Clean of ‘ora.diskmon’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘node1’
CRS-2676: Start of ‘ora.diskmon’ on ‘node1’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.ctssd’ on ‘node1’
CRS-2676: Start of ‘ora.ctssd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.asm’ on ‘node1’
CRS-2676: Start of ‘ora.asm’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.crsd’ on ‘node1’
CRS-2676: Start of ‘ora.crsd’ on ‘node1’ succeeded
2. Check the existing voting disk
# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
— —– —————– ——— ———
1. ONLINE a8115f266b524f4abfe02d173f0ac3d2 (/dev/rdsk/c0t60060E800564F700000064F700000561d0s6) [DATA]
Located 1 voting disk(s).
One voting file exists on ASM diskgroup +DATA.
3. Remove the ASM based voting disk
# crsctl delete css votedisk a8115f266b524f4abfe02d173f0ac3d2
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM
It is not allowing to remove the voting disk, as it is part of ASM. This is similar to bug:9294664 described in Metalink note 1060146.1.
According to this note, we can not delete the voting disk if it is on the ASM. We can only replace it.
# crsctl replace votedisk /ocrvote/vote/vote.dat
Now formatting voting disk: /ocrvote/vote/vote.dat.
CRS-4256: Updating the profile
Successful addition of voting disk 8be0f971ea394f2fbfbd4c9056c42656.
Successful deletion of voting disk a8115f266b524f4abfe02d173f0ac3d2.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
Add another voting disk
# /oragrid/11.2/bin/crsctl add css votedisk /ocrvote/vote/vote2.dat
Now formatting voting disk: /ocrvote/vote/vote2.dat.
CRS-4603: Successful addition of voting disk /ocrvote/vote/vote2.dat.
Check newly added voting disk
# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
— —– —————– ——— ———
1. ONLINE 8be0f971ea394f2fbfbd4c9056c42656 (/ocrvote/vote/vote.dat) []
2. ONLINE 239fa2fbbfb54f12bf519485cd10821e (/ocrvote/vote/vote2.dat) []
Located 2 voting disk(s).
# ls -al /ocrvote/vote/
total 82048
drwxr-xr-x 2 oragrid oinstall 96 Sep 3 18:25 .
drwxr-xr-x 5 oragrid oinstall 96 Sep 2 20:51 ..
-rw-r—– 1 oragrid root 21004288 Sep 3 18:25 vote.dat
-rw-r—– 1 oragrid root 21004288 Sep 3 18:25 vote2.dat
4. Stop the CRS
# crsctl stop crs
5. Cross verify /var/opt/oracle/ocr.loc file on all cluster nodes (platform specific)
# cat /var/opt/oracle/ocr.loc
#Device/file +DATA getting replaced by device /ocrvote/ocr/ocr.dat
ocrconfig_loc=/ocrvote/ocr/ocr.dat
ocrmirrorconfig_loc=/ocrvote/ocr/ocr2.dat
5. Start the CRS
# crsctl start crs
So we have our OCR & VOTING files on CFS.
just adding –
crsctl start crs – excl -nocrs
starts ohad and cssd ctssd but does not start crs.
this allows you to work with the registry files etc as well as other manipulations.
such as ocrconfig -import or ocrconfig -restore.
Thanks for the note Tom.
Hi Yogesh,
I have a scenario regarding the repairing the OCR in Oracle 11g R2.
I am getting problem while starting the CRS. Below is the necessary details:
1) I am using oracle 11g R2 RAC, with 2 node named RAC1 and RAC2
2) I have 3 diskgroup named DATA, DATA1, DATA2
3) OCR and voting disks are stored on ASM diskgroup
On node RAC1, CRS is started and /etc/oracle/ocr.loc is having the below entry:
ocrconfig_loc=+DATA
ocrmirrorconfig_loc=+DATA1
On node RAC2, CRS is stopped and /etc/oracle/ocr.loc is having the below entry:
ocrconfig_loc=+DATA
ocrmirrorconfig_loc=+DATA2
Please note the difference in “ocrmirrorconfig_loc”. The difference was occured because I updated the OCR location on RAC1 when RAC2 was down.
Now I am trying to repair the OCR on the RAC2 using below command. This command must be run as root user when HA service is started but CRS must be stopped.
ocrconfig -repair -replace +DAT2 -replacement +DATA1
but this showing error “PROT-21: Invalid parameter”, because to run this command diskgroup must be mounted. But diskgroup cannot be mounted until the CRS start. SO this is a conflicting situation.
Please suggest how to overcome from this problem ?
Please let me know if you required any other details.
CLI contains wrong DG name +DAT2? I don’t think you need to repair OCR in this case. I think mirror DG location is already registered in OCR from first node. If this is *NOT* production system, change /etc/oracle/ocr.loc on second node, to reflect correct location or copy this file from node1 & try starting CRS.
Hi Yogesh,
Thanks a lot for your valuable suggestion. It worked for me perfectly.
Just want to know that, is it a good practice to update the ocr.loc manually ?
I also have some questions regarding RAC concepts:
1) As per my understanding, ocrconfig commands only updates the ocr.loc file on all the available nodes. Does it updates anywhere else also ?
2) As per my understanding, while clusterware is running, if we lost my all OCR, then crsd.bin would be stopped and it won’t be restart automatically again due to unavailability of OCR. In this case we won’t be able to use commands like srvctl, crsctl, and ocrconfig which interact with OCR using crsd.bin process.
But how will it affect other resources like database instance, listeners, scan, vips ? These all are still keep on running after losing all OCR and stopping the crsd.bin.
Q. Is it a good practice to update the ocr.loc manually?
A. No. I won’t recommend this in production environment. I’ve suggested this option, as OCR was already changed, but changes were not reflected on one of the node (which was down at the time of change). Being on shared storage, correcting the paths worked.
Q. As per my understanding, ocrconfig commands only updates the ocr.loc file on all the available nodes. Does it updates anywhere else also?
A. Check olr.loc file (Oracle Local Registry new in 11g). This is local to every node.
Q. As per my understanding, while clusterware is running, if we lost my all OCR, then crsd.bin would be stopped and it won’t be restart automatically again due to unavailability of OCR. In this case we won’t be able to use commands like srvctl, crsctl, and ocrconfig which interact with OCR using crsd.bin process.
But how will it affect other resources like database instance, listeners, scan, vips ? These all are still keep on running after losing all OCR and stopping the crsd.bin.
A. CRS is down, which means ASM is down, which mean DB is down & all other resources. CRS is the *KEY* component, without which nothing will function. There is cascading dependency between these.
Hi Yogesh,
Thanks for your great explanation.
Hi Yogesh,
I have one more question regarding voting disk.
Could you please tell me that in Oracle 11g, information regarding voting disk location is stored in which file ?
I know that we can use “crsctl query css votedisk” to see the location of voting disk, but from which file or place this command pick the information ?
Well you are querying CSS (Cluster Synchronization Services) there to get the details. I’m not aware of any file, where these details are stored.
Hi Yogesh,
This is regarding Oracle 11gR2 RAC. I have a 2 node RAC on RHEL5. I have my voting file on ASM diskgroup named “+DATA1”, which is created on device “/dev/sdc1”. Below is the details:
[root@rac1 ~]# oracleasm querydisk -p ASMDISK2
Disk “ASMDISK2″ is a valid ASM disk
/dev/sdc1: LABEL=”ASMDISK2″ TYPE=”oracleasm”
I was just want to test the scenario when I lost my all voting disk, then what happens actually. So I make the zero space the partition using below command:
[root@rac1 ~]# dd if=/dev/zero of=/dev/sdc1
dd: writing to `/dev/sdc1′: No space left on device
10474318+0 records in
10474317+0 records out
5362850304 bytes (5.4 GB) copied, 161.053 seconds, 33.3 MB/s
My disktimeout was 200 seconds. But I waited for approx 8-10 minutes. Then I query the votedisk, but still it is showing ONLINE on DATA1 diskgroup.
[root@rac1 ~]# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
— —– —————– ——— ———
1. ONLINE 06d8300bb6af4fd3bfa2e1e5cda9c440 (ORCL:ASMDISK2) [DATA1]
Located 1 voting disk(s).
Also below are some findings in this situation:
1) ASMCA is not showing the diskgroup DATA1, it is deleted.
2) Cluster alert log is not displaying any message regarding the lost of voting disk.
3) cssd.log also not displaying any message regarding the lost of voting disk.
4) Cluster is still running and up. Also its working properly.
As per documentation, oracle 11g r2 provide “reboot less node fencing” in case of all voting disk failure. But atleast it must display any message in log files or how the “crsctl query css votedisk” is still giving that voting disk is ONLINE ?
I am not able to understand this situation. Please explain me the reason ?
Thanks in advanced.
Hi there, always i used to check webpage posts here in the early hours in the morning, since i love to find out more
and more.