Moving OCR / VOTING disk from ASM to filesystem in 11gr2

In Oracle 11gr2 we can use ASM as the storage for OCR and VOTING disks.

There are few things we need to understand

1. Directly restore of a manual / automatic OCR backup is not possible, if the OCR is stored on ASM disk group.
2. For successful ASM start, CRS stack must be up.
3. At the time of OCR restore, OCR should not be in use, i.e. no CRS daemon must be running.

So it is kind of cyclic dependency between ASM & CRS. There is a way to overcome this scenario. But to keep the setup simple, we can always follow traditional approach of the using CFS (Clustered File System) to store the OCR and VOTING disks.

In this scenario, I’ve described how to change the storage from ASM to CFS for OCR & VOTE.

Environment Details

SunOS node1.mydomain.com 5.10 Generic_141445-09 i86pc i386 i86pc
SunOS node2.mydomain.com 5.10 Generic_141445-09 i86pc i386 i86pc

# cat /etc/release
       Solaris 10 10/09 s10x_u8wos_08a X86
Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
 Use is subject to license terms.
    Assembled 16 September 2009

I have already created a CFS as /ocrvote and 2 directories (ocr,vote) in it for storing these files.

First we’ll add the OCR to CFS

1. Check the existing OCR details

# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2332
         Available space (kbytes) :     259788
         ID                       :  775547019
         Device/File Name         :      +DATA
                            Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

As we can see the current OCR is stored on ASM DG +DATA.

2. Add new OCR on CFS

# ocrconfig -add /ocrvote/ocr/ocr.dat
PROT-30: The Oracle Cluster Registry location to be added is not accessible

Workaround to this problem
# touch /ocrvote/ocr/ocr.dat

Now retry the same operation

# ocrconfig -add /ocrvote/ocr/ocr.dat
#

Check the OCR details again

# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2184
         Available space (kbytes) :     259936
         ID                       : 1340063014
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
         Device/File Name         : /ocrvote/ocr/ocr.dat
                                     Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

We are seeing second OCR added on CFS

Add Third OCR on CFS

# touch  /ocrvote/ocr/ocr2.dat
# ocrconfig -add /ocrvote/ocr/ocr2.dat
# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2152
         Available space (kbytes) :     259968
         ID                       : 1340063014
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
         Device/File Name         : /ocrvote/ocr/ocr.dat
                                    Device/File integrity check succeeded
         Device/File Name         : /ocrvote/ocr/ocr2.dat
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

# ls -al /ocrvote/ocr/
total 25368
drwxr-xr-x   2 oragrid  oinstall      96 Sep  3 17:56 .
drwxr-xr-x   5 oragrid  oinstall      96 Sep  2 20:51 ..
-rw-r–r–   1 root     root     272756736 Sep  3 17:58 ocr.dat
-rw-r–r–   1 root     root     272756736 Sep  3 17:58 ocr2.dat

3. Delete OCR file from ASM DG +DATA

# ocrconfig -delete +DATA

Check the OCR details again

# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2152
         Available space (kbytes) :     259968
         ID                       : 1340063014
         Device/File Name         : /ocrvote/ocr/ocr.dat
                                    Device/File integrity check succeeded
         Device/File Name         : /ocrvote/ocr/ocr2.dat
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

Second part, change VOTING disk

1. Started the CRS in the exclusive mode.

This mode will allow ASM to start & stay up without the presence of a Voting disk and without the CRS daemon process running
# CRS_HOME/bin/crsctl start crs -excl

CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘node1’
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘node1’
CRS-2676: Start of ‘ora.gipcd’ on ‘node1’ succeeded
CRS-2676: Start of ‘ora.mdnsd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘node1’
CRS-2676: Start of ‘ora.gpnpd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘node1’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘node1’
CRS-2679: Attempting to clean ‘ora.diskmon’ on ‘node1’
CRS-2681: Clean of ‘ora.diskmon’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘node1’
CRS-2676: Start of ‘ora.diskmon’ on ‘node1’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.ctssd’ on ‘node1’
CRS-2676: Start of ‘ora.ctssd’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.asm’ on ‘node1’
CRS-2676: Start of ‘ora.asm’ on ‘node1’ succeeded
CRS-2672: Attempting to start ‘ora.crsd’ on ‘node1’
CRS-2676: Start of ‘ora.crsd’ on ‘node1’ succeeded

2. Check the existing voting disk

# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
—  —–    —————–                ——— ———
 1. ONLINE   a8115f266b524f4abfe02d173f0ac3d2 (/dev/rdsk/c0t60060E800564F700000064F700000561d0s6) [DATA]
Located 1 voting disk(s).

One voting file exists on ASM diskgroup +DATA.

3. Remove the ASM based voting disk

# crsctl delete css votedisk a8115f266b524f4abfe02d173f0ac3d2
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

It is not allowing to remove the voting disk, as it is part of ASM. This is similar to bug:9294664 described in Metalink note  1060146.1.

According to this note, we can not delete the voting disk if it is on the ASM. We can only replace it.

# crsctl replace votedisk   /ocrvote/vote/vote.dat
Now formatting voting disk: /ocrvote/vote/vote.dat.
CRS-4256: Updating the profile
Successful addition of voting disk 8be0f971ea394f2fbfbd4c9056c42656.
Successful deletion of voting disk a8115f266b524f4abfe02d173f0ac3d2.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

Add another voting disk

# /oragrid/11.2/bin/crsctl add css votedisk /ocrvote/vote/vote2.dat
Now formatting voting disk: /ocrvote/vote/vote2.dat.
CRS-4603: Successful addition of voting disk /ocrvote/vote/vote2.dat.
Check newly added voting disk

# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
—  —–    —————–                ——— ———
 1. ONLINE   8be0f971ea394f2fbfbd4c9056c42656 (/ocrvote/vote/vote.dat) []
 2. ONLINE   239fa2fbbfb54f12bf519485cd10821e (/ocrvote/vote/vote2.dat) []
Located 2 voting disk(s).

# ls -al /ocrvote/vote/
total 82048
drwxr-xr-x   2 oragrid  oinstall      96 Sep  3 18:25 .
drwxr-xr-x   5 oragrid  oinstall      96 Sep  2 20:51 ..
-rw-r—–   1 oragrid  root     21004288 Sep  3 18:25 vote.dat
-rw-r—–   1 oragrid  root     21004288 Sep  3 18:25 vote2.dat

4. Stop the CRS
# crsctl stop crs

5. Cross verify /var/opt/oracle/ocr.loc file on all cluster nodes (platform specific)

# cat /var/opt/oracle/ocr.loc
#Device/file +DATA getting replaced by device /ocrvote/ocr/ocr.dat
ocrconfig_loc=/ocrvote/ocr/ocr.dat
ocrmirrorconfig_loc=/ocrvote/ocr/ocr2.dat

5. Start the CRS
# crsctl start crs

So we have our OCR & VOTING files on CFS.

Advertisements
This entry was posted in Oracle Automatic Storage Management, Oracle Cluster Ready Services, Oracle Real Application Cluster and tagged . Bookmark the permalink.

11 Responses to Moving OCR / VOTING disk from ASM to filesystem in 11gr2

  1. Tom says:

    just adding –
    crsctl start crs – excl -nocrs
    starts ohad and cssd ctssd but does not start crs.
    this allows you to work with the registry files etc as well as other manipulations.

    such as ocrconfig -import or ocrconfig -restore.

  2. pankaj sharma says:

    Hi Yogesh,
    I have a scenario regarding the repairing the OCR in Oracle 11g R2.
    I am getting problem while starting the CRS. Below is the necessary details:

    1) I am using oracle 11g R2 RAC, with 2 node named RAC1 and RAC2
    2) I have 3 diskgroup named DATA, DATA1, DATA2
    3) OCR and voting disks are stored on ASM diskgroup

    On node RAC1, CRS is started and /etc/oracle/ocr.loc is having the below entry:

    ocrconfig_loc=+DATA
    ocrmirrorconfig_loc=+DATA1

    On node RAC2, CRS is stopped and /etc/oracle/ocr.loc is having the below entry:

    ocrconfig_loc=+DATA
    ocrmirrorconfig_loc=+DATA2

    Please note the difference in “ocrmirrorconfig_loc”. The difference was occured because I updated the OCR location on RAC1 when RAC2 was down.

    Now I am trying to repair the OCR on the RAC2 using below command. This command must be run as root user when HA service is started but CRS must be stopped.

    ocrconfig -repair -replace +DAT2 -replacement +DATA1

    but this showing error “PROT-21: Invalid parameter”, because to run this command diskgroup must be mounted. But diskgroup cannot be mounted until the CRS start. SO this is a conflicting situation.

    Please suggest how to overcome from this problem ?

    Please let me know if you required any other details.

    • CLI contains wrong DG name +DAT2? I don’t think you need to repair OCR in this case. I think mirror DG location is already registered in OCR from first node. If this is *NOT* production system, change /etc/oracle/ocr.loc on second node, to reflect correct location or copy this file from node1 & try starting CRS.

      • pankaj sharma says:

        Hi Yogesh,
        Thanks a lot for your valuable suggestion. It worked for me perfectly.
        Just want to know that, is it a good practice to update the ocr.loc manually ?

        I also have some questions regarding RAC concepts:
        1) As per my understanding, ocrconfig commands only updates the ocr.loc file on all the available nodes. Does it updates anywhere else also ?
        2) As per my understanding, while clusterware is running, if we lost my all OCR, then crsd.bin would be stopped and it won’t be restart automatically again due to unavailability of OCR. In this case we won’t be able to use commands like srvctl, crsctl, and ocrconfig which interact with OCR using crsd.bin process.
        But how will it affect other resources like database instance, listeners, scan, vips ? These all are still keep on running after losing all OCR and stopping the crsd.bin.

  3. Q. Is it a good practice to update the ocr.loc manually?
    A. No. I won’t recommend this in production environment. I’ve suggested this option, as OCR was already changed, but changes were not reflected on one of the node (which was down at the time of change). Being on shared storage, correcting the paths worked.

    Q. As per my understanding, ocrconfig commands only updates the ocr.loc file on all the available nodes. Does it updates anywhere else also?
    A. Check olr.loc file (Oracle Local Registry new in 11g). This is local to every node.

    Q. As per my understanding, while clusterware is running, if we lost my all OCR, then crsd.bin would be stopped and it won’t be restart automatically again due to unavailability of OCR. In this case we won’t be able to use commands like srvctl, crsctl, and ocrconfig which interact with OCR using crsd.bin process.
    But how will it affect other resources like database instance, listeners, scan, vips ? These all are still keep on running after losing all OCR and stopping the crsd.bin.
    A. CRS is down, which means ASM is down, which mean DB is down & all other resources. CRS is the *KEY* component, without which nothing will function. There is cascading dependency between these.

  4. pankaj sharma says:

    Hi Yogesh,
    I have one more question regarding voting disk.
    Could you please tell me that in Oracle 11g, information regarding voting disk location is stored in which file ?
    I know that we can use “crsctl query css votedisk” to see the location of voting disk, but from which file or place this command pick the information ?

  5. pankaj sharma says:

    Hi Yogesh,

    This is regarding Oracle 11gR2 RAC. I have a 2 node RAC on RHEL5. I have my voting file on ASM diskgroup named “+DATA1”, which is created on device “/dev/sdc1”. Below is the details:

    [root@rac1 ~]# oracleasm querydisk -p ASMDISK2
    Disk “ASMDISK2″ is a valid ASM disk
    /dev/sdc1: LABEL=”ASMDISK2″ TYPE=”oracleasm”

    I was just want to test the scenario when I lost my all voting disk, then what happens actually. So I make the zero space the partition using below command:

    [root@rac1 ~]# dd if=/dev/zero of=/dev/sdc1
    dd: writing to `/dev/sdc1′: No space left on device
    10474318+0 records in
    10474317+0 records out
    5362850304 bytes (5.4 GB) copied, 161.053 seconds, 33.3 MB/s

    My disktimeout was 200 seconds. But I waited for approx 8-10 minutes. Then I query the votedisk, but still it is showing ONLINE on DATA1 diskgroup.

    [root@rac1 ~]# crsctl query css votedisk
    ## STATE File Universal Id File Name Disk group
    — —– —————– ——— ———
    1. ONLINE 06d8300bb6af4fd3bfa2e1e5cda9c440 (ORCL:ASMDISK2) [DATA1]
    Located 1 voting disk(s).

    Also below are some findings in this situation:
    1) ASMCA is not showing the diskgroup DATA1, it is deleted.
    2) Cluster alert log is not displaying any message regarding the lost of voting disk.
    3) cssd.log also not displaying any message regarding the lost of voting disk.
    4) Cluster is still running and up. Also its working properly.

    As per documentation, oracle 11g r2 provide “reboot less node fencing” in case of all voting disk failure. But atleast it must display any message in log files or how the “crsctl query css votedisk” is still giving that voting disk is ONLINE ?

    I am not able to understand this situation. Please explain me the reason ?

    Thanks in advanced.

  6. Hi there, always i used to check webpage posts here in the early hours in the morning, since i love to find out more
    and more.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s