Step By Step: Upgrade 11gR1 RAC to 11gR2 RAC on Oracle Enterprise Linux 5 (32 bit) Platform.

By Bhavin Hingu

 

                                                                                                                        <<HOME>>                                                            

 

This Document shows the step by step of upgrading 3-Node 11gR1 RAC to 11gR2 RAC. I have chosen the below upgrade path which will allow me to upgrade 11gR1 Clusterware and ASM to 11gR2 Grid infrastructure. I would prefer to perform this activity in a scheduled outage window though the rolling upgrade of ASM and CRS is possible in 11gR1. Upgrading 11gR1 RAC database needs outage and the total downtime may further be avoided or minimized by using Standby Database in the upgrade process (not covered here).

 

 

 

 

 

Existing 10gR2 RAC setup (Before Upgrade)

Target 11gR2 RAC Setup (After Upgrade)

Clusterware

Oracle 11g R1 Clusterware 11.1.0.6

Oracle 11gR2 Grid Infrastructure 11.2.0.1

ASM Binaries

11g R1 RAC 11.1.0.6

Oracle 11gR2 Grid Infrastructure 11.2.0.1

Cluster Name

Lab

lab

Cluster Nodes

node1, node2, node3

node1, node2, node3

Clusterware Home

/u01/app/oracle/crs (CRS_HOME)

/u01/app/grid11201 (GRID_HOME)

Clusterware Owner

oracle:(oinstall, dba)

oracle:(oinstall, dba)

VIPs

node1-vip, node2-vip, node3-vip

node1-vip, node2-vip, node3-vip

SCAN

N/A

lab-scan.hingu.net

SCAN_LISTENER Host/port

N/A

Scan VIPs Endpoint: (TCP:1525)

OCR and Voting Disks Storage Type

Raw Devices

Raw Devices

OCR Disks

/dev/raw/raw1, /dev/raw/raw2

/dev/raw/raw1, /dev/raw/raw2

Voting Disks

/dev/raw/raw3, /dev/raw/raw4, /dev/raw/raw5

/dev/raw/raw3, /dev/raw/raw4, /dev/raw/raw5

ASM_HOME

/u01/app/oracle/asm11gr1

/u01/app/grid11201

ASM_HOME Owner

oracle:(oinstall, dba)

oracle:(oinstall, dba)

ASMLib user:group

oracle:oinstall

oracle:oinstall

ASM LISTENER

LISTENER (TCP:1521)

LISTENER (TCP:1521)

 

DB Binaries

Oracle 11gR1 RAC (11.1.0.6)

Oracle 11gR2 RAC (11.2.0.1)

DB_HOME

/u01/app/oracle/db11gr1

/u01/app/oracle/db11201

DB_HOME Owner

oracle:(oinstall, dba)

oracle:(oinstall, dba)

DB LISTENER

LAB_LISTENER

LAB_LISTENER

DB Listener Host/port

node1-vip, node2-vip, node3-vip (port 1530)

node1-vip, node2-vip, node3-vip (port 1530)

DB Storage Type, File Management

ASM with OMFs

ASM with OMFs

ASM diskgroups for DB and FRA

DATA, FRA

DATA, FRA

OS Platform

Oracle Enterprise Linux 5.5 (32 bit)

Oracle Enterprise Linux 5.5 (32 bit)

 

NOTE: The Grid Infrastructure owner must be the same as the 11gR1 CRS owner. Role Separation is not possible in upgrades.

 

HERE’s an existing 11gR1 RAC Setup in detail

 

The Upgrade Process is composed of below 5 Stages:

 

·         Pre-Upgrade Tasks

·         Upgrade 11gR1 Clusterware to the 11gR2 Grid Infrastructure (11.2.0.1).

·         Upgrade 11gR1 ASM to 11gR2 Grid Infrastructure.

·         Upgrade Database from 11gR1 RAC to 11gR2 RAC.

 

Pre-Upgrade tasks:

 

Install/Upgrade RPMs required for 11gR2 RAC Installation

Add SCAN VIPs to the DNS

Setup of Network Time Protocol

Start the nscd on all the RAC nodes

Backing up 11gR1 existing HOMEs and database

 

Minimum Required RPMs for 11gR2 RAC on OEL 5.5 (All the 3 RAC Nodes):

 

Below command verifies whether the specified rpms are installed or not. Any missing rpms can be installed from the OEL Media Pack

 

For 11gR2:

 

rpm -q binutils compat-libstdc++-33 elfutils-libelf elfutils-libelf-devel elfutils-libelf-devel-static \

              gcc gcc-c++ glibc glibc-common glibc-devel glibc-headers kernel-headers ksh libaio libaio-devel \

              libgcc libgomp libstdc++ libstdc++-devel make numactl-devel sysstat unixODBC unixODBC-devel

 

Combined both the release’s requirements, I had to install below RPM.

 

numactl-devel à Located on the 3rd CD of OEL 5.5 Media pack.

 

[root@node1 ~]# rpm -ivh numactl-devel-0.9.8-11.el5.i386.rpm

warning: numactl-devel-0.9.8-11.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159

Preparing...                ########################################### [100%]

   1:numactl-devel          ########################################### [100%]

[root@node1 ~]#

 

I had to upgrade the cvuqdisk RPM by removing and installing the same with higher version. This step is also taken care by rootupgrade.sh script.

 

cvuqdisk à Available on Grid Infrastructure Media (under rpm folder)

 

rpm -e cvuqdisk

export CVUQDISK_GRP=oinstall

echo $CVUQDISK_GRP

rpm -ivh cvuqdisk-1.0.7-1.rpm

 

 

Add SCAN VIPs to DNS:

 

SCAN VIPS to configure in DNS which resolves to lab-scan.hingu.net:

192.168.2.151

192.168.2.152

192.168.2.153

 

HERE is the existing DNS setup. In that setup, the below two files were modified with the entry in RED to add these SCAN VIPs into the DNS.

 

/var/named/chroot/var/named/hingu.net.zone

/var/named/chroot/var/named/2.168.192.in-addr.arpa.zone

 

/var/named/chroot/var/named/hingu.net.zone

 

$TTL    1d

hingu.net.  IN    SOA   lab-dns.hingu.net. root.hingu.net. (

    100        ; se = serial number

    8h         ; ref = refresh

    5m         ; ret = update retry

    3w         ; ex = expiry

    3h         ; min = minimum

    )

 

    IN    NS    lab-dns.hingu.net.

 

; DNS server

 

lab-dns    IN    A    192.168.2.200

 

; RAC Nodes Public name

 

node1          IN    A    192.168.2.1

node2          IN    A    192.168.2.2

node3          IN    A    192.168.2.3

 

; RAC Nodes Public VIPs

 

node1-vip      IN    A    192.168.2.51

node2-vip      IN    A    192.168.2.52

node3-vip      IN    A    192.168.2.53

 

; 3 SCAN VIPs

 

lab-scan       IN    A    192.168.2.151

lab-scan       IN    A    192.168.2.152

lab-scan       IN    A    192.168.2.153

 

; Storage Network

 

nas-server IN    A    192.168.1.101

node1-nas  IN    A    192.168.1.1

node2-nas  IN    A    192.168.1.2

node3-nas  IN    A    192.168.1.3

 

 

/var/named/chroot/var/named/2.168.192.in-addr.arpa.zone

 

$TTL    1d

@   IN    SOA   lab-dns.hingu.net. root.hingu.net. (

    100        ; se = serial number

    8h         ; ref = refresh

    5m         ; ret = update retry

    3w         ; ex = expiry

    3h         ; min = minimum

    )

 

    IN    NS    lab-dns.hingu.net.

 

; DNS machine name in reverse

200        IN    PTR lab-dns.hingu.net.

 

; RAC Nodes Public Name in Reverse

 

1          IN    PTR node1.hingu.net.

2          IN    PTR node2.hingu.net.

3          IN    PTR node3.hingu.net.

 

; RAC Nodes Public VIPs in Reverse

 

51          IN    PTR      node1-vip.hingu.net.

52          IN    PTR      node2-vip.hingu.net.

53          IN    PTR      node3-vip.hingu.net.

 

; RAC Nodes SCAN VIPs in Reverse

 

151         IN    PTR      lab-scan.hingu.net.

152         IN    PTR      lab-scan.hingu.net.

153         IN    PTR      lab-scan.hingu.net.

 

Restart the DNS Service (named):

 

service named restart

 

NOTE: nslookup for lab-scan should return names in random order every time.

 

Network Time Protocol Setting (On all the RAC Nodes):

      

Oracle Time Synchronization Service is chosen to be used over the Linux system provided ntpd. So, ntpd needs to be deactivated and deinstalled to avoid any possibility of it being conflicted with the Oracle’s Cluster Time Sync Service (ctss).

 

# /sbin/service ntpd stop

# chkconfig ntpd off

# mv /etc/ntp.conf /etc/ntp.conf.org

 

Also remove the following file:

 

/var/run/ntpd.pid

 

 

Network Service Cache Daemon (all the RAC nodes)

 

The Network Service Cache Daemon was started on all the RAC nodes.

 

Service nscd start

 

 

Backing Up ORACLE_HOMEs/database:

 

Steps I followed to take the Backup of ORACLE_HOMEs before the upgrade: (This can be applied to 11gR1 and 10g HOMEs)

 

On node1:

 

mkdir backup

cd backup

 

dd if=/dev/dev/raw1  of=ocr_disk_10gr2.bkp

dd if=/dev/dev/raw3  of=voting_disk_10gr2.bkp

 

tar cvf node1_crs_10gr2.tar /u01/app/oracle/crs/*

tar cvf node1_asm_10gr2.tar /u01/app/oracle/asm/*

tar cvf node1_db_10gr2.tar /u01/app/oracle/db/*

tar cvf node1_etc_oracle /etc/oracle/*

 

cp /etc/inittab etc_inittab

 

mkdir etc_init_d

cd etc_init_d

cp /etc/init.d/init* .

 

On node2:

 

mkdir backup

cd backup

 

tar cvf node2_crs_10gr2.tar /u01/app/oracle/crs/*

tar cvf node2_asm_10gr2.tar /u01/app/oracle/asm/*

tar cvf node2_db_10gr2.tar /u01/app/oracle/db/*

tar cvf node2_etc_oracle /etc/oracle/*

 

cp /etc/inittab etc_inittab

 

mkdir etc_init_d

cd etc_init_d

cp /etc/init.d/init* .

 

On node3:

 

mkdir backup

cd backup

 

tar cvf node3_crs_10gr2.tar /u01/app/oracle/crs/*

tar cvf node3_asm_10gr2.tar /u01/app/oracle/asm/*

tar cvf node3_db_10gr2.tar /u01/app/oracle/db/*

tar cvf node3_etc_oracle /etc/oracle/*

 

cp /etc/inittab etc_inittab

 

mkdir etc_init_d

cd etc_init_d

cp /etc/init.d/init* .

 

 

 

RMAN full database backup was taken.

 

With this, the pre-Upgrade steps are completed successfully and are ready to upgrade to 11g R2 Grid Infrastructure Next.

 

 

 

Step By Step: Upgrade Clusterware, ASM and Database from 11.1.0.6 to 11.2.0.1.

 

Upgrade 11gR1 CRS to 11gR2 Grid Infrastructure:

 

Oracle document recommends leaving all the RAC instances up and running during the upgrade process because the rootupgrade.sh script brings down the crs stack. I would prefer to atleast shutdown the database cleanly before the start of upgrade process.

 

·         Stop the labdb database.

·         Start the runInstaller from the 11gR2 Grid Infrastructure software stage.

 

Grid Infrastructure Upgrade process:

 

Installation Option:

 Upgrade Grid Infrastructure

Product Language:

English

Node Selection:

Select all the nodes

SCAN information:

SCAN name: lab-scan.hingu.net

SCAN port: 1525

ASM Monitor Password

Password entered

Prerequisite Checks:

Verify all the minimum prerequisites are satisfied successfully

Privileged Operating System Groups:

ASM Database Administrator (OSDBA) Group: dba

ASM Instance Administrator Operator (OSOPER) Group: dba

ASM Instance Administrator (OSASM) Group: oinstall

Installation Location:

Oracle Base: /u01/app/oracle

Software Location: /u01/app/grid11201

Summary Screen:

Verified the information here and pressed “Finish” to start installation.

 

At the end of the installation, the rootupgrade.sh script needs to be executed as root user on all the nodes one by one.

 

/u01/app/grid11201/rootupgrade.sh

 

The rootupgrade.sh failed on the last node (node3) with the below error as it seemed that CRS died after successful upgrade of OCR.

 

04.jpg

 

The alertnode3.log showed the below error. It seemed that both the OCR disks which are raw devices became inaccessible after the successful upgrade of OCR. I tried this upgrade 2-3 times and it error out at the same exact place all the time. Because the OCR was upgraded successfully, I thought to reboot all the nodes at this stage to see if the HA comes back up successfully after the reboots. I also wanted to confirm that the OCR integrity via ocrcheck to see if there are no logical corruption at the block level.

 

/u01/app/grid11201/log/node3/alertnode3.log:

 

[ctssd(22505)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2011-10-13 15:30:58.341

[ohasd(21071)]CRS-2765:Resource 'ora.crsd' has failed on server 'node3'.

2011-10-13 15:30:58.830

[client(25091)]CRS-1006:The OCR location /dev/raw/raw2 is inaccessible. Details in /u01/app/grid11201/log/node3/client/ocrconfig_25091.log.

2011-10-13 15:30:58.845

[client(25091)]CRS-1006:The OCR location /dev/raw/raw1 is inaccessible. Details in /u01/app/grid11201/log/node3/client/ocrconfig_25091.log.

2011-10-13 15:33:44.000

[crsd(25138)]CRS-1012:The OCR service started on node node3.

2011-10-13 15:36:45.355

 

/u01/app/grid11201/log/node3/client/ocrconfig_25091.log:

 

Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.

2011-10-13 15:27:59.695: [ OCRCONF][3047016128]ocrconfig starts...

2011-10-13 15:27:59.722: [ OCRCONF][3047016128]Exporting OCR data to [/u01/app/grid11201/cdata/lab/ocr11.2.0.1.0_upg_node3.ocr]

2011-10-13 15:30:58.830: [  OCRRAW][3047016128]proprior: Header check from OCR device 1 offset 0 failed (26).

2011-10-13 15:30:58.845: [  OCRRAW][3047016128]proprior: Header check from OCR device 0 offset 0 failed (22).

2011-10-13 15:30:58.845: [  OCRRAW][3047016128]ibctx: Failed to read the whole bootblock.  Assumes invalid format.

2011-10-13 15:30:58.845: [  OCRRAW][3047016128]rtnode:2: Problem [26] reading the tnode 553. Returning [123]

2011-10-13 15:30:58.846: [  OCRRAW][3047016128]prgval: problem reading the tnode

2011-10-13 15:30:58.846: [ OCRCONF][3047016128]Error[104]: Failed to get key value for key CRS.CUR.ora!node2!ons.USR_ORA_PRECONNECT

2011-10-13 15:30:58.847: [ OCRCONF][3047016128]Exiting [status=failed]...

 

I rebooted all the RAC nodes at this point and after the reboot the HA stack came back up successfully using the new 11gR2 Grid infrastructure but it did not have all the resources configured. I had to manually configure the SCAN, SCAN_LISTENER, OC4J and ACFS as shown. The database, DB services and GSD were down when the 11gR2 CRS came back up on all the nodes after the reboot. It was expected for GSD to remain down as it is disabled by default in 11gR2. I noticed that the srvctl was no longer working to start the db service oltp. I had to use crs_start from Grid Home and it worked fine.

 

07.jpg

 

Manual Tasks that were performed to complete the 11gR2 Grid Infrastructure configuration:

 

 

As oracle:

 

/u01/app/grid11201/bin/srvctl enable nodeapps –g

/u01/app/grid11201/bin/srvctl start nodeapps –n node1

/u01/app/grid11201/bin/srvctl start nodeapps –n node1

/u01/app/grid11201/bin/srvctl start nodeapps –n node1

 

/u01/app/oracle/db11gr1/bin/srvctl start database –d labdb

 

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb1.srv

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb2.srv

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb3.srv

 

As root:

 

/u01/app/grid11201/bin/srvctl add scan -n lab-scan.hingu.net

/u01/app/grid11201/bin/crsctl add type ora.registry.acfs.type -basetype ora.local_resource.type -file /u01/app/grid11201/crs/template/registry.acfs.type

/u01/app/grid11201/bin/crsctl add resource ora.registry.acfs -type ora.registry.acfs.type

 

As oracle:

 

/u01/app/grid11201/bin/srvctl add scan_listener –l listener -s -p TCP:1525

/u01/app/grid11201/bin/srvctl start scan

/u01/app/grid11201/bin/srvctl start scan_listener

/u01/app/grid11201/bin/srvctl add oc4j

/u01/app/grid11201/bin/srvctl start oc4j

/u01/app/grid11201/bin/crs_start ora.registry.acfs

 

 

As root: (verify the OCR integrity and logical corruption after the upgrade)

 

/u01/app/grid11201/bin/ocrcheck

/u01/app/grid11201/bin/crsctl query css votedisk

 

 

26.JPG

 

After configuring the CRS resources manually, the final CRS stack looked like below:

 

27.JPG

 

The OCR integrity Check and Logical Corruption Check was verified. Both the Disks looked fine.

 

28.JPG

 

 

HERE’s the detailed Screen Shots of Upgrading 11gR1 CRS to 11gR2 Grid Infrastructure

 

Upgrade 11gR1 ASM to 11gR2 Grid Infrastructure:

 

·         Stopped the labdb database.

·         Invoked the asmca from the 11gR2 Grid Infrastructure HOME (/u01/app/grid11201).

·         Moved the listener LISTENER from 11gR1 ASM_HOME

·         Started the labdb database using 11gR1 srvctl

·         Started the DB service oltp using /u01/app/grid11201/bin/crs_start

 

 

/u01/app/oracle/db11gr1/bin/srvctl stop database –d labdb

/u01/app/grid11201/bin/asmca

 

04.JPG

 

Move the Listener “LISTENER” from 11gR1 ASM Home to 11gR2 Grid Infrastructure:

 

/u01/app/oracle/db11gr1/bin/srvctl stop listener -l LISTENER_NODE1 -n node1

/u01/app/oracle/db11gr1/bin/srvctl stop listener -l LISTENER_NODE2 -n node2

/u01/app/oracle/db11gr1/bin/srvctl stop listener -l LISTENER_NODE3 -n node3

 

/u01/app/oracle/db11gr1/bin/srvctl remove listener -l LISTENER_NODE1 -n node1

/u01/app/oracle/db11gr1/bin/srvctl remove listener -l LISTENER_NODE2 -n node2

/u01/app/oracle/db11gr1/bin/srvctl remove listener -l LISTENER_NODE3 -n node3

 

Add the listener “LISTENER” using netca from 11gR2 Grid Infrastructure Home (TCP:1521)

 

/u01/app/grid11201/bin/netca

 

/u01/app/oracle/db11gr1/bin/srvctl start database –d labdb

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb1.srv

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb2.srv

/u01/app/grid11201/bin/crs_start ora.labdb.oltp.labdb3.srv

 

HERE’s the detailed Screen Shots of Upgrading 11gR1 ASM to 11gR2 Grid Infrastructure

 

 

Upgrade 11gR1 RAC Database to 11gR2 RAC:

 

Start the runInstaller from 11g R2 Real Application Cluster (RAC) Software Location:

 

/home/oracle/db11201/database/runInstaller

 

Real Application Cluster installation process:

 

Configure Security Updates:

 Email: bhavin@oracledba.org

Ignore the “Connection Failed” alert.

Installation Option:

Install database software only

Node Selection:

Select All the Nodes (node1,node2 and node3)

Product Language:

English

Database Edition:

Enterprise Edition

Installation Location:

Oracle Base: /u01/app/oracle

Software Location: /u01/app/oracle/db11201

Operating System Groups:

Database Administrator (OSDBA) Group: dba

Database Operator (OSOPER) Group: oinstall

Summary Screen:

Verified the information here and pressed “Finish” to start installation.

 

At the End of the installation, the below scripts needs to be executed on all the nodes as root user.

 

/u01/app/oracle/db11201/root.sh

 

Upgrade the Database labdb using dbua:

 

·         Invoked the dbua from the 11gR2 RAC HOME (/u01/app/oracle/db11201).

·         Fixed any Critical Warnings returned from pre-Upgrade Utility by DBUA.

·         After the Successful Upgrade of Database to 11.2.0.1, moved the Listener LAB_LISTENER to 11gR2 HOME

·         Updated the REMOTE_LISTENER parameter to lab-scan.hingu.net:1525

·         Stopped the database labdb

·         Rebooted all the nodes and verify that asm, database, listeners and other resources came back up without any issue.

 

/u01/app/oracle/db11201/bin/dbua

 

 

03.JPG

 

The upgrade of 11gR1 RAC database labdb finished without any error and here is the upgrade result.

 

07.JPG

 

 

Move the Listener “LAB_LISTENER” from 11gR1 RAC DB Home to 11gR2 RAC database Home:

 

 

Move the TNSNAMES.ORA from old 11gR1 HOME to 11gR2 Home.

 

ssh node3 cp /u01/app/oracle/db11gr1/network/admin/tnsnames.ora /u01/app/oracle/db11201/network/admin/

ssh node2 cp /u01/app/oracle/db11gr1/network/admin/tnsnames.ora /u01/app/oracle/db11201/network/admin/

ssh node1 cp /u01/app/oracle/db11gr1/network/admin/tnsnames.ora /u01/app/oracle/db11201/network/admin/

 

Invoke netca from 11gR1 HOME to remove listener LAB_LISTNEER

 

/u01/app/oracle/db11gr1/bin/netca

 

Invoke netca from 11gR2 HOME to add listener LAB_LISTNEER on the same port 1530

 

/u01/app/oracle/db11201/bin/netca

 

Select the same end point TCP:1530.

 

 

Modified the REMOTE_LISTENER parameter:

 

alter system set remote_listener='lab-scan.hingu.net' scope=both sid='*';

 

Restarted the database to verify that the database instances are appropriately registered with their respective listeners.

 

srvctl stop database -d labdb

srvctl start database -d labdb

 

Rebooted all the 3 RAC nodes and verified that all the resources comes up without any issue/errors.

 

reboot

 

               

09.JPG

 

HERE’s the detailed Screen Shots of Upgrading database from 11gR1 RAC to 11gR2 RAC

 

 

                                                                                                                        <<HOME>>                                                           

 

 

HTML Comment Box is loading comments...