Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Wednesday, July 24, 2019

Huge Archive/Redo data generated

Redo log store all changes made to the database. Every instance of an Oracle Database has an associated redo log to protect the database in case of an instance failure.

Archive is generated due to log switches.In fact dependent on the redo generated. Cause of redo generation is any change in the database like, insert, delete or update query.

If you commit frequently, the amount of redo you generate goes UP as well (not only that, but it runs slower overall).

If we have issue of huge generation of archive log as immediately we can create enough room in the mount point to accommodate archive logs by compressing or moving the files.

Parallely we can search for the session generating high redo.

Query to find sessions when session is connected:

select s.username, s.osuser, s.status,s.sql_id,  sr.*  from
  (select sid, round(value/1024/1024) as "RedoSize(MB)"
          from v$statname sn, v$sesstat ss
          where sn.name = 'redo size'
                and ss.statistic# = sn.statistic#
          order by value desc) sr,
   v$session s
where sr.sid = s.sid
and   rownum <= 10;

select sid,value from v$sesstat natural join v$statname where name='redo size' order by 2;

If session is not connected at present and wanted to know the details of session which generated more redo we can reach as follows:

1. Take AWR report,check in session "Segments by DB Blocks Changes" for the segment having more db_block_changes.Now search for sql text in session 'Complete List of SQL Text' for corresponding segments.
or
2. Below query can be helpful:

SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
        dhsso.object_name,
        SUM(db_block_changes_delta)
  FROM dba_hist_seg_stat dhss,
         dba_hist_seg_stat_obj dhsso,
         dba_hist_snapshot dhs
  WHERE dhs.snap_id = dhss.snap_id
    AND dhs.instance_number = dhss.instance_number
    AND dhss.obj# = dhsso.obj#
    AND dhss.dataobj# = dhsso.dataobj#
    AND begin_interval_time BETWEEN to_date('2018_06_04 12','YYYY_MM_DD HH24')
                                           AND to_date('2018_04_04 12','YYYY_MM_DD HH24')
  GROUP BY to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'),
           dhsso.object_name

 
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') snap_time,
        SUM(db_block_changes_delta)
  FROM dba_hist_seg_stat dhss,
         dba_hist_seg_stat_obj dhsso,
         dba_hist_snapshot dhs
  WHERE dhs.snap_id = dhss.snap_id
    AND dhs.instance_number = dhss.instance_number
    AND dhss.obj# = dhsso.obj#
    AND dhss.dataobj# = dhsso.dataobj#
    AND dhsso.object_name = 'OWN_OBJECT'
  GROUP BY to_char(begin_interval_time,'YYYY_MM_DD HH24:MI')
 
 
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'),
         dbms_lob.substr(sql_text,4000,1),
         dhss.instance_number,
         dhss.sql_id,executions_delta,rows_processed_delta
  FROM dba_hist_sqlstat dhss,
         dba_hist_snapshot dhs,
         dba_hist_sqltext dhst
  WHERE UPPER(dhst.sql_text) LIKE '%OWN_OBJECT%'
    AND dhss.snap_id=dhs.snap_id
    AND dhss.instance_Number=dhs.instance_number
    AND dhss.sql_id = dhst.sql_id

SELECT instance_number, to_char(sample_time,'yyyy_mm_dd hh24:mi:ss'),
         user_id,
         program
  FROM dba_hist_active_sess_history
  WHERE sql_id = 'jy5i4p6t05v'
    AND snap_id BETWEEN 53234 AND 53245

Related doc id:

Diagnosing excessive redo generation [ID 199298.1]
Excessive Archives / Redo Logs Generation Troubleshooting [ID 832504.1]
Troubleshooting High Redo Generation Issues [ID 782935.1]
How to Disable (Temporary) Generation of Archive Redo Log Files [ID 177218.1]

Excessive Archives / Redo Logs Generation Troubleshooting [ID 832504.1]
NOTE:1035935.6 - Example of How To Resize the Online Redo Logfiles
NOTE:111886.1 - How to Setup LogMiner
NOTE:167492.1 - SQL: How to Find Sessions Generating Lots of Redo or Archive logs
NOTE:188691.1 - How to Avoid Generation of Redolog Entries
NOTE:199298.1 - Diagnosing excessive redo generation
NOTE:300395.1 - Using LogMiner, How to determine the cause of lots of redo generation.
NOTE:45042.1 - Archiver Best Practices
NOTE:461100.1 - STARTUP ERRORS ora-00824 cannot set sga_target with statistics_level=BASIC

Master Note: Troubleshooting Redo Logs and Archiving[Article ID 1507157.1]
SQL: How to Find Sessions Generating Lots of Redo or Archive logs [Article ID 167492.1]
Troubleshooting High Redo Generation Issues[Article ID 782935.1]
=========================================================================High redo generation is always a consequence of certain activity in the database.
The main cause of high redo generation is usually a high DML activity during a certain period of time and it’s a good practice to first examine modifications on either database level (parameters, any maintenance operations,…) and application level (deployment of new application, modification in the code, increase in the users,..).

What we need to examine:
Is supplemental logging enabled? The amount of redo generated when supplemental logging is enabled is quite high when compared to when supplemental logging is disabled.
What Causes High Redo When Supplemental Logging is Enabled (Doc ID 1349037.1)

Are a lot of indexes being used?, reducing the number of indexes or using the attribute NOLOGGING will reduce the redo considerably.

Are all the operation really in need of the use of LOGGING? From application we can reduce redo by making use of the clause NOLOGGING. Note that only the following operations can make use of NOLOGGING mode:
- direct load (SQL*Loader)
- direct-load INSERT
- CREATE TABLE ... AS SELECT
- CREATE INDEX
- ALTER TABLE ... MOVE PARTITION
- ALTER TABLE ... SPLIT PARTITION
- ALTER INDEX ... SPLIT PARTITION
- ALTER INDEX ... REBUILD
- ALTER INDEX ... REBUILD PARTITION
- INSERT, UPDATE, and DELETE on LOBs in NOCACHE NOLOGGING mode stored out of line
To confirm if the table or index has "NOLOGGING" set.
Issue the following statement.
select table_name,logging from all_tables where table_name = <table name>;
-or-
select table_name,logging from all_indexes where index_name = <index name>;

Do tables have triggers that might cause some indirect DML on other tables?

Is Auditing enabled the contributor for this excessive redo generation?

Are tablespaces in hot backup mode?

select a.tablespace_name
from sys.dba_data_files a, sys.v_$backup b
where b.status = ‘ACTIVE’
and b.file# = a.file_id
group by a.tablespace_name;

Examine the log switches:

select lg.group#,lg.bytes/1024/1024 mb, lg.status, lg.archived,lf.member
from v$logfile lf, v$log lg where lg.group# = lf.group# order by 1, 2;
select to_char(first_time,'YYYY-MON-DD') "Date", to_char(first_time,'DY') day,
to_char(sum(decode(to_char(first_time,'HH24'),'00',1,0)),'999') "00",
to_char(sum(decode(to_char(first_time,'HH24'),'01',1,0)),'999') "01",
to_char(sum(decode(to_char(first_time,'HH24'),'02',1,0)),'999') "02",
to_char(sum(decode(to_char(first_time,'HH24'),'03',1,0)),'999') "03",
to_char(sum(decode(to_char(first_time,'HH24'),'04',1,0)),'999') "04",
to_char(sum(decode(to_char(first_time,'HH24'),'05',1,0)),'999') "05",
to_char(sum(decode(to_char(first_time,'HH24'),'06',1,0)),'999') "06",
to_char(sum(decode(to_char(first_time,'HH24'),'07',1,0)),'999') "07",
to_char(sum(decode(to_char(first_time,'HH24'),'08',1,0)),'999') "08",
to_char(sum(decode(to_char(first_time,'HH24'),'09',1,0)),'999') "09",
to_char(sum(decode(to_char(first_time,'HH24'),'10',1,0)),'999') "10",
to_char(sum(decode(to_char(first_time,'HH24'),'11',1,0)),'999') "11",
to_char(sum(decode(to_char(first_time,'HH24'),'12',1,0)),'999') "12",
to_char(sum(decode(to_char(first_time,'HH24'),'13',1,0)),'999') "13",
to_char(sum(decode(to_char(first_time,'HH24'),'14',1,0)),'999') "14",
to_char(sum(decode(to_char(first_time,'HH24'),'15',1,0)),'999') "15",
to_char(sum(decode(to_char(first_time,'HH24'),'16',1,0)),'999') "16",
to_char(sum(decode(to_char(first_time,'HH24'),'17',1,0)),'999') "17",
to_char(sum(decode(to_char(first_time,'HH24'),'18',1,0)),'999') "18",
to_char(sum(decode(to_char(first_time,'HH24'),'19',1,0)),'999') "19",
to_char(sum(decode(to_char(first_time,'HH24'),'20',1,0)),'999') "20",
to_char(sum(decode(to_char(first_time,'HH24'),'21',1,0)),'999') "21",
to_char(sum(decode(to_char(first_time,'HH24'),'22',1,0)),'999') "22",
to_char(sum(decode(to_char(first_time,'HH24'),'23',1,0)),'999') "23" ,
count(*) Total from v$log_history group by to_char(first_time,'YYYY-MON-DD'), to_char(first_time,'DY')
order by to_date(to_char(first_time,'YYYY-MON-DD'),'YYYY-MON-DD')
This will give us an idea of the times when the high peaks of redo are happening
 Examine AWR report:
Next step will be examining the AWR from the hour where we have had the highest number of log switches, and confirm with the redo size that these log switches are actually caused by a lot of redo generation.
In the AWR we can also see the sql with most of the gets/executions to have an idea of the activity that is happening in the database and generating redo and we can also see the segments with the biggest number of block changes and the sessions performing these changes.Another way to find these sessions is described in SQL: How to Find Sessions Generating Lots of Redo or Archive logs (Doc ID 167492.1)
To find these segments we can also use queries:
SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time,
dhso.object_name,
sum(db_block_changes_delta) BLOCK_CHANGED
FROM dba_hist_seg_stat dhss,
dba_hist_seg_stat_obj dhso,
dba_hist_snapshot dhs
WHERE dhs.snap_id = dhss.snap_id
AND dhs.instance_number = dhss.instance_number
AND dhss.obj# = dhso.obj#
AND dhss.dataobj# = dhso.dataobj#
AND begin_interval_time BETWEEN to_date('11-01-28 13:00','YY-MM-DD HH24:MI') <<<<<<<<<<<< Need to modify the time as per the above query where more redo log switch happened (keep it for 1 hour)
AND to_date('11-01-28 14:00','YY-MM-DD HH24:MI') <<<<<<<<<<<< Need to modify the time as per the above query where more redo log switch happened (interval shld be only 1 hour)
GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'),
dhso.object_name
HAVING sum(db_block_changes_delta) > 0
ORDER BY sum(db_block_changes_delta) desc ;
-- Then : What SQL was causing redo log generation :
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24') WHEN,
dbms_lob.substr(sql_text,4000,1) SQL,
dhss.instance_number INST_ID,
dhss.sql_id,
executions_delta exec_delta,
rows_processed_delta rows_proc_delta
FROM dba_hist_sqlstat dhss,
dba_hist_snapshot dhs,
dba_hist_sqltext dhst
WHERE upper(dhst.sql_text) LIKE '%<segment_name>%' >>>>>>>>>>>>>>>>>> Update the segment name as per the result of previous query result
AND ltrim(upper(dhst.sql_text)) NOT LIKE 'SELECT%'
AND dhss.snap_id=dhs.snap_id
AND dhss.instance_number=dhs.instance_number
AND dhss.sql_id=dhst.sql_id
AND begin_interval_time BETWEEN to_date('11-01-28 13:00','YY-MM-DD HH24:MI') >>>>>>>>>>>> Update time frame as required
AND to_date('11-01-28 14:00','YY-MM-DD HH24:MI') >>>>>>>>>>>> Update time frame as required
 Finally, to troubleshoot further the issue and know the exact commands are being recorded at that particular time frame we can use log miner and mine the archivelog from the concerned time frame. We can look on v$archived_log and find the archived log generated at that particular time frame.
How To Determine The Cause Of Lots Of Redo Generation Using LogMiner (Doc ID 300395.1)

1) Query V$SESS_IO
This view contains the column BLOCK_CHANGES which indicates how much blocks have been changed by the session.
High values indicate a session generating lots of redo. The query you can use is:

select s.sid, s.serial#, s.username, s.program, i.block_changes
from v$session s, v$sess_io i
where s.sid = i.sid
order by 5 desc, 1,2,3,4;

Run the query multiple times and examine the delta between each occurrence of BLOCK_CHANGES.
Large deltas indicate high redo generation by the session.

2) Query V$TRANSACTION
This view contains information about the amount of undo blocks and undo records accessed by the transaction
(as found in the USED_UBLK and USED_UREC columns). The query you can use is:

SELECT s.sid, s.serial#, s.username, s.program, t.used_ublk, t.used_urec
FROM v$session s, v$transaction t
WHERE s.taddr = t.addr
ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence of USED_UBLK and USED_UREC.
Large deltas indicate high redo generation by the session.

Check which session is generating more redo.

set pages 1000
set lines 140
SELECT s.sid, s.serial#, s.username, s.program,
i.block_changes
FROM v$session s, v$sess_io i
WHERE s.sid = i.sid
ORDER BY 5 desc, 1, 2, 3, 4;

 

Thursday, July 18, 2019

MAX_STRING_SIZE Parameter In Oracle 12c:

Parameter "MAX_STRING_SIZE" is introduced in Oracle 12c.This controls the maximum size of string size in Oracle database.Parameter value either can be set to  STANDARD or EXTENDED
Default value is STANDARD.

MAX_STRING_SIZE = STANDARD means the maximum size of strings is 4000 bytes for VARCHAR2 and NVARCHAR2.

MAX_STRING_SIZE = EXTENDED means maximum size can be upto 32767 byte .

We can change the parameter value from STANDARD to EXTENDED but not possible from EXTENDED to STANDARD.


If MAX_STRING_SIZE value set to STANDARD and  try to set the length of column more than 4000,will return ORA-00910.

SQL> show parameter MAX_STRING_SIZE;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_string_size                      string      STANDARD


SQL> create table country ( state_name VARCHAR2(7000));
create table country ( state_name VARCHAR2(7000))
                                                *
ERROR at line 1:
ORA-00910: specified length too long for its datatype

This is how we can convert MAX_STRING_SIZE to EXTENDED:

1. Start database in upgrade mode:

SQL> show parameter MAX_STRING_SIZE;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_string_size                      string      STANDARD

SQL> SHUTDOWN IMMEDIATE;
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL>STARTUP UPGRADE

ORACLE instance started.
Total System Global Area 900000 bytes
Fixed Size 666 bytes
Variable Size 271790 bytes
Database Buffers 271790 bytes
Redo Buffers 4140 bytes
Database mounted.
Database opened.

2. Change the value to EXTENDED

SQL>  alter system set MAX_STRING_SIZE='EXTENDED' SCOPE=BOTH;

System altered.

SQL> show parameter MAX_STRING_SIZE

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_string_size                      string      EXTENDED

3.Run script utl32k.sql :

@?/rdbms/admin/utl32k.sql

4.Restart the database:

shutdown immediate;
startup
SQL> show parameter MAX_STRING_SIZE;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_string_size                      string      STANDARD

Know Issues:

1. If you try to restart the database without running utl32k.sql script, got below error.

Database mounted.
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-14695: MAX_STRING_SIZE migration is incomplete
Process ID: 18669
Session ID: 401 Serial number: 16419

So start the database again in UPGRADE mode and execute utlk32.sql script.

2. If you try to set the value to EXTENDED , when database is not in UPGRADE mode.


SQL> alter system set MAX_STRING_SIZE=EXTENDED;
alter system set MAX_STRING_SIZE=EXTENDED
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-14694: database must in UPGRADE mode to begin MAX_STRING_SIZE migration

before running this alter statement, need to start database in UPGRADE mode.

Friday, July 5, 2019

ADOP Phase: abort,Restart and Abandon

adop defaults to abandon=no restart=yes if not specified,

To restart from beginning,  use restart=no abandon=yes

adop phase=apply patches=11111,11111 abandon=yes          (new patch)

To restart a patch you need to use restart=yes abandon=no .

adop phase=apply patches=11111 restart=yes [abandon=no is implied and not required]

eg. adop restart=no abandon=yes  phase=apply
If unable to restart, check for these two AD_ADOP_SESSIONS and AD_ADOP_SESSION_PATCHES tables and analyze the o/p:
---------------------------------------------------------------------------------------

column id format 99
column nn format a10
column nt format a6
select adop_session_id id, prepare_status pr, apply_status ap,
finalize_status fi, cutover_status cu, cleanup_status cl, abort_status ab, status st, node_name nn, node_type nt
from ad_adop_sessions
order by adop_session_id desc;


column id format 99
column bn format a12
column pr format 99999999
column afs format a40
column pfs format a40
column cs format a9
column nn format a10
column ed format a28
column drv format a28
column pt format a35

select adop_session_id id, bug_number bn, patchrun_id pr, status st,
node_name nn, cast(end_date as timestamp) ed, driver_file_name drv, patch_top  pt
from ad_adop_session_patches
order by end_date desc;

To diagnostic more for the root cause why patch application failed, check for adpatch log for the failed patch 1111.
cd  $APPL_TOP_NE/../log/adop/<session id>


If we wants to start from a fresh page and considering the state of these tables, making a manual modification to the  AD_ADOP_SESSIONS table just so it would not hold up ADOP execution due to an  incomplete hotpatch session.

update ad_adop_sessions set status='C' where adop_session_id=<from above o/p>;

restart adop and retest.

=================================
Aborting an Online Patching Cycle:

If a patching cycle is failing and the issue cannot be resolved quickly, it is possible to abort the patching cycle and return to normal runtime operation. The patch edition will be dropped. You can abandon a patching cycle (without applying any patches) by running the command:

$ adop phase=abort

Important: This abort command can only be used before successful completion of the cutover phase. After cutover, the system is running on the new edition, and abort is no longer possible for that patching cycle. Aborting a patching cycle will drop the patch edition, but you must then run the cleanup and fs_clone phases before starting a new patching cycle. The cleanup must be a full cleanup.

For example:
$ adop phase=prepare
$ adop phase=apply patches=123456
[Patch application encounters problems and you want to abort]
$ adop phase=abort
$ adop phase=cleanup cleanup_mode=full
$ adop phase=fs_clone
Optionally, you can combine the abort and cleanup commands as follows:
$ $ adop phase=abort,cleanup cleanup_mode=full

Note: You cannot abort application of a patch applied in hotpatch mode or downtime mode.

ADOP apply phase options ABANDON and RESTART
Restarting a failed worker :

When adop is using parallel processing and an error occurs, the job fails. Review the main adop log file and the adworkxxx.log file to determine the source of the error, resolve the issues and continue. Restart adop using the adctrl command.

Restarting adop :

If you have shut down the workers, or if adop quits while performing processing actions, it saves all the actions completed up to that point in restart files. Investigate and resolve the problem that caused the failure, then restart adop.

restart options,

restart=(yes|no)  [default: no]

Use restart=yes to resume the previous failed apply command from where processing terminated.  If an apply command fails, check the log files for further information.  If the problem can be corrected, you can then restart the apply command where it left off using the restart parameter.

Example - Restart a apply command

adop phase=apply patches=123456 restart=yes

When restarting a failed apply it is important to use the same parameters as the failed command, with only the addition of the restart=yes parameter.

abandon=(yes|no)  [default: no]

Use abandon=yes to abandon the previous failed apply command and start a new apply command.  Note that any changes made to the system by the failed command will remain in effect. The abandon flag is most useful when applying a replacement patch for the failing patch.

Example - Abandon previous apply command and apply replacement

adop phase=apply patches=223456 abandon=yes

If a patch fails to apply and there is no replacement patch, you may also abort the complete online patching cycle. 

You have several options when restarting (or abandoning) application of individual patches, as follows :

• If you want to restart a failed patch from where it left off, you only need to specify restart=yes on the command line:

adop phase=apply patches=1234 restart=yes

• If you want to restart a failed patch from the very beginning, you need to specify abandon=yes on the command line:

adop phase=apply patches=1234 abandon=yes

• If you want to ignore a previously failed patch and apply a different one instead, you need to specify the new patch number and abandon=yes on the command line:

adop phase=apply patches=5678 abandon=yes
================================================================================
ADOP Useful Options 12.2
To restart worker where failed, while applying patch using "adop" in 12.2

wait_on_failed_job=(yes|no)  [default: no]
        Controls whether adop apply command exits when all workers have
        failed.  Instead of exiting, you can force adop to wait, and use
        the "adctrl" to retry failed jobs.

adop phase=apply patches=123456 wait_on_failed_job=yes

NB: This works only for Worker jobs, Not for objects compilation fails, like forms, pll's.
For these you have to use "flags=autoskip"
adop phase=apply patches=123456 flags=autoskip

 skipsyncerror=(yes|no)  [default: no]
        Specifies whether to ignore errors that may occur during incremental
        file system synchronization.  This might happen if you applied
        a patch in the previous patching cycle that had errors but decided
        to continue with the cutover.  When the patch is synchronized on
        the next patching cycle, the apply errors may occur again, but
        can be ignored.

        Example:

            adop phase=prepare skipsyncerror=yes
Aborting an online patching cycle:
        If an online patching cycle encounters problems that cannot be
        fixed immediately you can abort the patching cycle and return
        to normal runtime operation.

        Example - Aborting a failed online patching cycle:

            adop phase=prepare
            adop phase=apply patches=123456
            ### serious unfixable error reported
            adop phase=abort
            adop phase=cleanup cleanup_mode=full
            adop phase=fs_clone

        The abort command drops the database patch edition and
        returns the system to normal runtime state.  Immediately
        following abort, you must also run a full cleanup and
        fs_clone operation to fully remove effects of the failed
        online patching cycle.

adop cutover hang
If an adop cutover hangs or a server has crash or reboot issues in the middle of the adop cutover phase, execute the following steps to fix the issue and then proceed with the patch process.

SOLUTION
Make sure no services or processes are running from the PATCH file system.

Ensure that the Weblogic Admin Server and Node Manager are running on the run file system. Execute the following commands to check the status:

 $ adadminsrvctl.sh status
 $ adnodemgrctl.sh status
Execute the following commands:

 $ adop phase=abort
 $ adop phase=cleanup cleanup_mode=full
 $ adop phase=fs_clone force=yes
Run an empty adop cycle to make sure there is no issue in the adop cutover by executing the following command:

 $adop phase=prepare, finalize, cutover, cleanup cleanup_mode=full
Start a fresh adop prepare and apply patches.

After the apply step, complete the rest of the adop phases including finalize, cutover, and cleanup.

Trying to resume a failed cutover session giving Invalid Credentials in Oracle Apps R12.2

If you attempt to resume a failed session after cutover exits with cutover_status=3, I receive an 'Invalid Credentials' error.
To check the status you can use the query as earlier posted.

This will be because the database patch edition has already been promoted to be the new run edition. To resume and complete cutover successfully, run the command:

$ adop phase=cutover action=nodb

http://soban-dba.blogspot.com/2017/04/adop-options-availble-in-r122.html

Thursday, July 4, 2019

Upgrade Oracle 11gR2 (11.2.0.4) Grid Infrastructure to Oracle 12cR2 (12.2.0.1):

Grid Infrastructure Management Repository (GIMR),In 12.1.0.1 we had the option of installing the GIMR database – MGMTDB. But in 12.1.0.2 it is mandatory and the MGMTDB database is automatically created as part of the upgrade or initial installation process of 12.10.2 Grid Infrastructure.The GIMR primarily stores historical Cluster Health Monitor metric data. It runs as a container database on a single node of the RAC cluster.datafiles for the MGMTDB database are created on the same ASM disk group which holds the OCR and Voting Disk,at least 4 GB of free space in that ASM disk group – or an error INS-43100 will be returned as shown in the figure below.


1.To check GI active version and upgrade state:

crsctl query crs activeversion
crsctl query crs releaseversion
crsctl query crs softwareversion
crsctl query crs activeversion -f


2.Collect the crs status by running below commands[ grid owner]

/u01/oracle/11.2.0/grid/crsctl check crs
/u01/oracle/11.2.0/grid/crsctl stat res -t
/u01/oracle/11.2.0/grid/crsctl stat res -p
/u01/oracle/11.2.0/grid/oifcfg iflist -p -n
/u01/oracle/11.2.0/grid/oifcfg getif
/u01/oracle/11.2.0/grid/ocrcheck

3.Take OCR backup manually before GI upgrade. This could be used in downgrading the GI from 12.2.0.1 to 11.2.0.4 later on if required.

./ocrconfig -export /softdump/BACKUP/pre_upgrade_backup/ocr_file


4.Download the Oracle Grid Infrastructure image file , create the Grid home directory, and extract the image files in this Grid home directory.

$ mkdir -p /u01/oracle/12.2.0/grid
$ chown oracle:oinstall /u01/oracle/12.2.0/grid
$ cd /u01/oracle/12.2.0/grid
$ unzip -q linuxx64_12201_grid_home.zip

5.Run the cluvfy script for precheck and fix prerequisites missing.To verify all the prerequisites for the 12c GI upgrade are in place.
Verify the prerequisites with “runcluvfy”
runcluvfy.sh stage -pre crsinst -upgrade -rolling -src_crshome /u01/oracle/11.2.0/grid -dest_crshome /u01/oracle/12.2.0/grid -dest_version 12.2.0.1.0 -fixup -verbose

6.Stop the database/services running in grid

srvctl stop database -d PROD

7.Take backup of Grid binary:

tar -cvf grid_home_preupg.tar $GRID_HOME

8.Now start the upgrade using runInstaller.

GI Management repository is a new feature introduced with 12c. When this is selected with the GI installation a database is created (named -MGMTDB) and it's data files are stored in the same disk group where the OCR and vote disks are stored.

9.Run the scripts on the local node:[ from root]
root #/crsapp/app/oracle/product/grid12c/rootupgrade.sh

10.Now run the script on other node.

Single Client Access Name (SCAN):

Single client access name (SCAN): 

  • Introduced in Oracle 11g Release 2.
  • It provides a single name for clients to access any Oracle Database running in a cluster.
  • Client’s connect information does not needs to be changed if you add to or remove nodes from cluster. 
  • SCAN also provides the connect time load balancing so it forwards the request to least loaded node in the cluster.
  • This can resolve upto maximum three IP address and minimum one.

SCAN Concepts:
Single client access name (SCAN) is the virtual hostname to provide for all clients connecting to the cluster (as opposed to the vip hostnames in 10g and 11gR1).  

SCAN is a domain name registered to at least one and up to three IP addresses, either in the domain name service (DNS) or the Grid Naming Service (GNS).

By default, the name used as the SCAN is also the name of the cluster and must be globally unique throughout your enterprise. The default value for the SCAN is based on the local node name. SCAN name must be at least one character long and no more than 15 characters in length, must be alphanumeric - cannot begin with a numeral and may contain hyphens (-). If you require a SCAN that is longer than 15 characters, then select an Advanced installation.

For installation to succeed, the SCAN must resolve to at least one address.

SCAN VIP addresses must be on the same subnet as virtual IP addresses and public IP addresses.
Oracle strongly recommends that you do not configure SCAN VIP addresses in the hosts file. But if you use the hosts file to resolve SCAN name, you can have only one SCAN IP address.

If hosts file is used to resolve SCAN hostname, you will receive Cluster Verification Utility failure at end of installation (see Note: 887471.1 for more details).

For high availability and scalability, Oracle recommends that you configure the SCAN to use DNS Round Robin resolution to three addresses.

Because the SCAN is associated with the cluster as a whole, rather than to a particular node, the SCAN makes it possible to add or remove nodes from the cluster without needing to reconfigure clients. It also adds location independence for the databases, so that client configuration does not have to depend on which nodes are running a particular database.

Clients can continue to access the cluster in the same way as with previous releases, but Oracle recommends that clients accessing the cluster use the SCAN. Clients using the SCAN can also access the cluster using EZCONNECT.

Grid Infrastructure will start local listener LISTENER on all nodes to listen on local VIP, and SCAN listener LISTENER_SCAN1 (up to three cluster wide) to listen on SCAN VIP(s); 11gR2 database by default will set local_listener to local LISTENER, and remote_listener to SCAN listener.

SCAN listener will be running off GRID_HOME, and by default, in 11gR2 local listener will be running off GRID_HOME as well.

In 12c, SCAN supports multiple subnets in the cluster (one SCAN per subnet)

How does SCAN work ?

SCAN is a GSD resource, which is managed by CRS.So, SCAN is pretty much aware of what's going on in the cluster. Though Oracle documentation suggests that SCAN is a recommendation, but its a kind of mandatory as Oracle 11gR2 OUI would not proceed without it. Clients use SCAN name in tnsnames.ora to make the DB connection. SCAN Listener would forward the request to local listener that’s running on VIPs.

So, SCAN needs to resolve to one to three IP addresses with the same name. Oracle recommends using three IP Addresses for SCAN in DNS. There would be three SCAN listeners only, though the cluster has got dozens of nodes. SCAN listeners would be started from GRID Oracle Home, not the database/rdbms home. Since its part of a grid, this can be used for all the database in the cluster. So, we don't to run netca to create listeners in DB Homes anymore. If the default port, 1521, is used, Oracle instances (PMON) automatically registers with the SCAN listener.

PMON process of each instance registers the database services with the default listener on the local node and with each SCAN listener, which is specified by the REMOTE_LISTENER database parameter.
Oracle client connects using SCAN name: myscan:1521/sales.example.com
Client queries DNS to resolve scan_name.
SCAN listener selects least loaded node (node in this example)
The client connects to the local listener on node2. The local listener starts a dedicated server process for the connection to the database.
The client connects directly to the dedicated server process on node2 and accesses the sales2 database instance.

Need to under stand these term:

SCAN NAME, SCAN VIP, SCAN Listener, Node VIP, Local_Listener and Remote_Listener

SCAN NAME:
It must be unique.
This name is used in Tnsname.ora file of client system, all database connection are made thorugh this name.
Below commnad for scan name details:

[oracle@rac1 bin]$ srvctl config scan
[oracle@rac1 bin]$ nslookup orarac-scan
[grid@rac1 ~]$crsctl stat res -w "TYPE = ora.scan_listener.type"

SCAN VIP:
SCAN NAME resolves to one or more than one IP addresses, these IP address are called as SCAN VIP or SCAN IP.
Each Scan VIP has a SCAN Listener correspondingly.
SCAN Listener can runs on any of nodes or there could be two SCAN Listener running on one node if there are three SCAN Listener and only two nodes.SCAN VIP and SCAN Listener works as a pair.

Command to get SCAN Listener and corresponding SCAN VIP.

[oracle@rac1  ~]$ ps -ef | grep tnslsnr
grid     29211     1  0 Jul28 ?        00:00:27 /u01/app/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN3 -inherit

[oracle@rac1 ]$ lsnrctl status LISTENER_SCAN3

** SCAN Listener LISTENER_SCAN3 is running .

SCAN Listener:
SCAN Listener used to run on database nodes. At max there could be three SCAN Listeners in the whole cluster.

SCAN Listeners:

[oracle@database ~]$ srvctl status scan_listener

Node VIP:Each node has one node IP and one node VIP address in RAC env.These have differences like Node VIP can be moved to any other system in case of  failure of current owning system but Node IP can't.
When a new connection comes,SCAN listener listening on a SCAN IP address and the SCAN port is contacted.As every services is registered with SCAN listener in cluster,the SCAN listener responds with local listener address as Node VIP whichever is  least-loaded(Scan listener keeps updated with cluster load statistics) and connection is routed to same node.

Local_Listener:This is a database parameter which is to provide detail of local listener running on database node.This has Node VIP address as value in RAC. Local Listener  is registered with Node VIP and Node IP address.

The difference between SCAN Listener and Local Listener is SCAN listener runs corosponding to SCAN VIP's while Local Listener runs with Node VIP or Node IP address.SCAN Listener can move to another database node in case of node failure but local Listener doesn't have this kind of behavior.

SQL> show parameter local_listener

Remote_Listener:This parameter set to SCAN NAME.SCAN NAME has one or more IP address called as SCAN VIP.Each SCAN VIP has SCAN Listener running, therefore each database instance register itself with all SCAN Listeners running.

Remote_listener parameter is basically to register each database instance with all SCAN Listeners. This also provide information like services provided by the instance,current load and provide recommendation on how many incoming connections should be directed to the instance.

SQL> show parameter remote_listener


Difference between local and remote listener:





===================

What is SCAN:

Single client access name (SCAN)SCAN is a GSD resource, which is managed by CRS. Single client access name (SCAN).It is introduced in Oracle 11g R2.Main purpose of SCAN is to provide ease of management/connection.

How to rename SCAN in Cluster:

1. Stop the SCAN

[grid@rac1 ~]$ srvctl stop scan_listener
[grid@rac1 ~]$ srvctl stop scan

2.Configure the new SCAN in your DNS, or /etc/hosts, or GNS and make sure that lookups are working for the new name.

[root@rac2 ~]# nslookup newscan.domain.com


3. Configure the Cluster to take the new VIPs

As root user on one of the cluster nodes:

[root@rac1 ~]# /u01/product/11.2.0/grid/bin/srvctl modify scan -n newscan.domain.com

4.As grid user on one of the cluster nodes:
[grid@rac1 ~]$ srvctl modify scan_listener -u
[grid@rac1 ~]$ srvctl start scan_listener

5.Verify that configuration:

[grid@rac1 ~]$ srvctl config scan

Rename SCAN Port in Cluster:

Login as grid user, source the grid environment.
Make sure $GRID_HOME/bin is in PATH.

1. Modify SCAN listener port:

[grid@rac1 ~]$ srvctl modify scan_listener -p 1523

2. Restart SCAN listener to be effective of new port:

[grid@rac1 ~]$ srvctl stop scan_listener
[grid@rac1 ~]$ srvctl start scan_listener

3. Verify the change:

[grid@rac1 ~]$ srvctl config scan_listener
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1523