Focus On Oracle

Installing, Backup & Recovery, Performance Tuning,
Troubleshooting, Upgrading, Patching, Zero-Downtime Upgrade, GoldenGate

Oracle Exadata ,Oracle ODA, Oracle ZDLRA


当前位置: 首页 » 技术文章 » AODU

aodu(At Oracle Database Utility)之rac(一)

aodu的函数rac包含了以下内容:如何更改归档模式,常用术语,故障诊断常用脚本,设置调试模式,如何使用cvu(Cluster Vierify Utility),crsctl/srvctl常用命令,ocrconfig命令,olsnodes命令等。


如何使用rac函数?

[oracle@db1 ~]$ ./aodu

AT Oracle Database Utility,Release 1.1.0 on Tue Jun 14 14:28:01 2016
Copyright (c) 2014, 2015, Robin.Han.  All rights reserved.
http://ohsdba.cn
E-Mail:375349564@qq.com

AODU>

AODU> rac ohsdba
         rac archivelog|general|abbr|diag|cvu|diaginstall|debug|eviction|diagrac|perf|srvctl|crsctl|ocrconfig|olsnodes|debug|note
AODU> rac oracle
         Currently it's for internal use only
AODU>
注意:只有通过rac ohsdba会把函数中常用命令显示出来,当然也只有看过这篇文章,才会知道如何使用


AODU> rac archivelog

        ****Change archivedlog mode****
        The following steps need to be taken to enable archive logging in a RAC database environment:
        $srvctl stop database -d <db_unique_name>
        $srvctl start database -d <db_unique_name> -o mount
        $sqlplus / as sysdba
        sql> alter database archivelog;
        sql> exit;
        $ srvctl stop database -d <db_unique_name>
        $ srvctl start database -d <db_unique_name>
        sql> archive log list;
        Note:from 10g,you do not need to change parameter cluster_database
AODU>

AODU> rac general     ---11gR2的特性和RAC启动顺序的说明
        ****11gR2 Clusterware Key Facts****
        11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database.
        The GRID home consists of the Oracle Clusterware and ASM.  ASM should not be in a separate home.
        The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support.
        This clusterware is a subset of the full clusterware described in this document.
        The 11gR2 Clusterware can be run by itself or on top of vendor clusterware.See the certification matrix for certified
        combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters"
        The GRID Home and the RAC/DB Home must be installed in different locations.
        The 11gR2 Clusterware requires a shared OCR files and voting files.  These can be stored on ASM or a cluster filesystem.
        The OCR is backed up automatically every 4 hours to <GRID_HOME>/cdata/<clustername>/ and can be restored via ocrconfig.
        The voting file is backed up into the OCR at every configuration change and can be restored via crsctl.
        The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one
        public network for external communication.Several virtual IPs need to be registered with DNS.This includes the node VIPs (one per node),
        SCAN VIPs (three).This can be done manually via your network administrator or optionally you could configure the "GNS" (Grid Naming Service)
        in the Oracle clusterware to handle this for you (note that GNS requires its own VIP).
        A SCAN (Single Client Access Name) is provided to clients to connect to.  For more information on SCAN see Note: 887522.1
        The root.sh script at the end of the clusterware installation starts the clusterware stack.For information on troubleshooting
        root.sh issues see Note: 1053970.1
        Only one set of clusterware daemons can be running per node.
        On Unix, the clusterware stack is started via the init.ohasd script referenced in /etc/inittab with "respawn".
        A node can be evicted (rebooted) if a node is deemed to be unhealthy.  This is done so that the health of the entire cluster can be maintained.
        For more information on this see: Note: 1050693.1 "Troubleshooting 11.2 Clusterware Node Evictions (Reboots)"
        Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all and
        let CTSS handle time synchronization.See Note: 1054006.1 for more information.
        If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 errors.
        See Note 946332.1 and Note:948456.1 for more information.
        The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware stack,
        or by running "crsctl start cluster" to start the clusterware on all nodes.Note that crsctl is in the <GRID_HOME>/bin directory.
        Note that "crsctl start cluster" will only work if ohasd is running.
        The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the clusterware stack,
        or by running "crsctl stop cluster" to stop the clusterware on all nodes.Note that crsctl is in the <GRID_HOME>/bin directory.
        Killing clusterware daemons is not supported.
        Instance is now part of .db resources in "crsctl stat res -t" output, there is no separate .inst resource for 11gR2 instance.


        ****Cluster Start Sequence****
        This daemon spawns 4 processes.
        Level 1: OHASD Spawns:
            cssdagent - Agent responsible for spawning CSSD.
            orarootagent - Agent responsible for managing all root owned ohasd resources.
            oraagent - Agent responsible for managing all oracle owned ohasd resources.
            cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).
        Level 2: OHASD rootagent spawns:
            CRSD - Primary daemon responsible for managing cluster resources.
            CTSSD - Cluster Time Synchronization Services Daemon
            Diskmon
            ACFS (ASM Cluster File System) Drivers
        Level 2: OHASD oraagent spawns:
            MDNSD - Used for DNS lookup
            GIPCD - Used for inter-process and inter-node communication
            GPNPD - Grid Plug & Play Profile Daemon
            EVMD - Event Monitor Daemon
            ASM - Resource for monitoring ASM instances
        Level 3: CRSD spawns:
            orarootagent - Agent responsible for managing all root owned crsd resources.
            oraagent - Agent responsible for managing all oracle owned crsd resources.
        Level 4: CRSD rootagent spawns:
            Network resource - To monitor the public network
            SCAN VIP(s) - Single Client Access Name Virtual IPs
            Node VIPs - One per node
            ACFS Registery - For mounting ASM Cluster File System
            GNS VIP (optional) - VIP for GNS
        Level 4: CRSD oraagent spawns:
            ASM Resouce - ASM Instance(s) resource
            Diskgroup - Used for managing/monitoring ASM diskgroups.
            DB Resource - Used for monitoring and managing the DB and instances
            SCAN Listener - Listener for single client access name, listening on SCAN VIP
            Listener - Node listener listening on the Node VIP
            Services - Used for monitoring and managing services
            ONS - Oracle Notification Service
            eONS - Enhanced Oracle Notification Service
            GSD - For 9i backward compatibility
            GNS (optional) - Grid Naming Service - Performs name resolution
AODU>

AODU> rac abbr    ---RAC术语

        ****Abbreviations, Acronyms****
        This note lists commonly used Oracle Clusterware(Cluster Ready Service or Grid Infrastructure) Related abbreviations,acronyms,terms and Procedures.
        nodename: short hostname for local node. For example, racnode1 for node racnode1.us.oracle.com
        CRS: Cluster Ready Service, name for pre-11gR2 Oracle clusterware
        GI: Grid Infrastructure, name for 11gR2 Oracle clusterware
        GI cluster: Grid Infrastructure in cluster mode
        Oracle Restart: GI Standalone, Grid Infrastructure in standalone mode
        ASM user: the OS user who installs/owns ASM. For 11gR2, ASM and grid user is the same as ASM and GI share the same ORACLE_HOME.
        For pre-11gR2 CRS cluster, ASM and CRS user can be different as ASM and CRS will be in different ORACLE_HOME.
        For pre-11gR2 single-instance ASM, ASM and local CRS user is the same as ASM and local CRS share the same home.
        CRS user: the OS user who installs/owns pre-11gR2 Oracle clusterware
        grid user: the OS user who installs/owns 11gR2 Oracle clusterware
        clusterware user: CRS or grid user which must be the same in upgrade environment
        Oracle Clusterware software owner: same as clusterware user
        clusterware home: CRS or GI home
        ORACLE_BASE:ORACLE_BASE for grid or CRS user.
        root script checkpoint file: the file that records root script (root.sh or rootupgrade.sh) progress so root script
        can be re-executed, it's located in $ORACLE_BASE/Clusterware/ckptGridHA_${nodename}.xml
        OCR: Oracle Cluster Registry. To find out OCR location, execute: ocrcheck
        VD: Voting Disk. To find out voting file location, execute: crsctl query css votedisk
        Automatic OCR Backup: OCR is backed up automatically every four hours in cluster environment on OCR Master node,
        the default location is <clusterware-home>/cdata/<clustername>. To find out backup location, execute: ocrconfig -showbackup
        SCR Base: the directory where ocr.loc and olr.loc are located.
            Linux:         /etc/oracle
            Solaris:     /var/opt/oracle
            hp-ux:         /var/opt/oracle
            AIX:             /etc/oracle
        INITD Location: the directory where ohasd and init.ohasd are located.
            Linux:         /etc/init.d
            Solaris:     /etc/init.d
            hp-ux:         /sbin/init.d
            AIX:             /etc
        oratab Location: the directory where oratab is located.
            Linux:         /etc
            Solaris:     /var/opt/oracle
            hp-ux:         /etc
            AIX:             /etc
        CIL: Central Inventory Location. The location is defined by parameter inventory_loc in /etc/oraInst.loc or
        /var/opt/oracle/oraInst.loc depend on platform.
            Example on Linux:
            cat /etc/oraInst.loc | grep inventory_loc
            inventory_loc=/home/ogrid/app/oraInventory
        Disable CRS/GI: To disable clusterware from auto startup when node reboots, as root execute "crsctl disable crs".
        Replace it with "crsctl stop has" for Oracle Restart.
        DG Compatible: ASM Disk Group's compatible.asm setting. To store OCR/VD on ASM, the compatible setting must be at least 11.2.0.0.0,
        but on the other hand lower GI version won't work with higher compatible setting. For example, 11.2.0.1 GI will have issue to
        access a DG if compatible.asm is set to 11.2.0.2.0. When downgrading from higher GI version to lower GI version,
        if DG for OCR/VD has higher compatible, OCR/VD relocation to lower compatible setting is necessary.
        To find out compatible setting, log on to ASM and query:
        SQL> select name||' => '||compatibility from v$asm_diskgroup where name='GI';
        NAME||'=>'||COMPATIBILITY
        --------------------------------------------------------------------------------
        GI => 11.2.0.0.0


        In above example, GI is the name of the interested disk group.
        To relocate OCR from higher compatible DG to lower one:
        ocrconfig -add <diskgroup>
        ocrconfig -delete <disk group>
        To relocate VD from higher compatible DG to lower one:
        crsctl replace votedisk <diskgroup>


        When upgrading Oracle Clusterware:
        OLD_HOME: pre-upgrade Oracle clusterware home - the home where existing clusterware is running off. For Oracle Restart,
        the OLD_HOME is pre-upgrade ASM home.
        OLD_VERSION: pre-upgrade Oracle clusterware version.
        NEW_HOME: new Oracle clusterware home.
        NEW_VERSION: new Oracle clusterware version.
        OCR Node: The node where rootupgrade.sh backs up pre-upgrade OCR to $NEW_HOME/cdata/ocr$OLD_VERSION. In most case
        it's first node where rootupgrade.sh was executed.
        Example when upgrading from 11.2.0.1 to 11.2.0.2, after execution of rootupgrade.sh

        ls -l $NEW_HOME/cdata/ocr*
        -rw-r--r-- 1 root root 78220 Feb 16 10:21 /ocw/b202/cdata/ocr11.2.0.1.0
AODU>

AODU> rac diag   --如何收集rac的故障诊断信息

        ****Data Gathering for All Oracle Clusterware Issues****
        TFA Collector is installed in the GI HOME and comes with 11.2.0.4 GI and higher. For GI 11.2.0.3 or lower,
        install the TFA Collector by referring to Document 1513912.1 for instruction on downloading and installing TFA collector.
        $GI_HOME/tfa/bin/tfactl diagcollect -from "MMM/dd/yyyy hh:mm:ss" -to "MMM/dd/yyyy hh:mm:ss"
        Format example: "Jul/1/2014 21:00:00"
        Specify the "from time" to be 4 hours before and the "to time" to be 4 hours after the time of error.
        ****Linux/Unix Platform
        a.Linux/UNIX 11gR2/12cR1
        1. Execute the following as root user:
        # script /tmp/diag.log
        # id
        # env
        # cd <temp-directory-with-plenty-free-space>
        # $GRID_HOME/bin/diagcollection.sh
        # exit
        For more information about diagcollection, check out "diagcollection.sh -help"
        The following .gz files will be generated in the current directory and need to be uploaded along with /tmp/diag.log:
        Linux/UNIX 10gR2/11gR1
        1. Execute the following as root user:
        # script /tmp/diag.log
        # id
        # env
        # cd <temp-directory-with-plenty-free-space>
        # export OCH=<CRS_HOME>
        # export ORACLE_HOME=<DB_HOME>
        # export HOSTNAME=<host>
        # $OCH/bin/diagcollection.pl -crshome=$OCH --collect
        # exit
        The following .gz files will be generated in the current directory and need to be uploaded along with /tmp/diag.log:
        2. For 10gR2 and 11gR1, if getting an error while running root.sh, please collect /tmp/crsctl.*
        Please ensure all above information are provided from all the nodes.


        ****Windows Platform****
        b.Windows 11gR2/12cR1:
        set ORACLE_HOME=<GRID_HOME>    for example: set ORACLE_HOME=D:\app\11.2.0\grid
        set PATH=%PATH%;%ORACLE_HOME%\perl\bin
        perl %ORACLE_HOME%\bin\diagcollection.pl --collect --crshome %ORACLE_HOME%
        The following .zip files will be generated in the current directory and need to be uploaded:
        crsData_<timestamp>.zip,
        ocrData_<timestamp>.zip,
        oraData_<timestamp>.zip,
        coreData_<timestamp>.zip (only --core option specified)
        For chmosdata*:
        perl %ORACLE_HOME%\bin\diagcollection.pl --collect --crshome %ORACLE_HOME%
        Windows 10gR2/11gR1
        set ORACLE_HOME=<DB_HOME>
        set OCH=<CRS_HOME>
        set ORACLE_BASE=<oracle-base>
        %OCH%\perl\bin\perl %OCH%\bin\diagcollection.pl --collect
AODU>

AODU> rac cvu    --如何使用CVU
        ****Cluster verify utility****
        How to Debug CVU / Collect CVU Trace Generated by RUNCLUVFY.SH (Doc ID 986822.1)
        (a) GI/CRS has been installed
        $ script /tmp/cluvfy.log
        $ $GRID_HOME/bin/cluvfy stage -pre crsinst -n <node1, node2...> -verbose
        $ $GRID_HOME/bin/cluvfy stage -post crsinst -n all -verbose
        $ exit
        (b) GI/CRS has not been installed
        run runcluvfy.sh from the installation media or download from OTN http://www.oracle.com/technetwork/database/options/clustering/downloads/index.html
        set the environment variables CV_HOME to point to the cvu home, CV_JDKHOME to point to the JDK home and an optional
        CV_DESTLOC pointing to a writeable area on all nodes (e.g /tmp/cluvfy)
        $ cd $CV_HOME/bin
        $ script cluvfy.log
        $ cluvfy stage -pre crsinst -n <node1, node2...>
        $ exit

        ****Diagcollection options****
        To collect only a subset of logs, --adr together with --beforetime and --aftertime can be used, i.e.:
        # mkdir /tmp/collect
        # $GRID_HOME/bin/diagcollection.sh --adr /tmp/collect -beforetime 20120218100000 -aftertime 20120218050000
        This command will copy all logs contain timestamp 2012-02-18 05:00 - 10:00 to /tmp/collect directory.
        Time is specified with YYYYMMDDHHMISS24 format. --adr points to a directory where the logs are copied to.
        From 11.2.0.2 onwards, Cluster Health Monitor(CHM/OS) note 1328466.1 data can also be collected, i.e.:
        # $GRID_HOME/bin/diagcollection.sh --chmos --incidenttime 02/18/201205:00:00 --incidentduration 05:00
        This command will collect data from 2012-02-18 05:00 to 10:00 for 5 hours. incidenttime is specified as MM/DD/YYYY24HH:MM:SS,
        incidentduration is specified as HH:MM.

        ****For 11gR2/12c:****
        Set environment variable CV_TRACELOC to a directory that's writable by grid user; trace should be generated in there once
        runcluvfy.sh starts. For example, as grid user:
        rm -rf /tmp/cvutrace
        mkdir /tmp/cvutrace
        export CV_TRACELOC=/tmp/cvutrace
        export SRVM_TRACE=true
        export SRVM_TRACE_LEVEL=1
        <STAGE_AREA>/runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose

        Note:
        A. STAGE_AREA refers to the location where Oracle Clusterware is unzipped.
        B. Replace above "runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose"
        if other stage/comp needs to be traced,i.e."./runcluvfy.sh comp ocr -verbose"
        C. For 12.1.0.2, the following can be set for additional tracing from exectask command:
        export EXECTASK_TRACE=true
        The trace will be in <TMP>/CVU_<version>_<user>/exectask.trc, i.e. /tmp/CVU_12.1.0.2.0_grid/exectask.trc

        ****For 10gR2, 11gR1 or 11gR2:****
        1. As crs/grid user, backup runcluvfy.sh. For 10gR2, it's located in <STAGE_AREA>/cluvfy/runcluvfy.sh;
        and for 11gR1 and 11gR2, <STAGE_AREA>/runcluvfy.sh
        cd <STAGE_AREA>
        cp runcluvfy.sh runcluvfy.debug

        2. Locate the following lines in runcluvfy.debug:
        # Cleanup the home for cluster verification software
        $RM -rf $CV_HOME
        Comment out the remove command so runtime files including trace won't be removed once CVU finishes.
        # Cleanup the home for cluster verification software
        # $RM -rf $CV_HOME


        3. As crs/grid user,set environment variable CV_HOME to anywhere as long as the location is writable by crs/grid user and has 400MB of free space:
        mkdir /tmp/cvdebug
        CV_HOME=/tmp/cvdebug
        export CV_HOME
        This step is optional, if CV_HOME is unset, CVU files will be generated in /tmp.

        4. As crs/grid user, execute runcluvfy.debug:
        export SRVM_TRACE=true
        export SRVM_TRACE_LEVEL=1
        cd <STAGE_AREA>
        ./runcluvfy.debug stage -pre crsinst -n <node1>,<node2> -verbose
        Note:
        A. SRVM_TRACE_LEVEL is effective for 11gR2 only.
        B. Replace above "runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose" if other stage/comp needs to be
        traced, i.e. "./runcluvfy.sh comp ocr -verbose"

        5. Regardless whether above command finishes or not, CVU trace should be generated in:
        10gR2: $CV_HOME/<pid>/cv/log
        11gR1: $CV_HOME/bootstrap/cv/log
        11gR2: $CV_HOME/bootstrap/cv/log
        If CV_HOME is unset, trace will be in /tmp/<pid>/cv/log or /tmp/bootstrap/cv/log depend on CVU version.

        6. Clean up temporary files generated by above runcluvfy.debug:
        rm -rf $CV_HOME/bootstrap
        HOW TO UPGRADE CLUVFY IN CRS_HOME (Doc ID 969282.1)
        On the other hand e.g. GridControl uses the installed cluvfy from CRS_HOME. So if GridControl shows errors when checking the cluster
        You may want to update the cluvfy used by GridControl.
        As cluvfy consists of many files it is best to install the newest version outside of CRS_HOME so there
        will be no conflict between the jar files and libraries used by cluvfy and CRS.
        To do so follow these steps:
        1. download the newest version of cluvfy from here and extract the files into the target directory You want to use.
        The following assumptions are made here:
        CRS_HOME=/u01/oracle/crs
        new cluvfy-home CVU_HOME = /u01/oracle/cluvfy
        2. go to CRS_HOME/bin and copy the existing file CRS_HOME/bin/cluvfy to CRS_HOME/bin/cluvfy.org
        3. copy CVU_HOME/bin/cluvfy to CRS_HOME/bin/cluvfy
        4. edit that file and search for the line
        CRSHOME=        CRSHOME=that file and search fosearch foe
/cluvfy
S_HOME/bin/cOME/bin/cCRS_HOME/bin/cluvfy.orgluvfy.organt to use.
ewest versiest versie of CRS_HOME so there so there cking the cluster
ime ter
ime ied as MM/DD/YYYY24HH:MYYY24HH:Mt to the JDK home and aome and al
ng. For example, 11.mple, 11.will have issue to
        ace to
        ac if compatible.asm is s.asm is s2.0.2.0. When downgradidowngradiigher GI version to lowon to lowsion,
ptionally you coly you cogure the "GNS" (Grid Na (Grid Na (Grid_OracleHome
        make the following corrections:
        ORACLE_HOME=$ORA_CRS_HOME
        CRSHOME=$ORACLE_HOME
        CV_HOME=/u01/oracle/cluvfy           <---- check for Your environment
        JREDIR=$CV_HOME/jdk/jre
        DESTLOC=/tmp
        ./runcluvfy.sh stage -pre dbinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_dbinst.log
        runcluvfy.sh stage -pre  hacfg -verbose
        ./cluvfy stage -pre dbinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_dbinst.log
        ./runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_preinst.log
        ./cluvfy stage -post crsinst -n all -verbose | tee /tmp/cluvfy_postinst.log


        $ ./cluvfy comp -list
        USAGE:
        cluvfy comp  <component-name> <component-specific options>  [-verbose]
        Valid components are:
                nodereach : checks reachability between nodes
                nodecon   : checks node connectivity
                cfs       : checks CFS integrity
                ssa       : checks shared storage accessibility
                space     : checks space availability
                sys       : checks minimum system requirements
                clu       : checks cluster integrity
                clumgr    : checks cluster manager integrity
                ocr       : checks OCR integrity
                olr       : checks OLR integrity
                ha        : checks HA integrity
                crs       : checks CRS integrity
                nodeapp   : checks node applications existence
                admprv    : checks administrative privileges
                peer      : compares properties with peers
                software  : checks software distribution
                asm       : checks ASM integrity
                acfs       : checks ACFS integrity
                gpnp      : checks GPnP integrity
                gns       : checks GNS integrity
                scan      : checks SCAN configuration
                ohasd     : checks OHASD integrity
                clocksync      : checks Clock Synchronization
                vdisk      : check Voting Disk Udev settings


        $ ./cluvfy stage -list
        USAGE:
        cluvfy stage {-pre|-post} <stage-name> <stage-specific options>  [-verbose]
        Valid stage options and stage names are:
                -post hwos    :  post-check for hardware and operating system
                -pre  cfs     :  pre-check for CFS setup
                -post cfs     :  post-check for CFS setup
                -pre  crsinst :  pre-check for CRS installation
                -post crsinst :  post-check for CRS installation
                -pre  hacfg   :  pre-check for HA configuration
                -post hacfg   :  post-check for HA configuration
                -pre  dbinst  :  pre-check for database installation
                -pre  acfscfg  :  pre-check for ACFS Configuration.
                -post acfscfg  :  post-check for ACFS Configuration.
                -pre  dbcfg   :  pre-check for database configuration
                -pre  nodeadd :  pre-check for node addition.
                -post nodeadd :  post-check for node addition.
                -post nodedel :  post-check for node deletion.

        cluvfy comp ssa -n dbnode1,dbnode2 -s

        Datatbase logs & trace files:
        cd $(orabase)/diag/rdbms
        tar cf - $(find . -name '*.trc' -exec egrep '<date_time_search_string>' {} \; grep -v bucket) | gzip >  /tmp/database_trace_files.tar.gz
        ASM logs & trace files:
        cd $(orabase)/diag/asm/+asm/
        tar cf - $(find . -name '*.trc' -exec egrep '<date_time_search_string>' {} \; grep -v bucket) | gzip >  /tmp/asm_trace_files.tar.gz
        Clusteware logs:
        <GI home>/bin/diagcollection.sh --collect --crs --crshome <GI home>
        OS logs:
        /var/adm/messages* or /var/log/messages* or 'errpt -a' or Windows System Event Viewer log (saved as .TXT file)
AODU>

AODU> rac diaginstall   --RAC安装故障诊断信息收集
       ****Data Gathering for Oracle Clusterware Installation Issues****
         Failure before executing root script:
        For 11gR2: note 1056322.1 - Troubleshoot 11gR2 Grid Infrastructure/RAC Database runInstaller Issues
        For pre-11.2: note 406231.1 - Diagnosing RAC/RDBMS Installation Problems
         Failure while or after executing root script
        Provide files in Section "Data Gathering for All Oracle Clusterware Issues" and the following:
            root script (root.sh or rootupgrade.sh) screen output
            For 11gR2: provide zip of <$ORACLE_BASE>/cfgtoollogs and <$ORACLE_BASE>/diag for grid user.
            For pre-11.2: Note 240001.1 - Troubleshooting 10g or 11.1 Oracle Clusterware Root.sh Problems
        Before deconfiguring, collect the following as grid user if possible to generate a list of user resources to be
        added back to the cluster after reconfigure finishes:
        $GRID_HOME/bin/crsctl stat res -t
        $GRID_HOME/bin/crsctl stat res -p
        $GRID_HOME/bin/crsctl query css votedisk
        $GRID_HOME/bin/crsctl query crs activeversion
        $GRID_HOME/bin/crsctl check crs
        cat /etc/oracle/ocr.loc /var/opt/oracle/ocr.loc
        $GRID_HOME/bin/crsctl get css diagwait
        $GRID_HOME/bin/ocrcheck
        $GRID_HOME/bin/oifcfg iflist -p -n
        $GRID_HOME/bin/oifcfg getif
        $GRID_HOME/bin/crsctl query css votedisk
        $GRID_HOME/bin/srvctl config nodeapps -a
        $GRID_HOME/bin/srvctl config scan
        $GRID_HOME/bin/srvctl config asm -a
        $GRID_HOME/bin/srvctl config listener -l <listener-name> -a
        $DB_HOME/bin/srvctl config database -d <dbname> -a
        $DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
        # <$GRID_HOME>/crs/install/roothas.pl -deconfig -force -verbose
        OHASD Agents do not start
        Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1)
        In a nutshell, the operating system starts ohasd, ohasd starts agents to start up daemons (gipcd, mdnsd, gpnpd, ctssd, ocssd, crsd, evmd asm etc),
        and crsd starts agents that start user resources (database, SCAN, listener etc).
        OHASD.BIN will spawn four agents/monitors to start resource:
          oraagent: responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc
          orarootagent: responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc
          cssdagent / cssdmonitor: responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself)
        $GRID_HOME/bin/crsctl start res ora.crsd -init
        $GRID_HOME/bin/crsctl start res ora.evmd -init
        $GRID_HOME/bin/crsctl stop res ora.evmd -init
        ps -ef | grep <keyword> | grep -v grep | awk '{print $2}' | xargs kill -9
        If ohasd.bin can not start any of above agents properly, clusterware will not come to healthy state.
AODU>

AODU> rac debug   --RAC设置调试模式
        ****Troubleshooting Steps****
        Following are the new features in 10.2/11gR1/11gR2
        Using crsctl, debugging can be turned on and off for CRS/EVM/CSS and their subcomponents. Debug levels can also be dynamically changed
        using crsctl. The debug information is persisted in OCR for use during the next startup. Debugging can be turned on for CRS managed
        resource like VIP, Instance as well.
        * Note,in the following examples, commands with "#" prompt are executed as root user,commands with "$" prompt can be executed as clusterware owner.
        1. ****Component level logging****
        10.2/11gR1:
        # crsctl debug log css [module:level]{,module:level} ...
          - Turns on debugging for CSS
        # crsctl debug log crs [module:level]{,module:level} ...
          - Turns on debugging for CRS
        # crsctl debug log evm [module:level]{,module:level} ...
          - Turns on debugging for EVM
        For example:
        # crsctl debug log crs "CRSRTI:1,CRSCOMM:2"
        # crsctl debug log evm "EVMD:1"

        11gR2:
        # crsctl set {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} "<name1>=<lvl1>,..."
        Set the log/trace levels for specific modules within daemons
        For example:
        # crsctl set log crs "CRSRTI=2,CRSCOMM=2"

        To list all modules:
        10.2/11gR1:
        # crsctl lsmodules {css | crs | evm} - lists the CSS modules that can be used for debugging
        11gR2:
        # crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
        where
          mdns multicast Domain Name Server
          gpnp Grid Plug-n-Play Service
          css Cluster Synchronization Services
          crf Cluster Health Monitor
          crs Cluster Ready Services
          ctss Cluster Time Synchronization Service
          evm EventManager
          gipc Grid Interprocess Communications

        Logging level definition:
        level 0 = turn off
        level 2 = default
        level 3 = verbose
        level 4 = super verbose


        To check current logging level:
        10.2/11gR1:
        For CSS:
        $ grep clssscSetDebugLevel <ocssd logs>
        For CRS / EVMD:
        $ grep "ENV Logging level for Module" <crsd / evmd logs>


        11gR2:
        $ crsctl get log <modules> ALL
        For example:
        $ crsctl get log css ALL
        Get CSSD Module: CLSF  Log Level: 0
        Get CSSD Module: CSSD  Log Level: 2
        Get CSSD Module: GIPCCM  Log Level: 2
        Get CSSD Module: GIPCGM  Log Level: 2
        Get CSSD Module: GIPCNM  Log Level: 2
        Get CSSD Module: GPNP  Log Level: 1
        Get CSSD Module: OLR  Log Level: 0
        Get CSSD Module: SKGFD  Log Level: 0


        2. ****Component level debugging****
           Debugging can be turned on for CRS and EVM and their specific modules by setting environment variables or through crsctl.
        To turn on tracing for all modules:
           ORA_CRSDEBUG_ALL
        To turn on tracing for a specific sub module:
           ORA_CRSDEBUG_<modulename>


        3. ****CRS stack startup and shutdown****
           Using crsctl, the entire CRS stack and the resources can be started and stopped.
        # crsctl start crs
        # crsctl stop crs


        4. ****Diagnostics collection script - diagcollection.pl****
        This script is for collecting diagnostic information from a CRS installation. The diagnostics are necessary for development
        to be able to help with SRs, bugs and other problems that may arise in the field.
        10.2
        # <$CRS_HOME>/bin/diagcollection.pl
        11.1
        # <$CRS_HOME>/bin/diagcollection.pl -crshome=$ORA_CRS_HOME --collect
        11.2
        # <$GRID_HOME>/bin/diagcollection.sh
        For more details, please refer to Document 330358.1 CRS 10gR2/ 11gR1/ 11gR2 Diagnostic Collection Guide

        5. ****Unified log directory structure****
           The following log directory structure is in place for 10.2 onwards in an effort to consolidate the log files of different Clusterware
        components for easy diag information retrieval and problem analysis.
        Objectives
           1. Need one place where all the CRS related log files can be located.
           2. The directory structure needs to be intuitive for any user
           3. Permission, ownership issues need to be addressed
           4. Disk space issues need to be considered
           10.2/11gR1
            Alert file
               CRS_HOME/log/<host>/alert<hostname>.log
            CRS component directories
               CRS_HOME/log/
               CRS_HOME/log/<host>
               CRS_HOME/log/<host>/crsd
               CRS_HOME/log/<host>/cssd
               CRS_HOME/log/<host>/evmd
               CRS_HOME/log/<host>/client
               CRS_HOME/log/<host>/racg
           11gR2:
            GI components directories:
              <GRID_HOME>/log/<host>/crsd
              <GRID_HOME>/log/<host>/cssd
              <GRID_HOME>/log/<host>/admin
              <GRID_HOME>/log/<host>/agent
              <GRID_HOME>/log/<host>/evmd
              <GRID_HOME>/log/<host>/client
              <GRID_HOME>/log/<host>/ohasd
              <GRID_HOME>/log/<host>/mdsnd
              <GRID_HOME>/log/<host>/gipcd
              <GRID_HOME>/log/<host>/gnsd
              <GRID_HOME>/log/<host>/ctssd
              <GRID_HOME>/log/<host>/racg
        Log files of the respective components go to the above directories. Core files are dropped in the same directory as the logs.
        Old core files will be backed up.
        6. ****CRS and EVM Alerts****
        CRS and EVM post alert messages on the occurrence of important events.
        CRSD
        [NORMAL] CLSD-1201: CRSD started on host %s.
        [ERROR] CLSD-1202: CRSD aborted on host %s. Error [%s]. Details in %s.
        [ERROR] CLSD-1203: Failover failed for the CRS resource %s. Details in %s.
        [NORMAL] CLSD-1204: Recovering CRS resources for host %s
        [ERROR] CLSD-1205: Auto-start failed for the CRS resource %s. Details in %s.
        EVMD
        [NORMAL] CLSD-1401: EVMD started on node %s
        [ERROR] CLSD-1402: EVMD aborted on node %s. Error [%s]. Details in %s.
        [ERROR] CLSD-1402: EVMD aborted on node %s. Error [%s]. Details in %s.

        7. ****Resource debugging****
           Using crsctl, debugging can be turned on for resources.
           10.2/11gR1:
        # crsctl debug log res <resname:level>
        - Turns on debugging for resources
        For example:
        # crsctl debug log res ora.racnode.vip:2
            11gR2:
        # crsctl set log res <resname>=<lvl> [-init]
        Set the log levels for resources including init resource, check crsctl stat res -t and crsctl stat res -t -init for resource name.
        For example:
        # crsctl set log res ora.racnode.vip=2
        # crsctl set log res ora.cssdmonitor=2 -init
         This sets debugging on for the resource in the form of an OCR key.

        8. ****Health Check****
        To determine the health of the CRS stack:
        $ crsctl check crs
        To determine health of individual daemons:
        $ crsctl check css
        $ crsctl check evm
        or 11gR2:
        $ crsctl stat res -t -init
        9. ****OCR debugging****
        In 10.1, to debug OCR, the $ORA_CRS_HOME/srvm/ocrlog.ini was updated to a level higher than 0. Starting with 10.2 it is possible to
        debug the OCR at the component level using the following commands
        a)   Edit $ORA_CRS_HOME/srvm/admin/ocrlog.ini to the component level
        Eg: comploglvl="OCRAPI:5;OCRCLI:5;OCRSRV:5;OCRMAS:5;OCRCAC:5"

        b) Use the dynamic feature to update the logging into the OCR itself using the command crsctl
        Same as step one, the CRS modules name associated with OCR are:
        10.2/11gR1:
        CRSOCR

         11gR2:
        CRSOCR
        OCRAPI
        OCRASM
        OCRCAC
        OCRCLI
        OCRMAS
        OCRMSG
        OCROSD
        OCRRAW
        OCRSRV
        OCRUTL
AODU>

AODU> rac eviction  --RAC重启、节点驱逐诊断信息收集
        ****Data Gathering for Node Reboot/Eviction****
        Provide files in Section "Data Gathering for All Oracle Clusterware Issues" and the followings:
        Approximate date and time of the reboot, and the hostname of the rebooted node
        OSWatcher archives which cover the reboot time at an interval of 20 seconds with private network monitoring configured.
        Note 301137.1 - OS Watcher User Guide
        Note.433472.1 - OS Watcher For Windows (OSWFW) User Guide
        For pre-11.2, zip of /var/opt/oracle/oprocd/* or /etc/oracle/oprocd/*
        For pre-11.2, OS logs - refer to Section Appendix B
        For 11gR2+, zip of /etc/oracle/lastgasp/* or /var/opt/oracle/lastgasp/*
        CHM/OS data that covers the reboot time for platforms where it is available,
        refer to Note 1328466.1 for section "How do I collect the Cluster Health Monitor data"
        The Cluster Health Monitor is integrated part of 11.2.0.2 Oracle Grid Infrastructure for Linux (not on Linux Itanium) and
        Solaris (Sparc 64 and x86-64 only), so installing 11.2.0.2 Oracle Grid Infrastructure on those platforms will automatically
        install the Cluster Health Monitor. AIX will have the Cluster Health Monitor starting from 11.2.0.3.
        The Cluster Health Monitor is also enabled for Windows (except Windows Itanium) in 11.2.0.3.
        ora.crf is the Cluster Health Monitor resource name that ohasd manages.Issue "crsctl stat res ▒Ct ▒Cinit" to check the current status
        of the Cluster Health Monitor.
        For example, issue "<GI_HOME>/bin/diagcollection.pl --collect --crshome $ORA_CRS_HOME --chmos
        --incidenttime <start time of interesting time period> --incidentduration 05:00"


        What logs and data should I gather before logging a SR for the Cluster Health Monitor error?
        1) provide 3-4 pstack outputs over a minute for osysmond.bin
        2) output of strace -v for osysmond.bin about 2 minutes.
        3) strace -cp <osysmond.bin pid> for about 2 min
        4) oclumon dumpnodeview -v output for that node for 2 min.
        5) output of "uname -a"
        6) outpuft of "ps -eLf|grep osysmond.bin"
        7) The ologgerd and sysmond log files in the CRS_HOME/log/<host name> directory from all nodes
        How to start and stop CHM that is installed as a part of GI in 11.2 and higher?
        The ora.crf resource in 11.2 GI (and higher) is the resource for CHM, and the ora.crf resource is managed by ohasd.
        Starting and stopping ora.crf resource starts and stops CHM.
        To stop CHM (or ora.crf resource managed by ohasd)
        $GRID_HOME/bin/crsctl stop res ora.crf -init
        To start CHM (or ora.crf resource managed by ohasd)
        $GRID_HOME/bin/crsctl start res ora.crf -init
        If vendor clusterware is being used, upload the vendor clusterware logs
AODU>




关键词:aodu 

相关文章

aodu(At Oracle Database Utility)之optim
aodu(At Oracle Database Utility)之asm(二)
aodu(At Oracle Database Utility)之asm(一)
aodu(At Oracle Database Utility)之rac(二)
aodu(At Oracle Database Utility)之rac(一)
aodu(At Oracle Database Utility)之ora600
aodu(At Oracle Database Utility)之asmdisk
aodu(At Oracle Database Utility)之unwrap
aodu(At Oracle Database Utility)之rdba
aodu(At Oracle Database Utility)之drux
aodu(At Oracle Database Utility)之time
aodu(At Oracle Database Utility)之odlog
Top