aodu的函数rac包含了以下内容:如何更改归档模式,常用术语,故障诊断常用脚本,设置调试模式,如何使用cvu(Cluster Vierify Utility),crsctl/srvctl常用命令,ocrconfig命令,olsnodes命令等。
如何使用rac函数?
[oracle@db1 ~]$ ./aodu
AT Oracle Database Utility,Release 1.1.0 on Tue Jun 14 14:28:01 2016Copyright (c) 2014, 2015, Robin.Han. All rights reserved.
http://ohsdba.cn
E-Mail:375349564@qq.com
AODU>
AODU> rac ohsdba
rac archivelog|general|abbr|diag|cvu|diaginstall|debug|eviction|diagrac|perf|srvctl|crsctl|ocrconfig|olsnodes|debug|note
AODU> rac oracle
Currently it's for internal use only
AODU>
注意:只有通过rac ohsdba会把函数中常用命令显示出来,当然也只有看过这篇文章,才会知道如何使用
AODU> rac archivelog
****Change archivedlog mode**** The following steps need to be taken to enable archive logging in a RAC database environment: $srvctl stop database -d <db_unique_name> $srvctl start database -d <db_unique_name> -o mount $sqlplus / as sysdba sql> alter database archivelog; sql> exit; $ srvctl stop database -d <db_unique_name> $ srvctl start database -d <db_unique_name> sql> archive log list; Note:from 10g,you do not need to change parameter cluster_database AODU>
AODU> rac general ---11gR2的特性和RAC启动顺序的说明
****11gR2 Clusterware Key Facts**** 11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database. The GRID home consists of the Oracle Clusterware and ASM. ASM should not be in a separate home. The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support. This clusterware is a subset of the full clusterware described in this document. The 11gR2 Clusterware can be run by itself or on top of vendor clusterware.See the certification matrix for certified combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters" The GRID Home and the RAC/DB Home must be installed in different locations. The 11gR2 Clusterware requires a shared OCR files and voting files. These can be stored on ASM or a cluster filesystem. The OCR is backed up automatically every 4 hours to <GRID_HOME>/cdata/<clustername>/ and can be restored via ocrconfig. The voting file is backed up into the OCR at every configuration change and can be restored via crsctl. The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one public network for external communication.Several virtual IPs need to be registered with DNS.This includes the node VIPs (one per node), SCAN VIPs (three).This can be done manually via your network administrator or optionally you could configure the "GNS" (Grid Naming Service) in the Oracle clusterware to handle this for you (note that GNS requires its own VIP). A SCAN (Single Client Access Name) is provided to clients to connect to. For more information on SCAN see Note: 887522.1 The root.sh script at the end of the clusterware installation starts the clusterware stack.For information on troubleshooting root.sh issues see Note: 1053970.1 Only one set of clusterware daemons can be running per node. On Unix, the clusterware stack is started via the init.ohasd script referenced in /etc/inittab with "respawn". A node can be evicted (rebooted) if a node is deemed to be unhealthy. This is done so that the health of the entire cluster can be maintained. For more information on this see: Note: 1050693.1 "Troubleshooting 11.2 Clusterware Node Evictions (Reboots)" Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all and let CTSS handle time synchronization.See Note: 1054006.1 for more information. If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 errors. See Note 946332.1 and Note:948456.1 for more information. The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware stack, or by running "crsctl start cluster" to start the clusterware on all nodes.Note that crsctl is in the <GRID_HOME>/bin directory. Note that "crsctl start cluster" will only work if ohasd is running. The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the clusterware stack, or by running "crsctl stop cluster" to stop the clusterware on all nodes.Note that crsctl is in the <GRID_HOME>/bin directory. Killing clusterware daemons is not supported. Instance is now part of .db resources in "crsctl stat res -t" output, there is no separate .inst resource for 11gR2 instance. ****Cluster Start Sequence**** This daemon spawns 4 processes. Level 1: OHASD Spawns: cssdagent - Agent responsible for spawning CSSD. orarootagent - Agent responsible for managing all root owned ohasd resources. oraagent - Agent responsible for managing all oracle owned ohasd resources. cssdmonitor - Monitors CSSD and node health (along wth the cssdagent). Level 2: OHASD rootagent spawns: CRSD - Primary daemon responsible for managing cluster resources. CTSSD - Cluster Time Synchronization Services Daemon Diskmon ACFS (ASM Cluster File System) Drivers Level 2: OHASD oraagent spawns: MDNSD - Used for DNS lookup GIPCD - Used for inter-process and inter-node communication GPNPD - Grid Plug & Play Profile Daemon EVMD - Event Monitor Daemon ASM - Resource for monitoring ASM instances Level 3: CRSD spawns: orarootagent - Agent responsible for managing all root owned crsd resources. oraagent - Agent responsible for managing all oracle owned crsd resources. Level 4: CRSD rootagent spawns: Network resource - To monitor the public network SCAN VIP(s) - Single Client Access Name Virtual IPs Node VIPs - One per node ACFS Registery - For mounting ASM Cluster File System GNS VIP (optional) - VIP for GNS Level 4: CRSD oraagent spawns: ASM Resouce - ASM Instance(s) resource Diskgroup - Used for managing/monitoring ASM diskgroups. DB Resource - Used for monitoring and managing the DB and instances SCAN Listener - Listener for single client access name, listening on SCAN VIP Listener - Node listener listening on the Node VIP Services - Used for monitoring and managing services ONS - Oracle Notification Service eONS - Enhanced Oracle Notification Service GSD - For 9i backward compatibility GNS (optional) - Grid Naming Service - Performs name resolution AODU>
AODU> rac abbr ---RAC术语
****Abbreviations, Acronyms**** This note lists commonly used Oracle Clusterware(Cluster Ready Service or Grid Infrastructure) Related abbreviations,acronyms,terms and Procedures. nodename: short hostname for local node. For example, racnode1 for node racnode1.us.oracle.com CRS: Cluster Ready Service, name for pre-11gR2 Oracle clusterware GI: Grid Infrastructure, name for 11gR2 Oracle clusterware GI cluster: Grid Infrastructure in cluster mode Oracle Restart: GI Standalone, Grid Infrastructure in standalone mode ASM user: the OS user who installs/owns ASM. For 11gR2, ASM and grid user is the same as ASM and GI share the same ORACLE_HOME. For pre-11gR2 CRS cluster, ASM and CRS user can be different as ASM and CRS will be in different ORACLE_HOME. For pre-11gR2 single-instance ASM, ASM and local CRS user is the same as ASM and local CRS share the same home. CRS user: the OS user who installs/owns pre-11gR2 Oracle clusterware grid user: the OS user who installs/owns 11gR2 Oracle clusterware clusterware user: CRS or grid user which must be the same in upgrade environment Oracle Clusterware software owner: same as clusterware user clusterware home: CRS or GI home ORACLE_BASE:ORACLE_BASE for grid or CRS user. root script checkpoint file: the file that records root script (root.sh or rootupgrade.sh) progress so root script can be re-executed, it's located in $ORACLE_BASE/Clusterware/ckptGridHA_${nodename}.xml OCR: Oracle Cluster Registry. To find out OCR location, execute: ocrcheck VD: Voting Disk. To find out voting file location, execute: crsctl query css votedisk Automatic OCR Backup: OCR is backed up automatically every four hours in cluster environment on OCR Master node, the default location is <clusterware-home>/cdata/<clustername>. To find out backup location, execute: ocrconfig -showbackup SCR Base: the directory where ocr.loc and olr.loc are located. Linux: /etc/oracle Solaris: /var/opt/oracle hp-ux: /var/opt/oracle AIX: /etc/oracle INITD Location: the directory where ohasd and init.ohasd are located. Linux: /etc/init.d Solaris: /etc/init.d hp-ux: /sbin/init.d AIX: /etc oratab Location: the directory where oratab is located. Linux: /etc Solaris: /var/opt/oracle hp-ux: /etc AIX: /etc CIL: Central Inventory Location. The location is defined by parameter inventory_loc in /etc/oraInst.loc or /var/opt/oracle/oraInst.loc depend on platform. Example on Linux: cat /etc/oraInst.loc | grep inventory_loc inventory_loc=/home/ogrid/app/oraInventory Disable CRS/GI: To disable clusterware from auto startup when node reboots, as root execute "crsctl disable crs". Replace it with "crsctl stop has" for Oracle Restart. DG Compatible: ASM Disk Group's compatible.asm setting. To store OCR/VD on ASM, the compatible setting must be at least 11.2.0.0.0, but on the other hand lower GI version won't work with higher compatible setting. For example, 11.2.0.1 GI will have issue to access a DG if compatible.asm is set to 11.2.0.2.0. When downgrading from higher GI version to lower GI version, if DG for OCR/VD has higher compatible, OCR/VD relocation to lower compatible setting is necessary. To find out compatible setting, log on to ASM and query: SQL> select name||' => '||compatibility from v$asm_diskgroup where name='GI'; NAME||'=>'||COMPATIBILITY -------------------------------------------------------------------------------- GI => 11.2.0.0.0 In above example, GI is the name of the interested disk group. To relocate OCR from higher compatible DG to lower one: ocrconfig -add <diskgroup> ocrconfig -delete <disk group> To relocate VD from higher compatible DG to lower one: crsctl replace votedisk <diskgroup> When upgrading Oracle Clusterware: OLD_HOME: pre-upgrade Oracle clusterware home - the home where existing clusterware is running off. For Oracle Restart, the OLD_HOME is pre-upgrade ASM home. OLD_VERSION: pre-upgrade Oracle clusterware version. NEW_HOME: new Oracle clusterware home. NEW_VERSION: new Oracle clusterware version. OCR Node: The node where rootupgrade.sh backs up pre-upgrade OCR to $NEW_HOME/cdata/ocr$OLD_VERSION. In most case it's first node where rootupgrade.sh was executed. Example when upgrading from 11.2.0.1 to 11.2.0.2, after execution of rootupgrade.sh ls -l $NEW_HOME/cdata/ocr* -rw-r--r-- 1 root root 78220 Feb 16 10:21 /ocw/b202/cdata/ocr11.2.0.1.0 AODU>
AODU> rac diag --如何收集rac的故障诊断信息
****Data Gathering for All Oracle Clusterware Issues**** TFA Collector is installed in the GI HOME and comes with 11.2.0.4 GI and higher. For GI 11.2.0.3 or lower, install the TFA Collector by referring to Document 1513912.1 for instruction on downloading and installing TFA collector. $GI_HOME/tfa/bin/tfactl diagcollect -from "MMM/dd/yyyy hh:mm:ss" -to "MMM/dd/yyyy hh:mm:ss" Format example: "Jul/1/2014 21:00:00" Specify the "from time" to be 4 hours before and the "to time" to be 4 hours after the time of error. ****Linux/Unix Platform a.Linux/UNIX 11gR2/12cR1 1. Execute the following as root user: # script /tmp/diag.log # id # env # cd <temp-directory-with-plenty-free-space> # $GRID_HOME/bin/diagcollection.sh # exit For more information about diagcollection, check out "diagcollection.sh -help" The following .gz files will be generated in the current directory and need to be uploaded along with /tmp/diag.log: Linux/UNIX 10gR2/11gR1 1. Execute the following as root user: # script /tmp/diag.log # id # env # cd <temp-directory-with-plenty-free-space> # export OCH=<CRS_HOME> # export ORACLE_HOME=<DB_HOME> # export HOSTNAME=<host> # $OCH/bin/diagcollection.pl -crshome=$OCH --collect # exit The following .gz files will be generated in the current directory and need to be uploaded along with /tmp/diag.log: 2. For 10gR2 and 11gR1, if getting an error while running root.sh, please collect /tmp/crsctl.* Please ensure all above information are provided from all the nodes. ****Windows Platform**** b.Windows 11gR2/12cR1: set ORACLE_HOME=<GRID_HOME> for example: set ORACLE_HOME=D:\app\11.2.0\grid set PATH=%PATH%;%ORACLE_HOME%\perl\bin perl %ORACLE_HOME%\bin\diagcollection.pl --collect --crshome %ORACLE_HOME% The following .zip files will be generated in the current directory and need to be uploaded: crsData_<timestamp>.zip, ocrData_<timestamp>.zip, oraData_<timestamp>.zip, coreData_<timestamp>.zip (only --core option specified) For chmosdata*: perl %ORACLE_HOME%\bin\diagcollection.pl --collect --crshome %ORACLE_HOME% Windows 10gR2/11gR1 set ORACLE_HOME=<DB_HOME> set OCH=<CRS_HOME> set ORACLE_BASE=<oracle-base> %OCH%\perl\bin\perl %OCH%\bin\diagcollection.pl --collect AODU>
AODU> rac cvu --如何使用CVU
****Cluster verify utility**** How to Debug CVU / Collect CVU Trace Generated by RUNCLUVFY.SH (Doc ID 986822.1) (a) GI/CRS has been installed $ script /tmp/cluvfy.log $ $GRID_HOME/bin/cluvfy stage -pre crsinst -n <node1, node2...> -verbose $ $GRID_HOME/bin/cluvfy stage -post crsinst -n all -verbose $ exit (b) GI/CRS has not been installed run runcluvfy.sh from the installation media or download from OTN http://www.oracle.com/technetwork/database/options/clustering/downloads/index.html set the environment variables CV_HOME to point to the cvu home, CV_JDKHOME to point to the JDK home and an optional CV_DESTLOC pointing to a writeable area on all nodes (e.g /tmp/cluvfy) $ cd $CV_HOME/bin $ script cluvfy.log $ cluvfy stage -pre crsinst -n <node1, node2...> $ exit ****Diagcollection options**** To collect only a subset of logs, --adr together with --beforetime and --aftertime can be used, i.e.: # mkdir /tmp/collect # $GRID_HOME/bin/diagcollection.sh --adr /tmp/collect -beforetime 20120218100000 -aftertime 20120218050000 This command will copy all logs contain timestamp 2012-02-18 05:00 - 10:00 to /tmp/collect directory. Time is specified with YYYYMMDDHHMISS24 format. --adr points to a directory where the logs are copied to. From 11.2.0.2 onwards, Cluster Health Monitor(CHM/OS) note 1328466.1 data can also be collected, i.e.: # $GRID_HOME/bin/diagcollection.sh --chmos --incidenttime 02/18/201205:00:00 --incidentduration 05:00 This command will collect data from 2012-02-18 05:00 to 10:00 for 5 hours. incidenttime is specified as MM/DD/YYYY24HH:MM:SS, incidentduration is specified as HH:MM. ****For 11gR2/12c:**** Set environment variable CV_TRACELOC to a directory that's writable by grid user; trace should be generated in there once runcluvfy.sh starts. For example, as grid user: rm -rf /tmp/cvutrace mkdir /tmp/cvutrace export CV_TRACELOC=/tmp/cvutrace export SRVM_TRACE=true export SRVM_TRACE_LEVEL=1 <STAGE_AREA>/runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose Note: A. STAGE_AREA refers to the location where Oracle Clusterware is unzipped. B. Replace above "runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose" if other stage/comp needs to be traced,i.e."./runcluvfy.sh comp ocr -verbose" C. For 12.1.0.2, the following can be set for additional tracing from exectask command: export EXECTASK_TRACE=true The trace will be in <TMP>/CVU_<version>_<user>/exectask.trc, i.e. /tmp/CVU_12.1.0.2.0_grid/exectask.trc ****For 10gR2, 11gR1 or 11gR2:**** 1. As crs/grid user, backup runcluvfy.sh. For 10gR2, it's located in <STAGE_AREA>/cluvfy/runcluvfy.sh; and for 11gR1 and 11gR2, <STAGE_AREA>/runcluvfy.sh cd <STAGE_AREA> cp runcluvfy.sh runcluvfy.debug 2. Locate the following lines in runcluvfy.debug: # Cleanup the home for cluster verification software $RM -rf $CV_HOME Comment out the remove command so runtime files including trace won't be removed once CVU finishes. # Cleanup the home for cluster verification software # $RM -rf $CV_HOME 3. As crs/grid user,set environment variable CV_HOME to anywhere as long as the location is writable by crs/grid user and has 400MB of free space: mkdir /tmp/cvdebug CV_HOME=/tmp/cvdebug export CV_HOME This step is optional, if CV_HOME is unset, CVU files will be generated in /tmp. 4. As crs/grid user, execute runcluvfy.debug: export SRVM_TRACE=true export SRVM_TRACE_LEVEL=1 cd <STAGE_AREA> ./runcluvfy.debug stage -pre crsinst -n <node1>,<node2> -verbose Note: A. SRVM_TRACE_LEVEL is effective for 11gR2 only. B. Replace above "runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose" if other stage/comp needs to be traced, i.e. "./runcluvfy.sh comp ocr -verbose" 5. Regardless whether above command finishes or not, CVU trace should be generated in: 10gR2: $CV_HOME/<pid>/cv/log 11gR1: $CV_HOME/bootstrap/cv/log 11gR2: $CV_HOME/bootstrap/cv/log If CV_HOME is unset, trace will be in /tmp/<pid>/cv/log or /tmp/bootstrap/cv/log depend on CVU version. 6. Clean up temporary files generated by above runcluvfy.debug: rm -rf $CV_HOME/bootstrap HOW TO UPGRADE CLUVFY IN CRS_HOME (Doc ID 969282.1) On the other hand e.g. GridControl uses the installed cluvfy from CRS_HOME. So if GridControl shows errors when checking the cluster You may want to update the cluvfy used by GridControl. As cluvfy consists of many files it is best to install the newest version outside of CRS_HOME so there will be no conflict between the jar files and libraries used by cluvfy and CRS. To do so follow these steps: 1. download the newest version of cluvfy from here and extract the files into the target directory You want to use. The following assumptions are made here: CRS_HOME=/u01/oracle/crs new cluvfy-home CVU_HOME = /u01/oracle/cluvfy 2. go to CRS_HOME/bin and copy the existing file CRS_HOME/bin/cluvfy to CRS_HOME/bin/cluvfy.org 3. copy CVU_HOME/bin/cluvfy to CRS_HOME/bin/cluvfy 4. edit that file and search for the line CRSHOME= CRSHOME=that file and search fosearch foe /cluvfy S_HOME/bin/cOME/bin/cCRS_HOME/bin/cluvfy.orgluvfy.organt to use. ewest versiest versie of CRS_HOME so there so there cking the cluster ime ter ime ied as MM/DD/YYYY24HH:MYYY24HH:Mt to the JDK home and aome and al ng. For example, 11.mple, 11.will have issue to ace to ac if compatible.asm is s.asm is s2.0.2.0. When downgradidowngradiigher GI version to lowon to lowsion, ptionally you coly you cogure the "GNS" (Grid Na (Grid Na (Grid_OracleHome make the following corrections: ORACLE_HOME=$ORA_CRS_HOME CRSHOME=$ORACLE_HOME CV_HOME=/u01/oracle/cluvfy <---- check for Your environment JREDIR=$CV_HOME/jdk/jre DESTLOC=/tmp ./runcluvfy.sh stage -pre dbinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_dbinst.log runcluvfy.sh stage -pre hacfg -verbose ./cluvfy stage -pre dbinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_dbinst.log ./runcluvfy.sh stage -pre crsinst -n <node1>,<node2> -verbose | tee /tmp/cluvfy_preinst.log ./cluvfy stage -post crsinst -n all -verbose | tee /tmp/cluvfy_postinst.log $ ./cluvfy comp -list USAGE: cluvfy comp <component-name> <component-specific options> [-verbose] Valid components are: nodereach : checks reachability between nodes nodecon : checks node connectivity cfs : checks CFS integrity ssa : checks shared storage accessibility space : checks space availability sys : checks minimum system requirements clu : checks cluster integrity clumgr : checks cluster manager integrity ocr : checks OCR integrity olr : checks OLR integrity ha : checks HA integrity crs : checks CRS integrity nodeapp : checks node applications existence admprv : checks administrative privileges peer : compares properties with peers software : checks software distribution asm : checks ASM integrity acfs : checks ACFS integrity gpnp : checks GPnP integrity gns : checks GNS integrity scan : checks SCAN configuration ohasd : checks OHASD integrity clocksync : checks Clock Synchronization vdisk : check Voting Disk Udev settings $ ./cluvfy stage -list USAGE: cluvfy stage {-pre|-post} <stage-name> <stage-specific options> [-verbose] Valid stage options and stage names are: -post hwos : post-check for hardware and operating system -pre cfs : pre-check for CFS setup -post cfs : post-check for CFS setup -pre crsinst : pre-check for CRS installation -post crsinst : post-check for CRS installation -pre hacfg : pre-check for HA configuration -post hacfg : post-check for HA configuration -pre dbinst : pre-check for database installation -pre acfscfg : pre-check for ACFS Configuration. -post acfscfg : post-check for ACFS Configuration. -pre dbcfg : pre-check for database configuration -pre nodeadd : pre-check for node addition. -post nodeadd : post-check for node addition. -post nodedel : post-check for node deletion. cluvfy comp ssa -n dbnode1,dbnode2 -s Datatbase logs & trace files: cd $(orabase)/diag/rdbms tar cf - $(find . -name '*.trc' -exec egrep '<date_time_search_string>' {} \; grep -v bucket) | gzip > /tmp/database_trace_files.tar.gz ASM logs & trace files: cd $(orabase)/diag/asm/+asm/ tar cf - $(find . -name '*.trc' -exec egrep '<date_time_search_string>' {} \; grep -v bucket) | gzip > /tmp/asm_trace_files.tar.gz Clusteware logs: <GI home>/bin/diagcollection.sh --collect --crs --crshome <GI home> OS logs: /var/adm/messages* or /var/log/messages* or 'errpt -a' or Windows System Event Viewer log (saved as .TXT file) AODU>
AODU> rac diaginstall --RAC安装故障诊断信息收集
****Data Gathering for Oracle Clusterware Installation Issues**** Failure before executing root script: For 11gR2: note 1056322.1 - Troubleshoot 11gR2 Grid Infrastructure/RAC Database runInstaller Issues For pre-11.2: note 406231.1 - Diagnosing RAC/RDBMS Installation Problems Failure while or after executing root script Provide files in Section "Data Gathering for All Oracle Clusterware Issues" and the following: root script (root.sh or rootupgrade.sh) screen output For 11gR2: provide zip of <$ORACLE_BASE>/cfgtoollogs and <$ORACLE_BASE>/diag for grid user. For pre-11.2: Note 240001.1 - Troubleshooting 10g or 11.1 Oracle Clusterware Root.sh Problems Before deconfiguring, collect the following as grid user if possible to generate a list of user resources to be added back to the cluster after reconfigure finishes: $GRID_HOME/bin/crsctl stat res -t $GRID_HOME/bin/crsctl stat res -p $GRID_HOME/bin/crsctl query css votedisk $GRID_HOME/bin/crsctl query crs activeversion $GRID_HOME/bin/crsctl check crs cat /etc/oracle/ocr.loc /var/opt/oracle/ocr.loc $GRID_HOME/bin/crsctl get css diagwait $GRID_HOME/bin/ocrcheck $GRID_HOME/bin/oifcfg iflist -p -n $GRID_HOME/bin/oifcfg getif $GRID_HOME/bin/crsctl query css votedisk $GRID_HOME/bin/srvctl config nodeapps -a $GRID_HOME/bin/srvctl config scan $GRID_HOME/bin/srvctl config asm -a $GRID_HOME/bin/srvctl config listener -l <listener-name> -a $DB_HOME/bin/srvctl config database -d <dbname> -a $DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v # <$GRID_HOME>/crs/install/roothas.pl -deconfig -force -verbose OHASD Agents do not start Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1) In a nutshell, the operating system starts ohasd, ohasd starts agents to start up daemons (gipcd, mdnsd, gpnpd, ctssd, ocssd, crsd, evmd asm etc), and crsd starts agents that start user resources (database, SCAN, listener etc). OHASD.BIN will spawn four agents/monitors to start resource: oraagent: responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc orarootagent: responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc cssdagent / cssdmonitor: responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself) $GRID_HOME/bin/crsctl start res ora.crsd -init $GRID_HOME/bin/crsctl start res ora.evmd -init $GRID_HOME/bin/crsctl stop res ora.evmd -init ps -ef | grep <keyword> | grep -v grep | awk '{print $2}' | xargs kill -9 If ohasd.bin can not start any of above agents properly, clusterware will not come to healthy state. AODU>
AODU> rac debug --RAC设置调试模式
****Troubleshooting Steps**** Following are the new features in 10.2/11gR1/11gR2 Using crsctl, debugging can be turned on and off for CRS/EVM/CSS and their subcomponents. Debug levels can also be dynamically changed using crsctl. The debug information is persisted in OCR for use during the next startup. Debugging can be turned on for CRS managed resource like VIP, Instance as well. * Note,in the following examples, commands with "#" prompt are executed as root user,commands with "$" prompt can be executed as clusterware owner. 1. ****Component level logging**** 10.2/11gR1: # crsctl debug log css [module:level]{,module:level} ... - Turns on debugging for CSS # crsctl debug log crs [module:level]{,module:level} ... - Turns on debugging for CRS # crsctl debug log evm [module:level]{,module:level} ... - Turns on debugging for EVM For example: # crsctl debug log crs "CRSRTI:1,CRSCOMM:2" # crsctl debug log evm "EVMD:1" 11gR2: # crsctl set {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} "<name1>=<lvl1>,..." Set the log/trace levels for specific modules within daemons For example: # crsctl set log crs "CRSRTI=2,CRSCOMM=2" To list all modules: 10.2/11gR1: # crsctl lsmodules {css | crs | evm} - lists the CSS modules that can be used for debugging 11gR2: # crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc} where mdns multicast Domain Name Server gpnp Grid Plug-n-Play Service css Cluster Synchronization Services crf Cluster Health Monitor crs Cluster Ready Services ctss Cluster Time Synchronization Service evm EventManager gipc Grid Interprocess Communications Logging level definition: level 0 = turn off level 2 = default level 3 = verbose level 4 = super verbose To check current logging level: 10.2/11gR1: For CSS: $ grep clssscSetDebugLevel <ocssd logs> For CRS / EVMD: $ grep "ENV Logging level for Module" <crsd / evmd logs> 11gR2: $ crsctl get log <modules> ALL For example: $ crsctl get log css ALL Get CSSD Module: CLSF Log Level: 0 Get CSSD Module: CSSD Log Level: 2 Get CSSD Module: GIPCCM Log Level: 2 Get CSSD Module: GIPCGM Log Level: 2 Get CSSD Module: GIPCNM Log Level: 2 Get CSSD Module: GPNP Log Level: 1 Get CSSD Module: OLR Log Level: 0 Get CSSD Module: SKGFD Log Level: 0 2. ****Component level debugging**** Debugging can be turned on for CRS and EVM and their specific modules by setting environment variables or through crsctl. To turn on tracing for all modules: ORA_CRSDEBUG_ALL To turn on tracing for a specific sub module: ORA_CRSDEBUG_<modulename> 3. ****CRS stack startup and shutdown**** Using crsctl, the entire CRS stack and the resources can be started and stopped. # crsctl start crs # crsctl stop crs 4. ****Diagnostics collection script - diagcollection.pl**** This script is for collecting diagnostic information from a CRS installation. The diagnostics are necessary for development to be able to help with SRs, bugs and other problems that may arise in the field. 10.2 # <$CRS_HOME>/bin/diagcollection.pl 11.1 # <$CRS_HOME>/bin/diagcollection.pl -crshome=$ORA_CRS_HOME --collect 11.2 # <$GRID_HOME>/bin/diagcollection.sh For more details, please refer to Document 330358.1 CRS 10gR2/ 11gR1/ 11gR2 Diagnostic Collection Guide 5. ****Unified log directory structure**** The following log directory structure is in place for 10.2 onwards in an effort to consolidate the log files of different Clusterware components for easy diag information retrieval and problem analysis. Objectives 1. Need one place where all the CRS related log files can be located. 2. The directory structure needs to be intuitive for any user 3. Permission, ownership issues need to be addressed 4. Disk space issues need to be considered 10.2/11gR1 Alert file CRS_HOME/log/<host>/alert<hostname>.log CRS component directories CRS_HOME/log/ CRS_HOME/log/<host> CRS_HOME/log/<host>/crsd CRS_HOME/log/<host>/cssd CRS_HOME/log/<host>/evmd CRS_HOME/log/<host>/client CRS_HOME/log/<host>/racg 11gR2: GI components directories: <GRID_HOME>/log/<host>/crsd <GRID_HOME>/log/<host>/cssd <GRID_HOME>/log/<host>/admin <GRID_HOME>/log/<host>/agent <GRID_HOME>/log/<host>/evmd <GRID_HOME>/log/<host>/client <GRID_HOME>/log/<host>/ohasd <GRID_HOME>/log/<host>/mdsnd <GRID_HOME>/log/<host>/gipcd <GRID_HOME>/log/<host>/gnsd <GRID_HOME>/log/<host>/ctssd <GRID_HOME>/log/<host>/racg Log files of the respective components go to the above directories. Core files are dropped in the same directory as the logs. Old core files will be backed up. 6. ****CRS and EVM Alerts**** CRS and EVM post alert messages on the occurrence of important events. CRSD [NORMAL] CLSD-1201: CRSD started on host %s. [ERROR] CLSD-1202: CRSD aborted on host %s. Error [%s]. Details in %s. [ERROR] CLSD-1203: Failover failed for the CRS resource %s. Details in %s. [NORMAL] CLSD-1204: Recovering CRS resources for host %s [ERROR] CLSD-1205: Auto-start failed for the CRS resource %s. Details in %s. EVMD [NORMAL] CLSD-1401: EVMD started on node %s [ERROR] CLSD-1402: EVMD aborted on node %s. Error [%s]. Details in %s. [ERROR] CLSD-1402: EVMD aborted on node %s. Error [%s]. Details in %s. 7. ****Resource debugging**** Using crsctl, debugging can be turned on for resources. 10.2/11gR1: # crsctl debug log res <resname:level> - Turns on debugging for resources For example: # crsctl debug log res ora.racnode.vip:2 11gR2: # crsctl set log res <resname>=<lvl> [-init] Set the log levels for resources including init resource, check crsctl stat res -t and crsctl stat res -t -init for resource name. For example: # crsctl set log res ora.racnode.vip=2 # crsctl set log res ora.cssdmonitor=2 -init This sets debugging on for the resource in the form of an OCR key. 8. ****Health Check**** To determine the health of the CRS stack: $ crsctl check crs To determine health of individual daemons: $ crsctl check css $ crsctl check evm or 11gR2: $ crsctl stat res -t -init 9. ****OCR debugging**** In 10.1, to debug OCR, the $ORA_CRS_HOME/srvm/ocrlog.ini was updated to a level higher than 0. Starting with 10.2 it is possible to debug the OCR at the component level using the following commands a) Edit $ORA_CRS_HOME/srvm/admin/ocrlog.ini to the component level Eg: comploglvl="OCRAPI:5;OCRCLI:5;OCRSRV:5;OCRMAS:5;OCRCAC:5" b) Use the dynamic feature to update the logging into the OCR itself using the command crsctl Same as step one, the CRS modules name associated with OCR are: 10.2/11gR1: CRSOCR 11gR2: CRSOCR OCRAPI OCRASM OCRCAC OCRCLI OCRMAS OCRMSG OCROSD OCRRAW OCRSRV OCRUTL AODU>
AODU> rac eviction --RAC重启、节点驱逐诊断信息收集
****Data Gathering for Node Reboot/Eviction**** Provide files in Section "Data Gathering for All Oracle Clusterware Issues" and the followings: Approximate date and time of the reboot, and the hostname of the rebooted node OSWatcher archives which cover the reboot time at an interval of 20 seconds with private network monitoring configured. Note 301137.1 - OS Watcher User Guide Note.433472.1 - OS Watcher For Windows (OSWFW) User Guide For pre-11.2, zip of /var/opt/oracle/oprocd/* or /etc/oracle/oprocd/* For pre-11.2, OS logs - refer to Section Appendix B For 11gR2+, zip of /etc/oracle/lastgasp/* or /var/opt/oracle/lastgasp/* CHM/OS data that covers the reboot time for platforms where it is available, refer to Note 1328466.1 for section "How do I collect the Cluster Health Monitor data" The Cluster Health Monitor is integrated part of 11.2.0.2 Oracle Grid Infrastructure for Linux (not on Linux Itanium) and Solaris (Sparc 64 and x86-64 only), so installing 11.2.0.2 Oracle Grid Infrastructure on those platforms will automatically install the Cluster Health Monitor. AIX will have the Cluster Health Monitor starting from 11.2.0.3. The Cluster Health Monitor is also enabled for Windows (except Windows Itanium) in 11.2.0.3. ora.crf is the Cluster Health Monitor resource name that ohasd manages.Issue "crsctl stat res ▒Ct ▒Cinit" to check the current status of the Cluster Health Monitor. For example, issue "<GI_HOME>/bin/diagcollection.pl --collect --crshome $ORA_CRS_HOME --chmos --incidenttime <start time of interesting time period> --incidentduration 05:00" What logs and data should I gather before logging a SR for the Cluster Health Monitor error? 1) provide 3-4 pstack outputs over a minute for osysmond.bin 2) output of strace -v for osysmond.bin about 2 minutes. 3) strace -cp <osysmond.bin pid> for about 2 min 4) oclumon dumpnodeview -v output for that node for 2 min. 5) output of "uname -a" 6) outpuft of "ps -eLf|grep osysmond.bin" 7) The ologgerd and sysmond log files in the CRS_HOME/log/<host name> directory from all nodes How to start and stop CHM that is installed as a part of GI in 11.2 and higher? The ora.crf resource in 11.2 GI (and higher) is the resource for CHM, and the ora.crf resource is managed by ohasd. Starting and stopping ora.crf resource starts and stops CHM. To stop CHM (or ora.crf resource managed by ohasd) $GRID_HOME/bin/crsctl stop res ora.crf -init To start CHM (or ora.crf resource managed by ohasd) $GRID_HOME/bin/crsctl start res ora.crf -init If vendor clusterware is being used, upload the vendor clusterware logs AODU>