Focus On Oracle

Installing, Backup & Recovery, Performance Tuning,
Troubleshooting, Upgrading, Patching

Oracle Engineered System


当前位置: 首页 » 技术文章 » Oracle

Admin-managed and Policy-managed(RAC,RAC One Node)

从11gR2的白皮书中,我们可以看到,在11.2.0.1中引入了基于Policy的集群管理,在11.2.0.2中引入了RAC One Node(其实在11.2.0.1中也有,不过配置麻烦点,还不很成熟)。在11.2.0.4中还引入了Oracle RAC Configuration Audit Tool (RACcheck)和Oracle Trace File Analyzer (TFA) Collector,请打开下面链接查看详情。

What's New in Oracle RAC Administration and Deployment?
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/whatsnew.htm#RACAD000

Administrator managed(Admin-Managed)
Database administrators define on which servers a database resource should run, and place resources manually as needed. This is the management strategy used in previous releases.
数据库管理员根据需要,通过手动设置数据库资源在哪些服务器上运行,这种管理方式称为管理员管理的策略,这也是之前的版本管理方式。在11gR2之前,通过DBCA创建数据库时,可以选择需要在哪些节点(至少2个节点,可以有更多的节点)上创建数据库,创建完成后,如果不增加或减少节点,数据库会一直在这些节点上运行。就是说当管理员安装完毕以后,这些节点都不会自动变化,这称为Admin-Managed。其实Admin-Managed也有server pool,不过这种管理方式的管理池不能更改。

[orgrid@ohs1 ~]$ srvctl config database -d pgold
Database unique name: pgold
Database name: pgold
Oracle home: /ordb/oracle/product/112
Oracle user: oracle
Spfile: +DATA_PGOLD/pgold/spfilepgold.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: pgold
Database instances: pgold1,pgold2
Disk Groups: DATA_PGOLD,SYSTEMDG
Mount point paths:
Services:
Type: RAC
Database is administrator managed 

[orgrid@ohs2 ~]$

[orgrid@ohs2 ~]$ srvctl config srvpool -g pgold
PRKO-3160 : Server pool pgold is internally managed as part of administrator-managed database configuration and therefore cannot be queried directly via srvpool object.
[orgrid@ohs2 ~]$

Policy managed(Policy-Managed)
Database administrators specify the server pool (excluding generic or free) in which the database resource runs. Oracle Clusterware places the database resource on a server.
数据库管理员指定数据库在哪个服务器池(除了generic和free)中运行,也就是说这种管理方式是以服务器池(Server Pool)为基础。我们可以创建一个Server Pool,把一些服务器加入到Server Pool,制定策略(就是配置据库的实例在哪几台服务器上运行,及实例运行的的个数)。根据需要,我们可以设置数据库实例个数及所运行的主机。Policy-Managed数据库实例后缀名和Admin-Managed数据库实例后缀名是不一样的,前者的后缀名为1(racdb1,racdb2,racdb3),后者的后缀名为_1(racdb_1,racdb_2,racdb_3)。当服务器很多的时候,Policy-Managed管理方式的优势就体现出来了。

Cardinality
The number of servers on which a resource can run, simultaneously.
是说一个资源可以同时在多少个节点上一起运行。比如创建一个基于Policy-Managed数据库,如果这个参数为3(假如有4个节点),那么通过DBCA创建后,最多有三个实例。

RAC One Node

Oracle Real Application Clusters One Node (Oracle RAC One Node) is a single instance of an Oracle Real Application Clusters (Oracle RAC) database that runs on one node in a cluster. This option adds to the flexibility that Oracle offers for database consolidation.

RAC One Node:字面意思是一个节点的RAC数据库,是11.2的新特性, 是RAC数据库中的一个实例运行在集群中,有且只有一个节点运行(假定有2个节点,如果数据库在节点一上正常运行,如果这时启动节点二,节点一上的数据库会被关闭),还可以实现Failover。这个类似于利用其他厂商的集群软件来管理Oracle单机数据库,比如IBM HACMP。这两者都是高可用数据库的体现,不过RAC One Node是通过Oracle的集群软件实现的,通过集群软件可以管理数据库,和管理正常的数据库几乎一样(srvctl,crsctl命令几乎都一样),由于采用的都是Oracle的东西,所以管理维护起来和故障排除也很简单,如果利用厂商的集群软件做HA,还需要了解厂商集群软件(IBM HACMP,HP MC/SG)。

RAC One Node的优点和特点
A.有且只有一个节点在运行集群中,也因为这个特点,和RAC数据库相比,减少了RAC实例之间消息、数据请求传输的时间以及GC等待时间
B.如果当前运行节点需要维护,可以手动切换数据库手动切换到(relocate)到备用服务器,这样可以减少业务中断时间
C.很容易转变成RAC数据库, 可以在线操作,不需要停数据库
D.在GRID中可以创建多个RAC One Node数据库,运行在不同的节点上,提高了硬件的利用率
E.可以创建为基于Admin-Managed或Policy-Managed方式的数据库

RAC数据库和RAC One Node数据库可以相互转化

srvctl convert database -d db_unique_name -c RACONENODE [-i instance_name -w timeout]
srvctl convert database -d db_unique_name -c RAC [-n node_name]

http://docs.oracle.com/cd/E11882_01/rac.112/e41960/onenode.htm

什么是Server Pool?
是集群中服务器逻辑上的分组,可以保护应用、数据库,或两者都有。在某一个时间,一个服务器只能在某一特定的池中。她有三种类型,Free池、Generic池和用户自定义池。每个池有3个参数:Importance,Min, Max,这三个参数的意义为:

MIN_SIZE

The minimum number of servers the server pool should contain.

server pool中最小的服务器数。

MAX_SIZE

The maximum number of servers the server pool should contain.

server pool中最大的服务器数,取决于实际节点个数

IMPORTANCE

A number from 0 to 1000 (0 being least important) that ranks a server pool among all other server pools in a cluster.

她的取值范围为0到1000,表示该池的关键程度,这个数值越大,表示关键程度越高,会优先被考虑满足 Min 条件,默认值为0

Server Pool的分类

Server Pool有三种类型,她包含Free池、Generic池和用户自定义的池

Free  Pool:
It contains servers that are not assigned to any other server pools. The attributes of the Free server pool are restricted, as follows:
    SERVER_NAMES, MIN_SIZE, and MAX_SIZE cannot be edited by the user
    IMPORTANCE and ACL can be edited by the user
没有被指派server pool的server,都位于这个池中。池的属性(SERVER_NAMES, MIN_SIZE, and MAX_SIZE)不能被修改,IMPORTANCE可以被修改。    

Generic Pool:

It stores pre-11g release 2 (11.2) Oracle Databases and administrator-managed databases that have fixed configurations.

她包含了11gR2之前的的Oracle数据库和基于Admin-Managed数据库,因为他们的配置是固定的。

基于Policy-Managed RAC数据库测试

2个节点的RAC数据库,使用Policy-Managed方式。通过设置Server Pool,把服务器池的最大个数设置为1,观察数据库的变化

[orgrid@hs1 ~]$ srvctl config database -d racdb
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/ordb/oracle/product/112
Oracle user: oracle
Spfile: +DATA_PGOLD/RACDB/spfileRACDB.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: racp
Database instances:
Disk Groups: DATA_PGOLD
Mount point paths:
Services:
Type: RAC
Database is policy managed

[orgrid@ohs1 ~]$ srvctl config serverpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: racp
Importance: 0, Min: 0, Max: 2
Candidate server names: 
[orgrid@ohs1 ~]$
racp池是用DBCA创建数据库的时候由我们自己定义的。其中 Min: 0, Max: 2表示在这个池中最少允许有0台机器,最多允许有2台机器。

[orgrid@ohs1 ~]$ srvctl status server -n ohs1 -a       可以看到ohs1在racp池
Server name: ohs1
Server state: ONLINE
Server active pools: ora.racp
Server state details: 
[orgrid@ohs1 ~]$ srvctl status server -n ohs2 -a     可以看到ohs2也在racp池中
Server name: ohs2
Server state: ONLINE
Server active pools: ora.racp
Server state details:  [orgrid@ohs1 ~]$  

修改server pool中最大值(2-->1)
[orgrid@ohs1 ~]$ srvctl modify srvpool -h
Modifies the configuration for the server pool.
Usage: srvctl modify srvpool -g <pool_name> [-l <min>] [-u <max>] [-i <importance>] [-n "<server_list>"] [-f]
    -g <pool_name>           Server pool name
    -l <min>                 Minimum size of the server pool
    -u <max>                 Maximum size of the server pool, -1 for unlimited maximum size
    -i <importance>          Importance of the server pool
    -n "<server_list>"       Comma separated list of candidate server names
    -f                       Force the operation even though some resource(s) will be stopped
    -h                       Print usage

[orgrid@ohs1 ~]$

[orgrid@ohs1 ~]$ srvctl modify srvpool -g racp -l 1 -u 1 -i 100

PRCS-1011 : Failed to modify server pool racp
CRS-2736: The operation requires stopping resource 'ora.racdb.db' on server 'ohs1'
CRS-2738: Unable to modify server pool 'ora.racp' as this will affect running resources, but the force option was not specified

[orgrid@ohs1 ~]$ srvctl modify srvpool -g racp -l 1 -u 1 -i 100 -f    调整racp池最多可以容纳的服务器个数

[orgrid@ohs1 ~]$

[orgrid@ohs1 ~]$ srvctl config serverpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names:
Server pool name: racp
Importance: 100, Min: 1, Max: 1 Candidate server names:  
[orgrid@ohs1 ~]$
[orgrid@ohs1 ~]$ ps -ef|grep pmon
oracle    5765     1  0 00:37 ?        00:00:00 asm_pmon_+ASM1
oracle   10621  7687  0 01:07 pts/1    00:00:00 grep pmon

[orgrid@ohs1 ~]$ srvctl status server -n ohs1 -a     可以看到ohs1从racp池中移除,自动添加到Free池中

Server name: ohs1
Server state: ONLINE
Server active pools: Free
Server state details: 

[orgrid@ohs1 ~]$ srvctl status server -n ohs2 -a      ohs2还在racp池中
Server name: ohs2
Server state: ONLINE
Server active pools: ora.racp
Server state details: 

[orgrid@ohs1 ~]$

我们可以看到,节点一被shutdown了,节点一从server pool:racp中移除,现在节点一的server pool为free


基于Policy-Managed RAC One Node数据库测试

2个节点的RAC One Node数据库,使用Policy-Managed方式。在节点一上kill pmon进程来测试Failover。
[orgrid@ohs1 ~]$ srvctl config database -d racdb

Database unique name: racdb
Database name: racdb
Oracle home: /ordb/oracle/product/112
Oracle user: oracle
Spfile: +DATA_PGOLD/racdb/spfileracdb.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: onp
Database instances: 
Disk Groups: DATA_PGOLD
Mount point paths: 
Services: ap
Type: RACOneNode
Online relocation timeout: 30
Instance name prefix: racdb
Candidate servers: 
Database is policy managed

[orgrid@ohs1 ~]$ srvctl config service -s ap -d racdb
Service name: ap
Service is enabled
Server pool: onp
Cardinality: SINGLETON
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition: 
Service is enabled on nodes: 
Service is disabled on nodes: 
[orgrid@ohs1 ~]$

[orgrid@ohs1 ~]$ srvctl config serverpool -g onp
Server pool name: onp
Importance: 0, Min: 0, Max: 2
Candidate server names: 

[orgrid@ohs1 ~]$ srvctl status server -n ohs2 -a
Server name: ohs2
Server state: ONLINE
Server active pools: ora.onp
Server state details:
 

[orgrid@ohs1 ~]$ srvctl status server -n ohs1 -a
Server name: ohs1
Server state: ONLINE
Server active pools: ora.onp
Server state details: 
[orgrid@ohs1 ~]$


可以看到数据库在节点一上运行
[orgrid@ohs1 ~]$ ps -ef|grep pmon
orgrid    5772     1  0 14:16 ?        00:00:00 asm_pmon_+ASM1
oracle    6645     1  0 14:17 ?        00:00:00 ora_pmon_racdb_1
[orgrid@ohs1 ~]$
[root@ohs2 ~]# ps -ef|grep pmon
orgrid    5820     1  0 14:16 ?        00:00:00 asm_pmon_+ASM2
root     12907  9205  0 15:14 pts/1    00:00:00 grep pmon
[root@ohs2 ~]# 


kill节点一上pmon进程,kill一次之后,会尝试重新启动,如果启动不成功才切换到另外节点,本次测试kill了2次之后才切换到第二个节点

[orgrid@ohs1 ~]$ ps -ef|grep pmon
orgrid    5772     1  0 14:16 ?        00:00:00 asm_pmon_+ASM1
oracle    6645     1  0 14:17 ?        00:00:00 ora_pmon_racdb_1
oracle   14861 10475  0 15:14 pts/1    00:00:00 grep pmon
[orgrid@ohs1 ~]$ kill -9 6645
[orgrid@ohs1 ~]$ ps -ef|grep pmon
orgrid    5772     1  0 14:16 ?        00:00:00 asm_pmon_+ASM1
oracle   14903     1  0 15:14 ?        00:00:00 ora_pmon_racdb_1
oracle   15027 10475  0 15:14 pts/1    00:00:00 grep pmon
[orgrid@ohs1 ~]$ kill -9 14903
[orgrid@ohs1 ~]$ ps -ef|grep pmon
orgrid    5772     1  0 14:16 ?        00:00:00 asm_pmon_+ASM1
oracle   15213 10475  0 15:15 pts/1    00:00:00 grep pmon
[orgrid@ohs1 ~]$ 

可以看到数据库在第二个节点上成功启动,注意数据库的SID为racdb_1
[root@ohs2 ~]# ps -ef|grep ora_pmon
oracle   13224     1  0 15:15 ?        00:00:00 ora_pmon_racdb_1
root     13432  9205  0 15:16 pts/1    00:00:00 grep pmon
[root@ohs2 ~]# 

数据库现在在节点二上运行,我们可以通过relocate把数据库重新在节点一上运行,注意relocate之后SID变了
[orgrid@ohs1 ~]$ srvctl relocate database -d racdb -n ohs1
[orgrid@ohs1 ~]$ ps -ef|grep pmon
orgrid    5772     1  0 14:16 ?        00:00:00 asm_pmon_+ASM1
oracle   17548     1  0 15:29 ?        00:00:00 ora_pmon_racdb_2
oracle   17909 10475  0 15:30 pts/1    00:00:00 grep pmon
[orgrid@ohs1 ~]$ 
[root@ohs2 ~]# ps -ef|grep pmon
orgrid    5820     1  0 14:16 ?        00:00:00 asm_pmon_+ASM2
root     15915  9205  0 15:33 pts/1    00:00:00 grep pmon
[root@ohs2 ~]#

RAC One Node总结

如果节点意外终止(可以称为Failover),比如通过kill -9或其他未知因素造成数据库Crash,GI首先会尝试在这一节点上重新启动实例。如果启动不成功,会在其他节点启动实例。假如实例名为racdb_1,Failover之后实例名仍是racdb_1,这就是之前的测试为什么第一次kill pmon进程之后仍旧在同一节点启动,以及在第二个节点启动之后实例名没有改变。


在正常运行情况下,我们可以手动的切换实例(可以称为switchover)。假如实例在节点二上运行,我们通过relocate命令把数据库移动节点一,Oracle会在第一个节点创建pfile,由于SID已经被占用,所以会使用一个新的SID,等到节点一上实例成功启动后,会关闭第二个节点的实例。这也是上面为什么上面的测试中relocate之后,实例名会变化的原因。

基于Admin-Managed的RAC One Node数据库,因为配置是固定的,所以实例名不会发生变化。


Reference

Glossary
http://docs.oracle.com/cd/E11882_01/rac.112/e41959/glossary.htm#CWADD91440

What's New in Oracle RAC Administration and Deployment?
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/whatsnew.htm#RACAD000

Understanding Server pool

http://docs.oracle.com/cd/E11882_01/install.112/e41961/concepts.htm#CWLIN2966


Administering Oracle RAC One Node

http://docs.oracle.com/cd/E11882_01/rac.112/e41960/onenode.htm#RACAD7894


Server Control Utility Reference
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/srvctladmin.htm#RACAD005

Administering Database Instances and Cluster Databases
http://docs.oracle.com/cd/E11882_01/rac.112/e41960/admin.htm#RACAD900

Administering Oracle Clusterware

http://docs.oracle.com/cd/E11882_01/rac.112/e41959/admin.htm#CWADD838



关键词:rac 

相关文章

Oracle 19c新特性之RAC Automatic Failback Service
Install Oracle RAC Database 19c Step by Step
Oracle事务卫士(Transaction Guard)和应用连续性(Application Continuity)
Install Oracle Domain Service Cluster Step by Step
Oracle RAC and Third Party Cloud
ORA-12514 During DataPump Export/Import In RAC
How to config IB network listener
Oracle MAA汇总
在OEL6.8上安装12.2 RAC
Oracle Database 12.2 Hands-On Lab
How to Convert Physical Standby to Snapshot Standby
如何配置HITACHI存储多路径软件
Top