What is System Change Number (SCN)?
Concept
The SCN is a stamp that defines a committed version of a database at a point in time. Oracle assigns every committed transaction a unique SCN. The value of a SCN is the logical point in time at which changes are made to a database. This number is utilized by Oracle to log the changes made to the database.
SCN is a 6 Byte (48 bit) number whose value is 281,474,976,710,656 and represented as 2 parts - SCN_BASE and SCN_WRAP.
SCN_BASE is a 4 Byte (32 bit) number
SCN_WRAP is a 2 Byte (16 bit) number. Whenever SCN_BASE reaches its maximum (2 power 32 = 4294967296 ), SCN_WRAP is increased by one and SCN_BASE will be reset to 0. This continues until SCN_WRAP reaches its maximum, i.e. 2 power 16 = 65536.
SCN = (SCN_WRAP * 4294967296) + SCN_BASE
When the base exceeds 4 billion, then the wrap is incremented by 1. Essentially, wrap counts the number of times base wrapped around 4 billion.
Logically, The maximum value of the wrap defines the maximum value of SCN i.e. maximum value of wrap*4 billion = 65536* 4* power(2,30) = 281474976710656 = 281 trillion values.
SCN is incremented with the completion of each and every transaction. A commit does not write to datafiles at all. It does not update control files at all.
The SCN is written to the controlfiles when the control files are updated - which happens as the result of a few things, one being "end of official checkpoint".
SCN's are written to redo logs continuously - when you commit they are emitted into the redo stream, and semantically speaking, after the checkpoint COMPLETES, not after the checkpoint is INITIATED.
System Change Number,使用了6个字节(48bit),有2部分组成,SCN_WRAP(2字节,16bit),SCN_BASE(4字节,32bit),最大值为281474976710656。算法为(SCN_WRAP * 4294967296) + SCN_BASE,按照16384每秒的增长速率,Oracle数据库可以处理的数据的年限可以超过500年。
SQL> select power(2,48) max_scn from dual; MAX_SCN --------------- 281474976710656 SQL> SQL> select power(2,48)/16384/365/24/3600 years from dual; YEARS ---------- 544.770078 SQL>
查询当前SCN的值
select dbms_flashback.get_system_change_number scn from dual;
select current_scn from v$database;
Max Reasonable SCN
即在当前时间SCN允许达到的最大值,这是一个上限值,要避免数据库SCN无限制地增大,如果达到了SCN的最大值,就会出现故障,甚至可能要重建库。算法为(当前时间-1988年1月1日*24*3600*SCN每秒最大可能增长速率。当前时间减1988年1月1日的结果是天数,24表示1天24小时,3600表示1小时3600秒。按16K的最大值,SCN要增长到最大,要超过500年。算法可参考scnhealthcheck.sql(patch:13498243)。SCN每秒最大可增长速率跟Oracle版本有一定的关系,这个隐含参数是_max_reasonable_scn_rate,在11.2.0.2之前是16384,在11.2.0.2及之后版本是32768。
SCN HeadRoom
是指Max Reasonable SCN与当前数据库SCN的差值,以天((Max Reasonable SCN-Current SCN)/16384/3600/24)为单位。意思就是说,如果按SCN的最大增长速率,多少天会到达Max Reasonable SCN。
The difference between the current SCN the database is using, and the "not to exceed" upper limit, is known as the SCN headroom. For almost all Oracle Databases, this headroom is constantly increasing every second. However, Oracle has determined that some software bugs could cause the database to attempt to exceed the current maximum SCN value (or get closer to the limit than was warranted). Generally if the database does try to exceed the current maximum SCN value, the transaction that caused this event would be cancelled by the database, and the application would see an error. The next second the limit increases, so typically the application then continues with a slight hiccough in processing. However, in some very rare cases, the database does need to shutdown to preserve its integrity. In no cases is data lost or corrupted.
SCN Synchronize grows abnormally
一般情况下,Oracle SCN增长速率16384每秒/32768每秒可以满足需求。但Oracle的SCN会通过dblink进行传播(一个简单的select就可以让SCN同步),涉及到dblink操作的多个库,其它数据库的SCN同步到这些库中的最大的SCN。例如,如果A库通过dblink连接到B库,如果A库的SCN高于B库的SCN,那么B库就会递增SCN到跟A库一样,如果A库的SCN低于B库的SCN,那么A库的SCN会递增到跟B库的SCN一样。如果SCN异常增长,可能会出现SCN用尽的情况,由于SCN不能回退,所以必须重建库才能使用。
Similar to how clocks are kept synchronized in a computer network, when two databases communicate with each other over a database link, they synchronize their SCNs by picking the largest SCN in use by the two. So in some cases, databases experienced rapidly decreasing SCN headroom not because of a bug in that specific database, but because the bug was active in one or more of the databases that database was connected to. Since the database always rejects SCNs that exceed the current maximum SCN, the provision of being able to run Oracle Databases for more than 500 years was not affected in any of the cases.
SCN Bugfix
为了防止SCN异常增长,Oracle在2012年1季度的CPU和相关的PSU补丁中解决了这个问题。增加了一些隐含参数(_max_reasonable_scn_rate即SCN的最大增长速率,_external_scn_rejection_threshold_hours即拒绝外部SCN的阀值)。在有dblink的操作中,如果计算出来的HeadRoom的值小于_external_scn_rejection_threshold_hours的值,Oracle会拒绝同步,会出现ORA-19706错误。打了2012年1月CPU或PSU补丁,11.2.0.2及以后的版本默认是24小时,其他版本是31天即744小时。Oracle建议10g和11.1的数据库将此值设置为24,防止有SCN HeadRoom问题的系统将故障传播到其他系统。
All the associated bugs have been fixed in the January 2012 CPU (and associated PSU). The same fixes are also available in the database Patchset Update (PSU) and the latest Oracle Exadata and Windows bundled patches.The hidden parameter "_external_scn_rejection_threshold_hours" is introduced in January 2012 Critical Patch Update (CPU) and Patch Set Update (PSU) releases (and related bundles). Oracle recommends setting this parameter to the value 24 in 10g and 11.1 releases - it does not need to be set in 11.2 releases. The parameter is static and so must be set in the init.ora or spfile used to start the instance.
In init.ora:
# Set threshold on dd/mon/yyyy - See MOS Document 1393363.1
_external_scn_rejection_threshold_hours = 24
In the spfile:
alter system set "_external_scn_rejection_threshold_hours" = 24 comment='Set threshold on dd/mon/yyyy - See MOS Document 1393363.1' scope=spfile;
ORA-19706错误
[oracle@db1 ~]$ oerr ora 19706
19706, 00000, "invalid SCN"
// *Cause: The input SCN is either not a positive integer or too large.
// *Action: Check the input SCN and make sure it is a valid SCN.
[oracle@db1 ~]$
重现ORA-19706(invalid SCN)
在dblink涉及的数据库,只要其中任何一个SCN HeadRoom空间小于临界值时,都不允许同步,下面分别为在11.2.0.4和12.1.0.2上的测试,为了测试,你需要增加SCN,可以参考How to Increase SCN
测试A:UPGR 11.2.0.4,SCN为14874127569274,在UPGR上创建一dblink pdb11,这个数据库链接连接到pdb11(12.1.0.2.0),pdb11 SCN为2139331
SQL> select name,current_scn from v$database;
NAME CURRENT_SCN
--------- -----------
UPGR 1.4874E+13
SQL> set numwid 15
SQL> select name,current_scn from v$database;
NAME CURRENT_SCN
--------- ---------------
UPGR 14874127569274(为了测试,把SCN增加到这个值,只是为了测试)
SQL> create database link pdb11 connect to system identified by oracle using 'pdb11';
Database link created.
SQL> select * from dual@pdb11;
select * from dual@pdb11
*
ERROR at line 1:
ORA-19706: invalid SCN
ORA-02063: preceding line from PDB11
SQL> select
2 to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
3 (((
4 ((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
5 ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
6 (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
7 (to_number(to_char(sysdate,'HH24'))*60*60) +
8 (to_number(to_char(sysdate,'MI'))*60) +
9 (to_number(to_char(sysdate,'SS')))
10 ) * (16*1024)) - dbms_flashback.get_system_change_number)/(16*1024*60*60*24) scn_headeroom from dual;
DATE_TIME SCN_HEADEROOM
------------------- ---------------
2016/03/30 11:25:54 .01451076648853 -->HeadRoom小于1天
SQL>
测试B:CDB1 12.1.0.2,SCN为2139282,在CDB1上创建一dblink upgr,这个数据库链接连接到upgr(11.2.0.4),upgr SCN为2139331
SQL> show pdbs;
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB11 READ WRITE NO
4 PDB12 MOUNTED
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
2139282
SQL> create database link upgr connect to system identified by oracle using 'upgr';
Database link created.
SQL> select * from dual@upgr;
select * from dual@upgr
*
ERROR at line 1:
ORA-19706: invalid SCN
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
2139331
SQL>
alert日志
2016-03-30 11:21:55.141000 +08:00
Session (71,50247): EXTERNAL SCN ALERT: Rejected the attempt to advance SCN over limit by 23 hours worth to 0x0d87.271710ef, by outbound distributed transaction logon with returned scn
Session (71,50247): EXTERNAL SCN SOURCE: Outbound connection to DBID: cba3f82 GLOBAL_DB_NAME: UPGR
Session (71,50247): EXTERNAL SCN SOURCE: DBlink Name: UPGR, Connect String: (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.120)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME=upgr))), Remote Machine: db1
2016-03-30 11:29:34.262000 +08:00
Session (59,32064): EXTERNAL SCN ALERT: Rejected the attempt to advance SCN over limit by 23 hours worth to 0x0d87.27171195, by inbound distributed transaction logon with scn
Session (59,32064): EXTERNAL SCN SOURCE: Inbound connection from DBID: cba3f82 GLOBAL_DB_NAME: UPGR
Session (59,32064): EXTERNAL SCN SOURCE: DB Logon User: SYSTEM, Remote Machine: db1, Program: oracle@db1 (TNS V1-V3), OS User: oracle 2016-03-30 13:00:03.665000 +08:00 Setting Resource Manager plan SCHEDULER[0x4442]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN at pdb PDB11 (3) via parameter
2016-03-30 13:05:10.936000 +08:00
Session (59,50561): EXTERNAL SCN ALERT: Rejected the attempt to advance SCN over limit by 21 hours worth to 0x0d87.271726f8, by inbound distributed transaction logon with scn
Session (59,50561): EXTERNAL SCN SOURCE: Inbound connection from DBID: cba3f82 GLOBAL_DB_NAME: UPGR
Session (59,50561): EXTERNAL SCN SOURCE: DB Logon User: SYSTEM, Remote Machine: db1, Program: oracle@db1 (TNS V1-V3), OS User: oracle
2016-03-30 13:05:42.480000 +08:00
Session (38,63135): EXTERNAL SCN ALERT: Rejected the attempt to advance SCN over limit by 21 hours worth to 0x0d87.2717270a, by inbound distributed transaction logon with scn
Session (38,63135): EXTERNAL SCN SOURCE: Inbound connection from DBID: cba3f82 GLOBAL_DB_NAME: UPGR
Session (38,63135): EXTERNAL SCN SOURCE: DB Logon User: SYSTEM, Remote Machine: db1, Program: oracle@db1 (TNS V1-V3), OS User: oracle
2016-03-30 13:07:13.320000 +08:00
Session (71,50247): EXTERNAL SCN ALERT: Rejected the attempt to advance SCN over limit by 21 hours worth to 0x0d87.2717272a, by outbound distributed transaction logon with returned scn
Session (71,50247): EXTERNAL SCN SOURCE: Outbound connection to DBID: cba3f82 GLOBAL_DB_NAME: UPGR
Session (71,50247): EXTERNAL SCN SOURCE: DBlink Name: UPGR, Connect String: (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.120)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME=upgr))), Remote Machine: db1
修改时间为2016/03/31 15:00:00,即把当前时间提前了26小时,模拟解决故障
SQL> select
2 to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
3 (((
4 ((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
5 ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
6 (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
7 (to_number(to_char(sysdate,'HH24'))*60*60) +
8 (to_number(to_char(sysdate,'MI'))*60) +
9 (to_number(to_char(sysdate,'SS')))
10 ) * (16*1024)) - dbms_flashback.get_system_change_number)/(16*1024*60*60) scn_headeroom from dual;
DATE_TIME SCN_HEADEROOM
------------------- -------------
2016/03/31 15:03:08 27.9686788
SQL> select * from dual@pdb11;
D
-
X
SQL>
为了测试,把操作系统时间提前了,提前后HeadRoom增大到了27小时,再次查询就可以了,SCN较小的库也被SCN较大的库同步了
注意:在生产、测试库上都不能通过此方法解决问题,此次只是为了测试
查询当前SCN HeadRoom值(单位为小时)
select
to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
(((
((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
(((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
(to_number(to_char(sysdate,'HH24'))*60*60) +
(to_number(to_char(sysdate,'MI'))*60) +
(to_number(to_char(sysdate,'SS')))
) * (16*1024)) - dbms_flashback.get_system_change_number)/(16*1024*60*60)
scn_headeroom from dual;
查询当前可达到的最大SCN值
select
to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
(
((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
(((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
(to_number(to_char(sysdate,'HH24'))*60*60) +
(to_number(to_char(sysdate,'MI'))*60) +
(to_number(to_char(sysdate,'SS')))
) * (16*1024) max_scn from dual;
SCN相关参数
SQL> col name for a40
SQL> col description for a60
SQL> col value for a10
SQL> set pages 1000 lines 156
SQL> SELECT i.ksppinm name,
2 cv.ksppstvl value,
3 i.ksppdesc description
4 FROM sys.x$ksppi i, sys.x$ksppcv cv
5 WHERE i.inst_id = USERENV ('Instance')
6 AND CV.inst_id = USERENV ('Instance')
7 and i.indx = cv.indx
8 and i.ksppinm like '%&1%'
9 order by replace (i.ksppinm, '_', '');
Enter value for 1: scn
old 8: and i.ksppinm like '%&1%'
new 8: and i.ksppinm like '%scn%'
NAME VALUE DESCRIPTION ---------------------------------------- ---------- ------------------------------------------------------------ _broadcast_scn_wait_timeout 10 broadcast-on-commit scn wait timeout in centiseconds db_unrecoverable_scn_tracking TRUE Track nologging SCN in controlfile _dump_scn_increment_stack Dumps scn increment stack per session _enable_cscn_caching FALSE enable commit SCN caching for all transactions _enable_minscn_cr TRUE enable/disable minscn optimization for CR _enable_scn_wait_interface TRUE use this to turn off scn wait interface in kta _external_scn_logging_threshold_seconds 86400 High delta SCN threshold in seconds _external_scn_rejection_delta_threshold_ 0 external SCN rejection delta threshold in minutes minutes _external_scn_rejection_threshold_hours 24 Lag in hours between max allowed SCN and an external SCN _fbda_global_bscn_lag 0 flashback archiver global barrier scn lag _gc_check_bscn TRUE if TRUE, check for stale blocks _gc_global_checkpoint_scn TRUE if TRUE, enable global checkpoint scn _kdli_recent_scn FALSE use recent (not dependent) scns for block format/allocation _max_pending_scn_bcasts 8 maximum number of pending SCN broadcasts _max_reasonable_scn_rate 32768 Max reasonable SCN rate _scn_wait_interface_max_backoff_time_sec 600 max exponential backoff time for scn wait interface in kta s _scn_wait_interface_max_timeout_secs 2147483647 max timeout for scn wait interface in kta 17 rows selected. SQL>
scnhealthcheck.sql
define LOWTHRESHOLD=10
define MIDTHRESHOLD=62
define VERBOSE=TRUE
set veri off;
set feedback off;
set serverout on
DECLARE
verbose boolean:=&&VERBOSE;
BEGIN
For C in (
select
version,
date_time,
dbms_flashback.get_system_change_number current_scn,
indicator
from
(
select
version,
to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
((((
((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) +
((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
(((to_number(to_char(sysdate,'DD'))-1))*24*60*60) +
(to_number(to_char(sysdate,'HH24'))*60*60) +
(to_number(to_char(sysdate,'MI'))*60) +
(to_number(to_char(sysdate,'SS')))
) * (16*1024)) - dbms_flashback.get_system_change_number)
/ (16*1024*60*60*24)
) indicator
from v$instance
)
) LOOP
dbms_output.put_line( '-----------------------------------------------------'
|| '---------' );
dbms_output.put_line( 'ScnHealthCheck' );
dbms_output.put_line( '-----------------------------------------------------'
|| '---------' );
dbms_output.put_line( 'Current Date: '||C.date_time );
dbms_output.put_line( 'Current SCN: '||C.current_scn );
if (verbose) then
dbms_output.put_line( 'SCN Headroom: '||round(C.indicator,2) );
end if;
dbms_output.put_line( 'Version: '||C.version );
dbms_output.put_line( '-----------------------------------------------------'
|| '---------' );
IF C.version > '10.2.0.5.0' and
C.version NOT LIKE '9.2%' THEN
IF C.indicator>&MIDTHRESHOLD THEN
dbms_output.put_line('Result: A - SCN Headroom is good');
dbms_output.put_line('Apply the latest recommended patches');
dbms_output.put_line('based on your maintenance schedule');
IF (C.version < '11.2.0.2') THEN
dbms_output.put_line('AND set _external_scn_rejection_threshold_hours='
|| '24 after apply.');
END IF;
ELSIF C.indicator<=&LOWTHRESHOLD THEN
dbms_output.put_line('Result: C - SCN Headroom is low');
dbms_output.put_line('If you have not already done so apply' );
dbms_output.put_line('the latest recommended patches right now' );
IF (C.version < '11.2.0.2') THEN
dbms_output.put_line('set _external_scn_rejection_threshold_hours=24 '
|| 'after apply');
END IF;
dbms_output.put_line('AND contact Oracle support immediately.' );
ELSE
dbms_output.put_line('Result: B - SCN Headroom is low');
dbms_output.put_line('If you have not already done so apply' );
dbms_output.put_line('the latest recommended patches right now');
IF (C.version < '11.2.0.2') THEN
dbms_output.put_line('AND set _external_scn_rejection_threshold_hours='
||'24 after apply.');
END IF;
END IF;
ELSE
IF C.indicator<=&MIDTHRESHOLD THEN
dbms_output.put_line('Result: C - SCN Headroom is low');
dbms_output.put_line('If you have not already done so apply' );
dbms_output.put_line('the latest recommended patches right now' );
IF (C.version >= '10.1.0.5.0' and
C.version <= '10.2.0.5.0' and
C.version NOT LIKE '9.2%') THEN
dbms_output.put_line(', set _external_scn_rejection_threshold_hours=24'
|| ' after apply');
END IF;
dbms_output.put_line('AND contact Oracle support immediately.' );
ELSE
dbms_output.put_line('Result: A - SCN Headroom is good');
dbms_output.put_line('Apply the latest recommended patches');
dbms_output.put_line('based on your maintenance schedule ');
IF (C.version >= '10.1.0.5.0' and
C.version <= '10.2.0.5.0' and
C.version NOT LIKE '9.2%') THEN
dbms_output.put_line('AND set _external_scn_rejection_threshold_hours=24'
|| ' after apply.');
END IF;
END IF;
END IF;
dbms_output.put_line(
'For further information review MOS document id 1393363.1');
dbms_output.put_line( '-----------------------------------------------------'
|| '---------' );
END LOOP;
end;
/
--------------------------------------------------------------
ScnHealthCheck
--------------------------------------------------------------
Current Date: 2016/03/31 16:46:00
Current SCN: 14874127583521
SCN Headroom: 1.24
Version: 12.1.0.2.0
--------------------------------------------------------------
Result: C - SCN Headroom is low
If you have not already done so apply
the latest recommended patches right now
AND contact Oracle support immediately.
For further information review MOS document id 1393363.1
--------------------------------------------------------------
SQL>
SCN历史HeadRoom信息查询
set numwidth 17
set pages 1000
alter session set nls_date_format='DD/Mon/YYYY HH24:MI:SS';
SELECT tim, gscn,
round(rate),
round((chk16kscn - gscn)/24/3600/16/1024,1) "Headroom"
FROM
(
select tim, gscn, rate,
((
((to_number(to_char(tim,'YYYY'))-1988)*12*31*24*60*60) +
((to_number(to_char(tim,'MM'))-1)*31*24*60*60) +
(((to_number(to_char(tim,'DD'))-1))*24*60*60) +
(to_number(to_char(tim,'HH24'))*60*60) +
(to_number(to_char(tim,'MI'))*60) +
(to_number(to_char(tim,'SS')))
) * (16*1024)) chk16kscn
from
(
select FIRST_TIME tim , FIRST_CHANGE# gscn,
((NEXT_CHANGE#-FIRST_CHANGE#)/
((NEXT_TIME-FIRST_TIME)*24*60*60)) rate
from v$archived_log
where (next_time > first_time)
)
)
order by 1,2
;
Reference
Installing, Executing and Interpreting output from the "scnhealthcheck.sql" script (Note ID 1393363.1)
System Change Number (SCN), Headroom, Security and Patch Information (Note ID 1376995.1)
http://www.oracle.com/technetwork/topics/security/cpujan2012-366304.html
Patch Set Update and Critical Patch Update January 2012 Availability Document (Note ID 1374524.1)