Enabling Advanced Features of Oracle Net Services
一、什么是TAF
Transparent Application Failover ,透明应用程序故障转移,当客户端和oracle数据库建立连接后,正在执行某个应用,此时客户端连接的某个实例突然宕机,那么客户端的连接就会被转移到其他健康实例上去, 对于客户端的taf,如果执行的是update,insert等事物语句,那么就会回退,如果是select语句,那么会转移到新的实例继续执行,这个过程不需要用户的的介入,是透明的。
TAF 是和下列类型的故障转移
Oracle Real Application Clusters
Replicated systems
Standby databases
Single instance Oracle database
二:net service参数:
Parameter Description
FAILOVER
在连接监听的时候,如果FAILOVER=on,在连接地址列表中的第一个地址,监听连接失败,指示net去连接地址列表中的其他监听,当FAILOVER=on=off 时,只会连接地址列表中的第一地址的监听,
Connect-time failover is turned on by default for multiple address lists (ADDRESS_LIST), connect descriptors (DESCRIPTION), and multiple connect descriptors (DESCRIPTION_LIST).
When using a connect descriptor with a SERVICE_NAME, ensure that the value is not a GLOBAL_DBNAME in any SID_DESCentry, or a SID_NAME in any SID_DESC entry without a GLOBAL_DBNAME set.
LOAD_BALANCE
当LOAD_BALANCE =on 那么指示net progress 以随机的方式连接地址列表中地址,在不同监听之间平衡负载压力。当LOAD_BALANCE =off
是让net progress 安顺序访问地址列表的监听,直到连接成功为止.
Client load balancing is turned on by default for multiple connect descriptors (DESCRIPTION_LIST).
SOURCE_ROUTE
When set to on, instructs Oracle Net to use each address in the order presented until the destination is reached. This parameter is required for reaching the destination using a specific route, that is, by specific computers. This parameter is used to enable connections to Oracle Connection Manager.
Creating a List of Listener Protocol Addresses
sales.us.example.com=
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=sales1-server)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=sales2-server)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)))
Table 13-1 Address List Parameters in the tnsnames.ora File
Parameter | Description |
---|---|
At connect time, this parameter instructs Oracle Net to fail over to a different listener if the first listener fails when set to Connect-time failover is turned | |
When set to Client load balancing is turned | |
When set to |
Note:
You cannot set source routing (SOURCE_ROUTE) at the same level as connect-time failover (FAILOVER) or client load balancing (LOAD_BALANCE). Source routing connects to each address in the list sequentially whereas connect-time failover and client load balancing select a single address from a list.
When a connect descriptor in a tnsnames.ora
file contains at least two protocol addresses for an Oracle Connection Manager hop, then parameters for connect-time failover and load balancing within the hop can be included in the file.
Table 13-2 describes the address list options.
Table 13-2 Address List Options Dialog Box
Option | Parameter Setting |
---|---|
Try each address, in order, until one succeeds. | |
Try each address, randomly, until one succeeds. | LOAD_BALANCE=on FAILOVER=on |
Try one address, selected at random. | LOAD_BALANCE=on |
Use each address in order until destination reached. | SOURCE_ROUTE=on |
Use only the first address. | LOAD_BALANCE=off FAILOVER=off SOURCE_ROUTE=off |
sales.us.example.com=
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=off)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=sales1-server)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=sales2-server)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=sales.us.example.com)))
三、TAF 参数
BACKUP
A different net service name for backup connections. A backup should be specified when using preconnect to pre-establish connections.
DELAY
每次重试的时间间隔. 如果 RETRIES 已经指定,那么DELAY默认是1秒,If a callback function is registered, then this parameter is ignored.
METHOD
设置快速转移从 primary node 到 backup node:
basic: 在故障转移的时间间隔建立连接。此选项是故障转移前是不去对backup node做连接的,只有在故障发生时,才开始去连接backup node。
preconnect: 建立预连接。此选项比basic方式提供了更快的故障转移,。用户建立connect,对primary node和backup node 节点都会建立连接
BASIC方式在Failover时会有时间延迟,PRECONNECT方式虽然没有时间延迟,但是建立多个冗余连接会消耗更多资源
RETRIES
故障转移重试的次数,如果指定了DELAY,那么默认重试5次,If a callback function is registered, then this parameter is ignored.
TYPE
故障转移的类型. Three types of Oracle Net failover functionality are available by default to Oracle Call Interface (OCI) applications:
session: Set to failover the session. If a user's connection is lost, then a new session is automatically created for the user on the backup. This type of failover does not attempt to recover select operations.
select: Set to enable users with open cursors to continue fetching on them after failure. However, this mode involves overhead on the client side in normal select operations.
none: This is the default. No failover functionality is used. This can also be explicitly specified to prevent failover from happening.
对于select和session方式对于未提交的事务都会自动回滚,区别在于对select 语句的处理,对于select,用户正在执行的select语句会被转移到新的实例上,在新的节点上继续返回后续结果集,而已经返回的记录集则抛弃。
假设用户正在节点1上执行查询,整个结果集共有100条记录,现在已从节点1上返回10条记录,这时节点1宕机,用户连接被转移到节点2上,如果是session模式,则需要重新执行查询语句;如果是select方式,会从节点2上继续返回剩下的90天记录,而已经从节点1返回的10条记录不会重复返回给用户,对于用户而言,感受不到这种切换,显然为了实现select 方式,Oracle 必须为每个session保存更多的内容,包括游标,用户上下文等,需要更多的资源也是用资源换时间的方案。
四、举例
TAF with Connect-Time Failover and Client Load Balancing
在客户端的tnsname.ora里添加如下
sales.us.example.com=
(DESCRIPTION=
(LOAD_BALANCE=on) ----随机访问一个监听,直到成功
(FAILOVER=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server) ---ip地址或者主机名
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(FAILOVER_MODE=
(TYPE=select) --select方式的故障转移
(METHOD=basic))))
此方式建立的的故障转移,只能故障转移select语句,例如 updateinsert 语句在failover后操作会回退。
update t1 set sal=sal+1
此时数据库出现故障,那么就会被故障转移到另外一个节点,这是提交就会保存哦
SQL> commit;
commit
*
第 1 行出现错误:
ORA-25402: 事务处理必须重新运行
以此方式建立的taf,是建立在客户端的tnsname.ora里,这种方式建立的taf有一定的弊端,就是当客户端连接的时候,如果某个实例是在mount状态,那么客户端也是可能连接上去,由于是mount状态,所以会报错,而且现在应用程序都是用中间层,在客户端配置容易出错误,所以 有了服务端的taf,能解决此问题。
TAF Retrying a Connection
sales.us.example.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic)
(RETRIES=20)
(DELAY=15)))) ----TAF重新连接的时间,每次重试连接间隔15秒,总共重试20次,故障转移类型为select方式
TAF Pre-establishing a Connection
sales1.us.example.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_NAME=sales1)
(FAILOVER_MODE=
(BACKUP=sales2.us.example.com) -----此backup指向下面红色字体
(TYPE=select)
(METHOD=preconnect))))
sales2.us.example.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_NAME=sales2)
(FAILOVER_MODE=
(BACKUP=sales1.us.example.com)
(TYPE=select)
(METHOD=preconnect))))
通过测试此预连接方式,只能故障转移已经连接到数据库上的连接select。如果已经发生故障,再去连接sales1.us.example.com是连接不上的. 发生故障前连接到发生故障的数据库的update ,在发生故障时没有提交的会回退。
Specifying the Instance Role for Primary and Secondary Instance Configurations
instance_role 在下列情况使用:
1、你想明确指定连接到第一个实例还是第二个实例上去,默认是连接到第一个实例,主实例上。
2、你想使用taf预连接到第二实例上去。
The INSTANCE_ROLE 值有:
primary: 指定连接到 primary instance.
secondary: 指定连接到secondary instance.
any: 可以连接到连接地址里的任何一个实例。
Connection to Instance Role Type
sales_primary=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=primary)))
sales_secondary=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=secondary)))
Connection To a Specific Instance
sales1=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=any)
(INSTANCE_NAME=sales1)
(SERVER=dedicated)))
sales2=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=any)
(INSTANCE_NAME=sales2)
(SERVER=dedicated)))
TAF Pre-Establishing a Connection
sales1.example.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales1-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=primary)
(FAILOVER_MODE=
(BACKUP=sales2.example.com)
(TYPE=select)
(METHOD=preconnect))))
sales2.example.com=
(DESCRIPTION=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=sales2-server)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)
(INSTANCE_ROLE=secondary)))
通过语句可以查看是否已经发生了故障转移,以及故障转移的方式
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION
GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;
MACHINE FAILOVER_TYPE FAILOVER_M FAI COUNT(*)
---------------------------------------------------------------- ------------- ---------- --- ----------
rac1 NONE NONE NO 46
rac1 SESSION BASIC YES 1