Lunar的oracle实验室 - 世间所有相遇都是久别重逢

11.2中修改私有网络的private ip对应的主机名

发表于 2015 年 4 月 13 日由 Lunar

Oracle Clusterware使用oifcfg来管理网络信息（网卡，子网，网络接口的角色等等），但是他不管理每个网卡实际的IP地址。
因此，不能使用oifcfg来修改IP地址(只能修改子网)。
可以使用oifcfg getif来显示当前OCR里面的网卡信息，例如：

[root@RAC1 ~]# which oifcfg
/u01/11.2.0/grid/bin/oifcfg
[root@RAC1 ~]# env|grep ORA
ORACLE_SID=+ASM1
ORACLE_BASE=/u01/app/grid
ORACLE_HOME=/u01/11.2.0/grid
[root@RAC1 ~]# 
[root@RAC1 ~]# oifcfg getif 
eth0  192.168.56.0  global  public
eth1  192.168.88.0  global  cluster_interconnect
[root@RAC1 ~]#

在UNIX和linux系统中，网络接口的名称通常由OS来分配。
Windows系统中有所不同（不熟悉Windows上的RAC，职业生涯中，安装不超过10次Windows平台的RAC……）
数据库对外提供服务的网络，我们通常称为PUBLIC网络，VIP也是存储在OCR里面跟PUBLIC同一网段上的IP地址
注意：VIP是绑定在public ip上的虚拟IP，只有CRS启动后，才有VIP。
用于RAC中节点间通信或者RDBMS跟ASM实例通信的网络，我们称之为Private网络（私有网络），用于内部互联。
从11.2开始，cluster_interconnect也被用来做clusterware hearbeats
在11.2之前（11.1和10g），RACA使用私有IP对应的主机名（例如 lunar-priv）来作为集群的心跳（在安装RAC时有界面让指定的）
.
当修改了私有网络的IP或者主机名，就需要修改CRS（10g和11.1）或者GI(11.2以后)。
8i和9i没这个问题，因为那时候，Oracle没有自己的集群软件，只有“集群数据库”，所有的集群功能都是第三方集群软件完成的。
例如 :
HP的MC SERVERS GUARD/OPS OPTION
AIX的HACMP
Solaris的SUN Cluster
Tru64 UNIX的TruCluster
………………
.
在11.2之前，私有IP对应的主机名保存在OCR中，我们不能修改私有网络的主机名。
但是正因为如此，修改了私有IP，而私有IP对应的主机名其实并没有发生改变。
如果要修改主机名就简单的执行rootdelete.sh和rootdeinstall.sh重新配置CRS就可以了。
具体的例子请参考《在10.2 RAC中重建CRS的过程》
.
从11.2开始，CRS（Cluster Ready Service）升级为GI（Grid Infrastructure）。
私有IP对应的私有主机名不再存储在OCR中了（通过GNPN架构来管理，即网络即插即用，后面会讲）。
OCR和私有IP对应的主机名也不再有依赖关系。可以随便修改/etc/hosts
例如这里，
我们把两个节点的私有IP的主机名都修改一下：
RAC1-priv都修改为lunar1-priv
RAC2-priv修改为lunar2-priv

[root@RAC1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

#rac1
192.168.56.81   rac1.localdomain        rac1
192.168.56.85   rac1-vip.localdomain    rac1-vip
192.168.88.81   lunar1-priv.localdomain lunar1-priv

#rac2
192.168.56.82   rac2.localdomain        rac2
192.168.56.86   rac2-vip.localdomain    rac2-vip
192.168.88.82   lunar2-priv.localdomain lunar2-priv

#scan-ip
192.168.56.88   scan


10.10.10.66 s-dg
10.10.10.11 r-dg1
10.10.10.12 r-dg2
10.10.10.21 r-dg1-vip
10.10.10.22 r-dg2-vip
[root@RAC1 ~]#

关闭CRS：

[root@RAC1 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.racdb.db' on 'rac1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rac1'
CRS-2673: Attempting to stop 'ora.cvu' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rac1'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac2'
CRS-2677: Stop of 'ora.scan1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'rac2'
CRS-2677: Stop of 'ora.racdb.db' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ASMDATA.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac1'
CRS-2676: Start of 'ora.scan1.vip' on 'rac2' succeeded
CRS-2676: Start of 'ora.rac1.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'rac2'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.ASMDATA.dg' on 'rac1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'rac2'
CRS-2677: Stop of 'ora.cvu' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'rac2'
CRS-2676: Start of 'ora.cvu' on 'rac2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2676: Start of 'ora.oc4j' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rac1'
CRS-2677: Stop of 'ora.ons' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rac1'
CRS-2677: Stop of 'ora.net1.network' on 'rac1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@RAC1 ~]#

在OS层修改私有网络的IP地址和私有网络对应的主机名。
然后，直接重启CRS：

[root@RAC1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@RAC1 ~]#

看节点1的crs正常启动，没有问题（oc4j启动很慢，这个不用管）：
节点2我down了，没有启动，因此只显示了节点1的
而且，这里面dg单独配置了一个网卡，我也调整了，后面会说，怎么把ADG专有的network2修复好。

[root@RAC1 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMDATA.dg
               ONLINE  ONLINE       rac1                                         
ora.DATA.dg
               ONLINE  ONLINE       rac1                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                         
ora.LISTENER_DG.lsnr
               ONLINE  OFFLINE      rac1                                         
ora.asm
               ONLINE  ONLINE       rac1                     Started             
ora.gsd
               OFFLINE OFFLINE      rac1                                         
ora.net1.network
               ONLINE  ONLINE       rac1                                         
ora.net2.network
               ONLINE  OFFLINE      rac1                                         
ora.ons
               ONLINE  ONLINE       rac1                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                                         
ora.cvu
      1        ONLINE  ONLINE       rac1                                         
ora.oc4j
      1        ONLINE  OFFLINE                               STARTING            
ora.r-dg1-vip.vip
      1        ONLINE  OFFLINE                                                   
ora.r-dg2-vip.vip
      1        ONLINE  OFFLINE                                                   
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.rac2.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER         
ora.racdb.db
      1        ONLINE  ONLINE       rac1                     Open                
      2        ONLINE  OFFLINE                                                   
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                                         
[root@RAC1 ~]#

发表在 RAC | 标签为 11.2 RAC, 修改IP | 留下评论

11.2中修复CRS不能启动的例子-1-使用正常节点的gpnp profile修复损坏节点

发表于 2015 年 4 月 13 日由 Lunar

SSD上的一个11.2 RAC的其中一个节点OS不能起来了，鼓捣半天还是不行
想想这个是2013年买的，才两年啊……，不知道是不是这个原因，反正很无语……
另一个10多年前的活动硬盘上那个RedHat 2上的Oracle 8.0.6都还可以使用
.
就这个环境，从其他活动硬盘上复制了节点1的老的备份到SSD上，尝试修复整个RAC。
由于只修改了节点1的IP跟我现在VBOX中的配置一致即可，且节点2是正常的，因此，无需大招。
只要两件事情;
1,在OS层面修改节点1的网络配置:

	/etc/hosts  
	/etc/sysconfig/network-sritps/oifcfg-eth1
	/etc/sysconfig/network

2,把节点2的gpnp profile传给节点1
.
具体如下：
1，将2个节点的crsd都关闭，把节点2的profile.xml复制到节点1：
确认节点2的crs是关闭的：

[root@RAC2 ~]# ps -ef|grep d.bin
root      5026  3697  0 22:49 pts/4    00:00:00 grep d.bin
[root@RAC2 ~]#

2，确认当前的节点2的gpnp profile信息是正确的：
我这里主要是私有网络的IP地址应该为192.168.20.0网段：
即
可见这里是正确的。

[root@RAC2 ~]# gpnptool get
Warning: some command line parameters were defaulted. Resulting command line: 
         /u01/11.2.0/grid/bin/gpnptool.bin get -o-

Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 
GPnP service is not running on localhost. Found locally cached profile...
<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="12" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public,cluster_interconnect"/><gpnp:Network id="net2" Adapter="eth1" IP="192.168.20.0" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>s+dn2dL74Wsg58TpBl1wukYt3JM=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>hGmm6px+trv2uYThXOlkXUvuMjSJng7ZXgcWdwGeOugAXWRd58f/cHvHbioeKi2XK0kcUnh5OW2a9Mlhpy52Xi8+QdZdHNh5DSZ02HggiEJf0o0T29TZJr2mafTYyNKqzhHRv3aidAxwrPjLPr80rk6tEhB60hY9Ew+G15Do7D4=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>
Success.

Error CLSGPNP_NO_DAEMON getting profile.
[root@RAC2 ~]#

注意这里，当所有CRS进程都不启动时，gpnp的信息来自于他自己的一个cache（猜测这个是从文件上保存的profile中读取到他自己的所谓cache的）
.
3，查看节点1当前的gpnp profile，注意，其中的net2的信息，是错误的：

<gpnp:Network id="net2" IP="192.168.88.0" Adapter="eth1" Use="cluster_interconnect"/>
[root@RAC1 bin]# ./gpnptool get
Warning: some command line parameters were defaulted. Resulting command line: 
         ./gpnptool.bin get -o-

Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 
GPnP service is not running on localhost. Found locally cached profile...
<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="8" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public"/><gpnp:Network id="net2" IP="192.168.88.0" Adapter="eth1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>oNh4/PhSSqqj2jARHJa0GCNngHE=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>WThZxj2jGML+m+UrGrPBGZztS5xw6KJMb4eFL5l2MURoPAZCso5Ld9uTJ/taYpZAnamwYNYvkpouc/g/PuTI1WJGc2IBzFmF0ECnWXGGEPcS/8Sm4iiVvkyAYc/kNUA/DLzCvq1hozOopVgmQgfJZ/B+WVP423mBrnPFKYaFRKY=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>
Success.

Error CLSGPNP_NO_DAEMON getting profile.
[root@RAC1 bin]#

4，确认节点1的crs全部都是关闭的：

[root@RAC1 bin]# ps -ef|grep d.bin
root      3523  2865  0 22:48 pts/1    00:00:00 grep d.bin
[root@RAC1 bin]#

5，备份节点1当前的gpnp profile：

[root@RAC1 bin]# mkdir /home/oracle/gpnp
[root@RAC1 bin]# export GPNPDIR=/home/oracle/gpnp
[root@RAC1 bin]# env|grep GPNPDIR
GPNPDIR=/home/oracle/gpnp
[root@RAC1 bin]# ./gpnptool get -o=$GPNPDIR/profile.original
Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 
GPnP service is not running on localhost. Found locally cached profile...
Resulting profile written to "/home/oracle/gpnp/profile.original".
Success.

Error CLSGPNP_NO_DAEMON getting profile.
[root@RAC1 bin]# ll /home/oracle/gpnp/profile.original
-rw-r--r-- 1 root root 1878 Apr 11 22:52 /home/oracle/gpnp/profile.original
[root@RAC1 bin]#

6，将节点2的gpnp profile复制到节点1：

[root@RAC2 ~]# ll $GRID_HOME/gpnp/rac2
total 8
drwxr-x--T 6 grid oinstall 4096 May  3  2013 wallets
drwxr-x--- 3 grid oinstall 4096 May  3  2013 profiles
[root@RAC2 ~]# scp $GRID_HOME/gpnp/rac2/profiles/peer/profile.xml rac1:$GRID_HOME/gpnp/rac1/profiles/peer/profile.xml
root@rac1's password: 
profile.xml                                                                                                                                                 100% 1900     1.9KB/s   00:00    
[root@RAC2 ~]#

7，启动节点1和节点2的crs（正常启动即可）：

[root@RAC1 bin]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@RAC1 bin]#

8，可以看到，节点1已经可以正常启动了：

[root@RAC1 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMDATA.dg
               ONLINE  ONLINE       rac1                                         
ora.DATA.dg
               ONLINE  ONLINE       rac1                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                         
ora.LISTENER_DG.lsnr
               ONLINE  ONLINE       rac1                                         
ora.asm
               ONLINE  ONLINE       rac1                     Started             
ora.gsd
               OFFLINE OFFLINE      rac1                                         
ora.net1.network
               ONLINE  ONLINE       rac1                                         
ora.net2.network
               ONLINE  ONLINE       rac1                                         
ora.ons
               ONLINE  ONLINE       rac1                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                                         
ora.cvu
      1        ONLINE  ONLINE       rac1                                         
ora.oc4j
      1        ONLINE  ONLINE       rac1                                         
ora.r-dg1-vip.vip
      1        ONLINE  ONLINE       rac1                                         
ora.r-dg2-vip.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER         
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.rac2.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER         
ora.racdb.db
      1        ONLINE  ONLINE       rac1                     Open                
      2        ONLINE  OFFLINE                                                   
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                                         
[root@RAC1 ~]# gpnptool get
Warning: some command line parameters were defaulted. Resulting command line: 
         /u01/11.2.0/grid/bin/gpnptool.bin get -o-

<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="12" ClusterUId="4c62524e6a7dff62ffdae005dc6a08d6" ClusterName="racdb" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.56.0" Adapter="eth0" Use="public,cluster_interconnect"/><gpnp:Network id="net2" Adapter="eth1" IP="192.168.20.0" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/racdb/asmparameterfile/registry.253.814453247"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>s+dn2dL74Wsg58TpBl1wukYt3JM=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>hGmm6px+trv2uYThXOlkXUvuMjSJng7ZXgcWdwGeOugAXWRd58f/cHvHbioeKi2XK0kcUnh5OW2a9Mlhpy52Xi8+QdZdHNh5DSZ02HggiEJf0o0T29TZJr2mafTYyNKqzhHRv3aidAxwrPjLPr80rk6tEhB60hY9Ew+G15Do7D4=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>
Success.
[root@RAC1 ~]#

这里看到节点1的网络信息也正常了

再启动节点2：

[root@RAC2 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@RAC2 ~]#

除了network2（用于ADG的网络，还没有修改相应的配置），其他都正常了。

[root@RAC2 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMDATA.dg
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.DATA.dg
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.LISTENER_DG.lsnr
               ONLINE  ONLINE       rac1                                         
               ONLINE  OFFLINE      rac2                                         
ora.asm
               ONLINE  ONLINE       rac1                     Started             
               ONLINE  ONLINE       rac2                     Started             
ora.gsd
               OFFLINE OFFLINE      rac1                                         
               OFFLINE OFFLINE      rac2                                         
ora.net1.network
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.net2.network
               ONLINE  ONLINE       rac1                                         
               ONLINE  OFFLINE      rac2                                         
ora.ons
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                                         
ora.cvu
      1        ONLINE  ONLINE       rac1                                         
ora.oc4j
      1        ONLINE  ONLINE       rac1                                         
ora.r-dg1-vip.vip
      1        ONLINE  ONLINE       rac1                                         
ora.r-dg2-vip.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER         
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                                         
ora.racdb.db
      1        ONLINE  ONLINE       rac1                     Open                
      2        ONLINE  ONLINE       rac2                     Open                
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                                         
[root@RAC2 ~]#

发表在 RAC | 标签为 11.2 RAC, gpnp, gpnptool, 改IP | 留下评论

10.2.0.1 RAC中重建CRS的过程（比如错误的修改IP导致CRS不能启动）

发表于 2015 年 4 月 13 日由 Lunar

VIP是oracle 10.1引入的一个绑定在public IP上的静态IP，它的信息岁PUBLIC网卡的信息一起保存在OCR(Oracle Cluster Registry)中。
很多时候客户有一个需求，比如在测试环境是一套IP，迁移到生产后是另一套IP。
这时如果客户不想重装系统，那么就需要修改RAC的IP。
.
通常这样的操作本身都是小操作，不需要大动干戈，但前提是一定按照官方流程操作，比如修改IP时，CRS是开启的还是关闭的？
再或者一类常见错误是，修改IP时，需要的填写的是子网信息而不是IP地址，这也是一类常见的因为修改IP造成CRS不能启动的。
类似这些小地方不注意，就会造成CRS不能启动。
不论是10.2、11.1，还是11.2，这些东西基本的核心内容其实是变化不大的，不同的是11.2更加完善和提供了更多可选的修复方法（报错非官方支持的）。
.
总结一下大概的东西：
1，PUBLIC网络对应的主机名不能修改，如果要修改，官方说法是一定要重装CRS。
这里我感觉是这样的：如果把主机上那几个用主机名名门的目录名称修改为新的主机名，然后执行重新配置CRS的操作，理论上可以行。
比如10g是rootdelete.sh rootdeinstall.sh root.sh
11.2以后是deconfig和reconfig
不过测试环境和自己玩的也就算了，生产环境，我还是感觉按照官方说法靠谱，否则日后出现各种怪癖问题，那就是给自己挖坑了……
.
2，对于PUBLIC网络，不修改主机名，只修改PUBLIC IP，这个不需要修改CRS的配置，因为不论是10g, 11.1 ，还是11.2，CRS中都没有就具体IP
3，对于修改PUBLIC的网络接口名（interface）、子网（subnet）和掩码信息（netmask）等信息，需要使用oifcfg进行修改。
这个以前有客户问过，需要多久？
答：理论上10分钟以内。如果需要验证和回退措施等检验，那么申请30分钟到1小时的停业务时间吧。
.
4，如果修改私有网络，在10.2和11.1，因为这些信息保存在CRS中，因此需要相应的修改CRS信息（使用oifcfg）
从11.2开始，CRS（Cluster Ready Service）升级为GI（Grid Infrastructure）。私有IP对应的私有主机名不再存储在OCR中了。
而是通过GNPN架构来管理（即网络即插即用，后面会讲。OCR和私有IP对应的主机名也不再有依赖关系。
因此，可以随便修改而不需要在CRS层面改动任何东西。
.
5，如果修改错了，没关系，10.2和11.1的杀手锏是 rootdelete.sh rootdeinstall.sh root.sh
而11.2除了重新配置集群使用的deconfig和reconfig，还增加了使用gpnp profile来修复的功能。
同时gpnptool不知能修复这类问题，还能修复ASM spfile和CRS profile存放位置的信息异常等问题。
（比如ASM中查找磁盘的位置写错了，那么GI是无法正常启动的）
.
具体的，先从10.2中重建CRS的构成如下：
该过程可以用于修复,2类问题：
1，因为修改IP（PUBLIC IP，VIP，PRIVATE IP）等问题，造成CRS不能启动。
2，或者其他跟CRS配置相关的问题导致的CRS不能启动的问题。
.
具体操作如下：
1，停止节点1的CRS

[root@rh1 ~]# crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[root@rh1 ~]#

2，在节点1执行rootdelete.sh来删除CRS配置信息：

[root@rh1 ~]# cd $ORA_CRS_HOME/install
[root@rh1 install]# ls
cluster.ini         install.excl  paramfile.crs  rootaddnode.sbs   rootdeletenode.sh  rootlocaladd
cmdllroot.sh        install.incl  preupdate.sh   rootconfig        rootdelete.sh      rootupgrade
envVars.properties  make.log      readme.txt     rootdeinstall.sh  rootinstall        templocal
[root@rh1 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rh1 install]#

检查现在网卡的信息：

[root@rh1 install]#ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0C:29:8A:1A:12
          inet addr:192.168.10.11  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe8a:1a12/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1504 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1295 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:129195 (126.1 KiB)  TX bytes:118741 (115.9 KiB)
          Interrupt:10 Base address:0x1400

eth1      Link encap:Ethernet  HWaddr 00:0C:29:8A:1A:1C
          inet addr:192.168.20.11  Bcast:192.168.20.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe8a:1a1c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:204 errors:0 dropped:0 overruns:0 frame:0
          TX packets:210 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:19882 (19.4 KiB)  TX bytes:19404 (18.9 KiB)
          Interrupt:9 Base address:0x1480

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:19827 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19827 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6788710 (6.4 MiB)  TX bytes:6788710 (6.4 MiB)

[root@rh1 install]#

3，在节点2停止CRS

[root@rh2 ~]# crsctl stop crs
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
[root@rh2 ~]#

4，在节点2执行rootdelete.sh来删除CRS配置信息：

[root@rh2 ~]# cd $ORA_CRS_HOME/install
[root@rh2 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rh2 install]#

5，在节点1执行rootdeinstall.sh，清理OCR设备

[root@rh1 install]# ./rootdeinstall.sh

Removing contents from OCR device
2560+0 records in
2560+0 records out
[root@rh1 install]#

[root@rh1 install]# ps -e | grep -i 'ocs[s]d'
[root@rh1 install]# ps -e | grep -i 'cr[s]d.bin'
[root@rh1 install]# ps -e | grep -i 'ev[m]d.bin'
[root@rh1 install]# ps -ef|grep crs
root      2309 32489  0 13:32 pts/1    00:00:00 grep crs
[root@rh1 install]# ps -ef|grep d.bin
root      2311 32489  0 13:32 pts/1    00:00:00 grep d.bin
[root@rh1 install]#

6，在节点1执行root.sh来完成节点1的CRS的重新配置

[root@rh1 crs]# ./root.sh
WARNING: directory '/u01/app/oracle/product/ora10g' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product/ora10g' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
assigning default hostname rh1 for node 1.
assigning default hostname rh2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rh1 int1 rh1
node 2: rh2 int2 rh2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw1
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        rh1
CSS is inactive on these nodes.
        rh2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
[root@rh1 crs]#

7，在节点2执行root.sh来完成节点1的CRS的重新配置

[root@rh2 crs]# ./root.sh
WARNING: directory '/u01/app/oracle/product/ora10g' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product/ora10g' is not owned by root
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
assigning default hostname rh1 for node 1.
assigning default hostname rh2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rh1 int1 rh1
node 2: rh2 int2 rh2
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rh1
rh2
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.
[root@rh2 crs]#

确认两个节点的CRS进程都已经正常启动：

[root@rh1 crs]# ps -ef|grep d.bin
root     29156     1  0 12:17 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/crsd.bin reboot
oracle   29418 29154  0 12:18 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/evmd.bin
oracle   29585 29555  0 12:18 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/ocssd.bin

[root@rh1 crs]#

[root@rh2 crs]# ps -ef|grep d.bin
root     19689     1  0 11:57 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/crsd.bin reboot
oracle   19961 19687  0 11:58 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/evmd.bin
oracle   20096 20070  0 11:58 ?        00:00:00 /u01/app/oracle/product/ora10g/crs/bin/ocssd.bin
root     21283  8784  0 11:59 pts/1    00:00:00 grep d.bin
[root@rh2 crs]#

检查现在的网卡信息，现在vip已经分别绑定在每个节点的public网卡上了：

[root@rh1 crs]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0C:29:3F:E6:E7
          inet addr:192.168.10.11  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe3f:e6e7/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1305 errors:0 dropped:0 overruns:0 frame:0
          TX packets:731 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:202565 (197.8 KiB)  TX bytes:184325 (180.0 KiB)
          Interrupt:9 Base address:0x1400

eth1      Link encap:Ethernet  HWaddr 00:0C:29:3F:E6:F1
          inet addr:192.168.20.11  Bcast:192.168.20.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe3f:e6f1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5727 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8359 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4252719 (4.0 MiB)  TX bytes:7524822 (7.1 MiB)
          Interrupt:10 Base address:0x1480

eth1:1    Link encap:Ethernet  HWaddr 00:0C:29:3F:E6:F1
          inet addr:192.168.10.21  Bcast:192.168.10.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:10 Base address:0x1480

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:85980 errors:0 dropped:0 overruns:0 frame:0
          TX packets:85980 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6918640 (6.5 MiB)  TX bytes:6918640 (6.5 MiB)

[root@rh1 crs]#

然后再节点1（rh1）运行VIPCA来配置nodeapps

[root@rh1 crs]# crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2
[root@rh1 crs]#

[root@rh2 crs]# crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2
[root@rh2 crs]#

在节点1添加相应的ASM服务和ASM服务：

[oracle@rh1 ~]$ srvctl add database -d rac -o /u01/app/oracle/product/ora10g/db
[oracle@rh1 ~]$ srvctl add instance -d rac -i rac1 -n rh1
[oracle@rh1 ~]$ srvctl add instance -d rac -i rac2 -n rh2
[oracle@rh1 ~]$ srvctl add asm -n rh1 -i +ASM1 -o $ORACLE_HOME
[oracle@rh1 ~]$ srvctl add asm -n rh2 -i +ASM2 -o $ORACLE_HOME
[oracle@rh1 ~]$

启动ASM和数据库
[oracle@rh1 ~]$ srvctl start asm -n rh1
[oracle@rh1 ~]$ srvctl start asm -n rh2
[oracle@rh1 ~]$ srvctl start database -d rac
[oracle@rh1 ~]$

一切ok了：

[oracle@rh1 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.rac.db     application    ONLINE    ONLINE    rh1
ora....c1.inst application    ONLINE    ONLINE    rh1
ora....c2.inst application    ONLINE    ONLINE    rh2
ora....SM1.asm application    ONLINE    ONLINE    rh1
ora.rh1.gsd    application    ONLINE    ONLINE    rh1
ora.rh1.ons    application    ONLINE    ONLINE    rh1
ora.rh1.vip    application    ONLINE    ONLINE    rh1
ora....SM2.asm application    ONLINE    ONLINE    rh2
ora.rh2.gsd    application    ONLINE    ONLINE    rh2
ora.rh2.ons    application    ONLINE    ONLINE    rh2
ora.rh2.vip    application    ONLINE    ONLINE    rh2
[oracle@rh1 ~]$

使用netca重建LISTENER:

[oracle@rh1 admin]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.rac.db     application    ONLINE    ONLINE    rh1         
ora....c1.inst application    ONLINE    ONLINE    rh1         
ora....c2.inst application    ONLINE    ONLINE    rh2         
ora....SM1.asm application    ONLINE    ONLINE    rh1         
ora....H1.lsnr application    ONLINE    ONLINE    rh1         
ora.rh1.gsd    application    ONLINE    ONLINE    rh1         
ora.rh1.ons    application    ONLINE    ONLINE    rh1         
ora.rh1.vip    application    ONLINE    ONLINE    rh1         
ora....SM2.asm application    ONLINE    ONLINE    rh2         
ora....H2.lsnr application    ONLINE    ONLINE    rh2         
ora.rh2.gsd    application    ONLINE    ONLINE    rh2         
ora.rh2.ons    application    ONLINE    ONLINE    rh2         
ora.rh2.vip    application    ONLINE    ONLINE    rh2         
[oracle@rh1 admin]$

发表在 RAC | 标签为 10.2 RAC, CRS, rootdeinstall.sh, rootdelete.sh | 留下评论

在Exadata X5-2（HC）测试创建31g各类文件和10Tbigfile的效率

发表于 2015 年 4 月 7 日由 Lunar

本周要使用SwingBench对Exadata X5-2（HC）进行压测，因此，我提前准备了需要的表空间和redo等等。
1，首先看一下创建31g的数据文件的速度：
新增一个31g的数据文件，普通类型的数据文件，大概12秒左右：

sys@DBM>create tablespace Lunar_500g datafile '+DATADG' size 31g 
autoextend on next 8g maxsize 31g  
22:53:20   2  22:53:20   3  EXTENT MANAGEMENT LOCAL AUTOALLOCATE  
22:53:20   4  SEGMENT SPACE MANAGEMENT AUTO ;  

Tablespace created.

22:53:32 sys@DBM>

对比一下EF550全闪存（可用空间也是20T，Exadata X5-2 QuarterRack也带了20T的闪存）：

增加一个31g的临时表空间文件，大概不到20秒：

00:17:04 sys@DBM>alter tablespace temp add tempfile '+DATADG' SIZE 31g;

Tablespace altered.

00:17:22 sys@DBM>

新增一个31g的undo表空间文件，大概不到30秒：

00:18:34 sys@DBM>alter tablespace UNDOTBS1 add datafile '+DATADG' SIZE 31g;

Tablespace altered.

00:19:02 sys@DBM>

新增一个system的31g文件，时间比较长，大概不到80秒的样子：

00:19:02 sys@DBM>alter tablespace SYSTEM add datafile '+DATADG' SIZE 31g;

Tablespace altered.

00:20:19 sys@DBM>

新增一个SYSAUX表空间的文件大概20秒左右：

00:20:19 sys@DBM>aalter tablespace SYSAUX add datafile '+DATADG' SIZE 31g;

Tablespace altered.

00:20:40 sys@DBM>

顺便测试了一下Exadata上创建10T bigfile的效率，具体命令如下：

create bigfile tablespace soe datafile '+DATADG' size 10T 
autoextend on next 8g  
EXTENT MANAGEMENT LOCAL uniform size 1m  
SEGMENT SPACE MANAGEMENT AUTO ;

这个过程执行了大概1小时，也就是在此环境下，创建10T的bigfile需要1小时时间，这个速度我感觉已经超快了：

00:43:20 sys@DBM>create bigfile tablespace soe datafile '+DATADG' size 10T 
autoextend on next 8g  
EXTENT MANAGEMENT LOCAL uniform size 1m  
SEGMENT SPACE MANAGEMENT AUTO ;
00:43:21   2  00:43:21   3  00:43:21   4  

Tablespace created.

01:45:24 sys@DBM>

.
首先了解一下，offload和Smart Scan的关系。Offload在很多时候我们可以跟SmartScan理解为一致的东西。
它是指讲处理能力从数据库服务器下移到存储服务器的操作。
这一操作不仅转移了CPU的使用，更主要的是大大减少了返回到数据库层的IO：
1，减少存储系统和数据库服务器之间的传输数据量
2，减少数据库服务器上的CPU使用率（解压缩的操作发生在cell上）
3，减少存储层磁盘读取的时间
.
但是在某些情况下，还有一些其他非SmartScan类型的卸载（offload）操作：
1,智能文件创建（数据块在存储节点上被格式化）
2,Rman增量备份
.
在创建大表空间的时候，我们看到了等待事件“cell smart file creation”
这个正是我们期待看到的，这类IO走智能文件创建（Offload是Exadata的独门武器，而智能文件创建是非Smart Scan类型的一种IO卸载功能）。

01:17:51 sys@DBM>select inst_id,event,count(1) from gv$session where wait_class#<> 6 group by inst_id,event order by 1,3;

   INST_ID EVENT                                                              COUNT(1)
---------- ---------------------------------------------------------------- ----------
         1 cell smart file creation                                                  1

01:18:06 sys@DBM>/

   INST_ID EVENT                                                              COUNT(1)
---------- ---------------------------------------------------------------- ----------
         1 cell smart file creation                                                  1

01:18:21 sys@DBM>

整个创建bigtable的时间测算，测试期间，db节点和cell节点的CPU占用都很低：
每秒 6G
每分钟 360G
每小时 20T（既然是每小时20T，那么我的这个10T表空间为什么使用了1小时，而不是30分钟？）
这个啊，我猜因为咱们是ASM NORMAL redundancy（Exadata上没有RAID，靠的就是ASM自身的冗余和IO rebalance）。
也就是IO要写两份，O(∩_∩)O哈哈~
具体看图：

发表在 POC和性能调整 | 标签为 bigfile, exadata X5 | 留下评论

使用Oracle 11.2的DBMS_RESOURCE_MANAGER.CALIBRATE_IO对Exadata X5（HC）进行测试

发表于 2015 年 4 月 7 日由 Lunar

之前测试的X3的flashcard的IOPS大概是满配200w左右（具体参见《Exadata上的IOPS和MBPS》）
使用Exadata的calibrate命令测试X5-2的IOPS和MBPS
本次测试的机器是Exadata X5-2：
CPU型号：

[root@dm01cel02 cellos]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2401.000
model name      : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
[root@dm01cel02 cellos]# 

[root@dm01db01 ibdiagtools]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2301.000
model name      : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
[root@dm01db01 ibdiagtools]#

主机型号：

[root@dm01db01 ibdiagtools]# /opt/oracle.SupportTools/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware matches supported profile for server=ORACLE_SERVER_X5-2
[root@dm01db01 ibdiagtools]# 

[root@dm01cel01 ~]# /opt/oracle.SupportTools/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware matches supported profile for server=ORACLE_SERVER_X5-2L
[root@dm01cel01 ~]#

.
Oracle 11.2有一个DBMS_RESOURCE_MANAGER.CALIBRATE_IO，可以用来测试磁盘IO。
这里有几个需要注意的地方：
1，DBMS_RESOURCE_MANAGER.CALIBRATE_IO不能并行执行，否则会报错：

declare
*
ERROR at line 1:
ORA-56705: I/O calibration already in progress
ORA-06512: at "SYS.DBMS_RMIN", line 463
ORA-06512: at "SYS.DBMS_RESOURCE_MANAGER", line 1302
ORA-06512: at line 6

2，这个过程的测试，对于基准测试，我个人感觉非常不合适（数值严重偏低……不知道是不是需要调整什么参数，还是Exadata上就不应该用这个测试？）
3，测试中，指定延迟参数时，不能低于10秒（oracle的限制，估计是针对硬盘考虑的）
4，后面监控用的是Oracle已离职员工（凯耀，我的好兄弟）给的mon监控软件（底层基于开源的collect rpm包）
不过在新版本的exadata上貌似脚本还有些问题，因此，有些画图时丢失了（比如IOPS，比如flashcard的图）
5，测试脚本：

set timing on serveroutput on;
declare
  v_max_iops BINARY_INTEGER;
  v_max_mbps BINARY_INTEGER;
  v_act_lat BINARY_INTEGER;
begin
  DBMS_RESOURCE_MANAGER.CALIBRATE_IO(102,10 ,v_max_iops,v_max_mbps,v_act_lat);
  dbms_output.put_line('max iops : ' || v_max_iops );
  dbms_output.put_line('max mbps : ' || v_max_mbps );
  dbms_output.put_line('actual latency : ' || v_act_lat );
end;
/

这里 102 是磁盘数量，10 是硬盘延迟
102的由来是因为，这个是一个Quarter Rack，使用onecommand安装后，缺省可用102个asmdisk，这个可以查询（V$ASM_DISK）。
下面是本次的测试结果：

15:15:05 sys@DBM>set timing on serveroutput on;
15:15:21 sys@DBM>declare
15:15:21   2    v_max_iops BINARY_INTEGER;
15:15:21   3    v_max_mbps BINARY_INTEGER;
15:15:21   4    v_act_lat BINARY_INTEGER;
begin
15:15:21   5  15:15:21   6    DBMS_RESOURCE_MANAGER.CALIBRATE_IO(102,10 ,v_max_iops,v_max_mbps,v_act_lat);
15:15:21   7    dbms_output.put_line('max iops : ' || v_max_iops );
  dbms_output.put_line('max mbps : ' || v_max_mbps );
15:15:21   8  15:15:21   9    dbms_output.put_line('actual latency : ' || v_act_lat );
15:15:21  10  end;
15:15:21  11  /
max iops : 10592
max mbps : 3515
actual latency : 11

PL/SQL procedure successfully completed.

Elapsed: 00:06:05.44
15:21:27 sys@DBM>

这里看到IOPS大概1w多，吞吐量大概每秒3.5GB，最大延迟11秒。
这个数值跟使用mon（凯耀写的）监控的数量差不错：

这里看到磁盘的吞吐量大概每秒3.5GB/s，跟mon监控的差不多

IB的吞吐量大概是3.4GB/s

逻辑硬盘的IOPS是10000左右，跟使用cell上的calibrate的数值差不多

这是执行过程中的一个截图。

16:35:13 sys@DBM>select to_char(START_TIME,'yyyy-mm-dd hh24:mi:ss') START_TIME,
                 to_char(END_TIME,'yyyy-mm-dd hh24:mi:ss') START_TIME,
16:35:13   2  16:35:13   3               MAX_IOPS, MAX_MBPS, MAX_PMBPS,LATENCY,NUM_PHYSICAL_DISKS
16:35:13   4  from dba_rsrc_io_calibrate;

START_TIME          START_TIME                  MAX_IOPS         MAX_MBPS        MAX_PMBPS          LATENCY NUM_PHYSICAL_DISKS
------------------- ------------------- ---------------- ---------------- ---------------- ---------------- ------------------
2015-04-06 15:15:22 2015-04-06 15:21:27            10592             3515              320               11                102

Elapsed: 00:00:00.00
16:35:14 sys@DBM>

发表在 POC和性能调整 | 标签为 DBMS_RESOURCE_MANAGER.CALIBRATE_IO, exadata X5, mon | 一条评论

使用Exadata的calibrate命令测试X5-2的IOPS和MBPS

发表于 2015 年 4 月 7 日由 Lunar

之前测试的X3的flashcard的IOPS大概是满配200w左右（具体参见《Exadata上的IOPS和MBPS》）
使用Oracle 11.2的DBMS_RESOURCE_MANAGER.CALIBRATE_IO对Exadata X5（HC）进行测试

使用Exadata的calibrate命令测试时需要注意：
1，这个不能再业务高峰期操作，否则可能造成应用hang等。
2，一般测试的时候，是把cellsrv进程宕掉，否则就要带-force参数。
3，在老板上上，如果这个命令报错，还有一种可能的原因，就是它需要生成一些临时的lun配置文件，然后自己格式化。
如果报错，可以手工自己删除这些临时的lun文件。

新版本目前没有发发现这类问题，因为Oracle已经修改了脚本，所有的文件生成目录都是带日期的（启动测试的日期）。
因此，不会因为不能格式化lun等信息报错，如图

4，Exadata的calibrate命令底层采用orion的方法，根据自己预定义的lun组合，然后压测N轮
（不同的lun组合，压测好多次，自动的过程，无需手工干预）
5，Exadata的calibrate采用只读方式（保护数据安全）
6，不能再一个cell上并发两个calibrate，否则会报错的。
.
本次测试的机器是Exadata X5-2：
CPU型号：

[root@dm01cel02 cellos]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2401.000
model name      : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
[root@dm01cel02 cellos]# 

[root@dm01db01 ibdiagtools]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2301.000
model name      : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
[root@dm01db01 ibdiagtools]#

主机型号：

[root@dm01db01 ibdiagtools]# /opt/oracle.SupportTools/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware matches supported profile for server=ORACLE_SERVER_X5-2
[root@dm01db01 ibdiagtools]# 

[root@dm01cel01 ~]# /opt/oracle.SupportTools/CheckHWnFWProfile -c loose
[SUCCESS] The hardware and firmware matches supported profile for server=ORACLE_SERVER_X5-2L
[root@dm01cel01 ~]#

估算flash card的IOPS：

[root@dm01db01 lunar]# cat a.txt |grep FLASH
dm01cel01: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,227.00 MBPS, and 386910 IOPS
dm01cel01: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 357090 IOPS
dm01cel01: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,227.00 MBPS, and 372007 IOPS
dm01cel01: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 399469 IOPS
dm01cel02: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,226.00 MBPS, and 366062 IOPS
dm01cel02: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 401361 IOPS
dm01cel02: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,226.00 MBPS, and 403004 IOPS
dm01cel02: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 408830 IOPS
dm01cel03: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,226.00 MBPS, and 340622 IOPS
dm01cel03: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 339174 IOPS
dm01cel03: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,227.00 MBPS, and 405781 IOPS
dm01cel03: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 408817 IOPS
[root@dm01db01 lunar]#

因此，理论上，这台X5-2 quarter rack的flash lun的IOPS总和超过 450万
.
而在之前测试的X3的flashcard的IOPS大概是满配200w左右（具体参见《Exadata上的IOPS和MBPS》）：
满配机器（14个cell）hard disk LUNs的MBPS推算： 1962*14个cell=27468
满配机器（14个cell）flash disk LUNs的MBPS推算： 4184*14个cell=58576
满配机器（14个cell）hard disk LUNs的IOPS推算： 4832*14个cell=67648
满配机器（14个cell）flash disk LUNs的IOPS推算：146413*14个cell=2049782
.
下面图是Oracle内部员工（凯耀）写的mon工具（底层基于开源的collect），在当前较新的版本上，貌似脚本还有点问题，我还没有调试。
因此这里只能看到磁盘的吞吐量，没有flash的。
磁盘的吞吐量最高大概3.5GB，CPU平均使用率在17%以下。

这个是执行过程中的一个瞬间的截图，可以看到这组LUN带的参数执行时，基本上是MBPS的较高值了：

这个图就是另外一组lun的orion，可以看到每个节点每次运行测试都是使用6个orion来压测的，这组lun显然MBPS较低：

具体测试如下：
测试时间，大概7分钟
测试方法，使用Exadata自带的calibrate命令：

[root@dm01db01 monitor]# dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root "cellcli -e calibrate force"
dm01cel01: Calibration will take a few minutes...
dm01cel01: Aggregate random read throughput across all hard disk LUNs: 1220 MBPS
dm01cel01: Aggregate random read throughput across all flash disk LUNs: 12568 MBPS
dm01cel01: Aggregate random read IOs per second (IOPS) across all hard disk LUNs: 2889
dm01cel01: Calibrating hard disks (read only) ...
dm01cel01: LUN 0_0  on drive [8:0      ] random read throughput: 147.00 MBPS, and 256 IOPS
dm01cel01: LUN 0_1  on drive [8:1      ] random read throughput: 142.00 MBPS, and 255 IOPS
dm01cel01: LUN 0_10 on drive [8:10     ] random read throughput: 148.00 MBPS, and 244 IOPS
dm01cel01: LUN 0_11 on drive [8:11     ] random read throughput: 145.00 MBPS, and 255 IOPS
dm01cel01: LUN 0_2  on drive [8:2      ] random read throughput: 145.00 MBPS, and 255 IOPS
dm01cel01: LUN 0_3  on drive [8:3      ] random read throughput: 146.00 MBPS, and 238 IOPS
dm01cel01: LUN 0_4  on drive [8:4      ] random read throughput: 146.00 MBPS, and 252 IOPS
dm01cel01: LUN 0_5  on drive [8:5      ] random read throughput: 146.00 MBPS, and 251 IOPS
dm01cel01: LUN 0_6  on drive [8:6      ] random read throughput: 143.00 MBPS, and 255 IOPS
dm01cel01: LUN 0_7  on drive [8:7      ] random read throughput: 150.00 MBPS, and 251 IOPS
dm01cel01: LUN 0_8  on drive [8:8      ] random read throughput: 151.00 MBPS, and 256 IOPS
dm01cel01: LUN 0_9  on drive [8:9      ] random read throughput: 145.00 MBPS, and 255 IOPS
dm01cel01: Calibrating flash disks (read only, note that writes will be significantly slower) ...
dm01cel01: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,227.00 MBPS, and 386910 IOPS
dm01cel01: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 357090 IOPS
dm01cel01: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,227.00 MBPS, and 372007 IOPS
dm01cel01: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 399469 IOPS
dm01cel01: CALIBRATE results are within an acceptable range.
dm01cel01: Calibration has finished.
dm01cel02: Calibration will take a few minutes...
dm01cel02: Aggregate random read throughput across all hard disk LUNs: 1195 MBPS
dm01cel02: Aggregate random read throughput across all flash disk LUNs: 12827 MBPS
dm01cel02: Aggregate random read IOs per second (IOPS) across all hard disk LUNs: 2990
dm01cel02: Calibrating hard disks (read only) ...
dm01cel02: LUN 0_0  on drive [8:0      ] random read throughput: 149.00 MBPS, and 254 IOPS
dm01cel02: LUN 0_1  on drive [8:1      ] random read throughput: 138.00 MBPS, and 253 IOPS
dm01cel02: LUN 0_10 on drive [8:10     ] random read throughput: 148.00 MBPS, and 255 IOPS
dm01cel02: LUN 0_11 on drive [8:11     ] random read throughput: 148.00 MBPS, and 248 IOPS
dm01cel02: LUN 0_2  on drive [8:2      ] random read throughput: 149.00 MBPS, and 250 IOPS
dm01cel02: LUN 0_3  on drive [8:3      ] random read throughput: 150.00 MBPS, and 252 IOPS
dm01cel02: LUN 0_4  on drive [8:4      ] random read throughput: 143.00 MBPS, and 252 IOPS
dm01cel02: LUN 0_5  on drive [8:5      ] random read throughput: 148.00 MBPS, and 241 IOPS
dm01cel02: LUN 0_6  on drive [8:6      ] random read throughput: 148.00 MBPS, and 255 IOPS
dm01cel02: LUN 0_7  on drive [8:7      ] random read throughput: 152.00 MBPS, and 256 IOPS
dm01cel02: LUN 0_8  on drive [8:8      ] random read throughput: 150.00 MBPS, and 255 IOPS
dm01cel02: LUN 0_9  on drive [8:9      ] random read throughput: 150.00 MBPS, and 242 IOPS
dm01cel02: Calibrating flash disks (read only, note that writes will be significantly slower) ...
dm01cel02: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,226.00 MBPS, and 366062 IOPS
dm01cel02: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 401361 IOPS
dm01cel02: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,226.00 MBPS, and 403004 IOPS
dm01cel02: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 408830 IOPS
dm01cel02: CALIBRATE results are within an acceptable range.
dm01cel02: Calibration has finished.
dm01cel03: Calibration will take a few minutes...
dm01cel03: Aggregate random read throughput across all hard disk LUNs: 1233 MBPS
dm01cel03: Aggregate random read throughput across all flash disk LUNs: 11763 MBPS
dm01cel03: Aggregate random read IOs per second (IOPS) across all hard disk LUNs: 3028
dm01cel03: Calibrating hard disks (read only) ...
dm01cel03: LUN 0_0  on drive [8:0      ] random read throughput: 148.00 MBPS, and 253 IOPS
dm01cel03: LUN 0_1  on drive [8:1      ] random read throughput: 152.00 MBPS, and 255 IOPS
dm01cel03: LUN 0_10 on drive [8:10     ] random read throughput: 148.00 MBPS, and 251 IOPS
dm01cel03: LUN 0_11 on drive [8:11     ] random read throughput: 154.00 MBPS, and 253 IOPS
dm01cel03: LUN 0_2  on drive [8:2      ] random read throughput: 150.00 MBPS, and 255 IOPS
dm01cel03: LUN 0_3  on drive [8:3      ] random read throughput: 152.00 MBPS, and 256 IOPS
dm01cel03: LUN 0_4  on drive [8:4      ] random read throughput: 148.00 MBPS, and 253 IOPS
dm01cel03: LUN 0_5  on drive [8:5      ] random read throughput: 147.00 MBPS, and 255 IOPS
dm01cel03: LUN 0_6  on drive [8:6      ] random read throughput: 144.00 MBPS, and 252 IOPS
dm01cel03: LUN 0_7  on drive [8:7      ] random read throughput: 153.00 MBPS, and 252 IOPS
dm01cel03: LUN 0_8  on drive [8:8      ] random read throughput: 143.00 MBPS, and 256 IOPS
dm01cel03: LUN 0_9  on drive [8:9      ] random read throughput: 142.00 MBPS, and 250 IOPS
dm01cel03: Calibrating flash disks (read only, note that writes will be significantly slower) ...
dm01cel03: LUN 1_1  on drive [FLASH_1_1] random read throughput: 3,226.00 MBPS, and 340622 IOPS
dm01cel03: LUN 2_1  on drive [FLASH_2_1] random read throughput: 3,226.00 MBPS, and 339174 IOPS
dm01cel03: LUN 4_1  on drive [FLASH_4_1] random read throughput: 3,227.00 MBPS, and 405781 IOPS
dm01cel03: LUN 5_1  on drive [FLASH_5_1] random read throughput: 3,226.00 MBPS, and 408817 IOPS
dm01cel03: CALIBRATE results are within an acceptable range.
dm01cel03: Calibration has finished.
[root@dm01db01 monitor]#

发表在 POC和性能调整 | 标签为 calibrate, exadata X5, iops, MBPS, mon | 留下评论

Exadata上CPU相关的设置（感觉已经很好了，除了irqbalance）

发表于 2015 年 4 月 6 日由 Lunar

有些情况下，Exadata上我们发现有IO降级的情况，但是检查磁盘和flashcard都没有损坏。
这时候，可能需要检查一下APIC timer功能是否正常（APIC timer没有正确开启可能导致iostat报告的吞吐量有降级的现象）
当发生这类问题时，可以对比vmstat的bi/bo（读和写的平均吞吐量），因为vmstat命令不受APIC timer的影响。
.
APIC Timer(Advanced Programmable Interrupt Controller)是一种时钟中断设备，即高级可编程中断控制器
CPU通过彼此发送中断来完成它们之间的通信，APIC 就是通过编程触发lapic设备周期产生中断，然后注入到vcpu。
2000年以后的CPU都支持，这是Intel多处理规范的核心。
APIC Timer的模式APIC定时器包含3种定时器模式:
周期触发periodic
一次性触发one-shot
TSC-Deadline模式(最新的CPU里面支持)。
.
老版本的Exadata可以检查 dmesg|grep “Using local APIC timer interrupts”，来确认APIC timer是否正确开启。
新版本的Exadata可以检查 dmesg|grep “Using IOAPIC for interrupt routing”，来确认APIC timer是否正确开启。
.
例如，下面显示在这个1/4配置的Exadata X5-2 上都已经正确开启了APIC timer：

dm01db01: ACPI: Using IOAPIC for interrupt routing
dm01db02: ACPI: Using IOAPIC for interrupt routing
dm01cel01: ACPI: Using IOAPIC for interrupt routing
dm01cel02: ACPI: Using IOAPIC for interrupt routing
dm01cel03: ACPI: Using IOAPIC for interrupt routing

此外，影响CPU性能的还有很多因素，比如是否启用动态调频（实际上多数是降频，比如启用节能模式）。
可以使用cpufreq-info来检查cpu的当前频率和是否启用动态调频：

cpufreq-info|grep "current CPU frequency"
cpufreq-info|grep "driver"
cpufreq-info|grep "The governor"
cpufreq-info|grep "cpufreq governors"

例如下面的输出：

dm01db01: current CPU frequency is 2.30 GHz (asserted by call to hardware).  
dm01db02: current CPU frequency is 2.30 GHz (asserted by call to hardware).

这个是否使当前CPU的最高主频呢？
在Exadata上可以使用很多命令检查CPU型号，主机型号（各种命令一大堆，都有类似功能），这里我使用CheckHWnFWProfile -s来检查：

[root@dm01db01 ~]# /opt/oracle.SupportTools/CheckHWnFWProfile -s|grep E5-2699
      <Processor SIGNATURE="Intel(R) Xeon(R) E5-2699 v3 2.30GHz">
        <ProcessorModel VALUE="Intel(R) Xeon(R) E5-2699 v3 2.30GHz"/>
[root@dm01db01 ~]#

还可以使用/proc/cpuinfo里面的信息：

[root@dm01db01 ibdiagtools]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2301.000
model name      : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
........
[root@dm01cel02 cellos]# grep -E '^model name|^cpu MHz' /proc/cpuinfo
cpu MHz         : 2401.000
model name      : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
........

可以看到Exadata X5-2上:
db节点的CPU 是： Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
cell节点的CPU 是： Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
因此，回到上面的结果，我们看到当前CPU已经是最高主频了。
.
再看下cpufreq-info|grep “driver”的输出：

dm01db01: driver: acpi-cpufreq  
dm01db02: driver: acpi-cpufreq

这里表示已经正确的启用了ACPI。
看一下cpufreq-info|grep “The governor”的输出：

dm01db01: The governor "performance" may decide which speed to use  
dm01db02: The governor "performance" may decide which speed to use

这里表示CPU总是在最高主频下工作。
CPU有三种策略，
onemand （表示系统可以通过动态调整频率，降低功耗，具体的调整策略和内核的功耗管理算法有关）
userspace （表示用户可以自己设定cpu的频率）
performance （表示CPU总是在最高主频下工作）

最后，看一下cpufreq-info|grep “cpufreq governors”的输出：

dm01db01: available cpufreq governors: userspace, performance  
dm01db02: available cpufreq governors: userspace, performance

这里表示CPU总是在最高主频下工作，用户可以自己设定cpu的频率。
从BIOS中还可以看到CPU是否开启了节能（缺省是开启的），这个在一般使用SSD的服务器上都需要关注。
关于SSD上CPU节能设置，回头我们找机器，对着说，O(∩_∩)O哈哈~。
其实还需要关注一下 irqbalance（非Oracle特有，Linux的），这东西在老版本的Exadata上很多用户都被坑过，一般我们都关闭，虽然号称他调整CPU终端，提升性能，但实际是不稳定带来的问题太多了，因此，我建议关闭。Exadata 12.1.2.1.0版本上，db节点是缺省开启的，cell节点是缺省关闭的。
后面要对Exadata做压力测试，我关闭了db的一个节点，留一个是缺省的开启状态，到时候看看，在新版本上Oracle是否对此进行了优化。

发表在体系架构 | 标签为 APIC timer, cpufreq-info, exadata X5, irqbalance | 留下评论

一次数据库性能问题处理（11g自动维护任务，log file sync）

发表于 2015 年 4 月 6 日由 Lunar

看到问题发生前30分钟的awr显示，系统负载超高：

查询数据库当前等待事件：

SQL> set pages 999
SQL> set linesize 1000
SQL> SELECT inst_id,event,count(*) FROM gv$session WHERE wait_class#<>6 group by inst_id,event order by 1,3;

   INST_ID EVENT                                                              COUNT(*)
---------- ---------------------------------------------------------------- ----------
         1 DBWR slave I/O                                                            1
         1 enq: TX - row lock contention                                             1
         1 db file parallel write                                                    1
         1 asynch descriptor resize                                                  1
         1 read by other session                                                     1
         1 log file sync                                                             3
         1 direct path write temp                                                    3
         1 db file sequential read                                                   6
         1 enq: TX - index contention                                              193
         1 resmgr:cpu quantum                                                      226
         1 buffer busy waits                                                       822

11 rows selected.

SQL>

跟awr中的信息一致：

直觉是resmgr:cpu quantum引起了其他一系列问题，这个已经不是第一次遇到了。
等待事件 resmgr:cpu quantum 是11.2引入的oracle资源管理引起的，这个东西一般问题很多，大部分时候，我们装机时，都是直接禁用的。
11.2的oracle资源管理中有一项最坑的是周末的维护任务，这个遇到好几次了。
禁用方法：
1，关闭一些不需要的维护任务：

exec dbms_scheduler.disable( 'ORACLE_OCM.MGMT_CONFIG_JOB' );
exec dbms_scheduler.disable( 'ORACLE_OCM.MGMT_STATS_CONFIG_JOB' );

关闭数据库的空间Advisor，避免消耗过多的IO：

BEGIN
DBMS_AUTO_TASK_ADMIN.DISABLE(
client_name => 'auto space advisor',
operation => NULL,
window_name => NULL);
END;
/

关闭数据库的SQL自动调整Advisor：

BEGIN
DBMS_AUTO_TASK_ADMIN.DISABLE(
client_name => 'sql tuning advisor',
operation => NULL,
window_name => NULL);
END;
/

关闭数据库的weekday窗口的DEFAULT_MAINTENANCE_PLAN：

alter system set resource_manager_plan='' scope=both;
execute dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('WEEKEND_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('MONDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('TUESDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('WEDNESDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('THURSDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('FRIDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('SATURDAY_WINDOW','RESOURCE_PLAN','');
execute dbms_scheduler.set_attribute('SUNDAY_WINDOW','RESOURCE_PLAN','');
Commit;

查看当前active的resource plan ，确认已经关闭：

SQL> select * from v$rsrc_plan;

        ID NAME                             IS_TO CPU INS PARALLEL_SERVERS_ACTIVE PARALLEL_SERVERS_TOTAL
---------- -------------------------------- ----- --- --- ----------------------- ----------------------
PARALLEL_EXECUTION_MANAGED
--------------------------------
     12540 INTERNAL_PLAN                    TRUE  OFF OFF                       0                     64
FIFO

SQL>

此时awr可以看到系统负载已经下降一点点，但是依然很高：

上述修改后的数据库等待事件已经变味“log file sync”，具体如下：

相应的前台等待事件：

后台等待时间：

在LGWR的trace中，可能会出现类似如下的信息：

kcrfw_update_adaptive_sync_mode: poll>
post current_sched_delay=0
switch_sched_delay=154 current_sync_count_delta=887
switch_sync_count_delta=837

这是11.2的新特性，Adaptive Switching Between Log Write Methods（LGWR写模式自动切换），该功能bug一堆，经常导致commit缓慢而带来的“log file sync”。
通过设置如下隐含参数，禁用11.2这个LGWR写模式切换功能：

ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE;
ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE scope=sfile sid='*';

修改后，数据库等待事件已经变为正常的应用日常的等待事件了：

   INST_ID EVENT                                                              COUNT(*)
---------- ---------------------------------------------------------------- ----------
         1 db file parallel write                                                    1
         1 direct path write temp                                                    1
         1 DBWR slave I/O                                                            1
         1 SQL*Net message to client                                                 2
         1 direct path read temp                                                     3
         1 read by other session                                                    18
         1 db file sequential read                                                  70
         1 enq: TX - index contention                                              118

8 rows selected.

Elapsed: 00:00:00.01
SQL> /

   INST_ID EVENT                                                              COUNT(*)
---------- ---------------------------------------------------------------- ----------
         1 db file parallel write                                                    1
         1 direct path write temp                                                    1
         1 buffer busy waits                                                         1
         1 SQL*Net message to client                                                 2
         1 direct path read temp                                                     3
         1 DBWR slave I/O                                                            4
         1 log file sync                                                             9
         1 read by other session                                                    12
         1 enq: TX - index contention                                               35
         1 db file sequential read                                                  67
SQL>

此时的AWR看到，系统负载已经降下来了：

该应用正常时间，数据库连接数如下：

   INST_ID MACHINE                     COUNT(*)
---------- ------------------------- ----------
         1 800BEST\CLIENTLOG01                1
         1 800BEST\LUNARSET01                 43
         1 800BEST\LUNARSET102                76
         1 800BEST\TERMINAL                   2
         1 800BEST\WX01                      57
         1 IIS APPPOOL\LUNARSET01              5
         1 IIS APPPOOL\LUNARSET02            666
         1 IIS APPPOOL\LUNARSETBAK01          64
         1 LUNARDB                  1

9 rows selected.

SQL> /

   INST_ID MACHINE                     COUNT(*)
---------- ------------------------- ----------
         1 800BEST\CLIENTLOG01                1
         1 800BEST\LUNARSET01                 43
         1 800BEST\LUNARSET102                76
         1 800BEST\TERMINAL                   3
         1 800BEST\WX01                      56
         1 IIS APPPOOL\LUNARSET01              5
         1 IIS APPPOOL\LUNARSET02            450
         1 IIS APPPOOL\LUNARSETBAK01          64
         1 LUNARDB                  1

故障暂时缓解了，应用反映已经正常，但是可以看到，上面的信息告诉我们，对于这个应用来说，需要调整的地方还很多：
1，系统参数全面检查
2，SQL优化（SQL语句的写法，index，统计信息检查等）
3，系统IO（比如系统IO的配置，数据库filesystem_io的配置等等）
4，控制文件的检查（我一直怀疑这个有问题）
5，enq: TX – index contention
不过目前日程已经安排到5月底了，估计这个事情如果不太着急的话，应用先自己tuning，等到6月份，如果需要我再介入吧，O(∩_∩)O哈哈~。

发表在 Database | 标签为 log file sync, resmgr:cpu quantum, _use_adaptive_log_file_sync | Comments Off

Exadata上网卡参数的优化（主要是IB的MTU不同于一般值）

发表于 2015 年 4 月 5 日由 Lunar

对比了一下普通主机和Exadata，发现主要的区别在于组播的配置，这个跟Exadata上使用IB的整个网络环境有关系（Infiniband card，IB Switch等等）：

1，db和cell的以太网（eth0）的组播都是缺省值：1500
.
2，cell节点的IB的组播是2044：
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1    --------------cell节点的IB的MTU:2044
.
3，db节点的IB的组播是7000：
          UP BROADCAST RUNNING MULTICAST  MTU:7000  Metric:1    -------------  db节点IB的MTU是:7000
.

MTU是Maximum Transmission Unit的缩写。意思是网络上传送的最大数据包。
最大传输单元这个参数通常与通信接口有关（网络接口卡、串口等），MTU也不是越大越好，因为MTU越大，传送一个数据包的延迟也越大；并且MTU越大，数据包中 bit位发生错误的概率也越大。因此，需要针对网络來进行最佳化。
MTU的单位是字节。一般来说，如果本机的MTU比网关的MTU大，大的数据包就会被拆开来传送，这样会产生很多数据包碎片，增加丢包率，降低网络速度。
把本机的MTU设成比网关的MTU小或相同，就可以减少丢包。
一般普通的机器缺省配置组播是缺省值1500，这个跟以太网的帧的设计有关系。
以前，Ethernet一般把数据分割为一定大小的帧（frame）的单位来进行传送接收,但在规格上帧的尺寸被定为1,518字节。
但是随着通讯器材的发展，现在的万兆网等都支持大帧（jumbo frames），帧的尺寸根据机器各种各样,大部分对应9,000～16,000字节左右。
.
.
要修改MTU的方法很简单（尽管很多人在RAC环境不正确的修改这个值导致了很多问题）：
ifconfig eth0 mtu xxxx(你需要设置的值)，比如： ifconfig eth0 mtu 9000
修改后，使用 netstat -i 或者ifconfig |grep MTU来查看既可以。
目前，Oracle支持在私有网络（interconnect）使用超过1500的组播（具体设置也要根据前面说的，看环境，不是越大越好。通常没有好的设计，一般不改）。
.
.
对于多播（MULTICAST），RAC要求必须开启，这个在Oracle官方的最佳实践中有明确说明：

对于Grid Infrastructure版本 11.2.0.2，私网必须在子网 230.0.1.0支持多播传输（multicast）。
Oracle Grid Infrastructure 11.2.0.2 的 Patch: 9974223  
（包含在 GI PSU 11.2.0.2.1 和更高版本中）对私网上的 224.0.0.251 多播地址启用了多播功能。
必须允许这两个地址中的其中一个所对应的私网使用多播（假设已应用补丁程序）。
Document 1212703.1 中提供了测试多播功能的其他信息和程序。

对于多播的检测，Oracle也提供了详细的方法：

# perl mcasttest.pl -n <node1>,<node2>,<node_n...> -i <interface1>,<interface2><interface_n...>
.
具体参见：
Grid Infrastructure Startup During Patching, Install or Upgrade May Fail Due to Multicasting Requirement (Doc ID
1212703.1)
.

类似下面的，就是多播检测失败的情况：

Test for Multicast address 230.0.1.0
Nov 8 09:05:33 | Multicast Failed for eth1 using address 230.0.1.0:42000
Nov 8 09:05:34 | Multicast Failed for eth2 using address 230.0.1.0:42001
Test for Multicast address 224.0.0.251
Nov 8 09:05:35 | Multicast Succeeded for eth1 using address 224.0.0.251:42002
Nov 8 09:05:36 | Multicast Succeeded for eth2 using address 224.0.0.251:42003

.
关于组播，在普通环境（非Exadata）有一些注意事项：
1，一般就采用缺省的1500，如果超过这个值，需要特殊的配置，具体请参考：
CRS root.sh Script Failing on Second Node When MTU Larger than 1500 (Doc ID 1085885.1)
.
2，如果节点之间MTU设置不同，那么可能造成实例不能启动，具体请参考：
Instances Unable To Start If MTU Size Is Different for Cluster_interconnect (Doc ID 300388.1)
.
下面是db节点的ifconfig的具体信息和说明：

[root@dm01db01 ~# ifconfig
bondeth0  Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C1    --------encap:Ethernet表示 网卡连接网络类型Ethernet(以太网)；HWaddr是网卡物理地址（MAC地址
          inet addr:10.45.11.50  Bcast:10.45.11.255  Mask:255.255.252.0  ---inet addr是绑定IP地址，Bcast是这个IP对应的广播地址，Mask收集子网掩码
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1   ---UP(打开) BROADCAST(广播) MULTICAST(组播) RUNNING(运行)。MTU是最大传输单元，以太网缺省是1500
          RX packets:7610626 errors:0 dropped:1950540 overruns:0 frame:0   ---RX packets 接受packets数目，errors表示错误packets数目，dropped表示丢弃packets数目
          TX packets:3719744 errors:0 dropped:0 overruns:0 carrier:0  ---TX packets 传输packets数目，errors表示错误packets数目，dropped表示丢弃packets数目
          collisions:0 txqueuelen:0            ----------txqueuelen是发送packets的queue长度上限.0应该是无限，以太网是1000
          RX bytes:544097236 (518.8 MiB)  TX bytes:230272474 (219.6 MiB)   ----RX bytes时接收字节，TX bytes是发送字节

bondeth0:1 Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C1  
          inet addr:10.45.11.51  Bcast:10.45.11.255  Mask:255.255.252.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

bondeth0:2 Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C1  
          inet addr:10.45.11.54  Bcast:10.45.11.255  Mask:255.255.252.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

eth0      Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C0  
          inet addr:10.45.1.191  Bcast:10.45.1.255  Mask:255.255.255.0
          inet6 addr: fd0b:dddd:b7d:0:210:e0ff:fe68:b7c0/64 Scope:Global
          inet6 addr: fe80::210:e0ff:fe68:b7c0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:215294 errors:0 dropped:0 overruns:0 frame:0
          TX packets:103056 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:39010029 (37.2 MiB)  TX bytes:15403532 (14.6 MiB)

eth1      Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C1  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:5660114 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3719744 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:395715879 (377.3 MiB)  TX bytes:230272474 (219.6 MiB)

eth2      Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C1  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:1950512 errors:0 dropped:1950510 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:148381357 (141.5 MiB)  TX bytes:0 (0.0 b)

eth3      Link encap:Ethernet  HWaddr 00:10:E0:68:B7:C3  
          inet addr:192.168.18.11  Bcast:192.168.19.255  Mask:255.255.254.0
          inet6 addr: fe80::210:e0ff:fe68:b7c3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:221 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:12056 (11.7 KiB)  TX bytes:31357 (30.6 KiB)

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib0       Link encap:InfiniBand  HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00   -----encap:InfiniBand 表示 网卡连接网络类型Ethernet(以太网)；HWaddr是网卡物理地址（MAC地址）
          inet addr:192.168.10.1  Bcast:192.168.11.255  Mask:255.255.252.0  ---inet addr是绑定IP地址，Bcast是这个IP对应的广播地址，Mask收集子网掩码
          inet6 addr: fe80::210:e000:15a:e611/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:7000  Metric:1    ---UP(打开) BROADCAST(广播) MULTICAST(组播) RUNNING(运行)。MTU是最大传输单元，db节点IB的MTU是:7000
          RX packets:843743 errors:0 dropped:0 overruns:0 frame:0  ---RX packets 接受packets数目，errors表示错误packets数目，dropped表示丢弃packets数目
          TX packets:113542 errors:0 dropped:0 overruns:0 carrier:0  ---TX packets 传输packets数目，errors表示错误packets数目，dropped表示丢弃packets数目
          collisions:0 txqueuelen:1024             ----------txqueuelen是发送packets的queue长度上限.0应该是无限，IB网络是1024
          RX bytes:188246789 (179.5 MiB)  TX bytes:7920860 (7.5 MiB)   ----RX bytes时接收字节，TX bytes是发送字节

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib0:1     Link encap:InfiniBand  HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:169.254.102.112  Bcast:169.254.127.255  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:7000  Metric:1

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib1       Link encap:InfiniBand  HWaddr 80:00:00:49:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:192.168.10.2  Bcast:192.168.11.255  Mask:255.255.252.0
          inet6 addr: fe80::210:e000:15a:e612/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:7000  Metric:1
          RX packets:1141936 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1965335 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1024 
          RX bytes:894917258 (853.4 MiB)  TX bytes:878056699 (837.3 MiB)

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib1:1     Link encap:InfiniBand  HWaddr 80:00:00:49:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:169.254.237.6  Bcast:169.254.255.255  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:7000  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2204010 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2204010 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:191767284 (182.8 MiB)  TX bytes:191767284 (182.8 MiB)

[root@dm01db01 ~#

发表在体系架构 | 标签为 exadata X5, MTU, MULTICAST | 留下评论

修复由于修改主机名造成Standbalone异常（ORA-29701 raised in ASM I/O path）

发表于 2015 年 4 月 5 日由 Lunar

前不久，我同事修改standbalone的主机名，只是单纯的从主机层面修改了，has等并没有做调整，虽然数据库奇迹般的open了（有些时候，数据库在此情况下是不能open的，直接就会报错），但是数据库更改主机名以后，ASM和数据库的alert都有报错，而且分析后，觉得这个报错不解决，会在后期使用中造成数据库crash。
修复这个问题的程很简单，有几点说明：
1，8i和8i以前叫做OPS，Oracle Parallel Server，9.0.1~11.1之间叫做RAC（Real Application Cluster），11.2以后叫做GI（Grid Infrastructure）
2，Standbalone结构就是单实例数据库使用ASM的场景，他只需要db和asm通信的cssd等结构，不需要crs。
3，启动，关闭和管理has流程类似rac，但是千万别混淆，Standbalone的叫做crsctl start/stop has，RAC的叫做crsctl start/stop crs(整套架构中所有进程由ohasd创建)，如果弄错了，会出现一些异常情况，在其他案例中，我们遇到过直接ASM报错的，后续在总结了那个案例分享出来。
4，不管是Standbalone还是RAC，修改主机名和IP等都需要特别小心，因为他们不像单机数据库一样，单纯的主机层面就该就可以，他们需要分别在has和crs中进行重配或者修改。
5，无论发生了什么，只要没有动ASM和DB，那么不用担心丢数据。因为has或者crs的结构都跟db是独立的，一般不会丢数据，最差的结果，重装一下，也可以把asm和db拉起来。
6，很多时候，在做类似该ip或者主机名，或者升级crs或者has的版本，或者升级数据库软件时，都建议对现有环境进行备份，要么借助NBU之类的工具，要么使用tar命令。千万别用ftp或者直接copy，这类的否不靠谱，不能用于文件恢复。

具体过程如下：
ASM ALERT:

Tue Mar 03 14:58:34 2015
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
Wed Mar 18 17:01:23 2015 
NOTE: client lunarogg:lunarogg deregistered
Wed Mar 18 17:07:18 2015 
NOTE: client lunarogg:lunarogg registered, osid 16085, mbr 0x1
~

数据库的日志，DB ALERT:

Fri Mar 20 11:29:44 2015
Archived Log entry 650 added for thread 1 sequence 107161 ID 0x9911db3 dest 1:
Fri Mar 20 11:30:15 2015
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process 25695
Fri Mar 20 11:32:05 2015
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process 25711
Fri Mar 20 11:32:54 2015
Thread 1 cannot allocate new log, sequence 107163
Private strand flush not complete
  Current log# 3 seq# 107162 mem# 0: /u01/oradata/redo/lunarogg/onlinelog/redo03.log
Fri Mar 20 11:32:55 2015
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process 25766
Thread 1 advanced to log sequence 107163 (LGWR switch)
  Current log# 4 seq# 107163 mem# 0: /u01/oradata/redo/lunarogg/onlinelog/redo04.log
Fri Mar 20 11:32:59 2015
Archived Log entry 651 added for thread 1 sequence 107162 ID 0x9911db3 dest 1:

CRS和ohasd也有明显报错：

2015-03-03 14:32:22.832: [UiServer][1135483200]{0:0:2} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:22.833: [UiServer][1135483200]{0:0:2} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:22.836: [UiServer][1135483200]{0:0:54} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:22.836: [UiServer][1135483200]{0:0:54} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:53.108: [UiServer][1135483200]{0:0:78} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:53.109: [UiServer][1135483200]{0:0:78} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:53.118: [UiServer][1135483200]{0:0:79} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:32:53.119: [UiServer][1135483200]{0:0:79} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:33:22.609: [  CRSSEC][1133381952]{0:0:95} Exception: setGroupIdentity failed while checking groupname: asmadmin
2015-03-03 14:33:22.615: [   CRSPE][1133381952]{0:0:95} Exception in setGroupIdentity (ignoring groupIdentity): Authorization Exception:The input identity user id does not belong to the group is invalid
2015-03-03 14:33:53.733: [UiServer][1135483200]{0:0:91} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:33:53.738: [UiServer][1135483200]{0:0:94} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:33:53.739: [UiServer][1135483200]{0:0:95} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-03 14:33:53.740: [UiServer][1135483200]{0:0:95} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-18 17:01:17.748: [  CRSSEC][1133381952]{0:0:8811} Exception: setGroupIdentity failed while checking groupname: asmadmin
2015-03-18 17:01:17.748: [   CRSPE][1133381952]{0:0:8811} Exception in setGroupIdentity (ignoring groupIdentity): Authorization Exception:The input identity user id does not belong to the group is invalid
2015-03-18 17:01:59.271: [  CRSSEC][1133381952]{0:0:8812} Exception: setGroupIdentity failed while checking groupname: asmadmin
2015-03-18 17:01:59.271: [   CRSPE][1133381952]{0:0:8812} Exception in setGroupIdentity (ignoring groupIdentity): Authorization Exception:The input identity user id does not belong to the group is invalid
2015-03-18 17:01:59.277: [UiServer][1135483200]{0:0:8812} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-18 17:01:59.295: [UiServer][1135483200]{0:0:8812} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-18 17:01:59.296: [UiServer][1135483200]{0:0:8812} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-18 17:34:14.510: [UiServer][1135483200]{0:0:8834} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.
2015-03-18 17:34:14.510: [UiServer][1135483200]{0:0:8834} Communication exception sending reply back to client.FatalCommsException : Failed to send response to client.

我猜测，此时如果做类似增加/删除文件数据库都会crash，果然，我让同事做rman备份，数据库就crash了。
处理的方法很简单，就是修改has的相关配置，讲新主机名配置进去。具体步骤如下：
首先记录当前的系统关键信息，因为修改完has后，需要将asm，db，diskgroup等关键信息重新注册到has中：

[grid@lunardb ~]$ srvctl config asm
ASM home: /u01/app/grid/product/11.2.0/grid
ASM listener: LISTENER
Spfile: +DATA/asm/asmparameterfile/registry.253.847035651
ASM diskgroup discovery string: /dev/mapper/*
[grid@lunardb ~]$ 
[oracle@lunardb ~]$ srvctl config database -d lunarogg
Database unique name: lunarogg
Database name: lunarogg
Oracle home: /u01/app/oracle/product/11.2.0/db_1
Oracle user: oracle
Spfile: +DATA/lunarogg/spfilelunarogg.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Database instance: lunarogg
Disk Groups: DATA,DATA2
Services: 
[oracle@lunardb ~]$

这里还发现了一个问题，这个库的ASM的在has中的信息和实际使用的spfile不一致

[grid@lunardb ~]$ asmcmd spbackup +DATA/asm/asmparameterfile/registry.253.847035651 /tmp/ASMspfile.backup
ASMCMD-8303: invalid SPFILE '+DATA/asm/asmparameterfile/registry.253.847035651'
ORA-15056: additional error message
ORA-15173: entry 'asm' does not exist in directory '/'
ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 322
ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)
[grid@lunardb ~]$ 
检查ASM实例，确实使用的是pfile启动的，这个从alert中也可以得到证实：
[grid@lunardb ~]$ ss

SQL*Plus: Release 11.2.0.4.0 Production on Mon Mar 23 13:56:01 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option

SQL> show parameter pfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string
SQL>

在所有的磁盘组中，我们发现实际上是找不到在has中注册的ASM实例使用的spfile的：

[grid@lunardb ~]$ asmcmd find --type ASMPARAMETERFILE +DATA2 "*"
[grid@lunardb ~]$ asmcmd find --type ASMPARAMETERFILE +DATA "*"

使用kfod检查会发现，大量磁盘的报错：

[grid@lunardb ~]$ kfod asm_diskstring='/dev/mapper/*' disks=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group   
================================================================================
   1:    1429504 Mb /dev/mapper/D_R5_01_LUN01                grid     oinstall
   2:    1429504 Mb /dev/mapper/D_R5_01_LUN02                grid     oinstall
   3:    1425408 Mb /dev/mapper/D_R5_02_LUN01                grid     oinstall
   4:    1425408 Mb /dev/mapper/D_R5_02_LUN02                grid     oinstall
   5:    1425408 Mb /dev/mapper/D_R5_03_LUN01                grid     oinstall
   6:    1425408 Mb /dev/mapper/D_R5_03_LUN02                grid     oinstall
   7:    1425408 Mb /dev/mapper/D_R5_04_LUN01                grid     oinstall
   8:    1425408 Mb /dev/mapper/D_R5_04_LUN02                grid     oinstall
KFOD-00301: Unable to contact Cluster Synchronization Services (CSS). Return code 2 from kgxgncin.
KFOD-00311: Error scanning device /dev/mapper/control
ORA-15025: could not open disk "/dev/mapper/control"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/VolGroup00-LogVol00
ORA-15025: could not open disk "/dev/mapper/VolGroup00-LogVol00"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/VolGroup00-LogVol03
ORA-15025: could not open disk "/dev/mapper/VolGroup00-LogVol03"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/VolGroup00-LogVol04
ORA-15025: could not open disk "/dev/mapper/VolGroup00-LogVol04"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/VolGroup00-LogVol02
ORA-15025: could not open disk "/dev/mapper/VolGroup00-LogVol02"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/VolGroup00-LogVol01
ORA-15025: could not open disk "/dev/mapper/VolGroup00-LogVol01"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/Vol_oradata01-lv_oradata1
ORA-15025: could not open disk "/dev/mapper/Vol_oradata01-lv_oradata1"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
KFOD-00311: Error scanning device /dev/mapper/lunarogg01lun1
ORA-15025: could not open disk "/dev/mapper/lunarogg01lun1"
Linux-x86_64 Error: 13: Permission denied
Additional information: 42
Additional information: 22621383
Additional information: 1598903119
[grid@lunardb ~]$

接下来，我就开始重建has了，首先是使用force都无法停止has，因为已经异常了，这个在意料之中的，没关系：

[root@lunardb ~]# crsctl stop has -f
CLSU-00100: Operating System function: opendir failed with error data: 2
CLSU-00101: Operating System error message: No such file or directory
CLSU-00103: error location: scrsearch1
CLSU-00104: additional error information: cant open scr home dir scls_scr_getval
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
[root@lunardb ~]# 
[root@lunardb ~]# crsctl stop crs
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Stop failed, or completed with errors.
[root@lunardb ~]# ps -ef|grep d.bin
grid      9107     1  0 Mar03 ?        00:01:43 /u01/app/grid/product/11.2.0/grid/bin/evmd.bin
grid      9140  9107  0 Mar03 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmlogger.bin -o /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.log
grid      9149     1  0 Mar03 ?        00:15:09 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      9172     1  0 Mar03 ?        00:01:50 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin 
grid     16751     1  0 Mar18 ?        00:00:17 /u01/app/grid/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
root     31046 30878  0 14:17 pts/8    00:00:00 grep d.bin
[root@lunardb ~]#

这些进程已经停不掉了，于是只能重启主机（kill ocssd.bin 一样会导致主机重启，因此直接手工reboot了）。
起来后，先删除现有的has配置：

[root@lunardb ~]# $ORACLE_HOME/crs/install/roothas.pl -deconfig -force
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Delete failed, or completed with errors.
CLSU-00100: Operating System function: opendir failed with error data: 2
CLSU-00101: Operating System error message: No such file or directory
CLSU-00103: error location: scrsearch1
CLSU-00104: additional error information: cant open scr home dir scls_scr_getval
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Successfully deconfigured Oracle Restart stack
[root@lunardb ~]#

然后执行重新配置：

[root@lunardb ~]# $ORACLE_HOME/crs/install/roothas.pl
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
LOCAL ADD MODE 
Creating OCR keys for user 'grid', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE 
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node lunardb successfully pinned.
Adding Clusterware entries to inittab

lunardb     2015/03/23 14:29:29     /u01/app/grid/product/11.2.0/grid/cdata/lunardb/backup_20150323_142929.olr
Successfully configured Oracle Grid Infrastructure for a Standalone Server
You have new mail in /var/spool/mail/root
[root@lunardb ~]#

这过程很快，比GI快多了，HAS还是结构简单啊，O(∩_∩)O哈哈~。
配置完成后，检查一下，服务都正常，只是需要将下面的服务改为自动启动：

[grid@lunardb ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ons
               OFFLINE OFFLINE      lunardb                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       lunardb                                    
[grid@lunardb ~]$ 
[grid@lunardb ~]$ crsctl modify resource "ora.cssd" -attr "AUTO_START=1"
[grid@lunardb ~]$ crsctl modify resource "ora.evmd" -attr "AUTO_START=1"
[grid@lunardb ~]$ crsctl modify resource "ora.ons" -attr "AUTO_START=1"

使用NETCA重建监听:

[grid@lunardb ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       lunardb                                    
ora.ons
               OFFLINE OFFLINE      lunardb                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       lunardb                                    
[grid@lunardb ~]$

添加asm：

[grid@lunardb ~]$ srvctl add asm
[grid@lunardb ~]$ crsctl modify resource "ora.asm" -attr "AUTO_START=1"
[grid@lunardb ~]$ srvctl config asm
ASM home: /u01/app/grid/product/11.2.0/grid
ASM listener: LISTENER
Spfile: 
ASM diskgroup discovery string: ++no-value-at-resource-creation--never-updated-through-ASM++
[grid@lunardb ~]$
修改asm和listener的配置：
[grid@lunardb dbs]$ srvctl modify asm -l LISTENER -p "/u01/app/grid/product/11.2.0/grid/dbs/init+ASM.ora" -d "/dev/mapper/*"
[grid@lunardb dbs]$ crsctl modify resource "ora.asm" -attr "AUTO_START=1"
[grid@lunardb dbs]$ srvctl config asm
ASM home: /u01/app/grid/product/11.2.0/grid
ASM listener: LISTENER
Spfile: /u01/app/grid/product/11.2.0/grid/dbs/init+ASM.ora
ASM diskgroup discovery string: /dev/mapper/*
[grid@lunardb dbs]$

在has中添加db：

[oracle@lunardb ~]$ srvctl add database -d lunarogg -n lunardb -o /u01/app/oracle/product/11.2.0/db_1 -p +DATA/lunarogg/spfilelunarogg.ora
[oracle@lunardb ~]$ srvctl modify database -d lunarogg -a "DATA,DATA2"
[oracle@lunardb ~]$

现在再次检查has的配置，都已经好了：

[grid@lunardb dbs]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       lunardb                                    
ora.DATA2.dg
               ONLINE  ONLINE       lunardb                                    
ora.LISTENER.lsnr
               ONLINE  ONLINE       lunardb                                    
ora.asm
               ONLINE  ONLINE       lunardb                Started             
ora.ons
               OFFLINE OFFLINE      lunardb                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       lunardb                                    
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       lunardb                                    
ora.lunarogg.db
      1        OFFLINE OFFLINE                                                   
[grid@lunardb dbs]$

在使用kfod检查asmdisk已经不抱错了：

[grid@lunardb dbs]$ kfod asm_diskstring='/dev/mapper/*' disks=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group   
================================================================================
   1:    1429504 Mb /dev/mapper/D_R5_01_LUN01                grid     oinstall
   2:    1429504 Mb /dev/mapper/D_R5_01_LUN02                grid     oinstall
   3:    1425408 Mb /dev/mapper/D_R5_02_LUN01                grid     oinstall
   4:    1425408 Mb /dev/mapper/D_R5_02_LUN02                grid     oinstall
   5:    1425408 Mb /dev/mapper/D_R5_03_LUN01                grid     oinstall
   6:    1425408 Mb /dev/mapper/D_R5_03_LUN02                grid     oinstall
   7:    1425408 Mb /dev/mapper/D_R5_04_LUN01                grid     oinstall
   8:    1425408 Mb /dev/mapper/D_R5_04_LUN02                grid     oinstall
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME                                                          
================================================================================
      +ASM /u01/app/grid/product/11.2.0/grid                                    
[grid@lunardb dbs]$

使用has重新启动，把所有资源和db都带起来：

[grid@lunardb dbs]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       lunardb                                    
ora.DATA2.dg
               ONLINE  ONLINE       lunardb                                    
ora.LISTENER.lsnr
               ONLINE  ONLINE       lunardb                                    
ora.asm
               ONLINE  ONLINE       lunardb                Started             
ora.ons
               OFFLINE OFFLINE      lunardb                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       lunardb                                    
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       lunardb                                    
ora.lunarogg.db
      1        ONLINE  ONLINE       lunardb                Open                
[grid@lunardb dbs]$

好了，都ok了。

发表在 RAC | 标签为 ORA-29701 raised in ASM I/O path | 留下评论

站内搜索

Search for:
Oracle证书
分类目录
- ASM (30)
- Database (86)
  - backup&recovery (21)
  - expdp/impdp (5)
  - Installation and Deinstall (31)
  - network (7)
  - ORA-600 or ORA-7445 (6)
  - Performence Tuning (13)
  - troubleshoooting (2)
- Dataguard (7)
- EBS (3)
  - EBS系统管理 (1)
  - 安装、克隆、迁移 (2)
- Exadata (120)
  - FAQ (19)
  - POC和性能调整 (11)
  - 体系架构 (19)
  - 内部机制 (22)
  - 安装和升级 (14)
  - 性能指标 (8)
    - Exadata V1 (1)
    - Exadata V2 (1)
    - Exadata X2-2 (2)
    - Exadata X3-2 (1)
    - Exadata X4-2 (1)
    - FAQ (1)
  - 故障诊断 (3)
  - 日常运维 (15)
  - 硬件配置 (43)
    - Exadata V1 (6)
    - Exadata V2 (6)
    - Exadata X2-2 (6)
    - Exadata X3-2 (8)
    - Exadata X4-2 (8)
    - FAQ (1)
- FAQ (16)
- Internal (21)
  - bbed (2)
  - DUL ODU (4)
- Linux (20)
- MYSQL (8)
  - FAQ (7)
  - 复制 (1)
  - 安装配置 (1)
- OGG (1)
- ORA-600/7445 (2)
- ORA-XXXXX (5)
- Oracle 11.1 & Oracle11.2 (6)
- ORACLE 12C (21)
- Oracle 8 & Oracle 8i (1)
- RAC (47)
- SAP (2)
- Scripts (6)
- 未分类 (1)
- 虚拟化 (1)
2025 年七月

S M T W T F S

« Nov

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31
文章归档
文章归档
近期文章
近期评论
- tom 发表在《exadata巡检报告的模板》
- cyx 发表在《关于我》
- 李科胜发表在《EBS克隆–db和app分开在两个服务器上》
- xiao 发表在《exadata巡检报告的模板》
- Chris Sun 发表在《使用Oracle 11.2的DBMS_RESOURCE_MANAGER.CALIBRATE_IO对Exadata X5（HC）进行测试》

站内搜索

Oracle证书

分类目录

文章归档

近期文章

近期评论