新本本性能一般,用VM跑RAC很费劲,因此从朋友那里copy了一个STANDALONE(ASM+SINGLE DATABASE),然后直接修改主机名后,发现css信息异常,且HAS不能启动……
直接修改主机名为lunar后,HAS的信息为:
[root@lunar bin]# crs_stat -t -v Name Type R/RA F/FT Target State Host ---------------------------------------------------------------------- ora.DATA.dg ora....up.type 0/5 0/ ONLINE OFFLINE -----注意,ora.DATA.dg 异常了 ora....ER.lsnr ora....er.type 0/5 0/ ONLINE ONLINE dabaobao ora.asm ora.asm.type 0/5 0/ ONLINE ONLINE dabaobao ora.cssd ora.cssd.type 0/5 0/5 ONLINE ONLINE dabaobao ora.diskmon ora....on.type 0/10 0/5 OFFLINE OFFLINE --本来这个就不应该启动 ora.evmd ora.evm.type 0/10 0/5 ONLINE ONLINE dabaobao ora.ons ora.ons.type 0/3 0/ OFFLINE OFFLINE [root@lunar bin]#crsctl status res -init NAME=ora.DATA.dg TYPE=ora.diskgroup.type TARGET=ONLINE STATE=OFFLINE NAME=ora.LISTENER.lsnr TYPE=ora.listener.type TARGET=ONLINE STATE=ONLINE on dabaobao NAME=ora.asm TYPE=ora.asm.type TARGET=ONLINE STATE=ONLINE on dabaobao NAME=ora.cssd TYPE=ora.cssd.type TARGET=ONLINE STATE=ONLINE on dabaobao NAME=ora.diskmon TYPE=ora.diskmon.type TARGET=OFFLINE STATE=OFFLINE NAME=ora.evmd TYPE=ora.evm.type TARGET=ONLINE STATE=ONLINE on dabaobao NAME=ora.ons TYPE=ora.ons.type TARGET=OFFLINE STATE=OFFLINE [root@lunar bin]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE OFFLINE dabaobao ora.LISTENER.lsnr ONLINE ONLINE dabaobao ora.asm ONLINE ONLINE dabaobao Started ora.ons OFFLINE OFFLINE dabaobao -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE dabaobao ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE dabaobao [root@lunar bin]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@lunar bin]
重启has后,发现HAS启动不了,报错如下:
[root@lunar ~]# crsctl start has CLSU-00100: Operating System function: opendir failed with error data: 2 CLSU-00101: Operating System error message: No such file or directory CLSU-00103: error location: scrsearch1 CLSU-00104: additional error information: cant open scr home dir scls_scr_getval CRS-4000: Command Start failed, or completed with errors. [root@lunar ~]#
根据“error location: scrsearch1”和“cant open scr home dir scls_scr_getval”,可以看出这个跟修改主机名有关系,将主机名称修改会议前的dabaobao:
[root@lunar install]# hostname dabaobao [root@lunar install]# exit logout Last login: Sun Jul 13 09:39:31 2014 from 192.168.56.1 [root@dabaobao ~]# [root@dabaobao ~]#
修改回到以前的主机名“dabaobao”以后,再次重启has,可以启动了,可见,HAS的架构非常简单……
[root@dabaobao ~]# crsctl start has CRS-4123: Oracle High Availability Services has been started. [root@dabaobao ~]# ps -ef|grep d.bin grid 3192 1 0 08:56 ? 00:00:02 /u01/app/grid/product/11.2.0/db_1/bin/evmd.bin grid 3271 1 0 08:56 ? 00:00:03 /u01/app/grid/product/11.2.0/db_1/bin/ocssd.bin grid 5324 1 2 10:00 ? 00:00:02 /u01/app/grid/product/11.2.0/db_1/bin/ohasd.bin reboot root 5478 5293 0 10:02 pts/2 00:00:00 grep d.bin [root@dabaobao ~]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE OFFLINE dabaobao ora.LISTENER.lsnr ONLINE ONLINE dabaobao ora.asm ONLINE ONLINE dabaobao Started ora.ons OFFLINE OFFLINE dabaobao -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE dabaobao ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE dabaobao [root@dabaobao ~]#
这里,出了ora.DATA.dg这个资源异常外,其他资源是正常状态,此时,我们使用roothas.pl删除HAS的配置:
[root@dabaobao install]# ./roothas.pl -deconfig -force Using configuration parameter file: ./crsconfig_params CRS resources for listeners are still configured PRKO-2573 : ONS daemon is already stopped. CRS-2673: Attempting to stop 'ora.asm' on 'dabaobao' ORA-21561: OID generation failed CRS-5022: Stop of resource "ora.asm" failed: current state is "UNKNOWN" CRS-2675: Stop of 'ora.asm' on 'dabaobao' failed CRS-4000: Command Stop failed, or completed with errors. CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dabaobao' CRS-2673: Attempting to stop 'ora.asm' on 'dabaobao' ORA-21561: OID generation failed CRS-5022: Stop of resource "ora.asm" failed: current state is "UNKNOWN" CRS-2675: Stop of 'ora.asm' on 'dabaobao' failed CRS-2679: Attempting to clean 'ora.asm' on 'dabaobao' ORA-21561: OID generation failed CRS-5022: Stop of resource "ora.asm" failed: current state is "UNKNOWN" CRS-2678: 'ora.asm' on 'dabaobao' has experienced an unrecoverable failure CRS-2799: Failed to shut down resource 'ora.asm' on 'dabaobao' CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'dabaobao' has failed CRS-4687: Shutdown command has completed with errors. CRS-4000: Command Stop failed, or completed with errors. You must kill ohasd processes or reboot the system to properly cleanup the processes started by Oracle clusterware Successfully deconfigured Oracle Restart stack [root@dabaobao install]#
然后,修改主机名为lunar,再次使用roothas.pl,让他自动根据当前的主机名和IP来生成配置信息:
[root@dabaobao install]# hostname lunar [root@dabaobao install]# [root@dabaobao install]# exit logout [root@lunar install]# ./roothas.pl Using configuration parameter file: ./crsconfig_params LOCAL ADD MODE Creating OCR keys for user 'grid', privgrp 'oinstall'.. Operation successful. LOCAL ONLY MODE Successfully accumulated necessary OCR keys. Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. CRS-4664: Node lunar successfully pinned. Adding Clusterware entries to inittab lunar 2014/07/13 10:14:08 /u01/app/grid/product/11.2.0/db_1/cdata/lunar/backup_20140713_101408.olr Successfully configured Oracle Grid Infrastructure for a Standalone Server [root@lunar install]# [root@lunar install]# crsctl start has CRS-4640: Oracle High Availability Services is already active CRS-4000: Command Start failed, or completed with errors. [root@lunar install]#
可见,这里已经生产了节点名为lunar的has配置信息
[root@lunar install]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ons OFFLINE OFFLINE lunar -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 OFFLINE OFFLINE ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE lunar [root@lunar install]#
添加asm:
[root@lunar bin]# su - grid [grid@lunar ~]$ srvctl add asm [grid@lunar ~]$ srvctl start asm [grid@lunar ~]$ [root@lunar install]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.asm ONLINE ONLINE lunar Started ora.ons OFFLINE OFFLINE lunar -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE lunar ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE lunar [root@lunar install]# [root@lunar install]# crsctl modify resource "ora.asm" -attr "AUTO_START=1"
添加ASM DISKGROUP:
[grid@lunar ~]$ vi init+ASM.ora asm_diskgroups='DATA' instance_type='asm' large_pool_size=12M remote_login_passwordfile='EXCLUSIVE' SQL> shutdown immediate ASM diskgroups dismounted ASM instance shutdown SQL> startup pfile=/home/grid/init+ASM.ora ASM instance started Total System Global Area 1135747072 bytes Fixed Size 2260728 bytes Variable Size 1108320520 bytes ASM Cache 25165824 bytes ORA-15032: not all alterations performed ORA-15017: diskgroup "DATA" cannot be mounted ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA" SQL>
这个错误是因为没有找到合适的disk,于是修改参数:
[grid@lunar ~]$ vi init+ASM.ora asm_diskgroups='DATA' asm_diskstring= "/dev/asm-disk*" instance_type='asm' large_pool_size=12M remote_login_passwordfile='EXCLUSIVE' ~ [grid@lunar ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Sun Jul 13 10:40:17 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Automatic Storage Management option SQL> shutdown abort ASM instance shutdown SQL> startup pfile=/home/grid/init+ASM.ora ASM instance started Total System Global Area 1135747072 bytes Fixed Size 2260728 bytes Variable Size 1108320520 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASM diskgroups volume enabled SQL>
可以看到,现在磁盘组都mount上了
然后我们创建spfile,准备重启has:
SQL> create spfile='+DATA' from pfile='/home/grid/init+ASM.ora'; File created. SQL> show parameter spfile; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string SQL> [root@lunar install]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE lunar ora.asm ONLINE ONLINE lunar Started ora.ons OFFLINE OFFLINE lunar -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE lunar ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE lunar [root@lunar install]# crsctl modify resource "ora.DATA.dg" -attr "AUTO_START=1" [root@lunar install]#
重启HAS:
[root@lunar install]# crsctl stop has CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'lunar' CRS-2673: Attempting to stop 'ora.DATA.dg' on 'lunar' CRS-2677: Stop of 'ora.DATA.dg' on 'lunar' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'lunar' CRS-2677: Stop of 'ora.asm' on 'lunar' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'lunar' CRS-2677: Stop of 'ora.cssd' on 'lunar' succeeded CRS-2673: Attempting to stop 'ora.evmd' on 'lunar' CRS-2677: Stop of 'ora.evmd' on 'lunar' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'lunar' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@lunar install]# crsctl start has CRS-4123: Oracle High Availability Services has been started. [root@lunar install]# [root@lunar install]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE OFFLINE lunar ora.asm ONLINE OFFLINE lunar Instance Shutdown,S TARTING ora.ons OFFLINE OFFLINE lunar -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE lunar ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE INTERMEDIATE lunar [root@lunar install]#
等待一会儿,一切ok了:
[root@lunar install]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE lunar ora.asm ONLINE ONLINE lunar Started ora.ons OFFLINE OFFLINE lunar -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cssd 1 ONLINE ONLINE lunar ora.diskmon 1 OFFLINE OFFLINE ora.evmd 1 ONLINE ONLINE lunar [root@lunar install]#
总结:
1,在发现has或者crs异常时,不要stop crs或者stop has
2,修改主机名或者IP时,发现错误了,不要stop crs或者stop has(后续的一些操作需要这些资源)
3,在HAS环境中修改主机名和IP的过程:
(1)先用roothas.pl -deconfig -force清理老配置
(2)修改主机名(/etc/hosts,/etc/sysconfig/network,hostname等等)
(3)./roothas.pl (自动根据当前配置生成新的配置信息)
(4)添加ASM资源
(5)添加磁盘组
(6)重启HAS