crfclust.bdb文件过大处理
问题现象
巡检过程中发下1套RAC的生产环境服务上,/oracle目录空间仅剩余8.3G,需尽快清理大文件避免磁盘爆满宕机。
--查看磁盘空间
[root@rac01 ~]# df -h
文件系统 容量 已用 可用 已用%% 挂载点
/dev/mapper/vg_rac01-lv_root
50G 11G 37G 23% /
tmpfs 64G 37G 27G 59% /dev/shm
/dev/sda1 485M 38M 422M 9% /boot
/dev/mapper/vg_rac01-lv_oracle
99G 86G 8.3G 92% /oracle
/dev/mapper/vg_rac01-lv_tmp
50G 180M 47G 1% /tmp
/dev/mapper/vg_rac01-lv_usr
50G 3.0G 44G 7% /usr
/dev/mapper/vg_rac01-lv_var
50G 582M 47G 2% /var
分析过程
使用find命令找出来大文件
[root@rac01 ~]# find /oracle -type f -size +1024M
/oracle/grid_home/crf/db/rac01/crfloclts.bdb
/oracle/grid_home/crf/db/rac01/crfclust.bdb
/oracle/grid_home/log/diag/tnslsnr/rac01/listener_scan1/trace/listener_scan1.log
/oracle/app/diag/rdbms/orcl/orcl1/trace/alert_orcl1.log
/oracle/app/diag/rdbms/icpsp/icpsp1/trace/alert_icpsp1.log
[root@rac01 ~]# ls -lh /oracle/grid_home/crf/db/rac01/crfloclts.bdb
-rw-r----- 1 root root 1.2G 12月 29 10:50 /oracle/grid_home/crf/db/rac01/crfloclts.bdb
You have mail in /var/spool/mail/root
[root@rac01 ~]# ls -lh /oracle/grid_home/crf/db/rac01/crfclust.bdb
-rw-r----- 1 root root 53G 12月 29 10:50 /oracle/grid_home/crf/db/rac01/crfclust.bdb
问题原因
由于文件crfclust.bdb是Cluster Health Monitor (CHM) file,他的默认大小是1G,但是有在一些平台和版本中由于bug原因导致过大.
Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc ID 1343105.1)
Bug 20186278 – crfclust.bdb Becomes Huge Size Due to Sudden Retention Change (Doc ID 20186278.8)
ora.crf用途
资源对应的功能是CHM.Cluster Health Monitor(以下简称CHM)是一个Oracle提供的工具,用来自动收集操作系统的资源(CPU、内存、SWAP、进程、I/O以及网络等)的使用情况。CHM会每秒收集一次数据。这些系统资源数据对于诊断集群系统的节点重启、Hang、实例驱逐(Eviction)、性能问题等是非常有帮助的。另外,用户可以使用CHM来及早发现一些系统负载高、内存异常等问题,从而避免产生更严重的问题。
crfclust.bdb 文件是Oracle Cluster Health Monitor (CHM) 中 CRF 服务用于存储数据的文件,默认只存储一定时间数据,正常情况不会增长过大,默认大小是1G。但是有在一些平台和版本中由于bug原因导致过大。
例如在11.2.0.4版本中,由于bug 10165314,ORA.CRF服务可能会生成很大的文件,这可能会对$GI_HOME的使用率造成压力。因此,在某些情况下可能需要删除这些文件或者禁止ORA.CRF随ohas启动而启动。
解决步骤
获取CHM路径
--获取Cluster Health Monitor (CHM) 存储路径
[grid@rac01 bin]$ /oracle/grid_home/bin/oclumon manage -get reppath
CHM Repository Path = /oracle/grid_home/crf/db/rac02
Done
本次生产环境中获取Cluster Health Monitor (CHM) 存储路径提示如下:
[grid@rac01 bin]$ cd /oracle/grid_home/crf/db/rac02
-bash: cd: /oracle/grid_home/crf/db/rac02: No such file or directory
而虚拟机环境试了下可以获取,原因不详,继续往下分析
[grid@rac01 bin]$ cd /oracle/grid_home/crf/db/rac01
[root@wldb01 wldb01]# du -sh
58G .
[grid@rac01 wldb01]# ls -lhtr
total 58G
-rw-r-----. 1 root root 16M Dec 30 14:35 log.0000047847
-rw-r-----. 1 root root 8.0K Dec 30 14:35 repdhosts.bdb
-rw-r-----. 1 root root 24K Dec 30 14:36 __db.001
-rw-r--r--. 1 root root 115M Dec 30 14:36 wldb01.ldb
-rw-r-----. 1 root root 8.0K Dec 30 14:36 crfconn.bdb
-rw-r-----. 1 root root 329M Dec 30 14:36 crfts.bdb
-rw-r-----. 1 root root 508M Dec 30 14:36 crfloclts.bdb
-rw-r-----. 1 root root 54G Dec 30 14:35 crfclust.bdb
-rw-r-----. 1 root root 392K Dec 30 14:35 __db.002
-rw-r-----. 1 root root 16M Dec 30 14:36 log.0000047848
-rw-r-----. 1 root root 504M Dec 30 14:36 crfhosts.bdb
-rw-r-----. 1 root root 650M Dec 30 14:36 crfcpu.bdb
-rw-r-----. 1 root root 534M Dec 30 14:36 crfalert.bdb
-rw-r-----. 1 root root 56K Dec 30 14:36 __db.006
-rw-r-----. 1 root root 1.2M Dec 30 14:36 __db.005
-rw-r-----. 1 root root 2.1M Dec 30 14:36 __db.004
-rw-r-----. 1 root root 2.6M Dec 30 14:36 __db.003
清理bdb文件
两节点依次清理,清理完一节点,再清理二节点:
--查看集群中所有资源状态,不显示初始化资源信息,如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.DATA.dg
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.LISTENER.lsnr
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.asm
ONLINE ONLINE rac01 Started
ONLINE ONLINE rac02 Started
ora.gsd
OFFLINE OFFLINE rac01
OFFLINE OFFLINE rac02
ora.net1.network
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.ons
ONLINE ONLINE rac01
ONLINE ONLINE rac02
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac02
ora.cvu
1 ONLINE ONLINE rac02
ora.icpsp.db
1 ONLINE ONLINE rac01 Open
2 ONLINE ONLINE rac02 Open
ora.oc4j
1 ONLINE ONLINE rac02
ora.orcl.db
1 ONLINE ONLINE rac01 Open
2 ONLINE ONLINE rac02 Open
ora.rac01.vip
1 ONLINE ONLINE rac01
ora.rac02.vip
1 ONLINE ONLINE rac02
ora.scan1.vip
1 ONLINE ONLINE rac02
--守护进程状态
-init: 这个选项用于显示初始化资源的状态信息,这些资源通常包括如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac01 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac01
ora.crf
1 ONLINE ONLINE rac01
ora.crsd
1 ONLINE ONLINE rac01
ora.cssd
1 ONLINE ONLINE rac01
ora.cssdmonitor
1 ONLINE ONLINE rac01
ora.ctssd
1 ONLINE ONLINE rac01 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE rac01
ora.gipcd
1 ONLINE ONLINE rac01
ora.gpnpd
1 ONLINE ONLINE rac01
ora.mdnsd
1 ONLINE ONLINE rac01
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'wldb01'
CRS-2677: Stop of 'ora.crf' on 'wldb01' succeeded
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE wldb01 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE wldb01
ora.crf
1 OFFLINE OFFLINE
ora.crsd
1 ONLINE ONLINE wldb01
ora.cssd
1 ONLINE ONLINE wldb01
ora.cssdmonitor
1 ONLINE ONLINE wldb01
ora.ctssd
1 ONLINE ONLINE wldb01 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE wldb01
ora.evmd
1 ONLINE ONLINE wldb01
ora.gipcd
1 ONLINE ONLINE wldb01
ora.gpnpd
1 ONLINE ONLINE wldb01
ora.mdnsd
1 ONLINE ONLINE wldb01
--删除文件
[root@wldb01 wldb01]# rm -rf crfclust.bdb
或
[root@wldb01 wldb01]# rm -rf *.bdb
--查看磁盘空间
[root@rac01 rac01]# df -h
文件系统 容量 已用 可用 已用%% 挂载点
/dev/mapper/vg_rac01-lv_root
50G 11G 37G 23% /
tmpfs 64G 37G 27G 58% /dev/shm
/dev/sda1 485M 38M 422M 9% /boot
/dev/mapper/vg_rac01-lv_oracle
99G 33G 61G 35% /oracle
/dev/mapper/vg_rac01-lv_tmp
50G 180M 47G 1% /tmp
/dev/mapper/vg_rac01-lv_usr
50G 3.0G 44G 7% /usr
/dev/mapper/vg_rac01-lv_var
50G 582M 47G 2% /var
[root@rac01 rac01]# du -sh
4.9G
--启动crf服务.
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'wldb01'
CRS-2676: Start of 'ora.crf' on 'wldb01' succeeded
[root@wldb01 wldb01]# /u01/app/11.2.0/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE wldb01 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE wldb01
ora.crf
1 ONLINE ONLINE wldb01
ora.crsd
1 ONLINE ONLINE wldb01
ora.cssd
1 ONLINE ONLINE wldb01
ora.cssdmonitor
1 ONLINE ONLINE wldb01
ora.ctssd
1 ONLINE ONLINE wldb01 ACTIVE:0
ora.diskmon
1 OFFLINE OFFLINE
ora.drivers.acfs
1 ONLINE ONLINE wldb01
ora.evmd
1 ONLINE ONLINE wldb01
ora.gipcd
1 ONLINE ONLINE wldb01
ora.gpnpd
1 ONLINE ONLINE wldb01
ora.mdnsd
1 ONLINE ONLINE wldb01
如果不想这么麻烦,也可以不用管服务,直接删除文件,crf会自动重建文件(亲测没有问题,建议还是停服务后再操作避免意外发生)
rm -f *.bdb
如果确认不需要该服务,可以禁用
crsctl modify resource “ora.crf” -attr “AUTO_START=0” -init
疑问
--问题描述
不显示ora.crf资源信息
--原因
基本功不扎实,未弄清楚crsctl status res -t和crsctl status res -t -init 2个命令的区别
--解除疑问过程
--查看集群中所有资源状态,不显示初始化资源信息,如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.DATA.dg
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.LISTENER.lsnr
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.asm
ONLINE ONLINE rac01 Started
ONLINE ONLINE rac02 Started
ora.gsd
OFFLINE OFFLINE rac01
OFFLINE OFFLINE rac02
ora.net1.network
ONLINE ONLINE rac01
ONLINE ONLINE rac02
ora.ons
ONLINE ONLINE rac01
ONLINE ONLINE rac02
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac02
ora.cvu
1 ONLINE ONLINE rac02
ora.icpsp.db
1 ONLINE ONLINE rac01 Open
2 ONLINE ONLINE rac02 Open
ora.oc4j
1 ONLINE ONLINE rac02
ora.orcl.db
1 ONLINE ONLINE rac01 Open
2 ONLINE ONLINE rac02 Open
ora.rac01.vip
1 ONLINE ONLINE rac01
ora.rac02.vip
1 ONLINE ONLINE rac02
ora.scan1.vip
1 ONLINE ONLINE rac02
--守护进程状态
-init: 这个选项用于显示初始化资源的状态信息,这些资源通常包括如 ora.cssd、ora.ctssd、ora.diskmon 等基础资源。
[root@rac01 rac01]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE rac01 Started
ora.cluster_interconnect.haip
1 ONLINE ONLINE rac01
ora.crf
1 ONLINE ONLINE rac01
ora.crsd
1 ONLINE ONLINE rac01
ora.cssd
1 ONLINE ONLINE rac01
ora.cssdmonitor
1 ONLINE ONLINE rac01
ora.ctssd
1 ONLINE ONLINE rac01 OBSERVER
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE rac01
ora.gipcd
1 ONLINE ONLINE rac01
ora.gpnpd
1 ONLINE ONLINE rac01
ora.mdnsd
1 ONLINE ONLINE rac01
https://www.xifenfei.com/2017/03/high-space-usage-crfclust-bdb.html
https://blog.csdn.net/weixin_43700866/article/details/114382015
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!