Oracle 10gR2 10205 RAC CPU负载过高问题

来自Fantasy的维基百科
2014年10月1日 (三) 04:01Maoenfeng (讨论 | 贡献)的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航, 搜索

现象1

/u01/app/oracle/admin/TBNLOT/udump/tbnlot2_ora_1610.trc Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1 System name: Linux Node name: nrac2 Release: 2.6.18-371.11.1.el5 Version: #1 SMP Wed Jul 23 15:12:55 EDT 2014 Machine: x86_64 Instance name: TBNLOT2 Redo thread mounted by this instance: 2 Oracle process number: 465 Unix process pid: 1610, image: oracle@nrac2

      • ACTION NAME:() 2014-09-29 15:06:32.377
      • MODULE NAME:(TCPrintAppServer.exe) 2014-09-29 15:06:32.377
      • SERVICE NAME:(TBNLOTT) 2014-09-29 15:06:32.377
      • SESSION ID:(845.3) 2014-09-29 15:06:32.377

WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0 WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65516 WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0 ksfdgo()+1488<-ksfdaio1()+9848<-kfkUfsIO()+594<-kfkDoIO()+631<-kfkIOPriv()+616<-kfdIOPriv()+95<-kfioSubmitIO()+503<-kfioRequestPriv()+166<-kfioRequest()+689<-ksfd_osmgo()+1286<-ksfdgo()+1488<-ksfdaio1()+9848<-kcflbi()+498<-kcbldio()+1897<-kcblrs()+357<-kcblrd()+154 <-kcblgt()+1215<-kcbldrget()+218<-kcbgtcr()+25990<-kdlrdb()+146<-kdlccbk()+355<-kdlwfb()+716<-kdl_write1()+2399<-kdl_copy()+972ssd_unwind_bp: unhandled instruction at 0x1f01b1f instr=f <-kokliclo()+565<-koklcre()+512<-kokleva()+863<-evaopn2()+458<-insolev()+196<-insbrp()+360<-insrow()+236<-insdrv()+589<-inscovexe()+399<-insExecStmtExecIniEngine()+85<-insexe()+384<-opiexe()+9334<-kpoal8()+2295<-opiodr()+1184<-ttcpip()+1226<-opitsk()+1310<-opiino()+1024 <-opiodr()+1184<-opidrv()+548<-sou2o()+114<-opimai_real()+163<-main()+116<-__libc_start_main()+244<-_start()+41

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                            
9407 oracle    25   0 32.2g  59m  51m R 99.9  0.0 944:57.20 oracle                                                                              

24494 oracle 25 0 32.1g 30m 23m R 99.9 0.0 53:47.66 oracle 27200 oracle 25 0 32.1g 46m 39m R 99.9 0.0 1156:21 oracle 30885 oracle 25 0 32.1g 30m 23m R 99.9 0.0 103:06.92 oracle 31073 oracle 25 0 32.1g 30m 23m R 99.9 0.0 102:45.68 oracle

现象2

/u01/app/oracle/admin/TBNLOT/udump/tbnlot1_ora_14007.trc Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1 System name: Linux Node name: nrac1 Release: 2.6.18-371.11.1.el5 Version: #1 SMP Wed Jul 23 15:12:55 EDT 2014 Machine: x86_64 Instance name: TBNLOT1 Redo thread mounted by this instance: 0 <none> Oracle process number: 0 Unix process pid: 14007, image: oracle@nrac1

Number of resource hash buckets is 6955 Parsing user specified table space list to be ignored 2014-09-29 15:12:21.689: [ COMMCRS]clsc_set_clsd_NS_trace: called before init completed Number of resource hash buckets is 6955

  • kjfcnfy: kjinumbuckets = 32768

Dynamic strand is set to TRUE Running with 2 shared and 0 private strand(s). Zero-copy redo is FALSE

查了查资料,又说是bug,但给出了两种解决方法:一,增加操作系统内核参数AIO-MAX-NR的值;二,禁用磁盘AIO机制。我采用了修改系统内核参数AIO-MAX-NR的方法来解决这个问题。 1、可以临时修改内核参数aio-max-nr

  1. echo > /proc/sys/fs/aio-max-nr 1048576

2、永久修改内核参数aio-max-nr,需要在/etc/sysctl.conf加上下面这句 fs.aio-max-nr = 1048576 用下列命令使参数生效

  1. /sbin/sysctl -p

解决办法

附,参考资料 Bug 9949948 Linux: Process spin under ksfdrwat0 if OS Async IO not configured high enough This note gives a brief overview of bug 9949948. The content was last updated on: 28-OCT-2011 Click here for details of each of the sections below. Affects: Product (Component) Oracle Server (Rdbms) Range of versions believed to be affected Versions >= 10.2.0.4 but BELOW 11.1 Versions confirmed as being affected 10.2.0.5 Platforms affected Linux X86-64bit Linux 32bit It is believed to be a regression in default behaviour thus: Regression introduced in 10.2.0.5 Fixed: This issue is fixed in 11.1.0.6 (Base Release) 10.2.0.5.2 Patch Set Update 10.2.0.5 Patch 5 on Windows Platforms Symptoms: Related To: Hang (Process Spins) Waits for "i/o slave wait" DISK_ASYNCH_IO Description This problem is introduced in 10.2.0.5

It only affects platforms where Oracle has to reserve async IO slots, such as Linux platforms.

If the OS async IO layer is underconfigured and an Oracle process cannot get sufficient AIO slots then rather than reverting to using non AIO call the process may go into an infinite spin under ksfdrwat0.

Rediscovery notes: The spin will be preceded by messages in the trace file of the form: WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0 WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65518 WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0

Notice specifically that the value for aiolimit is reported as "0" for this bug.

The process then spins in ksfdrwat0 typically with a stack showing skgfqio () ksfdgo () ksfdwtio () ksfdwat1 () ksfdrwat0 () <<< Spin point ksfdblock () kcflwi () kcflci () kcblci () kcblcio () kcblgt () kcbldrget ()

It will show repeated waits for "i/o slave wait", which can be misleading as that is normally considered an idle wait event.

Workaround Raise the OS AIO limits such that the number of concurrent slot requirements never exceeds the OS limit. ie: Increase AIO-MAX-NR OR Disable async IO (Set DISK_ASYNCH_IO=FALSE)

See Note:1313555.1 for additional notes on this issue. Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support. References Bug:9949948 (This link will only work for PUBLISHED bugs) Note:245840.1 Information on the sections in this article

个人工具
名字空间

变种
操作
导航
工具