This article has been motivated from real life issue experienced on Exadata system running a few dozen of Oracle databases.
I have already written about Exadata shortcoming in one of previous articles where I stressed that one of the main disadvantages of Exadata machine is non-existent virtualization technology.
Oracle recognizes Oracle VM as the only supported virtualization technology on x86 architecture, and the only way to control license costs, which allows you to license only part of available CPU cores on the host (server box).
Main problem with Oracle VM is that many Oracle clients, for different reasons, avoid to use it, especially for high database load, and prefer VM Ware or use bare metal instead, or switch to IBM AIX which is supported from virtualization standpoint (although 1 CPU core on AIX is 1 Oracle Db license, while on x86 platform – Linux and Windows, ratio is 2 CPU cores = 1 license).
On the other side, VM Ware (acquired by EMC which is later aquired by Dell), dominant virtualization technology on x86 platform is officially not supported.
That means there is no way to control license costs with VM Ware, meaning you’ll have to license the whole host (server box), as VM Ware virtual CPU is not recognized by Oracle.
Problem is even worse today as today servers usually comes with many cores.
Another problem is what is happening in case you have some issue.
Oracle will provide support for their products, but if Oracle Support concludes that issue might be related to virtualization layer, you will be pointed to contact VM Ware support, and ping-pong game can start.
There are only three options that left to Oracle customers:
1. to use IBM AIX (or Sun Solaris) as it’s configuration technology is fully supported
2. to migrate into Cloud
3. to install many Oracle databases on the same host whit no virtualization (bare metal)
As of time of writing this article, option 3 is the most common option that majority of Oracle customers are using.
Although majority of Oracle related books have passed through my hands, and I went through Oracle official documentation, description how to setup Linux OS to be able to handle few dozen of Oracle databases running from one or more Oracle Home, I have not found yet.
For that reason very often, especially Exadata owners have difficult time to balance between not using available resources by running one or two databases per Exadata host, or to use available resources by installing many databases but with a risk of crashing the system due to inappropriate OS setup.
I’ll address answer to how to setup Linux OS for running many Oracle databases without virtualization technology for one of the future articles.
In this article I’ll describe how to setup capturing kernel dumps in case of Linux crash due to misconfigured OS.
Assuming you are using OEL 6 or 7, there are several options for analyzing kernel dump.
One option is to use DTrace utility which is migrated from ex Sun Solaris (now Oracle Solaris) UNIX system to Oracle Linux.
Although this is probably the most powerful utility, Linux version of DTrace is not as powerful as Solaris version.
But the main reason that restricts popularity of DTrace utility on Oracle Linux system is fulfilling prerequisites that you must subscribe your system to the following channels:
– Oracle Linux 6 Latest (x86_64) (ol6_x86_64_latest)
– Unbreakable Enterprise Kernel Release 3 (UEK R3) for Oracle Linux 6 (x86_64)
– Oracle Linux 6 Dtrace Userspace Tools (x86_64)
I’m not going to waste space on DTrace here as you can find more details on the following page (for OEL 6):
https://docs.oracle.com/cd/E37670_01/E37355/html/ol_config_dtrace.html
and there are a whole book dedicated to how to leverage that powerful utility, and that can also be topic for one of the future articles.
For that reason, tool that will be most appropriate for majority of sysadmins running Oracle SW (Database, Web Logic…) on Linux (Oracle Linux) is Kdump utility that comes from the Red Hat laboratory.
Kdump first appear in RHEL 5 and is further enhanced in RHEL 6 & 7.
Kdump, in case of crash condition, will trigger complete copy of the Linux kernel in reserved area of memory.
When the system crashes, secondary kernel copies the memory pages to the crash dump location (configurable either by GUI or by editing /etc/kdump.conf configuration file).
To explain in detail how to use that tool or what kind of crash can be recorded is by far out of scope of this article.
I’ll point to just a few important points.
First we need to install Kdump:
root@oel:/var/crash>yum install kexec-tools system-config-kdump
While the first package installs the utility itself, the later one is in case you want to have GUI where you can enable/disable Kdump and to configure it (reserverd memory, location on the disk etc).
On the following screenshots you can see GUI interface from which you can control Kdump utility like enable/disable, set the path to dump etc.
As majority of production Linux servers have no GUI installed, all steps will be performed by using terminal.
Next step is to determine kernel version which can easily be done by executing the following command:
root@oel:/>uname -r
2.6.39-400.300.2.el6uek.x86_64
Now when I know the kernel version, next step is to add
crashkernel= kernel option
in /boot/grub/grub.conf file.
For Red Hat kernel you can add
crashkernel=auto
but if you are using UEK (Oracle Linux Unbreakable enterprise kernel), you won’t be able to start Kdump service unless you add the following:
crashkernel=448M@64M numa=off
Here is the complete setting for the kernel that is used in /boot/grub/grub.conf file.
title Oracle Linux Server Unbreakable Enterprise Kernel (2.6.39-400.300.2.el6uek.x86_64)
root (hd0,0)
kernel /boot/vmlinuz-2.6.39-400.300.2.el6uek.x86_64 root=UUID=f5e110df-ed68-4d24-8435-b39657f5691e ro selinux=0 rd_NO_LUKS LANG=en_US.UTF-8 KEYBOARDTYPE=pc KEYTABLE=croat rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb rd_NO_LVM rd_NO_DM crashkernel=448M@64M numa=off
Next step is to check /boot/config-`uname -r` file which is in my case /boot/config-2.6.39-400.300.2.el6uek.x86_64 for the following option:
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
The last step is to check Kdump configuration file vi /etc/kdump.conf and change it to fit your purpose.
The most important is to check where dump will be written (default is in /var/crash as defined by the path variable inside kdump.conf file).
As entire content of memory is rarely needed to analyze and troubleshoot reason of kernel issues,
especially if host have a lot of memory which is common today (Exadata has Tb’s of memory), kernel dump might have trouble to deal with it, and can add performance overhead.
For that reason we need to configure core collector to capture only what is needed.
Usually zero, free, cache and user pages are not needed.
For that reason, I usually make one change in kdump.conf file:
#core_collector makedumpfile -p –message-level 1 -d 31
is replaced with
core_collector makedumpfile -d 31 -c
Now it’s time to reboot Linux machine.
We can check if Kdump service has been started automatically.
root@oel:/var/crash/127.0.0.1-2018-08-13-14:24:36>chkconfig --list |grep kdump
kdump 0:off 1:off 2:on 3:on 4:on 5:on 6:off
or with the following command:
root@oel:/var/crash/127.0.0.1-2018-08-13-14:24:36>service kdump status
Kdump is operational
If it is not, you can start Kdump by executing the following:
root@oel:/>service kdump status
Kdump is operational
or with the following command:
root@oel:/>/etc/init.d/kdump start
Once kdump is set up, vmcore file on configured location will be created if kernel panics.
To test it you can manually trigger kernel panic. I’ll trigger a panic with the following command:
root@oel:/>echo c > /proc/sysrq-trigger
After restart in configured location (default is /var/crash as defined in kdump.conf), you can find the following directory:
drwxr-xr-x 2 root root 4096 Aug 13 14:32 127.0.0.1-2018-08-13-14:24:36
Inside you can find:
root@oel:/var/crash/127.0.0.1-2018-08-13-14:24:36>ls -la
total 21680
drwxr-xr-x 2 root root 4096 Aug 13 14:32 .
drwxr-xr-x. 3 root root 4096 Aug 13 14:24 ..
-rw------- 1 root root 22133473 Aug 13 14:24 vmcore
-rw-r--r-- 1 root root 26204 Aug 13 14:24 vmcore-dmesg.txt
From RHEL 6.4 (OEL 6.4), Kdump will dump kernel log to a file vmcore-dmesg.txt which you can inspect (plain txt file).
In this case you can see near the end:
<4> [] write_sysrq_trigger+0x4a/0x50
…
<1>RIP [] sysrq_handle_crash+0x16/0x20
as we triggered the kernel crash.
At this point you can get idea what was going on at the time of crash and what is causing it.
As a next step you can upload vmcore (binary file) along with vmcore-dmesg.txt to Red Hat or Oracle Linux support to get more details of what is causing OS crash along with a possible solution how to fix it.
As alternative (or in parallel with raising a ticket to Red Hat or Oracle support) you can perform further analysis of kernel vmcore by using
– crash utility
– kernel debugging symbols
To install crash utility you need to execute:
root@oel:/>yum install crash
After that you need to install debuginfo package (to debug symbols). For that you need subscription to debuginfo channel, or download package for your particular kernel version (uname -r) from Red Hat support, or in case of Oracle Linux from the following web pages:
OEL 7:
OEL 6:
Finally you need to install it:
root@oel:/>yum install kernel-uek-debuginfo-common-2.6.39-400.300.2.el6uek.x86_64.rpm
root@oel:/>yum install kernel-uek-debug-debuginfo-2.6.39-400.300.2.el6uek.x86_64.rpm
Now everything is ready to execute crash
root@oel:/var/crash/127.0.0.1-2018-08-13-14:24:36>crash /usr/lib/debug/lib/modules/2.6.39-400.300.2.el6uek.x86_64.debug/vmlinux /var/crash/127.0.0.1-2018-08-13-14\:24\:36/vmcore
DUMPFILE: /tmp/vmcore [PARTIAL DUMP]
CPUS: 2
DATE: Thu May 5 14:32:50 2011
UPTIME: 00:01:15
LOAD AVERAGE: 1.19, 0.34, 0.12
TASKS: 252
NODENAME: oel6
RELEASE: 2.6.39-400.300.2.el6uek.x86_64
VERSION: #1 SMP Mon Aug 21 19:45:17 CET 2018
MACHINE: x86_64 (3214 Mhz)
MEMORY: 8 GB
PANIC: "Oops: 0002 [#1] SMP " (check log for details)
PID: 91478
COMMAND: "bash"
TASK: ffff81147a3bbb70 [THREAD_INFO: ffff81147a3bbb70]
CPU: 0
STATE: TASK_RUNNING (PANIC)
Summary:
When someone does not have knowledge to find out what is causing some issue, usually first who is blamed is network or external consultants.
Purpose of this article is to pull out what is really going on and how to start with troubleshooting as tthe only way to move forward is to fix the problem instead to blame other teams.
Here I barely touch the surface what can be done and I did not go in deep like options available, how to prevent interruption of core collection, how to configure collector, types of event that can be tracked, crash commands, inclomplete cores, how to perform analysis etc.
Still you should get feeling where to start after you check basic things first like inspecting /var/log with appropriate tools.
There are some other very useful tools you might consider like SystemTap and DTrace where you can get more specific information and and create scripts targeted to problematic part of the kernel.
I also want to mention OS Watcher and Exa Watcher as complementary tools.
Today purpose of OS is more difficult that it was once in a past, as many applications/databases and tasks have to be executed simultaneously.
At the same time today’s servers are equiped with many CPU cores and often with Tb’s of memory that also makes task for OS like harder than ever before.
Still, by intelligently setup OS/databases/apps/architecture that are running on some host, all tasks can still be handled efficiently without jeopardizing system stability.
I hope this article can help you to troubleshoot in what is causing system instability.
Comments