If you have hanging server but no useful errors are shown in /var/log/messages, the way to find the reason is to install and configure KDUMP, so it will create a core dump file when server hungs. In some cases KDUMP could not help on XEN PV machines, however in any case you should try, on regular servers is really doing a good job.
Here are steps to install and configure KDUMP:
- Install kexec-tools:
1yum install kexec-tools
Edit /etc/kdump.conf, and set path variable to point to a directory with enough space to hold kernel dump file (default location is /var/crash/). File size will be about size of the server RAM + 1GB
- Edit /etc/grub.conf.
For CloudLinux 5 add to the kernel line as another boot parameter:1crashkernel=160M@12M
For CloudLinux 6 add to the kernel line as another boot parameter:1crashkernel=160M
For CloudLinux 7 edit /etc/default/grub and add crashkernel=160M to GRUB_CMDLINE_LINUX parameter so it looks like:
GRUB_CMDLINE_LINUX=”crashkernel=160M rhgb quiet”
Then regenerate grub.conf with the following command:1grub2-mkconfig -o /boot/grub2/grub.cfg
For CLoudLinux 5 and 6 – add kdump to chkconfig and turn it On during boot:12chkconfig --add kdumpchkconfig kdump on
- Modify /etc/sysctl.conf file and add following block to catch all possible panic states:
12345678910# Enable reboots on panic to allow kdump make dumpskernel.sysrq=1kernel.hung_task_panic = 1kernel.panic = 1kernel.panic_on_io_nmi = 1kernel.panic_on_oops = 1kernel.panic_on_stackoverflow = 1kernel.panic_on_unrecovered_nmi = 1kernel.softlockup_panic = 1kernel.unknown_nmi_panic = 1
After server boot check if kdump is running with:
service kdump status
Obtaining coredump if server hungs is described here .