Friday, 21 February 2014

AIX livedump


What is a livedump file system and how is it created


What is an AIX livedump
The concept of an AIX livedump was introduced in AIX 6.1. A livedump is a small dump initiated on a running server and does not require a system restart. The dumps are saved to /var/adm/ras as a default directory and have a file extension that ends in .DZ. Lived dumps are part of an ongoing initiative by AIX called RAS, or reliability, accessibility, and serviceability.

Only components registered with the AIX kernel can be dumped. They can be initiated from software programs by the kernel, a kernel extension, or from the command line. The dump has either a critical or informational priority. The errdemon records an entry in /var/adm/ras/errlog when a livedump occurs.

The LVM kernel component often registers informational livedumps. These can be triggered by multiple LVM errors which have to do with I/O subsystem problems. LVM related livedumps can usually be disregarded because of this and the reason why LVM is having an issue should be investigated.

While a livedump is being written the system is frozen as data is written to pinned memory. The system is unfrozen for the copy to the livedump file system. livedumps are typically very small so the freeze time is a very short interval. At this time livedumps are written serially, not in parallel.

livedump in the Error Report
This is an example of a livedump as shown in the output of "errpt -a":
Date/Time:       Wed Nov  6 08:42:43 2013
Sequence Number: 3550
Machine Id:      000CF31BD400
Node Id:         p260vio1
Class:           S
Type:            INFO
WPAR:            Global
Resource Name:   livedump
Live dump complete
Detail Data
File name and message text

This entry notes that this is an informational livedump, shows where it is located, how it is named, and shows that it was generated by the emfcdd kernel component. An error writing a livedump may report LDM_DUMPERR to the errorlog.

AIX livedump data is gathered by the snap command when using the "-D" option and is stored in a directory called dumpdata.
AIX livedump File Systems
A brand new installation of AIX 6.1 or 7.1 will, by default, have an LVM logical volume called livedump which is mounted as a JFS2 file system /var/adm/ras/livedump. The file system has a default size of 64MB or 1/4 of system memory, whichever is less. However the LVM partition size is also a determining factor in the size of this file system.

A migration from an earlier version of AIX will not create this file system although a directory may be created under /var/adm/ras.

The crontab for root runs dumpcheck at 3:00 PM daily. One of the checks it makes is whether free space in /var/adm/ras/livedump is greater than the livedump_ threshold, which is set to 25% of the file system space as a default value. If there is insufficient space then a message with the label of DMPCHK_LDMPFSFULL will be written to the AIX error log. If the /var/adm/ras/livedump file system does not exist or is not mounted the check will be made against /var.

If a livedump file system does not exist it can be created as follows:

Livedump Creation
1. It will need a minimum of 256MB in space

2. Type "lsvg rootvg" and look at FREE PPs and PP_SIZE. to make sure sufficient space exists in the rootvg to create the livedump logical volume. If the rootvg is mirrored you will need to take this into account.

3. Create and mount the file system. The following example assumes a PP_SIZE of 64MB.

# mklv -t jfs2 -y livedump rootvg 4
# crfs -v jfs2 -d /dev/livedump -m /var/adm/ras/livedump -A yes

If the mount point does not exist create it and then mount the file system.
# mkdir /var/adm/ras/livedump
# mount /var/adm/ras/livedump

4. If you need to mirror the livedump logical volume, type

# mklvcopy livedump 2
# syncvg -l lvedump

1 comment:

  1. Hi,

    We do have the "/var/adm/ras/livedump" filesystem created as part of OS installation. But i never see any activity inside that filesystem.

    how to enable live dumps in AIX? Cna you please give me some idea.
    is there away to check whether it is working as expected ?