Tuesday, 2 April 2013

Replace a “Hot Swap” mirrored rootvg disk (hdisk1)

In the following example, AIX system has 2 disks in the rootvg with mirrored lv except ibmDIRlv . The boolist contains both hdisk0 and hdisk1. There are no other logical volumes in rootvg other than the AIX systemlogical volumes. hdisk1 has failed and need replacing, both hdisk0 and hdisk1 are in “Hot Swap” carriers and therefore the machine does not need shutting down.  In the below procedure from steps 9 to step 15 are  only applicable to Internal disk and are not applicable to SAN LUNs.
Pre Requisite:   Make a note of the slot disk number for replacing disk using lscfg -vl hdisk1
                          hdisk1           U787B.001.DNW8950-P1-T14-L3-L0  16 Bit LVD SCSI Disk Drive (146800 MB)
Step1:- if hdisk0 and hdisk1 are not similar migrate lv’s those are not in hdisk0. Here hdisk0 containing all hdisk1 LVs except ibmDIRlv. To move physical partitions in logical volume ibmDIRlv from hdisk1 to hdisk0, enter
#migratepv –l ibmDIRlv hdisk1 hdsik0
Step2:- unmirror hdisk1 from rootvg
#unmirrorvg rootvg hdisk1                       (Breaks the mirroring).
Step3:- Clear’s the boot record of hdisk1.
#chpv –c hdisk1                                         (Clears the bootrecords of hdisk1).
Step4:- Reduce the rootvg by eliminating hdisk1.
#reducevg rootvg hdisk1
Step5:- Set the bootlist to hdisk0.
#bootlist –m normal hdisk0     (-m refers mode.Here it will boot normal mode from hdisk0).
Step6:- Removed hdisk1 from rootvg so we have problem with quorum so need to varryoff quorum automatically).
#chvg –Qn rootvg   (here we have two disks each having 50-50% and quoram must be >51 so it is not possible with one disk for that it need to disable).
Step7:- sync the odm for lv special file ownerships and permissions.
#synclvodm –Pv rootvg  
Ste      Step8 :Remove the hdisk1 definition from the server. rmdev –dl /dev/hdisk1
Step9:  Select the Identifier of the hdisk1 as show below. It will help the onsite person to identify the disk as Indicator will be lit on the hdisk1 slot.
daig —>task selection —> Identify and Attention Indicators
Step10 :  Now do the following step and asked on site engineer to remove the device physically from the server and add a new disk.
diag —-> enter —> task selection—->hot plug task —–> scsi&scsi     RAID plug master—> select replace/ remove –> select hdisk1 —-> select device has been removed/replace
Step11 Press Enter. The disk drive slot exits the Remove state and enters the Normal state.
Step12: Scan the new device by running the cfgmgr
Step13: Update the system log by using daig.
daig —>task selection —> Log Repair Action —>hdisk1 —-> F3
Step14: Check the errpt status by errpt | more
Step15: certify the device by daig .
daig —–>task selection —-> certify the device —>hdisk1.
Step16: Add hdisk1 to rootvg
#extendvg rootvg hdisk1
Step17: Mirror and Sync rootvg with disk hdisk1
#mirrorvg –S rootvg
Step18:  Create boot image on hdisk1
#bosboot –ad /dev/hdisk1


  1. I think you get your hdisk1/0 mixed up here and there. you turned on the attention indicator for hdisk1 ...you said hdisk0 was failed ... after this your system fails ...

  2. Have corrected them, thanks for informing us.