Showing posts with label Cluster. Show all posts
Showing posts with label Cluster. Show all posts

Saturday, 26 July 2014

PowerHA/HACMP Moving Resource Group (RG) one node to other

We are going to discuss about the resource group (RG) movement one node to other in PowerHA.
Here are the steps

1) Extending PATH vairable with cluster paths

Sometimes cluster paths are not included in default path ,run below command incase if you are not able to run commands directly.
export PATH=$PATH:/usr/es/sbin/cluster:/usr/es/sbin/cluster/utilities:/usr/es/sbin/cluster/sbin:/usr/es/sbin/cluster/cspoc

2) Check the cluster services are up  or not in destination node

#clshowsrv -v
Status of the RSCT subsystems used by HACMP:
Subsystem         Group            PID          Status
 topsvcs          topsvcs          278684       active
 grpsvcs          grpsvcs          332026       active
 grpglsm          grpsvcs                       inoperative
 emsvcs           emsvcs           446712       active
 emaixos          emsvcs           294942       active
 ctrmc            rsct             131212       active

Status of the HACMP subsystems:
Subsystem         Group            PID          Status
 clcomdES         clcomdES         204984       active
 clstrmgrES       cluster          86080        active

Status of the optional HACMP subsystems:
Subsystem         Group            PID          Status
 clinfoES         cluster          360702       active

3) Check the availability of resource group

# clRGinfo
-----------------------------------------------------------------------------
Group Name     Type           State      Location
-----------------------------------------------------------------------------
UMRG1            non-concurrent OFFLINE    umhaserv1
                                ONLINE     umhaserv2
#

4) Move the resourcegroup by using below command

==>  clRGmove -g <RG> -n  <node> -m

# clRGmove -g UMRG1 -n umhaserv1 -m
Attempting to move group UMRG1 to node umhaserv1.
Waiting for cluster to process the resource group movement request....
Waiting for the cluster to stabilize..................
Resource group movement successful.
Resource group UMRG1 is online on node umhaserv1.

You can use smitty path also

smitty cl_admin => HACMP Resource Group and Application Management => Move a Resource Group to Another Node / Site

5) Verify the RG movement

# clRGinfo
-----------------------------------------------------------------------------
Group Name     Type           State      Location
-----------------------------------------------------------------------------
UMRG1          non-concurrent   ONLINE     umhaserv1
                                OFFLINE    umhaserv2
#

Wednesday, 24 July 2013

Importvg PowerHA(HACMP)


There is a slight difference when we deal with importvg  in  PowerHA(HACMP) cluster environment. We need to give additional flag  “V”  it’s  major number .

1) First get the major number 

    For the volume group  disk without a PVID
      # ls -l /dev/<volume_group_name> # to get the major number
    For the volume group  disk with a PVID
      # ls -l /dev/applvg  
crw-rw---- 1 root system 48, 0 Jul 16 23:02 applvg

Note: here  "applvg" is volume_group_name

2) Importvg

# importvg -V <major_number> -y <volume_group_name> <hdisk_number> 
# improtvg -V 48 -y applvg hdisk7 

Tuesday, 23 July 2013

PuTTYCM (PuTTY Connection Manager) alternatives

PuttyCM (Putty Connection Manager) alternatives:

This post is in continuation with my earlier post  PuttyCM , we have some alternatives to puttycm , you can try them as well.
1) SuperPutty:
This is almost similar to the PuTTY CM. Also, it uses PuTTY as the back-end program to process. You can use different tabs in the same window, here. It allows you to see multiple connections.
There is no difference between PuTTY and this. Both look similar and perform, too. If you are familiar with the former one, the transition to this will be completely effortless and allows the PuTTY SSH Client to be opened in Tabs. Additionally there is support for SCP to transfer files.

Features

  • Docking user interface allows personalized workspace and managing multiple PuTTY sessions easy
  • Export/Import session configuration
  • Upload files securely using the scp or sftp protocols
  • Layouts
  • Supports PuTTY session configurations including Private Keys
  • Supports SSH, RLogin, Telnet and RAW protocols
  • Supports local shell via MinTTY or puttycyg
  • Supports KiTTY
Download SuperPuTTy

2) Multi-Tabbed Putty:
Multi-Tabbed putty is another free application that uses Putty or Kitty (forked version of Putty) in background. It provides clear tabbed user interfaces for each connection. The servers are grouped in a sidebar which is similar to Putty connection Manager. It also allows detaching of tabs and converting them into a general Putty window.

Multi-Tabbed Putty

3) Poderosa Connection Manager:
This is something different. This application does not add on to Putty, and works independently. Users can split the window space into multiple panes and allocate each to a different connection. Telnet and SSH are used to establish connections. Moreover, developers can enhance the functions of Poderosa by using and create new plugins.

Download Poderosa

4)   Putty tab manager :
Windows tool for managing multiple PuTTY instances in a single tabbed window. PuTTY Tab Manager allows you to run multiple sessions of PuTTY , known ssh and telnet client for windows in one window with tabs.
  • One is an executable does not need installation.
  • Platforms  Win32 , you do not need MS. NET framework. 
  • Unlimited tabs.
  • PuTTY menu for each open session.
  • Multilingual (English, Spanish).
  • Support KiTTY  (kitty.exe), "fork" (branch) of PuTTY.
  • Portable version.
  • Send scripts
  • Automatic capture mode . Catch all PuTTY sessions open or opening.
  • Option to launch sessions from a file ( -f <file>).
  • Hotkeys  
    • Add tab [Ctrl] + [Ins] / [ Ctrl] + [+]
    • Next  tab: [Ctrl] + [Tab] / [ Ctrl] + [Page Down]
    • previous tab [Ctrl] + [Shift] + [Tab] / [Ctrl] + [Page Up]
Download PuTTYTabManager

5) AutoPutty :
AutoPuTTY is a simple connection manager / launcher -
It's written in C# so you'll need at least Microsoft .NET Framework Version 2.0
What you can do with it:
- Manage a server list and connect thru PuTTY, WinSCP, Microsoft Remote Desktop and VNC (only VNC    3.3 encryption is supported for passwords yet)
- Easily connect to multiple servers at once
- Import a list from a simple text file
- Protect the application startup with a password (note that the list is always encrypted)
Download AutoPuTTY

6) Tunnelier:
SSH and SFTP client for Windows incorporates:
  • one of the most advanced graphical SFTP clients;
  • state-of-the-art terminal emulation with support for the bvterm, xterm, and vt100 protocols;
  • support for corporation-wide single sign-on using SSPI (GSSAPI) Kerberos 5 and NTLM user authentication, as well as Kerberos 5 host authentication;
  • support for RSA and DSA public key authentication with comprehensive user keypair management;
  • powerful SSH port forwarding capabilities, including dynamic forwarding through integrated SOCKS and HTTP CONNECT proxy;
  • powerful command-line parameters which make the SSH client highly customizable and suitable for use in specific situations and controlled environments;
  • an advanced, scriptable command-line SFTP client (sftpc);
  • a scriptable command-line remote execution client (sexec) and a command-line terminal emulation client (stermc);
  • an FTP-to-SFTP bridge allowing you to connect to an SFTP server using legacy FTP applications;
  • Bitvise SSH Server remote administration features;
  • single-click Remote Desktop forwarding.
Download Tunnelier


7) SecureCRT:
SecureCRT® combines rock-solid terminal emulation with the strong encryption, broad range of authentication options, and data integrity of the Secure Shell protocol for secure network administration and end user access.

Download SecureCRT
8) mRemote:
mRemoteNG is a fork of mRemote, an open source, tabbed, multi-protocol, remote connections manager. mRemoteNG adds bug fixes and new features to mRemote.
It allows you to view all of your remote connections in a simple yet powerful tabbed interface.
mRemoteNG supports the following protocols:
  • RDP (Remote Desktop/Terminal Server)
  • VNC (Virtual Network Computing)
  • ICA (Citrix Independent Computing Architecture)
  • SSH (Secure Shell)
  • Telnet (TELecommunication NETwork)
  • HTTP/HTTPS (Hypertext Transfer Protocol)
  • rlogin
  • Raw Socket Connections
Download mRemote


Tuesday, 23 April 2013

IBM PowerHA 7.1 heartbeat over SAN

Introduction

IBM PowerHA System Mirror for AIX is clustering software which gives the capability for a resource or group of resources (an application) to be automatically or manually moved to another IBM AIX® system in the event of a system failure.

Heartbeat and failure detection is performed over all interfaces available to the cluster. This could be network interfaces, Fibre Channel (FC) adapter interfaces, and the Cluster Aware AIX (CAA) repository disk.

In PowerHA 6.1 and earlier versions, heartbeat over FC adapter interfaces was not supported, and instead, a SAN-attached heartbeat disk was made available to both nodes, and this was used for heartbeat and failure detection. In PowerHA 7.1, the use of heartbeat disks is no longer supported, and configuring heartbeat over SAN is the supported method to use in place of heartbeat disks.

For this heartbeat over SAN to take place, the FC adapter in the AIX system needs to be configured to act as a target and aninitiator. In most SAN environments, an initiator device belongs to the server which is typically a host bus adapter (HBA) and a target is typically a storage device, such as a storage controller or a tape device. The IBM AIX 7.1 Information Center contains a list of supported FC adapters that can support the target mode. These adapters can be used for heartbeat over SAN.

Overview

In this article, I have provided simple examples of how to set up the SAN heartbeat in two scenarios; the first example with two AIX systems using physical I/O and the other example with two AIX logical partitions (LPARs) using Virtual I/O Server and N-Port ID Virtualization (NPIV).

In each of the examples, we have a two-node PowerHA 7.1 cluster, with one node residing on a different IBM POWER® processor-based server. This article does not cover how to configure shared storage, advanced network communications, or application controllers. This is a practical example of how to build a very simple cluster, and get the SAN heartbeat working.

Requirements

The following minimum requirements must be met to ensure that we can create the cluster and configure the SAN heartbeat:

  • AIX 6.1 or preferably AIX 7.1 needs to be installed on both AIX systems, using the latest technology level and service pack.
  • PowerHA 7.1 needs to be installed on both AIX systems, using the latest service pack.
  • The FC adapters in the servers must support target mode, and if NPIV is in use, they must be 8 GBps adapters supporting NPIV. NPIV support is required for Scenario 2 that is explained in this article.
  • If Virtual I/O Server is in use, then the VIOS code should be the latest service pack of IOS 2.2. This is required for Scenario 2 in this article.
  • If NPIV is in use, then the fabric switches must have NPIV support enabled, and be on a supported level of firmware. This is required for Scenario 2 that is explained in this article.
  • There must be a logical unit number (LUN) allocated to both AIX systems for use as the CAA repository disk.
  • There must be a LUN allocated to both AIX systems for use as shared storage for the cluster.

Scenario 1: Two nodes using physical I/O

In this scenario, we have a very simple environment where there are two POWER processor-based systems, each with a single instance of AIX. These systems are in a PowerHA cluster and connected through redundant SAN fabrics to shared storage.

The following figure gives a high-level overview of this scenario.

Figure 1. Overview of scenario 1
Overview of scenario 1

SAN zoning requirements

 Before the cluster can be created, SAN zoning is required. You need to configure the following two types of zones.

  • Storage zones
  • Heartbeat zones
To configure the zoning, first log in to each of the nodes, verify that the FC adapters are available, and capture the worldwide port number (WWPN) of each adapter port, as shown in the following example.
root@ha71_node1:/home/root# lsdev -Cc adapter |grep fcs
fcs0   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs1   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
fcs2   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs3   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
root@ha71_node1:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}' 
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}' 
|cut -c21-50| sed 's/../&:/g;s/:$//'); done                                            
fcs0 - 10:00:00:00:C9:CC:49:44
fcs1 - 10:00:00:00:C9:CC:49:45
fcs2 - 10:00:00:00:C9:C8:85:CC
fcs3 - 10:00:00:00:C9:C8:85:CD
root@ha71_node1:/home/root#
root@ha71_node2:/home/root# lsdev -Cc adapter |grep fcs
fcs0   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs1   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
fcs2   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs3   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
root@ha71_node2:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}' 
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}' 
|cut -c21-50| sed 's/../&:/g;s/:$//'); donefcs0 - 10:00:00:00:C9:A9:2E:96
fcs1 - 10:00:00:00:C9:A9:2E:97
fcs2 - 10:00:00:00:C9:CC:2A:7C
fcs3 - 10:00:00:00:C9:CC:2A:7D
root@ha71_node2:/home/root#
After the WWPNs are known, zoning can be performed on the fabric switches. Zone the HBA adapters to the storage ports on the storage controller used for the shared storage, and also create zones that can be used for the heartbeat. The following diagram gives an overview of how the heartbeat zones should be created.

Figure 2. Overview of creating heartbeat zones (scenario 1)
 Overview of creating heartbeat zones
Ensure that you zone one port from each FC adapter on the first node to another port on each FC adapter on the second node.

Device configuration in AIX

After the zoning is complete, the next step is to enable target mode on each of the adapter device in AIX. This needs to be performed on each adapter that has been used for a heartbeat zone. In the SAN zoning example, the adapters fcs0 and fcs2 on each node have been used for the SAN heartbeat zones.

For target mode to be enabled, both dyntrk (dynamic tracking) and fast_fail need to be enabled on the fscsi device, andtarget mode need to be enabled on the fcs device.

To enable target mode, perform the following steps on both nodes to zone.
root@ha71_node1:/home/root# rmdev –l fcs0 –R
fscsi0 Defined
fcs0 Defined
root@ha71_node1:/home/root# rmdev –l fcs2 –R
fscsi2 Defined
fcs2 Defined
root@ha71_node1:/home/root# chdev –l fscsi0 –a dyntrk=yes -a fc_err_recov=fast_fail
fscsi0 changed
root@ha71_node1:/home/root# chdev –l fscsi2 –a dyntrk=yes –a fc_err_recov=fast_fail
fscsi2 changed
root@ha71_node1:/home/root# chdev –l fcs0 -a tme=yes
fcs0 changed
root@ha71_node1:/home/root# chdev –l fcs2 -a tme=yes
fcs2 changed
root@ha71_node1:/home/root# cfgmgr
root@ha71_node1:/home/root#
root@ha71_node2:/home/root# rmdev –l fcs0 –R
fscsi0 Defined
fcs0 Defined
root@ha71_node2:/home/root# rmdev –l fcs2 –R
fscsi2 Defined
fcs2 Defined
root@ha71_node2:/home/root# chdev –l fscsi0 –a dyntrk=yes –a fc_err_recov=fast_fail
fscsi0 changed
root@ha71_node2:/home/root# chdev –l fscsi2 –a dyntrk=yes –a fc_err_recov=fast_fail
fscsi2 changed
root@ha71_node2:/home/root# chdev –l fcs0 -a tme=yes
fcs0 changed
root@ha71_node2:/home/root# chdev –l fcs2 -a tme=yes
fcs2 changed
root@ha71_node2:/home/root# cfgmgr
root@ha71_node2:/home/root#
If the devices are busy, make the changes with the –P option at the end of the command, and restart the server. This will cause the change to be applied at the next start of the server.

The target mode setting can be verified by checking the attributes of the fscsi devices. The following example shows how to check fscsi0 and fcs0 on one of the nodes. This should be checked on each of the fcs0 and fcs2 adapters on both nodes.
root@ha71_node1:/home/root# lsattr -El fscsi0
attach       switch    How this adapter is CONNECTED         False
dyntrk       yes       Dynamic Tracking of FC Devices        True
fc_err_recov fast_fail FC Fabric Event Error RECOVERY Policy True
scsi_id      0xbc0e0a  Adapter SCSI ID                       False
sw_fc_class  3         FC Class for Fabric                   True
root@ha71_node1:/home/root# lsattr -El fcs0 |grep tme
tme           yes        Target Mode Enabled                                True
root@ha71_node1:/home/root#
After the target mode is enabled, we should next look for the available sfwcomm devices. These devices are used for the PowerHA error detection and heartbeat over SAN.

Check whether these devices are available on both nodes.
root@ha71_node1:/home/root# lsdev -C |grep sfwcomm
sfwcomm0      Available 02-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm1      Available 03-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm2      Available 02-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm3      Available 03-T1-01-FF Fibre Channel Storage Framework Comm
root@ha71_node1:/home/root#

root@ha71_node1:/home/root# lsdev -C |grep sfwcomm
sfwcomm0      Available 02-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm1      Available 03-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm2      Available 02-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm3      Available 03-T1-01-FF Fibre Channel Storage Framework Comm
root@ha71_node1:/home/root#

Scenario 2: Two nodes using Virtual I/O Server

In this scenario, a slightly more complex environment where there are two POWER processor-based systems, each with dual VIOS and LPARs using VIOS is used. These LPARs are in a PowerHA cluster and connected using redundant SAN fabrics to shared storage.

When using VIOS, what differs from the physical I/O scenario is that the FC ports of the Virtual I/O Server must be zoned together. There is then a private virtual LAN (VLAN) with the port VLAN ID of 3358 (3358 is the only VLAN ID that will work) used to carry the heartbeat communication over the hypervisor from the Virtual I/O Server to the client LPAR, which is our PowerHA node.

In this case, the following high-level steps are required.

  1. Turn on target mode on the VIOS FC adapters.
  2. Zone the VIOS ports together.
  3. Configure the private 3358 VLAN for heartbeat traffic.
  4. Configure the PowerHA cluster.
The following figure gives a high-level overview of this scenario.

Figure 3. Overview of Scenario 2
Overview of Scenario 2

SAN zoning requirements

Before the cluster can be created, SAN zoning is required. You need to configure the following two types of zones.

  • Storage zones
    • Contains the LPAR's virtual WWPNs
    • Contains the storage controller's WWPNs
  • Heartbeat zones (contains the VIOS physical WWPNs)
    • The VIOS on each machine should be zoned together.
    • The virtual WWPNs of the client LPARs should not be zoned together.
When performing the zoning, log in to each of the VIOS (both VIOS on each managed system) and verify that the FC adapters are available, and capture the WWPN information for zoning. The following example shows how to perform this step on one VIOS.
$ lsdev -type adapter |grep fcs
fcs0   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs1   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
fcs2   Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs3   Available 03-T1 8Gb PCI Express Dual Port FC Adapter
$ for i in `lsdev -type adapter |awk '{print $1}' |grep fcs `; 
do print ${i} - $(lsdev -dev $i -vpd 
|grep Network |awk '{print $2}' |sed 's/Address.............//g'
| sed 's/../&:/g;s/:$//'); done
fcs0 - 10:00:00:00:C9:B7:65:32
fcs1 - 10:00:00:00:C9:B7:65:33
fcs2 - 10:00:00:00:C9:B7:63:60
fcs3 - 10:00:00:00:C9:B7:63:61
The virtual WWPNs also need to be captured from the client LPAR for the storage zones. The following example shows how to perform this step on both nodes.
root@ha71_node1:/home/root# lsdev -Cc adapter |grep fcs
fcs0   Available 02-T1 Virtual Fibre Channel Client Adapter
fcs1   Available 03-T1 Virtual Fibre Channel Client Adapter
fcs2   Available 02-T1 Virtual Fibre Channel Client Adapter
fcs3   Available 03-T1 Virtual Fibre Channel Client Adapter
root@ha71_node1:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}' 
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}' 
|cut -c21-50| sed 's/../&:/g;s/:$//'); done
fcs0 – c0:50:76:04:f8:f6:00:40
fcs1 – c0:50:76:04:f8:f6:00:42
fcs2 – c0:50:76:04:f8:f6:00:44
fcs3 – c0:50:76:04:f8:f6:00:46
root@ha71_node1:/home/root#
root@ha71_node2:/home/root# lsdev -Cc adapter |grep fcs
fcs0   Available 02-T1 Virtual Fibre Channel Client Adapter
fcs1   Available 03-T1 Virtual Fibre Channel Client Adapter
fcs2   Available 02-T1 Virtual Fibre Channel Client Adapter
fcs3   Available 03-T1 Virtual Fibre Channel Client Adapter
root@ha71_node2:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}' 
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}' 
|cut -c21-50| sed 's/../&:/g;s/:$//'); done                                            
fcs0 – C0:50:76:04:F8:F6:00:00
fcs1 – C0:50:76:04:F8:F6:00:02
fcs2 – C0:50:76:04:F8:F6:00:04
fcs3 – C0:50:76:04:F8:F6:00:06
root@ha71_node2:/home/root#
After the WWPNs are known, zoning can be performed on the fabric switches. Zone the LPAR’s virtual WWPNs to the storage ports on the storage controller used for the shared storage, and also create zones containing the VIOS physical ports, which will be used for the heartbeat. The following figure gives an overview of how the heartbeat zones should be created.

Figure 4. Overview of creating heartbeat zones (scenario 2)
 Overview of creating heartbeat zones

Virtual I/O Server FC adapter configuration

After the zoning is complete, the next step is to enable target mode on each of the adapter device in each VIOS. This needs to be performed on each adapter that has been used for a heartbeat zone. In the SAN zoning example, the fcs0 and fcs2 adapters on each node have been used for the SAN heartbeat zones.

For target mode to be enabled, both dyntrk (dynamic tracking) and fast_fail need to be enabled on the fscsi device, andtarget mode need to be enabled on the fcs device.

To enable target mode, perform the following steps on both VIOS on each managed system.
$ chdev -dev fscsi0 -attr dyntrk=yes fc_err_recov=fast_fail –perm
fscsi0 changed
$ chdev -dev fcs0 -attr tme=yes –perm
fcs0 changed
$ chdev -dev fscsi2 -attr dyntrk=yes fc_err_recov=fast_fail –perm
fscsi2 changed
$ chdev -dev fcs2 -attr tme=yes –perm
fcs2 changed
$ shutdown -restart

A restart of each VIOS is required, and therefore, it is strongly recommended to modify one VIOS at a time.

Virtual I/O Server network configuration

When VIOS is in use, the physical FC adapters belonging to the VIOS are zoned together. This provides connectivity between the VIOS on each managed system, however for the client LPAR (HA node) connectivity, a private VLAN must to be configured to provide this.

The VLAN ID must be 3358 for this to work. The following figure describes the virtual Ethernet setup.

Figure 5. Virtual Ethernet setup
Virtual Ethernet setup
First, log in to each of the VIOS, and add an additional VLAN to each shared Ethernet bridge adapter. This provides the VIOS connectivity to the 3358 VLAN.

The following figure shows how this additional VLAN can be added to the bridge adapter.

Figure 6. Adding the additional VLAN to the bridge adapter
Adding the additional VLAN to the bridge adapter
Next, create a virtual Ethernet adapter on the client partition, and set the port virtual VLAN ID to be 3358. This provides the client LPAR connectivity to the 3358 VLAN.

From AIX, run the cfgmgr command and pick up the virtual Ethernet adapter.

Do not put an IP address on this interface.

Figure 7. Creating a virtual Ethernet adapter on the client partition
 Creating a virtual Ethernet adapter on the client partition
After this is complete, we can create our PowerHA cluster, and the SAN heartbeat is ready for use.

PowerHA cluster configuration

The first step, before creating the cluster, is to perform the following tasks:

  • Edit /etc/environment and add /usr/es/sbin/cluster/utilities and /usr/es/sbin/cluster/ to the $PATH variable.
  • Populate /etc/cluster/rhosts.
  • Populate /usr/es/sbin/cluster/netmon.cf.
After this is complete, the cluster can be created using smitty sysmirror or on the command line. In the following example, I have created a simple two-node cluster called ha71_cluster.
root@ha71_node1:/home/root # clmgr add cluster ha71_cluster NODES="ha71_node1 ha71_node2"
Warning: to complete this configuration, a repository disk must be defined.

Cluster Name: ha71_cluster
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
Repository Disk: None
Cluster IP Address:
There are 2 node(s) and 1 network(s) defined

NODE ha71_node1:
        Network net_ether_01
                ha71_node1      172.16.5.251

NODE ha71_node2:
        Network net_ether_01
                ha71_node2      172.16.5.252

No resource groups defined
Initializing..
Gathering cluster information, which may take a few minutes...
Processing...

….. etc…..

Retrieving data from available cluster nodes.  This could take a few minutes.

        Start data collection on node ha71_node1
        Start data collection on node ha71_node2
        Collector on node ha71_node1 completed
        Collector on node ha71_node2 completed
        Data collection complete
        Completed 10 percent of the verification checks
        Completed 20 percent of the verification checks
        Completed 30 percent of the verification checks
        Completed 40 percent of the verification checks
        Completed 50 percent of the verification checks
        Completed 60 percent of the verification checks
        Completed 70 percent of the verification checks
        Completed 80 percent of the verification checks
        Completed 90 percent of the verification checks
        Completed 100 percent of the verification checks
IP Network Discovery completed normally

Current cluster configuration:

Discovering Volume Group Configuration

root@ha71_node1:/home/root #

After creating the cluster definition, the next step is to check whether there is a free disk on each node, so that we can configure the CAA repository.
root@ha71_node1:/home/root# lsdev –Cc disk
hdisk0 Available 00-00-01 IBM MPIO FC 2107
hdisk1 Available 00-00-01 IBM MPIO FC 2107
root@ha71_node1:/home/root# lspv
hdisk0          000966fa5e41e427      rootvg          active
hdisk1          000966fa08520349      None
root@ha71_node1:/home/root#
root@ha71_node2:/home/root# lsdev –Cc disk
hdisk0 Available 00-00-01 IBM MPIO FC 2107
hdisk1 Available 00-00-01 IBM MPIO FC 2107
root@ha71_node2:/home/root# lspv
hdisk0          000966fa46c8abcb         rootvg          active
hdisk1          000966fa08520349         None
root@ha71_node2:/home/root#

From the above example, it is clear that hdisk1 is a free disk on each node. So, this can be used for the repository. Next, modify the cluster definition to include the cluster repository disk. Our free disk on both nodes is hdisk1.

This can be performed using smitty hacmp or on the command line. The following example shows how to perform this step on the command line.
root@ha71_node1:/home/root # clmgr modify cluster ha71_cluster REPOSITORY=hdisk1
Cluster Name: ha71_cluster
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
Repository Disk: hdisk1
Cluster IP Address:
There are 2 node(s) and 1 network(s) defined

NODE ha71_node1:
        Network net_ether_01
                ha71_node1      172.16.5.251

NODE ha71_node2:
        Network net_ether_01
                ha71_node2      172.16.5.252

No resource groups defined


Current cluster configuration:

root@ha71_node1:/home/root #

The next step is to verify and synchronize the cluster configuration. This can be performed using smitty hacmp or on the command line. The following example shows how to synchronize the cluster topology and resources on the command line.
root@ha71_node1:/home/root # cldare -rt
Timer object autoclverify already exists

Verification to be performed on the following:
        Cluster Topology
        Cluster Resources

Retrieving data from available cluster nodes.  This could take a few minutes.

        Start data collection on node ha71_node1
        Start data collection on node ha71_node2
        Collector on node ha71_node2 completed
        Collector on node ha71_node1 completed
        Data collection complete

Verifying Cluster Topology...

        Completed 10 percent of the verification checks

WARNING: Multiple communication interfaces are recommended for networks that
use IP aliasing in order to prevent the communication interface from
becoming a single point of failure. There are fewer than the recommended
number of communication interfaces defined on the following node(s) for
the given network(s):

    Node:                                Network:
    ----------------------------------   ----------------------------------
    ha71_node1                           net_ether_01
    ha71_node2                           net_ether_01

        Completed 20 percent of the verification checks
        Completed 30 percent of the verification checks
Saving existing /var/hacmp/clverify/ver_mping/ver_mping.log to
 /var/hacmp/clverify/ver_mping/ver_mping.log.bak
Verifying clcomd communication, please be patient.

Verifying multicast communication with mping.


Verifying Cluster Resources...

        Completed 40 percent of the verification checks
        Completed 50 percent of the verification checks
        Completed 60 percent of the verification checks
        Completed 70 percent of the verification checks
        Completed 80 percent of the verification checks
        Completed 90 percent of the verification checks
        Completed 100 percent of the verification checks
… etc…
Committing any changes, as required, to all available nodes...
Adding any necessary PowerHA SystemMirror entries to 
/etc/inittab and /etc/rc.net for IPAT on node ha71_node1.
Adding any necessary PowerHA SystemMirror entries 
to /etc/inittab and /etc/rc.net for IPAT on node ha71_node2.

Verification has completed normally.
root@ha71_node1:/home/root #

Now that a basic cluster has been configured, the last step is to verify that the SAN heartbeat is up.

The lscluster –i command displays the cluster interfaces and their status. The sfwcom (Storage Framework Communication) interface is the SAN heartbeat.

In the following example, we can check this from one of the nodes to ensure that the SAN heartbeat is up. This is good news!
root@ha71_node1:/home/root # lscluster -i sfwcom
Network/Storage Interface Query

Cluster Name:  ha71_cluster
Cluster uuid:  7ed966a0-f28e-11e1-b39b-62d58cd52c04
Number of nodes reporting = 2
Number of nodes expected = 2
Node ha71_node1
Node uuid = 7ecf4e5e-f28e-11e1-b39b-62d58cd52c04
Number of interfaces discovered = 3

Interface number 3 sfwcom
 ifnet type = 0 ndd type = 304
 Mac address length = 0
 Mac address = 0.0.0.0.0.0
 Smoothed rrt across interface = 0
 Mean Deviation in network rrt across interface = 0
 Probe interval for interface = 100 ms
 ifnet flags for interface = 0x0
 ndd flags for interface = 0x9
 Interface state UP
root@ha71_node1:/home/root #
The remaining steps for cluster configuration, such as configuring shared storage, mirror pools, file collections, application controllers, monitors, and so on are not covered in this article.


Draft IBM PowerHA SystemMirror 7.1.2 Enterprise Edition for AIX Redbook

Download and read the draft of the IBM PowerHA SystemMirror 7.1.2 Enterprise Edition for AIX Redbooks publication.

This publication delivers the following table of contents:

Part 1 - Introduction

Chapter 1. Concepts and overview of the IBM PowerHA SystemMirror 7.1.2 Enterprise Edition
Chapter 2. Differences between PowerHA Enterprise Edition 6.1 and 7.1.2
Chapter 3. Planning

Part 2. - Campus style disaster recovery (streched clusters)

Chapter 4. Implementing DS8800 HyperSwap
Chapter 5. Cross-site LVM mirroring with PowerHA Standard Edition

Part 3 - Extended disaster recovery (linked clusters)

Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition linked clusters with SVC replication
Chapter 7. Configuring PowerHA SystemMirror Enterprise Edition with XIV replication

Part 4 - System administration, monitoring, maintenance, and management

Chapter 8. Migrating to PowerHA SystemMirror 7.1.2 Enterprise Edition
Chapter 9. PowerHA 7.1.2 Systems Director plug-in enhancements
Chapter 10. Cluster partition management

Part 5 - Appendices

Appendix A. Configuring PowerHA with IPv6

Appendix B. DNS change for the IBM Systems Director environment with PowerHA

This IBM Redbooks publication can be downloaded at the following website:
http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248106.html?Open

Wednesday, 17 April 2013

All About HACMP/PowerHA

PowerHA/SystemMirror

PowerHA implementation steps

PowerHA Limits

ComponentMaximum number/other limits
Nodes32
Resource groups64
Networks48
Network interfaces, devices, and labels256
Cluster resourcesWhile 128 is the maximum that clinfo can handle, there can be more in the cluster
Parent-Child dependenciesMax of 3 levels
Site2
Interfaces7 interfaces per node per network
Application monitors per site128
Persistent IP aliasOne per node per network
XD_data networks4 per cluster
GLVM ModesSynchronous, Asynchronous, non concurrent
GLVM DevicesAll PVs supported by AIX, no need to be same local and remote

 Log Files

FileDescription
/var/hacmp/adm/cluster.logGenerated by cluster scripts and daemons
/var/hacmp/log/hacmp.outGenerated by event scripts and utilities
/var/hacmp/adm/history/cluster.mmddyyyyCluster history files generated daily
/var/hacmp/clcomd/clcomd.logGenerated by clcomd daemon
/var/hacmp/clcomd/clcomddiag.logGenerated by clcomd daemon, debug information
/var/hacmp/clverify/clverify.logGenerated by Cluster Verification utility
/var/hacmp/log/autoverify.logGenerated by Auto Verify and Synchronize
/var/hacmp/log/clavan.logGenerated by Application Availability Analysis tool
/var/hacmp/log/clinfo.logGenerated by client node running clinfo
/var/hacmp/log/cl_testtool.logGenerated by the Cluster Test Tool
/var/hacmp/log/clconfigassist.logGenerated by Two-Node Cluster Configuration Assistant
/var/hacmp/log/clstrmgr.debugGenerated by the clstrmgr daemon
/var/hacmp/log/clstrmgr.debug.longDetail information from the clstrmgr daemon
/var/hacmp/log/clutils.logGenerated by cluster utilities and file propagation
/var/hacmp/log/cspoc.logGenerated by CSPOC commands
/var/hacmp/log/cspoc.log.remoteDetail information from CSPOC commands
/var/hacmp/log/migration.logGenerated by cluster migration
/var/hacmp/log/sa.logGenerated by Application Discovery
"odmget HACMPlogs"Displays a complete list of HACMP Log Files
/var/ha/log/topsvcs.defaultTopology Services starup configuration log.
/var/ha/log/topsvcs.dd.hhmmss.langTopology Services start/stop log
/var/ha/log/topsvcs.dd.hhmmssTopology Services activity log
/var/ha/log/nim.topsvcs.IF.clusternameNIM hearthbeat activity log for each interface.
/var/ha/log/nmDiag.topsvcs.IF.clusterNIM diagnostic log for each interface.
/var/ha/log/grpsvcs.default.nodenum_instnumGroup Services startup log
/var/ha/log/grpsvcs_nodenum_instnumGroup Services activity log. 

Useful HACMP Commands

CommandPurpose
clstatDisplays topology and resource groups status.
clinfoES and snmpd must be running.
cldumpDisplays topology & resource group status and configuration.
snmpd must be running.
cldispLike cldump, but application oriented.
snmpd must be running.
cltopinfo (cllsif)Displays topology configuration.
clRGinfo (clfindres)Displays resource group status.
clsshowresDisplays resource groups configuration.
clshowsrvCalls lsrsc to display status of:
HACMP subsystems (clshowrsrv -a)
HACMP and RSCT subsystems (clshowsrv -v)
clcycleRotates selected log files.
clgetactivenodesDisplays active nodes.
Must specify which nodo to ask (-n node)
clsnapSave HACMP log files and configuration information.
cl_ls_shared_vgsList shared vgs.
cl_lsfsList shared fs.
cllsgrpList the resource groups.
cllsresShow short resource group information.
clRGmoveBrings a RG Offline/Online or move it
lssrc -ls clstrmgrESDisplays Cluster Services

Notes:

Installation changes 

The following AIX configuration changes are made:

1. Files modified:

/etc/hosts
/etc/inittab
/etc/rc.net
/etc/services
/etc/snmpd.conf
/etc/snmpd.peers
/etc/syslog.conf
/etc/trcfmt
/var/spool/cron/crontab/root 

2. The hacmp group is added.  

3. Also, using cluster configuration and verification, the file /etc/hosts can be changed by adding or modifying entries. 

4. The following network options are set to “1” by RSCT topsvcs startup:

- nonlocsrcroute
- ipsrcrouterecv
- ipsrcroutesend
- ipsrcrouteforward
- ip6forwarding

5. The verification utility ensures that the value of each network option is consistent across all cluster nodes for the following settings:

- tcp_pmtu_discover
- udp_pmtu_discover
- ipignoreredirects
- routerevalidate

The cluster communications daemon

With the introduction of clcomdES, there is no need for an /.rhosts file to be configured. The cluster communications daemon is started by inittab, with the entry being created by the installation of PowerHA. The daemon is controlled by the system resource controller, so startsrc, stopsrc and refresh work. In particular, refresh is used to re-read /usr/es/sbin/cluster/etc/rhosts and moving the log files. The cluster communication daemon uses port 6191.

Resource group components:

Service IP Label
Volume Group
Filesystem
Application Server
NFS mounts
NFS exports

Resource group Startup options:

Online on home node only.
Online on first available node.
Online on all available nodes.
Online using distribution policy.

Resource group Fallover options:

Fall over to next priority node in list:
Fallover using dynamic node priority:
Bring offline (on error only)

Resource group Fallback options:

Fall back to higher priority node in list
Never fall back

Resource group attributes

Settling time
Delayed fallback timers
Distribution policy
Dynamic node priorities
Resource group processing order
Priority override location
Resource group dependencies - parent / child
Resource group dependencies - location

Resource Groups operations 

Bring a resource group offline
Bring a resource group online
Move a resource group to another node/site
Suspend/resume application monitoring

Sources of HACMP information

HACMP manuals come with the product –READ THEM!
Sales Manual: www.ibm.com/common/ssi
/usr/es/sbin/cluster/release_notes

IBM courses:

– HACMP Administration I: Planning and Implementation (AU54/Q1554)
– HACMP Administration II: Administration and Problem Determination (AU61/Q1561)
– HACMP V5 Internals (AU60/Q1560)

IBM Web Site:

– http://www-03.ibm.com/systems/p/ha/

Non-IBM sources (not endorsed by IBM but probably worth a look):

– http://www.matilda.com/hacmp/
– http://groups.yahoo.com/group/hacmp/

Saturday, 6 April 2013

AIX PowerHA (HACMP) Commands


Most commands should work on all PowerHA (HACMP prior to 5.5) versions.
If there is some syntax error, please consult the manual page for that command.

Sometimes path to cluster commands not included in default PATH variable ,to overcome it run below command before running ha commands

export PATH=$PATH:/usr/es/sbin/cluster:/usr/es/sbin/cluster/utilities:/usr/es/sbin/cluster/sbin:/usr/es/sbin/cluster/cspoc
PowerHA(HACMP) Commands
How to start cluster daemons (options in that order:
 clstrmgr, clsmuxpd, broadcast message, clinfo, cllockd)
clstart -m -s -b -i -l
How to show cluster state and substate (depends on clinfo)
 clstat
SNMP-based tool to show cluster state
 cldump
Similar to cldump, perl script to show cluster state
 cldisp
How to list the local view of the cluster topology
 cltopinfo
How to list the local view of the cluster subsystems
 clshowsrv -a
How to show all necessary info about HACMP
 clshowsrv -v
How to show HACMP version
 lslpp -L | grep cluster.es.server.rte
How to verify the HACMP configuration
 /usr/es/sbin/cluster/diag/clconfig -v -O                                                                                                    
How to list app servers configured including start/stop scripts
 cllsserv
How to locate the resource groups and display their status
 clRGinfo -v
How to rotate some of the log files
 clcycle
A cluster ping program with more arguments
 cl_ping
Cluster rsh program that take cluster node names as argument
 clrsh
How to find out the name of the local node
 get_local_nodename
rHow to check the HACMP ODM
 clconfig
How to put online/offline or move resource groups
 clRGmove
How to list the resource groups
 cllsgrp
How to create a large snapshot of the hacmp configuration
 clsnapshotinfo
How to show short resource group information
 cllsres
How to list the cluster manager state
 lssrc -ls clstrmgrES
Cluster manager states

  • ST_NOT_CONFIGURED Node never started
  • ST_INIT Node configured but down - not  running
  • ST_STABLE Node up and running
  • ST_RP_RUNNING 
  • ST_JOINING 
  • ST_BARRIER 
  • ST_CBARRIER 
  • ST_VOTING 
  • ST_RP_FAILED Node with event error         

How to show heartbeat information
 lssrc -ls topsvcs
How to check logs related to hacmp
 odmget HACMPlogs
How to list all information from topology HACMP
 lssrc -ls topsvcs
How to show all info about group
  lssrc -ls grpsvcs
How to list the logs
 cllistlogs
How to list the resources defined for all resource group
 clshowres
How to show resource information by resource group
 clshowres -g'RG'
How to show resource information by node
 clshowres -n'NODE'
How to locate the resource groups and display status (-s)    
 clfindres
How to list interface name/interface device   name/netmask      associated with a specified ip label / ip address of a specific 
node
 clgetif
Cluster verification utility
 clverify
How to list cluster topology information
 cllscf
X utility for cluster configuration
 xclconfig
X utility for hacmp management 
 xhacmpm
X utility for cluster status
 xclstat
How to force shutdown cluster immediately without releasing resources
 lclstop -f -N
How to do graceful shutdown immediately with no takeover
 clstop -g -N
How to do graceful shutdown immediately with takeover
 clstop -gr -N
How to sync the cluster topology
 cldare -t
How to do the mock sync of topology
 cldare -t -f
How to sync the cluster resources
 cldare -r
How to do the mock sync of resources
 cldare -r -f
How to list the name and security level of the cluster
 cllsclstr
How to list the info about the cluster nodes
 cllsnode
How to list info about node69
 cllsnode -i node69
How to list the PVID of the shared hard disk for resource group dataRG
 cllsdisk -g dataRG
How to list all cluster networks
 cllsnw
How to list the details of network ether1
 cllsnw -n ether1
How to show network ip/nonip interface information
 cllsif
How to list the details of network adapter node1_service
 cllsif -n node1_service
How to list the shared vgs which can be accessed by all nodes
 cllsvg
How to list the shared vgs in resource group dbRG
 cllsvg -g dbRG
How to list the shared lvs
 cllslv
How to list the shared lvs in the resource group dbRG
 cllslv -g dbRG
How to list the PVID of disks in the resource group appRG
 cllsdisk -g appRG
How to list the shared file systems
 cllsfs
How to list the shared file systems in the resource group sapRG
 cllsfs -g sapRG
How to show info about all network modules
 cllsnim
How to show info about ether network module
 cllsnim -n ether
How to list the runtime parameters for the node node1
 cllsparam -n node1
How to add a cluster definition with name dcm and id 3
 claddclstr -i 3 -n dcm
How to create resource group sapRG with nodes n1,n2 in cascade
 claddgrp -g sapRG -r cascading -n n1 n2
Creates an application server ser1 with startscript as /usr/start and stop script as /usr/stop
 claddserv -s ser1 -b /usr/start -e /usr/stop
How to change cluster definitions name to dcmds and id to 2
 clchclstr -i 2 -n dcmds
How to change the cluster security to enhanced
 clchclstr -s enhanced
How to delete the resource group appRG and related resources
 clrmgrp -g appRG
How to remove the node node69
 clrmnode -n node69
How to remove the adapter named node69_svc
 clrmnode -a node69_svc
How to remove all resources from resource group appRG
 clrmres -g appRG
How to remove the application server app69
 clrmserv app69
How to remove all applicaion servers
 clrmserv ALL
How to list the nodes with active cluster manager processes from cluster manager on node node1clgetaddr node1 returns a pingable address from node node1
 clgetactivenodes -n node1
How to list the info about resource group sapRG
 clgetgrp -g sapRG
How to list the participating nodes in the resource group sapRG
 clgetgrp -g sapRG -f nodes
How to get the ip label associated to the resource group
  clgetip sapRG
How to list the network for ip 192.168.100.2, netmask 255.255.255.0
 clgetnet 192.168.100.2 255.255.255.0
How to list the VG of LV nodelv
 clgetvg -l nodelv
How to add node5 to the cluster
 clnodename -a node5
How to change the cluster node name node5 to node3
 clnodename -o node5 -n node3