Showing posts with label HP-UX. Show all posts
Showing posts with label HP-UX. Show all posts

Thursday, 19 June 2014

How to Convert OpenSSH to SSH2 and vise versa

The program SSH (Secure Shell) provides an encrypted channel for logging into another computer over a network, executing commands on a remote computer, and moving files from one computer to another. SSH provides strong host-to-host and user authentication as well as secure encrypted communications over the Internet.

SSH2 is a more secure, efficient, and portable version of SSH .

Connecting two servers running different type of SSH can be a danting task if you does not know how to convert the key. In this article ,we are going to learn about how to convert  keys   SSH( OpenSSH) to SSH2.

How to Generate OpenSSH(SSH v1) key :

umadm@umixserv1 [/home/umadm/.ssh]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/umadm/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/umadm/.ssh/id_rsa.
Your public key has been saved in /home/umadm/.ssh/id_rsa.pub.
The key fingerprint is:
5b:ac:ea:c3:25:cf:2d:31:a2:aa:83:76:4b:a2:c9:eb umadm@umixserv1
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
|         .       |
|        S o      |
|. o   . .+       |
|+o o + oo        |
|Bo.   =.         |
|#Eo..oo.         |
+-----------------+
umadm@umixserv1 [/home/umadm/.ssh]$
Here we get two encrypted keys  callled   private key( called id_rsa) and public key id_rsa.pub  undr ~$HOME/.ssh directory.
  
You can generate dsa key by using below command.
#ssh-keygen -t dsa

Convert SSH2 to  OpenSSH(SSH):


The command below can be used to convert an SSH2 private key into the OpenSSH format:
ssh-keygen -i -f path/to/private.key > path/to/new/opensshprivate.key
The command below can be used to convert an SSH2 public key into the OpenSSH format:
ssh-keygen -i -f path/to/publicsshkey.pub > path/to/publickey.pub
Here  -i ==> SSH to read an SSH2 key and convert it into the OpenSSH format

Convert OpenSSH(SSH) to SSH2:

The  reverse  process to convert an OpenSSH key into the SSH2 format in the event that a client application requires the other format. This can be done using the following command:

OpenSSH to SSH2 Private key conversion:
ssh-keygen -e -f path/to/opensshprivate.key > path/to/ssh2privatekey/ssh2privatekey
OpenSSH to SSH2 Public key conversion:
ssh-keygen -e -f path/to/publickey.pub > path/to/ssh2privatekey/ssh2publickey.pub
Here  -e ==> SSH to read an OpenSSH key file and convert it to SSH2 format

Note:If you need passwordless authentication  b/w two different hosts , you need to convert the publickey as per the destination server SSH version and  append the public key to   ~/.ssh/authorized_keys or  ~/.ssh2/authorized_keys at destination server.

Thursday, 8 May 2014

Tivoli System Automation (TSA) Overview

Introduction

The purpose of this guide is to introduce Tivoli® System Automation for Multiplatforms and provide a quick-start, purpose-driven approach to users that need to use the software, but have little or no past experience with it.

This guide describes the role that TSA plays within IBM’s Smart Analytics System solution and the commands that can be used to manipulate the application. Further, some basic problem diagnosis techniques will be discussed, which may help with minor issues that could be experienced during regular use.
When the Smart Analytics system is built with High Availability, TSA is automatically installed and configured by the ATK. Therefore, this guide will not describe how to install or configure a TSA cluster (domain) from scratch, but rather how to manipulate and work with an existing environment. To learn to define a cluster of servers, please refer to the References appendix for IBM courses that are available.

Terminology

It is advisable to become familiar with the following terms, since they are used throughout this guide. It will also help you become familiar with the scopes of the different components within TSA.
Table 1. Terminology
TermDefinition
Peer Domain: A cluster of servers, or nodes, for which TSA is responsible
Resource: Hardware or software that can be monitored or controlled. These can be fixed or floating. Floating resources can move between nodes.
Resource group: A virtual group or collection of resources
Relationships: Describe how resources work together. A start-stop relationship creates a dependency (see below) on another resource. A location relationship applies when resources should be started on the same or different nodes.
Dependency: A limitation on a resource that restricts operation. For example, if resource A depends on resource B, then resource B must be online for resource A to be started.
Equivalency: A set of fixed resources of the same resource class that provide the same functionality
Quorum: A cluster is said to have quorum when there it has the capability to form a majority within its nodes. The cluster can lose quorum when there is a communication failure, and sub-clusters form with an even number of nodes.
Nominal State: This can be online or offline. It is the desired state of a resource, and can be changed so that TSA will bring a resource online or shut it down.
Tie Breaker: Used to maintain quorum, even in a split-brain situation (as mentioned in the definition of quorum). A tie-breaker allows sub-clusters to determine which set of nodes will take control of the domain.
Failover: When a failure occurs (typically hardware), which causes resources to be moved from one machine to another machine, the resources are said to have “failed over”

Getting Started

The purpose of TSA in the Smart Analytics system is to manage software and hardware resources, so that in the event of a failure, they can be restarted or moved to a backup system. TSA uses background scripts to check the status of processes and ensure that everything is working ok. It also uses “heart-beating” between all the nodes in the domain to ensure that every server is reachable. Should a process fail the status check, or a node fails to respond to a heartbeat, appropriate action will be taken by TSA to bring the system back to its nominal state.
Let’s start with the basics. In a Smart Analytics System, the TSA domain includes the DB2 Admin node, the Data nodes, and any Standby/backup nodes. The management server is not part of the domain and TSA commands will not work there. Further, all TSA commands are run as the root user.
The first thing you want to do is check the status of the domain, and start it if required:
    # lsrpdomain
    Name      OpState RSCTActiveVersion MixedVersions TSPort GSPort 
    bcudomain Online  2.5.3.3           No            12347  12348
In this case it’s already started, but if OpState would show “Offline”, then the command to start the domain is,
startrpdomain bcudomain
Notice that the domain name is bcudomain, and it is required for the start command. Likewise, if you want to stop the domain, the command is,
stoprpdomain bcudomain
If TSA is in an unstable state, you can also forcefully shut down the domain using the -f parameter in the stoprpdomain command. However, this is typically not recommended:
stoprpdomain -f bcudomain
You should not stop a domain until all your resources have been properly shut down. If your system uses GPFS to manage the /db2home mount, then you need to manually unmount the GPFS filesystems before you can stop the TSA domain using the following command,
/usr/lpp/mmfs/bin/mmunmount /db2home
Next, you’ll want to check the status of the nodes in the domain. The following command will do this:
        # lsrpnode
        Name      OpState RSCTVersion 
        beluga006 Online  2.5.3.3     
        beluga008 Online  2.5.3.3     
        beluga007 Online  2.5.3.3
You can see that we have 3 nodes in this domain: beluga006, beluga007, and beluga008. This also shows their state. If they are Online, then TSA can work with them. If they are Offline, they are either turned off or TSA cannot communicate with them (and thus unavailable). Nodes don’t always appear in the order that you would expect, so be sure to scan the whole output (in this case, beluga008 shows up before beluga007).

Resource Groups

After you have verified that the Domain is started, and all your nodes are Online, you will want to check the status of your resources. TSA manages all resources through resource groups. You cannot start a resource individually through TSA. When you start a resource group however, it will start all resources that belong to that group.
To check the status of your DB2 resources, use the hals command. This gives you a summary of all nodes in the peer domain, including their primary and backup locations, current location, and failover state.
+===============+===============+===============+==================+==================+===========+
|  PARTITIONS   |    PRIMARY    |   SECONDARY   | CURRENT LOCATION | RESOURCE OPSTATE | HA STATUS |
+===============+===============+===============+==================+==================+===========+
| 0             | dwadmp1x      | dwhap1x       | dwadmp1x         | Online           | Normal    |
| 1,2,3,4       | dwdmp1x       | dwhap1x       | dwdmp1x          | Online           | Normal    |
| 5,6,7,8       | dwdmp2x       | dwhap1x       | dwdmp2x          | Online           | Normal    |
| 9,10,11,12    | dwdmp3x       | dwhap1x       | dwhap1x          | Online           | Failover  |
| 13,14,15,16   | dwdmp4x       | dwhap1x       | dwdmp4x          | Online           | Normal    |
+===============+===============+===============+==================+==================+===========+
In this example, we see that the admin node is dwadmp1x since it holds partition 0. There are 4 data nodes in this system, and all are in Normal state except for data node 3. We can see that data node 3 is in Failover state and its current location is dwhap1x, the backup server.
The hals command is actually a summary of the complete output. For more detailed information about each resource, use the lssam command. The following output is an example of a cluster with the following nodes:
Admin node:   beluga006
Data node:    beluga007
Standby node: beluga008

# lssam | grep Nominal
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
Notice that the full output was grepped to “Nominal”. This is a trick to shorten the output so that we only see the Nominal states, and soon you will see that it can get quite long otherwise.
Let’s step through the above output:
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
This first line tells us that we have a resource group named SA-nfsserver-rg and it is Online. The Nominal state is also Online, so it is working as expected. By the name, we can tell that this resource group manages the NFS server resources. Typically, this should always be online.
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
Next we have a resource group called db2_bculinux_NLG_beluga006-rg. This is the resource group belonging to the Admin node. We know that because beluga006 is the hostname for the Admin node. Here, we have 1 DB2 partition (the coordinator partition). For every partition, we define a resource group. You’ll see why shortly. The resource group for the admin partition, partition 0, is called db2_bculinux_0-rg.
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
Lastly, we have our data partition group, db2_bculinux_NLG_beluga007-rg. Every data partition in a Balanced Warehouse has 4 partitions, and they can be easily seen here.
Now, let us examine the full lssam output. Try to find each of the lines from the grepped output in the full output:
# lssam
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
                |- Online IBM.AgFileSystem:shared_db2home:beluga006
                '- Offline IBM.AgFileSystem:shared_db2home:beluga008
        |- Online IBM.AgFileSystem:varlibnfs
                |- Online IBM.AgFileSystem:varlibnfs:beluga006
                '- Offline IBM.AgFileSystem:varlibnfs:beluga008
        |- Online IBM.Application:SA-nfsserver-server
                |- Online IBM.Application:SA-nfsserver-server:beluga006
                '- Offline IBM.Application:SA-nfsserver-server:beluga008
        '- Online IBM.ServiceIP:SA-nfsserver-ip-1
                |- Online IBM.ServiceIP:SA-nfsserver-ip-1:beluga006
                '- Offline IBM.ServiceIP:SA-nfsserver-ip-1:beluga008
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_0-rs
                   |- Online IBM.Application:db2_bculinux_0-rs:beluga006
                   '- Offline IBM.Application:db2_bculinux_0-rs:beluga008
                |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs:beluga006
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs:beluga008
                '- Online IBM.ServiceIP:db2ip_172_16_10_228-rs
                    |- Online IBM.ServiceIP:db2ip_172_16_10_228-rs:beluga006
                    '- Offline IBM.ServiceIP:db2ip_172_16_10_228-rs:beluga008
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_1-rs
                    |- Online IBM.Application:db2_bculinux_1-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_1-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs:beluga008
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_2-rs
                    |- Online IBM.Application:db2_bculinux_2-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_2-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs:beluga008
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_3-rs
                    |- Online IBM.Application:db2_bculinux_3-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_3-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs:beluga008
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_4-rs
                    |- Online IBM.Application:db2_bculinux_4-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_4-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs:beluga008

Let us take a look at the NFS resource group:


Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
             |- Online IBM.AgFileSystem:shared_db2home:beluga006
             '- Offline IBM.AgFileSystem:shared_db2home:beluga008
The first line was what we had seen before (lssam | grep Nom). Now, we can see what resources actually form the resource group. This first resource is of type AgFileSystem and represents the db2home mount. We can see that it can exist on beluga006 and beluga008, and that it is Online in beluga006 and Offline in beluga008.
Similarly, for the admin node, we can now see the individual resources:
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_0-rs
                   |- Online IBM.Application:db2_bculinux_0-rs:beluga006
                   '- Offline IBM.Application:db2_bculinux_0-rs:beluga008
The first two lines were part of the previous grepped output, but now we can see an Application resource. You can see similar results for the data node and each of its 4 data partitions. The reason that each of these resources exist on two nodes (beluga006 and beluga008) is for high availability. If beluga006 were to fail, TSA will move all those resources that are currently Online there to beluga008. Then, you would see that they are Offline in beluga006, and Online in beluga008. You can see how this output is useful to determine on which nodes the resources exist.
The lssam command also shows Equivalencies as part of the output. I will include it for the sake of completion, but we will discuss this later on:
Online IBM.Equivalency:SA-nfsserver-nieq-1
        |- Online IBM.NetworkInterface:bond0:beluga006
        '- Online IBM.NetworkInterface:bond0:beluga008
Online IBM.Equivalency:db2_FCM_network
        |- Online IBM.NetworkInterface:bond0:beluga006
        |- Online IBM.NetworkInterface:bond0:beluga007
        '- Online IBM.NetworkInterface:bond0:beluga008
Online IBM.Equivalency:db2_bculinux_0-rg_group-equ
        |- Online IBM.PeerNode:beluga006:beluga006
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_1-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_2-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_3-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_4-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_NLG_beluga006-equ
        |- Online IBM.PeerNode:beluga006:beluga006
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_NLG_beluga007-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
The lssam command also lets you limit the output to a particular resource group, with the –g option:
# lssam –g SA-nfsserver-rg
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
                |- Online IBM.AgFileSystem:shared_db2home:beluga006
                '- Offline IBM.AgFileSystem:shared_db2home:beluga008
        |- Online IBM.AgFileSystem:varlibnfs
                |- Online IBM.AgFileSystem:varlibnfs:beluga006
                '- Offline IBM.AgFileSystem:varlibnfs:beluga008
        |- Online IBM.Application:SA-nfsserver-server
                |- Online IBM.Application:SA-nfsserver-server:beluga006
                '- Offline IBM.Application:SA-nfsserver-server:beluga008
        '- Online IBM.ServiceIP:SA-nfsserver-ip-1
                |- Online IBM.ServiceIP:SA-nfsserver-ip-1:beluga006
                '- Offline IBM.ServiceIP:SA-nfsserver-ip-1:beluga008
With the Smart Analytics System, some new commands were introduced to make it easier to monitor and use TSA with DB2:
Table 2. Useful Commands
CommandDefinition
hals: shows HA status summary for all db2 partitions
hachknode shows the status of the node in the domain and details about the private and public networks
hastartdb2 start db2 partition resources
hastopdb2 stop db2 partition resources
hafailback moves partitions back to the primary machine specified in the primary_machine argument
Equivalency: A set of fixed resources of the same resource class that provide the same functionality
hafailover moves partitions off of the primary machine specified in the primary_machine argument to it is standby
hareset attempt to reset pending, failed, stuck resource states

Stopping and Starting Resources

If you want to stop or start the DB2 service, you need to stop the respective DB2 resource groups using TSA commands. TSA will then start or stop DB2.
The command to do this is chrg. To stop a resource group named db2_bculinux_NLG_beluga007, issue the command,
chrg –o offline –s “Name == ‘db2_bculinux_NLG_beluga007’”
Similarly, to start the resource group
chrg –o online –s “Name == ‘db2_bculinux_NLG_beluga007’”
You can also stop/start all resources at the same time:
chrg –o online –s “1=1”
The Smart Analytics System also has some pre-configured commands:
hastartdb2 and hastopdb2
These two commands, however, are specific to DB2 and if there has been customization to TSA, they may not stop/start all resources.
If TSA has pre-configured rules/dependencies, they will ensure that resources are stopped and started in the correct order. For example, DB2 resources that depend on NFS will not start if the NFS share is Offline.

TSA Components

Now that you understand the basics of Tivoli System Automation, we can discuss some of the other components that it can manage.

Service IP

A service IP is a virtual, floating resource attached to a network device. Essentially, it is an IP address that can move from one machine to another, in the event of a failover. Service IPs play a key role in a highly available environment. Because they move from a failed machine to a standby, they allow an application to reconnect to the new machine using the same IP address – as if the original server had simply restarted.
The following command will allow you to view what service IPs have been configured for your system.
# lsrsrc -Ab IBM.ServiceIP
    Resource Persistent and Dynamic Attributes for IBM.ServiceIP
    resource 1:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 0
     AggregateResource = "0x2029 0xffff 0x414c690c 0x7cc2abfa 0x919b42d5 0xbf62ab75"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb3a"}
     OpState           = 2
     ConfigChanged     = 0
     ChangedAttributes = {}
    resource 2:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 0
     AggregateResource = "0x2029 0xffff 0x414c690c 0x7cc2abfa 0x919b42d5 0xbf62ab75"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb1a"}
     OpState           = 1
     ConfigChanged     = 0
     ChangedAttributes = {}
    resource 3:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 1
     AggregateResource = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb1a","t6udb3a"}
     OpState           = 1
     ConfigChanged     = 0
     ChangedAttributes = {}
The above example shows three resources with the same name, db2ip_10_160_20_210-rs. The NodeNameList parameter tells us which node(s) the resource is referring to. The first resource has Opstate set to 2, which tells us that this is where the service IP is currently pointing (it is also the primary location of the resource). The second resource has Opstate 1, which tells us that this is the backup/standby node. The third resource contains both nodes in its NodeNameList parameters, and this tells TSA that this is a floating resource between those two nodes.

Application Resources

TSA manages resources using scripts. Some scripts are built in (and part of TSA), such as those for controlling DB2. These scripts are responsible for starting, stopping and monitoring the application. Sometimes it can be useful to understand these scripts, or even edit them for problem diagnosis. To find out where they are located, we use the lsrsrc command, which provides us with the complete configuration of a particular resource.
Following is an example:
# lsrsrc -Ab IBM.Application
resource 12:
  Name                  = "db2_dbedw1da_8-rs" 
  ResourceType          = 1
  AggregateResource     = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
  StartCommand          = "/usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh dbedw1da 8"
  StopCommand           = "/usr/sbin/rsct/sapolicies/db2/db2V97_stop.ksh dbedw1da 8"
  MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/db2V97_monitor.ksh dbedw1da 8"
  MonitorCommandPeriod  = 60
  MonitorCommandTimeout = 180
  StartCommandTimeout   = 330
  StopCommandTimeout    = 140
  UserName              = "root"
  RunCommandsSync       = 1
  ProtectionMode        = 1
  HealthCommand         = ""
  HealthCommandPeriod   = 10
  HealthCommandTimeout  = 5
  InstanceName          = ""
  InstanceLocation      = ""
  SetHealthState        = 0
  MovePrepareCommand    = ""
  MoveCompleteCommand   = ""
  MoveCancelCommand     = ""
  CleanupList           = {}
  CleanupCommand        = ""
  CleanupCommandTimeout = 10
  ProcessCommandString  = ""
  ResetState            = 0
  ReRegistrationPeriod  = 0
  CleanupNodeList       = {}
  MonitorUserName       = ""
  ActivePeerDomain      = "bcudomain"
  NodeNameList          = {"d8udb11a","d8udb3a"}
  OpState               = 1
  ConfigChanged         = 0
  ChangedAttributes     = {}
  HealthState           = 0
  HealthMessage         = ""
  MoveState             = [32768,{}]
  RegisteredPID         = 0
Some of the more common and useful attributes are described in Table 3.
Table 3. Resource Attributes
AttributeDefinition
ResourceType: Indicates whether the resource is allowed to run on multiple nodes, or a single node. A fixed resource is identified with a ResouceType value of 0, and a floating resource has a value of 1.
StartCommand: Specifies the command to be run when the resources is started
StopCommand: Specifies the command to be run when the resource is stopped
MonitorCommand: Specifies the command to be run when the resource is being monitored. This happens on a regular interval, and you will likely see this command often when you run the “ps –ef” command.
UserName: The userid that TSA will use to start this resource
NodeNameList: Indicates on which nodes the resource is allowed to run. This is an attribute of an RSCT resource.
OpState: Specifies the operational state of a resource or a resource group. The valid states are,
0 - UNKNOWN
1 - ONLINE
2 - OFFLINE
3 - FAILED_OFFLINE
4 - STUCK_ONLINE
5 - PENDING_ONLINE
6 - PENDING_OFFLINE

Network Resources

Every machine typically has an Ethernet adaptor, with a configured network address. TSA is aware of this and you can see how they have been configured with the lsrsrc command. For example,
# lsrsrc -Ab IBM.NetworkInterface
    resource 1:
        Name             = "en0"
        DeviceName       = ""
        IPAddress        = "172.22.1.217"
        SubnetMask       = "255.255.252.0"
        Subnet           = "172.22.0.0"
        CommGroup        = "CG1"
        HeartbeatActive  = 1
        Aliases          = {}
        DeviceSubType    = 6
        LogicalID        = 0
        NetworkID        = 0
        NetworkID64      = 0
        PortID           = 0
        HardwareAddress  = "00:21:5e:a3:be:60"
        DevicePathName   = ""
        IPVersion        = 4
        Role             = 0
        ActivePeerDomain = "bcudomain"

Log Files

It is important to be aware of the log files that TSA actively writes to:
  1. History file – this logs the commands that were sent to TSA
    /var/ct/IBM.RecoveryRM.log2
  2. Error and monitor logs – these logs are simply the AIX and Linux system logs. They will show you the output of the start, stop, and monitor scripts as well as any diagnostic information coming from TSA. Although the system administrator can configure the location for these logs, they are typically located in the following locations,
    AIX: /tmp/syslog.out
    Linux: /var/log/messages

Command Reference

Table 4 describes the most common commands that a TSA administrator will use.
Table 4. Common TSA Commands
CommandDefinition
hals: Display HA configuration summary
hastopdb2: Stop DB2 using TSA
hastartdb2: Start DB2 using TSA
mkequ:Makes an equivalency resource
chequ:Changes a resource equivalency
lsequ: Lists equivalencies and their attributes
rmequ: Removes one or more resource equivalencies
mkrg: Makes a resource group
chrg: Changes persistent attribute values of a resource group (including starting and stopping a resource group)
lsrg: Lists persistent attribute values of a resource group or its resource group members
rmrg: Removes a resource group
mkrel: Makes a managed relationship between resources
chrel: Changes one or more managed relationships between resources
lsrel: Lists managed relationships
rmrel: Removes a managed relationship between resources
samcrl: Sets the IBM TSA control parameters
lssamctrl: Lists the IBM TSA controls
addrgmbr: Adds one ore more resources to a resource group
chrgmbr: Changes the persistent attribute value(s) of a managed resource in a resource group
rmrgmbr: Removes one or more resources from the resource group
lsrgreq: Lists outstanding requests applied against resource groups or managed resources
rgmbrreq: Requests a managed resource to be started or stopped, or cancels the request
rgreq: Requests a resource group to be started, stopped, or moved, or cancels the request
lssam: Lists the defined resource groups and their members in a tree format

Command Tips

Following are some useful commands with examples.
Show relationships/dependencies:
lsrel | sort
Show details for a specific relationship:
# lsrel -A b -s "Name = 'db2_bculinux_0-rs_DependOn_db2_bculinux_qp-rel'"
Managed Relationship 1:
        Class:Resource:Node[Source] = IBM.Application:db2_bculinux_qp
        Class:Resource:Node[Target] = {IBM.Application:db2_bculinux_0-rs}
        Relationship                = DependsOn
        Conditional                 = NoCondition
        Name                        = db2_bculinux_0-rs_DependOn_db2_bculinux_qp-rel
        ActivePeerDomain            = bcudomain
        ConfigValidity              =
Delete/remove a relationship
rmrel -s "Name like 'db2_bculinux_%-rs_DependsOn_db2_bculinux_0-rs-rel'"
Change a resource attribute:
chrsrc -s "Name=='"  attribute=value
Example:
chrsrc -s "Name=='db2ip_10_160_10_27-rs'" IBM.ServiceIP NetMask='255.255.255.0'
To save current SAMP policy information:
sampolicy –s /tmp/sampolicy.current.xml
To check if the policy in the input file is valid:
sampolicy –c /tmp/sampolicy.current.xml
To activate it:
sampolicy –a /tmp/sampolicy.current.xml

Troubleshooting

This section describes methods that can be used to determine the cause of a particular problem or failure. Though techniques vary depending on the type of problem, the following should be a good starting point for most issues.
Resolving FAILED OFFLINE status
A failed offline status will prevent you from setting the nominal status to ONLINE, so these must be resolved first and changed to OFFLINE before turning it back to ONLINE. Make sure that the Nominal status is showing OFFLINE before resolving it.
To resolve the Failed offline messages, use the resetrsrc command.
resetrsrc -s ‘Name = "db2whse_appinstance_01.abxplatform_server1"‘ IBM.Application
resetrsrc -s 'Name = "db2whse_appinstance_01.adminconsole_server1"' IBM.Application
Recovery from a failed failover attempt
Take all TSA resources offline. The lssam output should reflect “Offline” for all resources before you attempt to bring them back online. To reset NFS resources, use:
resetrsrc -s "Name like 'SA-nfsserver-%'" IBM.Application (if necessary)
resetrsrc -s "Name like 'SA-nfsserver-%'" IBM.ServiceIP (if necessary)
When testing goes wrong, you are often left with resources in various states such as online, offline, and unknown. When the state of a resource is unknown, before attempting to restart it, you must issue resetrsrc for that particular resource.
When you are restarting DB2, you must verify that all the resources are offline before attempting to bring them online again. You must also correct the db2nodes.cfg file. Make sure you have backup copies of db2nodes.cg and db2ha.sys.
NFS mounts stop functioning
In testing the NFS failover, we were able to move the server over successfully, but the existing NFS client mounts stopped functioning. We solved this problem by unmounting and remounting the NFS volume.
Resolving Binding=Sacrificed
To resolve this problem you have to look at the overall cluster and how its setup/defined. Issues that can and will cause this are types that will have a cluster-wide impact but not specifically affect one resource.
  1. Check for failed relationships by listing the relationships with the following command "lsrel -Ab", and then determine if one or more of the relationships relating to the failed resource group have not been satisfied.
  2. Check for failed equivalencies by listing them with the following command "lsequ -Ab" and then determine if one re more of the equivalencies have not been satisfied.
  3. Check your resource group attributes and look for anything that maybe set incorrectly, some of the commands to use are listed as follows:
    lsrg -Ab -g 
    lsrsrc -s 'Name="failed_resource"' –Ab IBM.
    lsrg -m -g 
    samdiag -g <resource_group_name>
  4. Check for anything specific to your configuration that all of the sacrificed resources share in common, like a mount point, a database instance, a virtual IP.
Check hardware configuration:dmesg – check initialization errors
date – check server synchronization
ifconfig – to check network adapters
netstat -I – to check network configuration
ps -ef | grep inetd – will provide a list of the running processes, including group and PID
Resource state is unknown
Try resetting the resource using the resetrsrc command:
resetrsrc -s "Name like 'db2_db2inst2_%'" IBM.Application
resetrsrc -s "Name like 'db2_db2inst2_%'" IBM.ServiceIP
Timeout values for resources
For the health query interval of each resource, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application MonitorCommandPeriod=300
For the health query timeout, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application MonitorCommandTimeout=290
For the resource startup script timeout, use
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application StartCommandTimeout=300
For the Resource Stop script timeout, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application StopCommandTimeout=720
Recycling the automation manager
If the problem is most likely related to the automation manager, you should try recycling the automation manager (IBM.RecoveryRM) before contacting IBM support. This can be done using the following commands:
Find out on which node the RecoveryRM master daemon is running:
# lssrc -ls IBM.RecoveryRM | grep Master
On the node running the master, retrieve the PID and kill the automation manager:
# lssrc -ls IBM.RecoveryRM | grep PID
# kill -9 
As a result, an automation manager on another node in the domain will take over the master role, and proceeds with making automation decisions. The subsystem will restart the killed automation manager immediately.
Resolving lssam hangs
http://www-01.ibm.com/support/docview.wss?uid=swg21293701
Move to another node in the same HA group and see if you can run the lssam command. If you can, go back to the original node to see if you can now do the lssam command. If this still does not work, then run the following commands:
lssrc -ls IBM.RecoveryRM | grep -i master 
lssrc -ls IBM.GblResRM | grep -i leader
Make sure neither of the above command outputs return the “hanging” node and if so, then reboot just that node and see if the issue is resolved.
AVOID the following (DON’Ts)
  • Do not use rpower –a, or rpower on more than one node in the same HA group when SAMP HA is up and running.
  • Do not offline HA-NFS using a sudo command while logged in as the instance owner and while in the /db2home directory. HA-NFS will get stuck online, and the RecoveryRM daemon has to be killed on the master. If RecoveryRM will not start, reboot may be required.
  • Do not use ifdown to bring down a network interface. This will result in the eth (or en) device to be deleted from equivalency member and will require you to add the "eth" device (in Linux) or "en" device (in AIX) back into the network equivalency using chequ command
  • Do not manipulate any BW resources that are under active SAMP control.
    Turn automation off (samctrl –M T) before manipulating these BW resources.
  • Do not implement changes to the SA MP policy unless exhaustive testing of the HA test cases is completed.
Check the following frequently (DOs)
  • Ensure the /home and /db2home directories are always mounted before starting up a node.
  • Check for process ids that may be blocking stop, start and monitor commands.
  • Save backup copies of the db2nodes.cfg and db2ha.sys file.
  • Save the backup copies of the current SAMP policy before and after every SAMP change. Compare the current SAMP policy to the backup SAMP policy every time there is an HA incident.
  • Save backup copies of db2pd -ha output before and after every SAMP change. Compare the current db2pd outputs to the backup db2pd outputs every time there is an HA incident.
  • Save backup copies of the samdiag outputs. 
Source

Monday, 5 May 2014

HP Cluster "Service Guard" Commands


1)Starting the Cluster:

Use the cmruncl command to start the cluster when all cluster nodes are down. Particular command options can be used to start the cluster under specific circumstances.

The following command starts all nodes configured in the cluster and verifies the network information:
cmruncl
By default, cmruncl will do network validation, making sure the actual network setup matches the configured network setup. This is the recommended method.

If you have recently checked the network and find the check takes a very long time, you can use the -w none option to bypass the validation.

Use the -v (verbose) option to display the greatest number of messages.
cmruncl -v

2)Stop/Halt entire cluster:

Use cmhaltcl to halt the entire cluster. This command causes all nodes in a configured cluster to halt their Serviceguard daemons.
cmhaltcl  -v
You can use the -f option to force the cluster to halt even when packages are running. You can use the command on any running node, for example:
cmhaltcl -f -v
This halts all the cluster nodes.

3)Check status of cluster:

cmviewcl displays the current status information of a cluster in line output  format.
cmviewcl

4)Adding Previously Configured Nodes to a Running Cluster:

Use the cmrunnode command to join one or more nodes to an already running cluster.

Any node you add must already be a part of the cluster configuration. The following example adds node umhpux9 to the cluster that was just started with only nodes ftsys9 and ftsys10.

The -v (verbose) option prints out all the messages:
cmrunnode -v umhpux9

5)Removing Nodes from Participation in a Running Cluster:

You can use Serviceguard Manager, or Serviceguard commands as shown below, to remove nodes from active participation in a cluster.

This operation halts the cluster daemon, but it does not modify the cluster configuration.Halting a node is a convenient way of bringing it down for system maintenance while keeping its packages available on other nodes. After maintenance, the package can be returned to its primary node.
cmhaltnode -f -v umhpuxsg9
Get current configuration
cmgetconf - Get cluster or package configuration information:
cmgetconf
 you can use -c cluster_name   for a specific Name  of the cluster.

6)Managing Packages and Services:

Starting a Package:

Use the cmrunpkg command to run the package on a particular node, then use the cmmodpkg command to enable switching for the package. For example, to start a failover package:
cmrunpkg -n nodename package_name
cmrunpkg -n umhpux9 pkg1
cmmodpkg -e pkg1
This starts up the package on umhpux9, then enables package switching.

This sequence is necessary when a package has previously been halted on some node, since halting the package disables switching.

Halting a Package:

Use the cmhaltpkg command to halt a package, as follows:
cmhaltpkg package_name
cmhaltpkg pkg1
This halts pkg1, and, if pkg1 is a failover package, also disables it from switching to another node.

You cannot halt a package unless all packages that depend on it are down.

7)Moving a Failover Package:

Before you move a failover package to a new node, it is a good idea to run cmviewcl -v -l package and look at dependencies. If the package has dependencies, be sure they can be met on the new node.

To move the package, first halt it where it is running using the cmhaltpkg command. This action not only halts the package, but also disables package switching.

After it halts, run the package on the new node using the cmrunpkg command, then re-enable switching as described under “Start a Package”.

Changing Package Switching behaviour can be  below options

There are two options to consider:
  • Whether the package can switch (fail over) or not.
  • Whether the package can switch to a particular node or not.

8)Changing Package Switching:

enable or disable switching attributes for a cluster .
# cmmodpkg –e  (enable)
# cmmodpkg –d  (diable)
After a package has failed on one node, that node is disabled. This means the package will not be able to run on that node.

The following command will enable the package to run on the specified node.
# cmmodpkg –e -n
# cmmodpkg -e -n umhpux9 pkg1
Disabling a package from running on a particular node
# cmmodpkg -d -n
# cmmodpkg -d -n umhpux9 pkg1
This will command will disable the package to run on the specified node.

Friday, 25 April 2014

Unix /Linux Mail Command Tutorial with Examples

unix-linux-mail-command-tutorial
The mail/Mail/mailx  commands are used in unix flavoured operating systems like Linux, HPUX, AIX,Linux and many more unix based systems  are  used to send emails to the users, to read the received emails, to delete the emails etc.

It would be very usefull  when you are working with shell scripts. A good application for using mail/mailx would be to send alerts, or process a file and then email it to somebody, extract data from your database/application  and email the resulting data or file, etc.



The syntax of mail command is:
mail [options] to-address [-- sendmail-options]

-v : Verbose mode. Delivery details are displayed on the terminal.
-s : Specify the subject of the mail
-c : Send carbon copies of the mail to the list of users. This is like cc option in Microsoft outlook.
-b : Send blind copies of the mail to the list of users. This is like bcc option in outlook.
-f : Read the contents of the mailbox
-e : Tests for the presence of mail in the system mailbox.
-F : Records the message in a file named after the recipient.
-r : Specify the from address in send mail options.
-u : Specifies an abbreviated equivalent of doing mail -f /var/spool/mail/UserID.

1)To start the Mail program and list the messages in your mailbox:

# mail
The mail command lists every messages in your system mailbox. The mail system then displays the mailbox prompt (?) to indicate waiting for input.

When you see this prompt, enter any mailbox subcommand. To see a list of subcommands, type:
?
This entry lists the Mail subcommands.

2) Sending  email to  a user:

# echo "Test of Mail body" | mail -s "Mail subject" [email protected]

Here the echo statement is used for specifying the body of the email.
The -s option is used for specifying the mail subject. The mail command sends the email to the user [email protected]
another way is
# mail -s "Mail subject" [email protected]
in this example you are then expected to type in your message, followed by an "control-D" at the beginning of a line. To stop simply type dot (.) as follows:
Hi,
This is a test
.
Cc:
if you wish to send to multiple mail users just add the mail ids side by side with spaces

Lets re-write above command for multi recipients
# echo "Test of Mail body" | mail -s "Mail subject" recipient[email protected]  [email protected] [email protected]

3) Sending contents of a text file

you can send the contets in two ways using cat/echo or using a input redirect < operator

Lets say for example here we need send contents of  somefile.txt  through mail ,
# cat  somefile.txt  | mail -s "Mail subject" "[email protected],[email protected]"
# mail -s "Mail subject" "[email protected],[email protected]" < somefile.txt

4) Mail Usage with CC & Bcc :

 Using the cc and bcc option You can copy the emails to more number of users by using the -c & -b options. An example is shown below:
# echo "something" | mailx -s "subject" -b [email protected] -c [email protected]  -r [email protected] [email protected]
# mail -s "Mail subject" -c "[email protected]" -b "[email protected]" "[email protected]" < somefile.txt

5) Attaching files:

The mail command does not provide an option for attaching files.

There is a workaround for attaching files using the uuencode command. Pipe the output of uuencode command for attaching files.
 # uuencode attachment-file attachment-file | mail -s "Mail subject" "[email protected]" < somefile.txt

Working with Mailbox in the server

1) To start the Mail program and list the messages in your mailbox:

# mail
The mail command lists every messages in your system mailbox. The mail system then displays the mailbox prompt (?) to indicate waiting for input.

When you see this prompt, enter any mailbox subcommand. To see a list of subcommands, type:
?

Another way of viewing the emails is using the -f option. This is shown below:
# mail -f /var/spool/mail/user
Mail version 8.1 6/6/93.  Type ? for help.
"/var/spool/mail/user": 2 messages 2 new
>N  1 root@hostname  Tue May 17 00:00  21/1013  "Mail subject 1"
 N  2 root@hostname  Wed May 18 00:00  21/1053  "Mail subject 2"
&
From the above output, you can see that, it displays the from-address, date and subject of the emails in the inbox. It also displays the ampersand (&) prompt at the end. To go back to the main prompt, type CTRL+z or CTRL+d depending on your operating system and press enter. The ampersand prompt allows you to read, reply, navigate and delete the emails.

2. Reading an email.

To read the Nth email, just enter the mail number at the ampersand prompt and press enter. This is shown below:
> mail -f /var/spool/mail/user

Mail version 8.1 6/6/93.  Type ? for help.
"/var/spool/mail/user": 2 messages 2 new
>N  1 root@hostname  Tue May 17 00:00  21/1013  "Mail subject 1"
 N  2 root@hostname  Wed May 18 00:00  21/1053  "Mail subject 2"
&2
Message 2:
From root@hostname  Wed May 18 00:00  21/1053
---------------
Subject: Mail subject 2
------------

This displays the second email details.

3. Navigating through inbox emails. 

To go to the next email, enter the + symbol. To go back to the previous email, enter the - symbol at the ampersand prompt.
&-
Message 1:
From root@hostname  Tue May 17 00:00  21/1013
---------------
Subject: Mail subject 1
------------

4. Replying email. 

Once you have read an email, you can give reply to the mail by typing "reply" and pressing enter.
&reply
To: root@hostname
    root@hostname
Subject: Re: Mail subject1

5. Deleting emails. 

You can delete a read email by typing the d and pressing enter. You can also specify the email numbers to d option for deleting them.
To delete read email
&d
To delete emails 1 and 2
&d 1 2
To delete range emails from 10 to 30
&d 10-30
To delete all emails in the mbox (mail box)
&d *

Friday, 28 March 2014

A Simple Way to Send Multiple Line Commands Over SSH

 A Simple Way to Send Multiple Line Commands Over SSH

Below are three methods to send multiple line commands over SSH. The first method is a quick overview of running remote commands over SSH, the second method uses the bash command to run remote commands over SSH, and the third method uses HERE documents to run remote commands over SSH. Each have their limitations which I will cover.

Contents

Running Remote Commands Over SSH

To run one command on a remote server over SSH:
ssh $HOST ls
To run two commands on a remote server over SSH:
ssh $HOST 'ls; pwd'
To run the third, fourth, fifth, etc. commands on a remote server over SSH keep appending commands with a semicolon inside the single quotes.

But, what if you want to remotely run many more commands, or if statements, or while loops, etc., and make it all readable?
#!/bin/bash
ssh $HOST '
ls

pwd

if true; then
    echo "This is true"
else
    echo "This is false"
fi

echo "Hello world"
'
The above shell script works but begins to break if local variables are added.

For example, the following shell script will run, but the local variable HELLO will not be parsed inside the remote if statement:
#!/bin/bash

HELLO="world"

ssh $HOST '
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"
'
In order to parse the local variable HELLO so it is used in the remote if statement, read onto the next section.

Using SSH with the BASH Command

As mentioned above, in order to parse the local variable HELLO so it is used in the remote if statement, the bash command can to be used:
#!/bin/bash

HELLO="world"

ssh $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"
'"
Perhaps you want to use a remote sudo command within the shell script:
#!/bin/bash

HELLO="world"

ssh $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"

sudo ls /root
'"
When the above shell script is run, everything will work as intended until the remote sudo command which will throw the following error:
sudo: sorry, you must have a tty to run sudo
This error is thrown because the remote sudo command is prompting for a password which needs an interactive tty/shell. To force a pseudo interactive tty/shell, add the -t command line switch to the ssh command:
#!/bin/bash

HELLO="world"

ssh -t $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"

sudo ls /root
'"
With a pseudo interactive tty/shell available, the remote sudo command’s password prompt will be displayed, the remote sudo password can then be entered, and the contents of the remote root’s home directory will be displayed.

However, recently I needed to run a specific remote sed command over SSH to find and delete one line and the subsequent three lines and another specific remote sed command over SSH to find a line and insert another line with some text above it, so I naturally tried using the bash method mentioned above:
#!/bin/bash

ssh $HOST bash -c "'
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
'"
Everytime I would run the above shell script, I would get the following error:
sed: -e expression #1, char 5: unterminated address regex
However, the same commands work when run by themselves:
ssh $HOST "sed -i -e '/line one/,+3 d' /tmp/test1"
ssh $HOST "sed -i -e '/^line two$/i line one' /tmp/test2"
I thought the problem may be because of single quotes within single quotes. The bash command above requires everything to be wrapped in single quotes and a sed command requires the regular expression to be wrapped in single quotes as well. As mentioned in the BASH manual, “a single quote may not occur between single quotes, even when preceded by a backslash”.

However, I debunked this single quote theory being my problem because running a simple remote sed search and replace command inside of the bash command worked just fine:
#!/bin/bash

ssh $HOST bash -c "'

echo "Hello" >> /tmp/test3

sed -i -e 's/Hello/World/g' /tmp/test3
'"
I can only assume the problem with the specific remote sed commands is something with the syntax that I have not yet figured out.

Despite all this, I eventually figured out that the specific remote sed commands I wanted to run would work when using SSH with HERE documents.

Using SSH with HERE Documents

As mentioned above, the specific remote sed commands I wanted to run did work when using SSH with HERE documents:
ssh $HOST << EOF
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
EOF
Despite the remote sed commands working, the following warning message was thrown:
Pseudo-terminal will not be allocated because stdin is not a terminal.
To stop this warning message from appearing, add the -T command line switch to the ssh command to disable pseudo-tty allocation (a pseudo-terminal can never be allocated when using HERE documents because it is reading from standard input):
ssh -T $HOST << EOF
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
EOF
With this working, I later discovered remote sudo commands that require a password prompt will not work with HERE documents over SSH.
ssh $HOST << EOF
sudo ls /root
EOF
The above ssh command will throw the following error if the SSH user you are logging into requires a password when using the remote sudo command:
Pseudo-terminal will not be allocated because stdin is not a terminal.
user@host's password: 
sudo: no tty present and no askpass program specified
However, the remote sudo command will work if the SSH user’s sudo settings allow that user to use sudo without a password by setting user ALL=(ALL) NOPASSWD:ALL in /etc/sudoers.

References

What’s the Cleanest Way to SSH and Run Multiple Commands in Bash?
Chapter 19. Here Documents

Saturday, 1 March 2014

Find NFS Clients Connected to NFS Server

 FIND NFS CLIENTS CONNECTED TO NFS SERVER

Question: How to find Find  NFS Clients Connected to NFS Server

Ans:
You can say it very simple by using 

showmount -e  <NFS Server Name>

# showmount -e umbox04
export list for umbox04:
/gpfs/edw/common  umlpar1 umlpar2 umlpar3 umlpar4 umlpar5
/export/nim      udaixserv1,udaixserv2,udaixserv3,udaixserv4,udaixserv5
/bigdata (everyone)

but there is a problem here, if the nfs mount is shared as everyone ( say /bigdata in the above example) you will not be able to tell ,to which clients are  using it.

In-order to overcome above issues ,there are two ways to find the NFS clients connected to  NFS server.

1. Using "netstat":

The idea with  "netstat" is an indirect method . We use nfs port which is 2049 to get the clients information.
netstat -an | grep nfs.server.ip:port
If your nfs server IP address 10.6.55.21 and port is 2049, enter:
Sample outputs:
# netstat -an | grep 10.6.55.21:2049
tcp        0      0 10.6.55.21:2049       10.6.55.33:757         ESTABLISHED
tcp        0      0 10.6.55.21:2049       10.6.55.34:892         ESTABLISHED

Where,

  •     10.6.55.21 - NFS serer IP address
  •     2049 - NFS server port
  •     10.6.55.33 and 10.6.55.34 - NFS clients IP address

2. Using showmount command

You can to use the showmount command to see mount information for an NFS server. The following command should not be used as it may produce unreliable result (you can type this command on any one of the nfs client):
showmount -a <NFS Server Name>
Sample outputs:
All mount points on mynfsserv01:

#showmount -a mynfsserv01
10.6.55.33:/umdata
10.6.55.34:/umdata
10.6.55.69:/umdata
10.6.55.3:/umdata
10.6.55.6:/umdata
10.6.55.16:/umdata

Where,
  • -a: List both the client hostname or IP address and mounted directory in host:dir format. This info should not be considered reliable.

As per the rpc.mountd(8) man page:
he rpc.mountd daemon registers every successful MNT request by adding an entry to the /var/lib/nfs/rmtab file. When receivng a UMNT request from an NFS client, rpc.mountd simply removes the matching entry from /var/lib/nfs/rmtab, as long as the access control list for that export allows that sender to access the export.

Clients can discover the list of file systems an NFS server is currently exporting, or the list of other clients that have mounted its exports, by using the showmount(8) command. showmount(8) uses other procedures in the NFS MOUNT protocol to report information about the server's exported file systems.

Note, however, that there is little to guarantee that the contents of /var/lib/nfs/rmtab are accurate. A client may continue accessing an export even after invoking UMNT. If the client reboots without sending a UMNT request, stale entries remain for that client in /var/lib/nfs/rmtab.

Saturday, 15 February 2014

Regular Expressions Special Characters Explained

Regular expressions” (often shortened to “regex”) is a language used to represent patterns for matching text. Regular expressions are the primary text-matching schema in all text-processing tools, including grep,egrep,awk ,sed.
The following table contains the basic elements, along with description and examples.
Regex Description Example
^ The start-of-line marker ^tux matches any line that starts with tux
$ The end-of-line marker tux$ matches any line that ends with tux
. Matches any one character Hack. matches Hack1, Hacki but not Hack12, Hackil; only one additional character matches
[] Matches any one of the character set inside [] coo[kl] matches cook or cool
[^] Exclusion set: the carat negates the set of characters in the square brackets; text matching this set will not be returned as a match 9[^01] matches 92, 93 but not 91 and 90
[-] Matches any character within the range specified in [] [1-5] matches any digits from 1 to 5
? The preceding item must match one or zero times colou?r matches color or colour but not colouur
+ The preceding item must match one or more times Rollno-9+ matches Rollno-99, Rollno-9 but not Rollno-
* The preceding item must match zero or more times co*l matches cl, col, coool
() Creates a substring in the regex match Explained below, in the section “Substring match and back-referencing”
{n} The preceding item must match exactly n times [0-9]{3} matches any three-digit number. This can be expanded as: [0-9][0-9][0-9]
{n,} Minimum number of times that the preceding item should match [0-9]{2,} matches any number that is two digits or more in length
{n, m} Specifies the minimum and maximum number of times the preceding item should match [0-9]{2,5} matches any number that is between two and five digits in length
| Alternation — one of the items on either side of | should match Oct (1st|2nd) matches Oct 1st or Oct 2nd
\ The escape character for escaping any of the special characters given above a\.b matches a.b but not ajb. The dot is not interpreted as the special “match any one character” regex shown above, but instead a literal dot (period) ASCII character is sought to be matched. Another example: if you’re searching for the US currency symbol “$”, and not the end-of-line marker, you must precede it with a back-slash, like this: \$
There are a few character classes, called POSIX classes, in the format [:name:] that can be conveniently used, instead of spelling out the character set each time. Note that, as shown in the example column, you need to enclose the class itself in another pair of square brackets. For example:
$ echo -e "maxnORnMatrix" | sed '/[:alpha:]/d'
OR
$ echo -e "maxnORnMatrix" | sed '/[[:alpha:]]/d'
$
In the first case, the set is interpreted literally — the words max and matrix are deleted because they contain a, one of the letters in the character set. In the second command, with another pair of square brackets around the class, all input lines are deleted, because all lines contain (at least one) alphabet.
Regex Description Example
[:alnum:] Alphanumeric characters [[:alnum:]]+
[:alpha:] Alphabet character (lowercase and uppercase) [[:alpha:]]{4}
[:blank:] Space and tab [[:blank:]]*
[:digit:] Digit [[:digit:]]?
[:lower:] Lowercase alphabet [[:lower:]]{5,}
[:upper:] Uppercase alphabet ([[:upper:]]+)?
[:punct:] Punctuation [[:punct:]]
[:space:] All whitespace characters including newline, carriage return, and so on [[:space:]]+
Meta-characters are a type of Perl-style regular expressions that are supported by a subset of text-processing utilities. Not all utilities will support the following notations.
Regex Description Example
\b Word boundary \bcool\b matches only cool and not coolant
\B Non-word boundary cool\B matches coolant but not cool
\d Single digit character b\db matches b2b but not bcb
\D Single non-digit b\Db matches bcb but not b2b
\w Single word character (alnum and _) \w matches 1 or a but not &
\W Single non-word character \w matches & but not 1 or a
\n Newline \n matches a new line
\s Single whitespace x\sx matches x x but not xx
\S Single non-space x\Sx matches xkx but not x x
\r Carriage return \r matches carriage return
The above tables can be used as a reference while constructing regular expression patterns.
Let us go through a few examples of regular expressions.

Treatment of special characters

Regular expressions use some characters such as $, ^, ., *, +, {, and } as special characters. But what if we want to use these characters as non-special characters (normal text character)? Let’s see an example. Regex: [a-z]*.[0-9].
How is this interpreted? It can be zero or more [a-z] ([a-z]*), then any one character (.), and one character in the set [0-9] such that it matches abcdeO9. It can also be interpreted as one of [a-z], then a character *, then a character . (period), and a digit such that it matches x*.8. In order to overcome this problem, we precede the character with a forward slash \ (doing this is called “escaping the character”). The characters such as * that have multiple meanings are prefixed with \ to make them into a special meaning or to make them non special.
Whether special characters or non-special characters are to be escaped varies depending on the tool that you are using. In short the term special meaning means that a character is considered as meaningful interpretation other than its character ASCII value. For example a*means a, aa, aaa… Here * has special meaning since its not interpreted as ASCII character “*”.
Certain characters to be escaped using \ to give special meaning while some others are by default taken as special meaning (e.g., *). To use it as regular ASCII meaning, it should be escaped. Here is small list of characters having special meaning with escaping: \+, \{, \}, \(,\), \?.
Characters that are by default special (you need to escape these in order to use as regular ASCII): *, ., ^, $, [, ].
To match any line containing only the word test, and no other characters on it, use ^test$. This is interpreted as “start of line marker” followed by “test” followed by “end of line marker”.
Another good example is to extract email addresses from the given text. An email address has the format usernam[email protected]. We can formulate the regular expression as:
[A-Za-z0-9.]+@[A-Za-z0-9.]+.[a-zA-Z]{2,4}. The [A-Za-z0-9.]+ before @ states that the given character class should occur one or more times, just as after the @. At the end of the email address, we have the TLD (top-level domain), which can be two to four characters in length, as specified by {2,4}.

Points to remember

In Sed, “one or more” (+) is always prefixed with the backslash escape character if it does not occur after a character set/class, while “zero or more” (*) is not thus prefixed. We can do an inverse match using /PATTERN/!{statements}. That is, include the bang (!) after the slash after PATTERN.