Showing posts with label Solaris. Show all posts
Showing posts with label Solaris. Show all posts

Monday, 20 October 2014

WEBMIN- Managing Unix Systems Graphically

What is Webmin?

Webmin is a web-based interface for system administration for Unix. Using any modern web browser, you can setup user accounts, Apache, DNS, file sharing and much more.

Demo:

http://webmin-demo.virtualmin.com/   login: demo &  password: demo.

Download Link:

How to Install:

Install on RedHat/CentOS/Fedora:

If you are using the RPM version of Webmin, first download the file from the downloads page , or run the command :
[root@UMLinux1 ~]# wget http://prdownloads.sourceforge.net/webadmin/webmin-1.710-1.noarch.rpm

and then run the command

[root@UMLinux1 ~]# rpm -U webmin-1.710-1.noarch.rpm
The rest of the install will be done automatically to the directory /usr/libexec/webmin, the administration username set to root and the password to your current root password. You should now be able to login to Webmin at the URL http://localhost:10000/.Or if accessing it remotely, replace localhost with your system's IP address.

If you want to connect from a remote server and your system has a firewall installed, see this page for instructions on how to open up port 10000.

Install on Debian:

If you are using the DEB version of webmin, first download the file from the downloads page , or run the command :
[root@UMLinux1 ~]# wget http://prdownloads.sourceforge.net/webadmin/webmin_1.710_all.deb

then run the command :

[root@UMLinux1 ~]# dpkg --install webmin_1.710_all.deb
The install will be done automatically to /usr/share/webmin, the administration username set to root and the password to your current root password. You should now be able to login to Webmin at the URL http://localhost:10000/. Or if accessing it remotely, replace localhost with your system's IP address.

How to Stop& Start Webmin Services:

In order to start the Webmin service on CentOS (Linux) you will need to issue the following command:
[root@UMLinux1 ~]# service webmin start
You can check to make sure that Webmin is running by issuing the following command:
[root@UMLinux1 ~]# service webmin status
Webmin (pid 1729) is running
[root@UMLinux1 ~]#
If you wish to configure your server to ensure that the Webmin service is started at boot time you can issue the following command:
[root@UMLinux1 ~]# chkconfig --level 3 webmin on
To verify that Webmin will start at boot, issue the following command:
[root@UMLinux1 ~]# chkconfig --list webmin
webmin 0:off 1:off 2:off 3:on 4:off 5:off 6:off
[root@UMLinux1 ~]#
In the previous listing, Webmin is listed to start in run level 3, which is the default run level that the dedicated servers boot into.

Thursday, 24 July 2014

How to enable the Name Service cache Daemon (NSCD)

Question

How do you enable NSCD to improve the performance of the hostname, password, name and group lookup which is frequently being done by IBM Rational ClearCase?

Cause

By enabling the Name Service cache Daemon (NSCD) of the operating system, a significant performance improvement can be achieved when using naming services like DNS, NIS, NIS+, LDAP.

Answer

Benefit of name service cache daemon (NSCD) for ClearCase

Example:

WithoutNSCD:
[user@host]$ time cleartool co -nc "/var/tmp/file"
Checked out "/var/tmp/file" from version "/main/10".
real    0m3.355s
user    0m0.020s
sys     0m0.018s
With NSCD
[user@host]$ time cleartool co -nc "/var/tmp/file"
Checked out "/var/tmp/file" from version "/main/11".
real    0m0.556s
user    0m0.021s
sys     0m0.016s
Enabling NSCD
Solaris:
/etc/init.d/nscd start

Linux
service nscd start

AIX:
startsrc -s netcd
Note: In addition to having nscd started it is mandatory to be sure this service will be started after a reboot. For instance on Red Hat and SuSE you can run:
chkconfig nscd  on
For more details on how to configure and or enable NSCD refer to your respective operating system vendor's manpage.

Note that this service is not yet available on HP-UX platforms.

Thursday, 19 June 2014

How to Convert OpenSSH to SSH2 and vise versa

The program SSH (Secure Shell) provides an encrypted channel for logging into another computer over a network, executing commands on a remote computer, and moving files from one computer to another. SSH provides strong host-to-host and user authentication as well as secure encrypted communications over the Internet.

SSH2 is a more secure, efficient, and portable version of SSH .

Connecting two servers running different type of SSH can be a danting task if you does not know how to convert the key. In this article ,we are going to learn about how to convert  keys   SSH( OpenSSH) to SSH2.

How to Generate OpenSSH(SSH v1) key :

umadm@umixserv1 [/home/umadm/.ssh]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/umadm/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/umadm/.ssh/id_rsa.
Your public key has been saved in /home/umadm/.ssh/id_rsa.pub.
The key fingerprint is:
5b:ac:ea:c3:25:cf:2d:31:a2:aa:83:76:4b:a2:c9:eb umadm@umixserv1
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
|         .       |
|        S o      |
|. o   . .+       |
|+o o + oo        |
|Bo.   =.         |
|#Eo..oo.         |
+-----------------+
umadm@umixserv1 [/home/umadm/.ssh]$
Here we get two encrypted keys  callled   private key( called id_rsa) and public key id_rsa.pub  undr ~$HOME/.ssh directory.
  
You can generate dsa key by using below command.
#ssh-keygen -t dsa

Convert SSH2 to  OpenSSH(SSH):


The command below can be used to convert an SSH2 private key into the OpenSSH format:
ssh-keygen -i -f path/to/private.key > path/to/new/opensshprivate.key
The command below can be used to convert an SSH2 public key into the OpenSSH format:
ssh-keygen -i -f path/to/publicsshkey.pub > path/to/publickey.pub
Here  -i ==> SSH to read an SSH2 key and convert it into the OpenSSH format

Convert OpenSSH(SSH) to SSH2:

The  reverse  process to convert an OpenSSH key into the SSH2 format in the event that a client application requires the other format. This can be done using the following command:

OpenSSH to SSH2 Private key conversion:
ssh-keygen -e -f path/to/opensshprivate.key > path/to/ssh2privatekey/ssh2privatekey
OpenSSH to SSH2 Public key conversion:
ssh-keygen -e -f path/to/publickey.pub > path/to/ssh2privatekey/ssh2publickey.pub
Here  -e ==> SSH to read an OpenSSH key file and convert it to SSH2 format

Note:If you need passwordless authentication  b/w two different hosts , you need to convert the publickey as per the destination server SSH version and  append the public key to   ~/.ssh/authorized_keys or  ~/.ssh2/authorized_keys at destination server.

Thursday, 8 May 2014

Tivoli System Automation (TSA) Overview

Introduction

The purpose of this guide is to introduce Tivoli® System Automation for Multiplatforms and provide a quick-start, purpose-driven approach to users that need to use the software, but have little or no past experience with it.

This guide describes the role that TSA plays within IBM’s Smart Analytics System solution and the commands that can be used to manipulate the application. Further, some basic problem diagnosis techniques will be discussed, which may help with minor issues that could be experienced during regular use.
When the Smart Analytics system is built with High Availability, TSA is automatically installed and configured by the ATK. Therefore, this guide will not describe how to install or configure a TSA cluster (domain) from scratch, but rather how to manipulate and work with an existing environment. To learn to define a cluster of servers, please refer to the References appendix for IBM courses that are available.

Terminology

It is advisable to become familiar with the following terms, since they are used throughout this guide. It will also help you become familiar with the scopes of the different components within TSA.
Table 1. Terminology
TermDefinition
Peer Domain: A cluster of servers, or nodes, for which TSA is responsible
Resource: Hardware or software that can be monitored or controlled. These can be fixed or floating. Floating resources can move between nodes.
Resource group: A virtual group or collection of resources
Relationships: Describe how resources work together. A start-stop relationship creates a dependency (see below) on another resource. A location relationship applies when resources should be started on the same or different nodes.
Dependency: A limitation on a resource that restricts operation. For example, if resource A depends on resource B, then resource B must be online for resource A to be started.
Equivalency: A set of fixed resources of the same resource class that provide the same functionality
Quorum: A cluster is said to have quorum when there it has the capability to form a majority within its nodes. The cluster can lose quorum when there is a communication failure, and sub-clusters form with an even number of nodes.
Nominal State: This can be online or offline. It is the desired state of a resource, and can be changed so that TSA will bring a resource online or shut it down.
Tie Breaker: Used to maintain quorum, even in a split-brain situation (as mentioned in the definition of quorum). A tie-breaker allows sub-clusters to determine which set of nodes will take control of the domain.
Failover: When a failure occurs (typically hardware), which causes resources to be moved from one machine to another machine, the resources are said to have “failed over”

Getting Started

The purpose of TSA in the Smart Analytics system is to manage software and hardware resources, so that in the event of a failure, they can be restarted or moved to a backup system. TSA uses background scripts to check the status of processes and ensure that everything is working ok. It also uses “heart-beating” between all the nodes in the domain to ensure that every server is reachable. Should a process fail the status check, or a node fails to respond to a heartbeat, appropriate action will be taken by TSA to bring the system back to its nominal state.
Let’s start with the basics. In a Smart Analytics System, the TSA domain includes the DB2 Admin node, the Data nodes, and any Standby/backup nodes. The management server is not part of the domain and TSA commands will not work there. Further, all TSA commands are run as the root user.
The first thing you want to do is check the status of the domain, and start it if required:
    # lsrpdomain
    Name      OpState RSCTActiveVersion MixedVersions TSPort GSPort 
    bcudomain Online  2.5.3.3           No            12347  12348
In this case it’s already started, but if OpState would show “Offline”, then the command to start the domain is,
startrpdomain bcudomain
Notice that the domain name is bcudomain, and it is required for the start command. Likewise, if you want to stop the domain, the command is,
stoprpdomain bcudomain
If TSA is in an unstable state, you can also forcefully shut down the domain using the -f parameter in the stoprpdomain command. However, this is typically not recommended:
stoprpdomain -f bcudomain
You should not stop a domain until all your resources have been properly shut down. If your system uses GPFS to manage the /db2home mount, then you need to manually unmount the GPFS filesystems before you can stop the TSA domain using the following command,
/usr/lpp/mmfs/bin/mmunmount /db2home
Next, you’ll want to check the status of the nodes in the domain. The following command will do this:
        # lsrpnode
        Name      OpState RSCTVersion 
        beluga006 Online  2.5.3.3     
        beluga008 Online  2.5.3.3     
        beluga007 Online  2.5.3.3
You can see that we have 3 nodes in this domain: beluga006, beluga007, and beluga008. This also shows their state. If they are Online, then TSA can work with them. If they are Offline, they are either turned off or TSA cannot communicate with them (and thus unavailable). Nodes don’t always appear in the order that you would expect, so be sure to scan the whole output (in this case, beluga008 shows up before beluga007).

Resource Groups

After you have verified that the Domain is started, and all your nodes are Online, you will want to check the status of your resources. TSA manages all resources through resource groups. You cannot start a resource individually through TSA. When you start a resource group however, it will start all resources that belong to that group.
To check the status of your DB2 resources, use the hals command. This gives you a summary of all nodes in the peer domain, including their primary and backup locations, current location, and failover state.
+===============+===============+===============+==================+==================+===========+
|  PARTITIONS   |    PRIMARY    |   SECONDARY   | CURRENT LOCATION | RESOURCE OPSTATE | HA STATUS |
+===============+===============+===============+==================+==================+===========+
| 0             | dwadmp1x      | dwhap1x       | dwadmp1x         | Online           | Normal    |
| 1,2,3,4       | dwdmp1x       | dwhap1x       | dwdmp1x          | Online           | Normal    |
| 5,6,7,8       | dwdmp2x       | dwhap1x       | dwdmp2x          | Online           | Normal    |
| 9,10,11,12    | dwdmp3x       | dwhap1x       | dwhap1x          | Online           | Failover  |
| 13,14,15,16   | dwdmp4x       | dwhap1x       | dwdmp4x          | Online           | Normal    |
+===============+===============+===============+==================+==================+===========+
In this example, we see that the admin node is dwadmp1x since it holds partition 0. There are 4 data nodes in this system, and all are in Normal state except for data node 3. We can see that data node 3 is in Failover state and its current location is dwhap1x, the backup server.
The hals command is actually a summary of the complete output. For more detailed information about each resource, use the lssam command. The following output is an example of a cluster with the following nodes:
Admin node:   beluga006
Data node:    beluga007
Standby node: beluga008

# lssam | grep Nominal
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
Notice that the full output was grepped to “Nominal”. This is a trick to shorten the output so that we only see the Nominal states, and soon you will see that it can get quite long otherwise.
Let’s step through the above output:
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
This first line tells us that we have a resource group named SA-nfsserver-rg and it is Online. The Nominal state is also Online, so it is working as expected. By the name, we can tell that this resource group manages the NFS server resources. Typically, this should always be online.
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
Next we have a resource group called db2_bculinux_NLG_beluga006-rg. This is the resource group belonging to the Admin node. We know that because beluga006 is the hostname for the Admin node. Here, we have 1 DB2 partition (the coordinator partition). For every partition, we define a resource group. You’ll see why shortly. The resource group for the admin partition, partition 0, is called db2_bculinux_0-rg.
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
Lastly, we have our data partition group, db2_bculinux_NLG_beluga007-rg. Every data partition in a Balanced Warehouse has 4 partitions, and they can be easily seen here.
Now, let us examine the full lssam output. Try to find each of the lines from the grepped output in the full output:
# lssam
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
                |- Online IBM.AgFileSystem:shared_db2home:beluga006
                '- Offline IBM.AgFileSystem:shared_db2home:beluga008
        |- Online IBM.AgFileSystem:varlibnfs
                |- Online IBM.AgFileSystem:varlibnfs:beluga006
                '- Offline IBM.AgFileSystem:varlibnfs:beluga008
        |- Online IBM.Application:SA-nfsserver-server
                |- Online IBM.Application:SA-nfsserver-server:beluga006
                '- Offline IBM.Application:SA-nfsserver-server:beluga008
        '- Online IBM.ServiceIP:SA-nfsserver-ip-1
                |- Online IBM.ServiceIP:SA-nfsserver-ip-1:beluga006
                '- Offline IBM.ServiceIP:SA-nfsserver-ip-1:beluga008
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_0-rs
                   |- Online IBM.Application:db2_bculinux_0-rs:beluga006
                   '- Offline IBM.Application:db2_bculinux_0-rs:beluga008
                |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs:beluga006
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0000-rs:beluga008
                '- Online IBM.ServiceIP:db2ip_172_16_10_228-rs
                    |- Online IBM.ServiceIP:db2ip_172_16_10_228-rs:beluga006
                    '- Offline IBM.ServiceIP:db2ip_172_16_10_228-rs:beluga008
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga007-rg Nominal=Online
        |- Online IBM.ResourceGroup:db2_bculinux_1-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_1-rs
                    |- Online IBM.Application:db2_bculinux_1-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_1-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0001-rs:beluga008
        |- Online IBM.ResourceGroup:db2_bculinux_2-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_2-rs
                    |- Online IBM.Application:db2_bculinux_2-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_2-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0002-rs:beluga008
        |- Online IBM.ResourceGroup:db2_bculinux_3-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_3-rs
                    |- Online IBM.Application:db2_bculinux_3-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_3-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0003-rs:beluga008
        '- Online IBM.ResourceGroup:db2_bculinux_4-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_4-rs
                    |- Online IBM.Application:db2_bculinux_4-rs:beluga007
                    '- Offline IBM.Application:db2_bculinux_4-rs:beluga008
                '- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs
                    |- Online IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs:beluga007
                    '- Offline IBM.Application:db2mnt-db2fs_bculinux_NODE0004-rs:beluga008

Let us take a look at the NFS resource group:


Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
             |- Online IBM.AgFileSystem:shared_db2home:beluga006
             '- Offline IBM.AgFileSystem:shared_db2home:beluga008
The first line was what we had seen before (lssam | grep Nom). Now, we can see what resources actually form the resource group. This first resource is of type AgFileSystem and represents the db2home mount. We can see that it can exist on beluga006 and beluga008, and that it is Online in beluga006 and Offline in beluga008.
Similarly, for the admin node, we can now see the individual resources:
Online IBM.ResourceGroup:db2_bculinux_NLG_beluga006-rg Nominal=Online
        '- Online IBM.ResourceGroup:db2_bculinux_0-rg Nominal=Online
                |- Online IBM.Application:db2_bculinux_0-rs
                   |- Online IBM.Application:db2_bculinux_0-rs:beluga006
                   '- Offline IBM.Application:db2_bculinux_0-rs:beluga008
The first two lines were part of the previous grepped output, but now we can see an Application resource. You can see similar results for the data node and each of its 4 data partitions. The reason that each of these resources exist on two nodes (beluga006 and beluga008) is for high availability. If beluga006 were to fail, TSA will move all those resources that are currently Online there to beluga008. Then, you would see that they are Offline in beluga006, and Online in beluga008. You can see how this output is useful to determine on which nodes the resources exist.
The lssam command also shows Equivalencies as part of the output. I will include it for the sake of completion, but we will discuss this later on:
Online IBM.Equivalency:SA-nfsserver-nieq-1
        |- Online IBM.NetworkInterface:bond0:beluga006
        '- Online IBM.NetworkInterface:bond0:beluga008
Online IBM.Equivalency:db2_FCM_network
        |- Online IBM.NetworkInterface:bond0:beluga006
        |- Online IBM.NetworkInterface:bond0:beluga007
        '- Online IBM.NetworkInterface:bond0:beluga008
Online IBM.Equivalency:db2_bculinux_0-rg_group-equ
        |- Online IBM.PeerNode:beluga006:beluga006
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_1-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_2-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_3-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_4-rg_group-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_NLG_beluga006-equ
        |- Online IBM.PeerNode:beluga006:beluga006
        '- Online IBM.PeerNode:beluga008:beluga008
Online IBM.Equivalency:db2_bculinux_NLG_beluga007-equ
        |- Online IBM.PeerNode:beluga007:beluga007
        '- Online IBM.PeerNode:beluga008:beluga008
The lssam command also lets you limit the output to a particular resource group, with the –g option:
# lssam –g SA-nfsserver-rg
Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online
        |- Online IBM.AgFileSystem:shared_db2home
                |- Online IBM.AgFileSystem:shared_db2home:beluga006
                '- Offline IBM.AgFileSystem:shared_db2home:beluga008
        |- Online IBM.AgFileSystem:varlibnfs
                |- Online IBM.AgFileSystem:varlibnfs:beluga006
                '- Offline IBM.AgFileSystem:varlibnfs:beluga008
        |- Online IBM.Application:SA-nfsserver-server
                |- Online IBM.Application:SA-nfsserver-server:beluga006
                '- Offline IBM.Application:SA-nfsserver-server:beluga008
        '- Online IBM.ServiceIP:SA-nfsserver-ip-1
                |- Online IBM.ServiceIP:SA-nfsserver-ip-1:beluga006
                '- Offline IBM.ServiceIP:SA-nfsserver-ip-1:beluga008
With the Smart Analytics System, some new commands were introduced to make it easier to monitor and use TSA with DB2:
Table 2. Useful Commands
CommandDefinition
hals: shows HA status summary for all db2 partitions
hachknode shows the status of the node in the domain and details about the private and public networks
hastartdb2 start db2 partition resources
hastopdb2 stop db2 partition resources
hafailback moves partitions back to the primary machine specified in the primary_machine argument
Equivalency: A set of fixed resources of the same resource class that provide the same functionality
hafailover moves partitions off of the primary machine specified in the primary_machine argument to it is standby
hareset attempt to reset pending, failed, stuck resource states

Stopping and Starting Resources

If you want to stop or start the DB2 service, you need to stop the respective DB2 resource groups using TSA commands. TSA will then start or stop DB2.
The command to do this is chrg. To stop a resource group named db2_bculinux_NLG_beluga007, issue the command,
chrg –o offline –s “Name == ‘db2_bculinux_NLG_beluga007’”
Similarly, to start the resource group
chrg –o online –s “Name == ‘db2_bculinux_NLG_beluga007’”
You can also stop/start all resources at the same time:
chrg –o online –s “1=1”
The Smart Analytics System also has some pre-configured commands:
hastartdb2 and hastopdb2
These two commands, however, are specific to DB2 and if there has been customization to TSA, they may not stop/start all resources.
If TSA has pre-configured rules/dependencies, they will ensure that resources are stopped and started in the correct order. For example, DB2 resources that depend on NFS will not start if the NFS share is Offline.

TSA Components

Now that you understand the basics of Tivoli System Automation, we can discuss some of the other components that it can manage.

Service IP

A service IP is a virtual, floating resource attached to a network device. Essentially, it is an IP address that can move from one machine to another, in the event of a failover. Service IPs play a key role in a highly available environment. Because they move from a failed machine to a standby, they allow an application to reconnect to the new machine using the same IP address – as if the original server had simply restarted.
The following command will allow you to view what service IPs have been configured for your system.
# lsrsrc -Ab IBM.ServiceIP
    Resource Persistent and Dynamic Attributes for IBM.ServiceIP
    resource 1:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 0
     AggregateResource = "0x2029 0xffff 0x414c690c 0x7cc2abfa 0x919b42d5 0xbf62ab75"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb3a"}
     OpState           = 2
     ConfigChanged     = 0
     ChangedAttributes = {}
    resource 2:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 0
     AggregateResource = "0x2029 0xffff 0x414c690c 0x7cc2abfa 0x919b42d5 0xbf62ab75"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb1a"}
     OpState           = 1
     ConfigChanged     = 0
     ChangedAttributes = {}
    resource 3:
     Name              = "db2ip_10_160_20_210-rs"
     ResourceType      = 1
     AggregateResource = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
     IPAddress         = "10.160.20.210"
     NetMask           = "255.255.255.0"
     ProtectionMode    = 1
     NetPrefix         = 0
     ActivePeerDomain  = "bcudomain"
     NodeNameList      = {"t6udb1a","t6udb3a"}
     OpState           = 1
     ConfigChanged     = 0
     ChangedAttributes = {}
The above example shows three resources with the same name, db2ip_10_160_20_210-rs. The NodeNameList parameter tells us which node(s) the resource is referring to. The first resource has Opstate set to 2, which tells us that this is where the service IP is currently pointing (it is also the primary location of the resource). The second resource has Opstate 1, which tells us that this is the backup/standby node. The third resource contains both nodes in its NodeNameList parameters, and this tells TSA that this is a floating resource between those two nodes.

Application Resources

TSA manages resources using scripts. Some scripts are built in (and part of TSA), such as those for controlling DB2. These scripts are responsible for starting, stopping and monitoring the application. Sometimes it can be useful to understand these scripts, or even edit them for problem diagnosis. To find out where they are located, we use the lsrsrc command, which provides us with the complete configuration of a particular resource.
Following is an example:
# lsrsrc -Ab IBM.Application
resource 12:
  Name                  = "db2_dbedw1da_8-rs" 
  ResourceType          = 1
  AggregateResource     = "0x3fff 0xffff 0x00000000 0x00000000 0x00000000 0x00000000"
  StartCommand          = "/usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh dbedw1da 8"
  StopCommand           = "/usr/sbin/rsct/sapolicies/db2/db2V97_stop.ksh dbedw1da 8"
  MonitorCommand        = "/usr/sbin/rsct/sapolicies/db2/db2V97_monitor.ksh dbedw1da 8"
  MonitorCommandPeriod  = 60
  MonitorCommandTimeout = 180
  StartCommandTimeout   = 330
  StopCommandTimeout    = 140
  UserName              = "root"
  RunCommandsSync       = 1
  ProtectionMode        = 1
  HealthCommand         = ""
  HealthCommandPeriod   = 10
  HealthCommandTimeout  = 5
  InstanceName          = ""
  InstanceLocation      = ""
  SetHealthState        = 0
  MovePrepareCommand    = ""
  MoveCompleteCommand   = ""
  MoveCancelCommand     = ""
  CleanupList           = {}
  CleanupCommand        = ""
  CleanupCommandTimeout = 10
  ProcessCommandString  = ""
  ResetState            = 0
  ReRegistrationPeriod  = 0
  CleanupNodeList       = {}
  MonitorUserName       = ""
  ActivePeerDomain      = "bcudomain"
  NodeNameList          = {"d8udb11a","d8udb3a"}
  OpState               = 1
  ConfigChanged         = 0
  ChangedAttributes     = {}
  HealthState           = 0
  HealthMessage         = ""
  MoveState             = [32768,{}]
  RegisteredPID         = 0
Some of the more common and useful attributes are described in Table 3.
Table 3. Resource Attributes
AttributeDefinition
ResourceType: Indicates whether the resource is allowed to run on multiple nodes, or a single node. A fixed resource is identified with a ResouceType value of 0, and a floating resource has a value of 1.
StartCommand: Specifies the command to be run when the resources is started
StopCommand: Specifies the command to be run when the resource is stopped
MonitorCommand: Specifies the command to be run when the resource is being monitored. This happens on a regular interval, and you will likely see this command often when you run the “ps –ef” command.
UserName: The userid that TSA will use to start this resource
NodeNameList: Indicates on which nodes the resource is allowed to run. This is an attribute of an RSCT resource.
OpState: Specifies the operational state of a resource or a resource group. The valid states are,
0 - UNKNOWN
1 - ONLINE
2 - OFFLINE
3 - FAILED_OFFLINE
4 - STUCK_ONLINE
5 - PENDING_ONLINE
6 - PENDING_OFFLINE

Network Resources

Every machine typically has an Ethernet adaptor, with a configured network address. TSA is aware of this and you can see how they have been configured with the lsrsrc command. For example,
# lsrsrc -Ab IBM.NetworkInterface
    resource 1:
        Name             = "en0"
        DeviceName       = ""
        IPAddress        = "172.22.1.217"
        SubnetMask       = "255.255.252.0"
        Subnet           = "172.22.0.0"
        CommGroup        = "CG1"
        HeartbeatActive  = 1
        Aliases          = {}
        DeviceSubType    = 6
        LogicalID        = 0
        NetworkID        = 0
        NetworkID64      = 0
        PortID           = 0
        HardwareAddress  = "00:21:5e:a3:be:60"
        DevicePathName   = ""
        IPVersion        = 4
        Role             = 0
        ActivePeerDomain = "bcudomain"

Log Files

It is important to be aware of the log files that TSA actively writes to:
  1. History file – this logs the commands that were sent to TSA
    /var/ct/IBM.RecoveryRM.log2
  2. Error and monitor logs – these logs are simply the AIX and Linux system logs. They will show you the output of the start, stop, and monitor scripts as well as any diagnostic information coming from TSA. Although the system administrator can configure the location for these logs, they are typically located in the following locations,
    AIX: /tmp/syslog.out
    Linux: /var/log/messages

Command Reference

Table 4 describes the most common commands that a TSA administrator will use.
Table 4. Common TSA Commands
CommandDefinition
hals: Display HA configuration summary
hastopdb2: Stop DB2 using TSA
hastartdb2: Start DB2 using TSA
mkequ:Makes an equivalency resource
chequ:Changes a resource equivalency
lsequ: Lists equivalencies and their attributes
rmequ: Removes one or more resource equivalencies
mkrg: Makes a resource group
chrg: Changes persistent attribute values of a resource group (including starting and stopping a resource group)
lsrg: Lists persistent attribute values of a resource group or its resource group members
rmrg: Removes a resource group
mkrel: Makes a managed relationship between resources
chrel: Changes one or more managed relationships between resources
lsrel: Lists managed relationships
rmrel: Removes a managed relationship between resources
samcrl: Sets the IBM TSA control parameters
lssamctrl: Lists the IBM TSA controls
addrgmbr: Adds one ore more resources to a resource group
chrgmbr: Changes the persistent attribute value(s) of a managed resource in a resource group
rmrgmbr: Removes one or more resources from the resource group
lsrgreq: Lists outstanding requests applied against resource groups or managed resources
rgmbrreq: Requests a managed resource to be started or stopped, or cancels the request
rgreq: Requests a resource group to be started, stopped, or moved, or cancels the request
lssam: Lists the defined resource groups and their members in a tree format

Command Tips

Following are some useful commands with examples.
Show relationships/dependencies:
lsrel | sort
Show details for a specific relationship:
# lsrel -A b -s "Name = 'db2_bculinux_0-rs_DependOn_db2_bculinux_qp-rel'"
Managed Relationship 1:
        Class:Resource:Node[Source] = IBM.Application:db2_bculinux_qp
        Class:Resource:Node[Target] = {IBM.Application:db2_bculinux_0-rs}
        Relationship                = DependsOn
        Conditional                 = NoCondition
        Name                        = db2_bculinux_0-rs_DependOn_db2_bculinux_qp-rel
        ActivePeerDomain            = bcudomain
        ConfigValidity              =
Delete/remove a relationship
rmrel -s "Name like 'db2_bculinux_%-rs_DependsOn_db2_bculinux_0-rs-rel'"
Change a resource attribute:
chrsrc -s "Name=='"  attribute=value
Example:
chrsrc -s "Name=='db2ip_10_160_10_27-rs'" IBM.ServiceIP NetMask='255.255.255.0'
To save current SAMP policy information:
sampolicy –s /tmp/sampolicy.current.xml
To check if the policy in the input file is valid:
sampolicy –c /tmp/sampolicy.current.xml
To activate it:
sampolicy –a /tmp/sampolicy.current.xml

Troubleshooting

This section describes methods that can be used to determine the cause of a particular problem or failure. Though techniques vary depending on the type of problem, the following should be a good starting point for most issues.
Resolving FAILED OFFLINE status
A failed offline status will prevent you from setting the nominal status to ONLINE, so these must be resolved first and changed to OFFLINE before turning it back to ONLINE. Make sure that the Nominal status is showing OFFLINE before resolving it.
To resolve the Failed offline messages, use the resetrsrc command.
resetrsrc -s ‘Name = "db2whse_appinstance_01.abxplatform_server1"‘ IBM.Application
resetrsrc -s 'Name = "db2whse_appinstance_01.adminconsole_server1"' IBM.Application
Recovery from a failed failover attempt
Take all TSA resources offline. The lssam output should reflect “Offline” for all resources before you attempt to bring them back online. To reset NFS resources, use:
resetrsrc -s "Name like 'SA-nfsserver-%'" IBM.Application (if necessary)
resetrsrc -s "Name like 'SA-nfsserver-%'" IBM.ServiceIP (if necessary)
When testing goes wrong, you are often left with resources in various states such as online, offline, and unknown. When the state of a resource is unknown, before attempting to restart it, you must issue resetrsrc for that particular resource.
When you are restarting DB2, you must verify that all the resources are offline before attempting to bring them online again. You must also correct the db2nodes.cfg file. Make sure you have backup copies of db2nodes.cg and db2ha.sys.
NFS mounts stop functioning
In testing the NFS failover, we were able to move the server over successfully, but the existing NFS client mounts stopped functioning. We solved this problem by unmounting and remounting the NFS volume.
Resolving Binding=Sacrificed
To resolve this problem you have to look at the overall cluster and how its setup/defined. Issues that can and will cause this are types that will have a cluster-wide impact but not specifically affect one resource.
  1. Check for failed relationships by listing the relationships with the following command "lsrel -Ab", and then determine if one or more of the relationships relating to the failed resource group have not been satisfied.
  2. Check for failed equivalencies by listing them with the following command "lsequ -Ab" and then determine if one re more of the equivalencies have not been satisfied.
  3. Check your resource group attributes and look for anything that maybe set incorrectly, some of the commands to use are listed as follows:
    lsrg -Ab -g 
    lsrsrc -s 'Name="failed_resource"' –Ab IBM.
    lsrg -m -g 
    samdiag -g <resource_group_name>
  4. Check for anything specific to your configuration that all of the sacrificed resources share in common, like a mount point, a database instance, a virtual IP.
Check hardware configuration:dmesg – check initialization errors
date – check server synchronization
ifconfig – to check network adapters
netstat -I – to check network configuration
ps -ef | grep inetd – will provide a list of the running processes, including group and PID
Resource state is unknown
Try resetting the resource using the resetrsrc command:
resetrsrc -s "Name like 'db2_db2inst2_%'" IBM.Application
resetrsrc -s "Name like 'db2_db2inst2_%'" IBM.ServiceIP
Timeout values for resources
For the health query interval of each resource, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application MonitorCommandPeriod=300
For the health query timeout, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application MonitorCommandTimeout=290
For the resource startup script timeout, use
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application StartCommandTimeout=300
For the Resource Stop script timeout, use:
chrsrc -s 'Name like "db2_db2inst2%"' IBM.Application StopCommandTimeout=720
Recycling the automation manager
If the problem is most likely related to the automation manager, you should try recycling the automation manager (IBM.RecoveryRM) before contacting IBM support. This can be done using the following commands:
Find out on which node the RecoveryRM master daemon is running:
# lssrc -ls IBM.RecoveryRM | grep Master
On the node running the master, retrieve the PID and kill the automation manager:
# lssrc -ls IBM.RecoveryRM | grep PID
# kill -9 
As a result, an automation manager on another node in the domain will take over the master role, and proceeds with making automation decisions. The subsystem will restart the killed automation manager immediately.
Resolving lssam hangs
http://www-01.ibm.com/support/docview.wss?uid=swg21293701
Move to another node in the same HA group and see if you can run the lssam command. If you can, go back to the original node to see if you can now do the lssam command. If this still does not work, then run the following commands:
lssrc -ls IBM.RecoveryRM | grep -i master 
lssrc -ls IBM.GblResRM | grep -i leader
Make sure neither of the above command outputs return the “hanging” node and if so, then reboot just that node and see if the issue is resolved.
AVOID the following (DON’Ts)
  • Do not use rpower –a, or rpower on more than one node in the same HA group when SAMP HA is up and running.
  • Do not offline HA-NFS using a sudo command while logged in as the instance owner and while in the /db2home directory. HA-NFS will get stuck online, and the RecoveryRM daemon has to be killed on the master. If RecoveryRM will not start, reboot may be required.
  • Do not use ifdown to bring down a network interface. This will result in the eth (or en) device to be deleted from equivalency member and will require you to add the "eth" device (in Linux) or "en" device (in AIX) back into the network equivalency using chequ command
  • Do not manipulate any BW resources that are under active SAMP control.
    Turn automation off (samctrl –M T) before manipulating these BW resources.
  • Do not implement changes to the SA MP policy unless exhaustive testing of the HA test cases is completed.
Check the following frequently (DOs)
  • Ensure the /home and /db2home directories are always mounted before starting up a node.
  • Check for process ids that may be blocking stop, start and monitor commands.
  • Save backup copies of the db2nodes.cfg and db2ha.sys file.
  • Save the backup copies of the current SAMP policy before and after every SAMP change. Compare the current SAMP policy to the backup SAMP policy every time there is an HA incident.
  • Save backup copies of db2pd -ha output before and after every SAMP change. Compare the current db2pd outputs to the backup db2pd outputs every time there is an HA incident.
  • Save backup copies of the samdiag outputs. 
Source

Friday, 28 March 2014

A Simple Way to Send Multiple Line Commands Over SSH

 A Simple Way to Send Multiple Line Commands Over SSH

Below are three methods to send multiple line commands over SSH. The first method is a quick overview of running remote commands over SSH, the second method uses the bash command to run remote commands over SSH, and the third method uses HERE documents to run remote commands over SSH. Each have their limitations which I will cover.

Contents

Running Remote Commands Over SSH

To run one command on a remote server over SSH:
ssh $HOST ls
To run two commands on a remote server over SSH:
ssh $HOST 'ls; pwd'
To run the third, fourth, fifth, etc. commands on a remote server over SSH keep appending commands with a semicolon inside the single quotes.

But, what if you want to remotely run many more commands, or if statements, or while loops, etc., and make it all readable?
#!/bin/bash
ssh $HOST '
ls

pwd

if true; then
    echo "This is true"
else
    echo "This is false"
fi

echo "Hello world"
'
The above shell script works but begins to break if local variables are added.

For example, the following shell script will run, but the local variable HELLO will not be parsed inside the remote if statement:
#!/bin/bash

HELLO="world"

ssh $HOST '
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"
'
In order to parse the local variable HELLO so it is used in the remote if statement, read onto the next section.

Using SSH with the BASH Command

As mentioned above, in order to parse the local variable HELLO so it is used in the remote if statement, the bash command can to be used:
#!/bin/bash

HELLO="world"

ssh $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"
'"
Perhaps you want to use a remote sudo command within the shell script:
#!/bin/bash

HELLO="world"

ssh $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"

sudo ls /root
'"
When the above shell script is run, everything will work as intended until the remote sudo command which will throw the following error:
sudo: sorry, you must have a tty to run sudo
This error is thrown because the remote sudo command is prompting for a password which needs an interactive tty/shell. To force a pseudo interactive tty/shell, add the -t command line switch to the ssh command:
#!/bin/bash

HELLO="world"

ssh -t $HOST bash -c "'
ls

pwd

if true; then
    echo $HELLO
else
    echo "This is false"
fi

echo "Hello world"

sudo ls /root
'"
With a pseudo interactive tty/shell available, the remote sudo command’s password prompt will be displayed, the remote sudo password can then be entered, and the contents of the remote root’s home directory will be displayed.

However, recently I needed to run a specific remote sed command over SSH to find and delete one line and the subsequent three lines and another specific remote sed command over SSH to find a line and insert another line with some text above it, so I naturally tried using the bash method mentioned above:
#!/bin/bash

ssh $HOST bash -c "'
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
'"
Everytime I would run the above shell script, I would get the following error:
sed: -e expression #1, char 5: unterminated address regex
However, the same commands work when run by themselves:
ssh $HOST "sed -i -e '/line one/,+3 d' /tmp/test1"
ssh $HOST "sed -i -e '/^line two$/i line one' /tmp/test2"
I thought the problem may be because of single quotes within single quotes. The bash command above requires everything to be wrapped in single quotes and a sed command requires the regular expression to be wrapped in single quotes as well. As mentioned in the BASH manual, “a single quote may not occur between single quotes, even when preceded by a backslash”.

However, I debunked this single quote theory being my problem because running a simple remote sed search and replace command inside of the bash command worked just fine:
#!/bin/bash

ssh $HOST bash -c "'

echo "Hello" >> /tmp/test3

sed -i -e 's/Hello/World/g' /tmp/test3
'"
I can only assume the problem with the specific remote sed commands is something with the syntax that I have not yet figured out.

Despite all this, I eventually figured out that the specific remote sed commands I wanted to run would work when using SSH with HERE documents.

Using SSH with HERE Documents

As mentioned above, the specific remote sed commands I wanted to run did work when using SSH with HERE documents:
ssh $HOST << EOF
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
EOF
Despite the remote sed commands working, the following warning message was thrown:
Pseudo-terminal will not be allocated because stdin is not a terminal.
To stop this warning message from appearing, add the -T command line switch to the ssh command to disable pseudo-tty allocation (a pseudo-terminal can never be allocated when using HERE documents because it is reading from standard input):
ssh -T $HOST << EOF
cat << EOFTEST1 > /tmp/test1
line one
line two
line three
line four
EOFTEST1

cat << EOFTEST2 > /tmp/test2
line two
EOFTEST2

sed -i -e '/line one/,+3 d' /tmp/test1

sed -i -e '/^line two$/i line one' /tmp/test2
EOF
With this working, I later discovered remote sudo commands that require a password prompt will not work with HERE documents over SSH.
ssh $HOST << EOF
sudo ls /root
EOF
The above ssh command will throw the following error if the SSH user you are logging into requires a password when using the remote sudo command:
Pseudo-terminal will not be allocated because stdin is not a terminal.
user@host's password: 
sudo: no tty present and no askpass program specified
However, the remote sudo command will work if the SSH user’s sudo settings allow that user to use sudo without a password by setting user ALL=(ALL) NOPASSWD:ALL in /etc/sudoers.

References

What’s the Cleanest Way to SSH and Run Multiple Commands in Bash?
Chapter 19. Here Documents

Saturday, 8 March 2014

A Quick Guide To Unix Shell Scripting

1) what is shell script ?

Normally shells are interactive. It means shell accept command from you (via keyboard) and execute them. But if you use command one by one (sequence of 'n' number of commands) , the you can store this sequence of command to text file and tell the shell to execute this text file instead of entering the commands. This is know as shell script.
Shell script defined as:

"Shell Script is series of command written in plain text file. Shell script is just like batch file is MS-DOS but have more power than the MS-DOS batch file."

2)  Shebang:

Naturally, a shell script should start with a line such as the following:
#!/bin/bash
This indicates that the script should be run in the bash shell regardless of which interactive shell the user has chosen. This is very important, since the syntax of different shells can vary greatly.

3) How to write shell script ?

Now i will walk you through  how to write shell script,execute them etc.We will getting started with writing small shell script, that will print "Hello UnixMantra" on screen. Before starting with this you should know.
Following steps are required to write shell script:
(1) Use any editor like vi or exedit to write shell script.

(2) After writing shell script set execute permission for your script as follows

syntax: chmod permission your-script-name

Examples:
$ chmod +x your-script-name
$ chmod 755 your-script-name

Note: This will set read write execute(7) permission for owner, for group and other permission is read and execute only(5).

(3) Execute your script as

syntax: bash your-script-name
sh your-script-name
./your-script-name

Examples:
$ bash bar
$ sh bar
$ ./bar

NOTE: In the last syntax ./ means current directory, But only . (dot) means execute given command file in current shell without starting the new copy of shell, The syntax for . (dot) command is as follows

Syntax: . command-name

Example:
$ . foo

Now you are ready to write first shell script that will print "Hello UnixMantra" on screen. See the common vi command list , if you are new to vi.

$ vi firstscript
#
# My first shell script
#
clear
echo "Hello UnixMantra"

After saving the above script, you can run the script as follows:
$ ./firstscript

This will not run script since we have not set execute permission for our script first; to do this type command

$ chmod 755 firstscript
$ ./firstscript

4) Commenting Commands:

Any line beginning with a hash '#' character in the first column is taken to be a comment and is ignored. The only exception is the first line (shebang #!/bin/sh)  in the file, where the comment is used to indicate which shell should be used.

5) Shell Variables:

Like every programming language, shells support variables. Shell variables may be assigned values, manipulated, and used. Some variables are automatically assigned for use by the shell.
there are two types of variable:

(1) System variables - Created and maintained by Unix OS itself. This type of variable defined in CAPITAL LETTERS.
(2) User defined variables (UDV) - Created and maintained by user. This type of variable defined in lower letters.
Any programming language needs variables. You define a variable as follows:
Y="hello"
and refer to it as follows:
$Y
More specifically, $Y is used to denote the value of the variable Y.

$ no=10
# this is ok
$ 10=no
# Error, NOT Ok, Value must be on right side of = sign.

To define variable called 'vech' having value car
$ vech=car

To define variable called n having value 10
$ n=10
Caution: Do not modify System variable this can some time create problems.
Yon can  print the value of the variable or command using  "echo"  or  "print"
#echo "$Y"
hello
I always suggest you to use "curly braces {}" to protect the variables, we have a good advantage when grabbing the actual values of variables.
Eg:
# X=Hello
#echo "$XWorld"
There wont be any output  by above command because the shell looks for "Xworld" as variable rather X.We can avoid this embracing situation using curly braces.
#echo "${X}World"
HelloWorld

6)  Analysing  quotes:

There are three types of quotes

Quotes
Name
Meaning
"
Double Quotes
"Double Quotes" - Anything enclose in double quotes removed meaning of that characters (except \ and $).
'
Single quotes
'Single quotes' - Enclosed in single quotes remains unchanged.
`
Back quote

`Back quote` - To execute command
Eg:
MY_VALUE=Hello
$ echo '$MY_VALUE'
$MY_VALUE
$ echo "$MY_VALUE"
Hello
$ echo "Today is date"
Can't print message with today's date.
$ echo "Today is `date`".
It will print today's date as, Today is  Fri Mar 07 15:35:08 EDT 2014

7)  Conditional Statement:

if or elif
Conditionals are used where an action is appropriate only under certain circumstances. The most frequently used conditional operator is the if-statement. For example, the shell below displays the contents of a file on the screen using cat, but lists the contents of a directory using ls.
#!/bin/sh
# show script
if [ -d $1 ]
then
  ls $1
else
  cat $1
fi
Here, we notice a number of points:

  • The if-statement begins with the keyword if, and ends with the keyword fi (if, reversed).
  • The if keyword is followed by a condition, which is enclosed in square brackets. In this case, the condition -d $1 may be read as: if $1 is a directory.
  • The line after the if keyword contains the keyword then.
  • Optionally, you may include an else keyword.
If the condition is satisfied (in this case, if $1 is a directory) then the commands between the then and else keywords are executed; if the condition isn't satisfied then the commands between the else and fi keywords are executed. If an else keyword isn't included, then the commands between the then and fi keywords are executed if the condition is true; otherwise the whole section is skipped.

Type1
Type2
Type3

if condition

then

    statement1

    statement2
    ..........
fi

if condition
then
    statement1
    statement2
    ..........
else
    statement3
fi

"if condition1
then
    statement1
    statement2
    ..........
elif condition2
then
    statement3
    statement4
    ........   
elif condition3
then
    statement5
    statement6
    ........   
fi

To run simple test

If you wish to specify an alternate action when the condition fails

it is possible to test for another condition if the first "if" fails. Note that any number of elifs can be added.
The Test Command and Operators
The command used in conditionals nearly all the time is the test command. Test returns true or false (more accurately, exits with 0 or non zero status) depending respectively on whether the test is passed or failed. It works like this:
test operand1 operator operand2
for some tests, there need be only one operand (operand2) The test command is typically abbreviated in this form:
[ operand1 operator operand2 ]
To bring this discussion back down to earth, we give a few examples:
#!/bin/bash
X=3
Y=4
empty_string=""
if [ $X -lt $Y ] # is $X less than $Y ? 
then
 echo "\$X=${X}, which is smaller than \$Y=${Y}"
fi

if [ -n "$empty_string" ]; then
 echo "empty string is non_empty"
fi

if [ -e "${HOME}/.surya" ]; then    # test to see if ~/.surya exists
 echo "you have a .surya file"
 if [ -L "${HOME}/.surya" ]; then   # is it a symlink ?  
  echo "it's a symbolic link
 elif [ -f "${HOME}/.surya" ]; then  # is it a regular file ?
  echo "it's a regular file"
 fi
else
 echo "you have no .surya file"
fi
A brief summary of test operators
Here's a quick list of test operators. It's by no means comprehensive, but its likely to be all you'll need to remember (if you need anything else, you can always check the bash manpage ... )

operator produces true if... number of operands
-n operand non zero length 1
-z operand has zero length 1
-d there exists a directory whose name is operand 1
-f there exists a file whose name is operand 1
-eq the operands are integers and they are equal 2
-neq the opposite of -eq 2
= the operands are equal (as strings) 2
!= opposite of = 2
-lt operand1 is strictly less than operand2 (both operands should be integers) 2
-gt operand1 is strictly greater than operand2 (both operands should be integers) 2
-ge operand1 is greater than or equal to operand2 (both operands should be integers) 2
-le operand1 is less than or equal to operand2 (both operands should be integers) 2
Case Statements:
The case construct has the following syntax:
case word in
pattern) list ;;
...esac      
An example of this should make things clearer:
!#/bin/sh
case $1
in
1) echo 'First Choice';;
2) echo 'Second Choice';;
*) echo 'Other Choice';;
esac
"1", "2" and "*" are patterns, word is compared to each pattern and if a match is found the body of the corresponding pattern is executed, we have used "*" to represent everything, since this is checked last we will still catch "1" and "2" because they are checked first. In our example word is "$1", the first parameter, hence if the script is ran with the argument "1" it will output "First Choice", "2" "Second Choice" and anything else "Other Choice". In this example we compared against numbers (essentially still a string comparison however) but the pattern can be more complex, see the SH man page for more information.

8) Looping Commands:

Whereas conditional statements allow programs to make choices about what to do, looping commands support repetition. Many scripts are written precisely because some repetitious processing of many files is required, so looping commands are extremely important.

Loops are constructions that enable one to reiterate a procedure or perform the same procedure on several different items. There are the following kinds of loops available in bash

  • for loops
  • while loops
'For' loops
The syntax for the for loops is best demonstrated by example.
#!/bin/bash
for X in red green blue
do
 echo $X
done
The for loop iterates the loop over the space seperated items. Note that if some of the items have embedded spaces, you need to protect them with quotes. Here's an example:
#!/bin/bash
colour1="red"
colour2="light blue"
colour3="dark green"
for X in "$colour1" $colour2" $colour3"
do
 echo $X
done
Can you guess what would happen if we left out the quotes in the for statement ? This indicates that variable names should be protected with quotes unless you are pretty sure that they do not contain any spaces.
'While' Loops
While loops iterate "while" a given condition is true. An example of this:
#!/bin/bash
X=0
while [ $X -le 20 ]
do
 echo $X
 X=$((X+1))
done
This raises a natural question: why doesn't bash allow the C like for loops
for (X=1,X<10; X++)
As it happens, this is discouraged for a reason: bash is an interpreted language, and a rather slow one for that matter. For this reason, heavy iteration is discouraged.

9) Functions:

When program gets complex we need to use divide and conquer technique. It means whenever programs gets complicated, we divide it into small chunks/entities which is know as function.
Function is series of instruction/commands. Function performs particular activity in shell i.e. it had specific work to do or simply say task. To define function use following syntax:
Syntax:
           function-name ( )
           {
                command1
                command2
                .....
                ...
                commandN
                return
           }
Where function-name is name of you function, that executes series of commands. A return statement will terminate the function. Example:
Type SayHello() at $ prompt as follows
$ SayHello()
{ echo "Hello $LOGNAME, Have nice computing"
return}
To execute this SayHello() function just type it name as follows:
$ SayHello
Hello surya, Have nice computing.
This way you can call function.

10)  Command Substitution:

Command Substitution is a very handy feature of the bash shell. It enables you to take the output of a command and treat it as though it was written on the command line. For example, if you want to set the variable X to the output of a command, the way you do this is via command substitution.

There are two means of command substitution: brace expansion and backtick expansion.
Brace expansion workls as follows:
$(commands) expands to the output of commands

This permits nesting, so commands can include brace expansions

Backtick expansion expands
`commands` to the output of commands
An example is given;:
#!/bin/bash
files="$(ls)"
web_files=`ls public_html`
echo "$files"      # we need the quotes to preserve embedded newlines in $files
echo "$web_files"  # we need the quotes to preserve newlines 
X=`expr 3 \* 2 + 4` # expr evaluate arithmatic expressions. man expr for details.
echo "$X"
The advantage of the $() substitution method is almost self evident: it is very easy to nest. It is supported by most of the bourne shell varients (the POSIX shell or better is OK). However, the backtick substitution is slightly more readable, and is supported by even the most basic shells (any #!/bin/sh version is just fine)

Note that if strings are not quote-protected in the above echo statement, new lines are replaced by spaces in the output.

11)  Shell Arithmetic:

Use to perform arithmetic operations.
Syntax:
expr op1 math-operator op2
Examples:
$ expr 1 + 3
$ expr 2 - 1
$ expr 10 / 2
$ expr 20 % 3
$ expr 10 \* 3
$ echo `expr 6 + 3`

12)  How to de-bug the shell script?

While programming shell sometimes you need to find the errors (bugs) in shell script and correct the errors (remove errors - debug). For this purpose you can use -v and -x option with sh or bash command to debug the shell script.
General syntax is as follows:
sh   option   { shell-script-name }
OR
bash   option   { shell-script-name }

Option can be

-v Print shell input lines as they are read.
-x After expanding each simple-command, bash displays the expanded value of PS4 system variable, followed by the command and its expanded arguments.

Saturday, 1 March 2014

Find NFS Clients Connected to NFS Server

 FIND NFS CLIENTS CONNECTED TO NFS SERVER

Question: How to find Find  NFS Clients Connected to NFS Server

Ans:
You can say it very simple by using 

showmount -e  <NFS Server Name>

# showmount -e umbox04
export list for umbox04:
/gpfs/edw/common  umlpar1 umlpar2 umlpar3 umlpar4 umlpar5
/export/nim      udaixserv1,udaixserv2,udaixserv3,udaixserv4,udaixserv5
/bigdata (everyone)

but there is a problem here, if the nfs mount is shared as everyone ( say /bigdata in the above example) you will not be able to tell ,to which clients are  using it.

In-order to overcome above issues ,there are two ways to find the NFS clients connected to  NFS server.

1. Using "netstat":

The idea with  "netstat" is an indirect method . We use nfs port which is 2049 to get the clients information.
netstat -an | grep nfs.server.ip:port
If your nfs server IP address 10.6.55.21 and port is 2049, enter:
Sample outputs:
# netstat -an | grep 10.6.55.21:2049
tcp        0      0 10.6.55.21:2049       10.6.55.33:757         ESTABLISHED
tcp        0      0 10.6.55.21:2049       10.6.55.34:892         ESTABLISHED

Where,

  •     10.6.55.21 - NFS serer IP address
  •     2049 - NFS server port
  •     10.6.55.33 and 10.6.55.34 - NFS clients IP address

2. Using showmount command

You can to use the showmount command to see mount information for an NFS server. The following command should not be used as it may produce unreliable result (you can type this command on any one of the nfs client):
showmount -a <NFS Server Name>
Sample outputs:
All mount points on mynfsserv01:

#showmount -a mynfsserv01
10.6.55.33:/umdata
10.6.55.34:/umdata
10.6.55.69:/umdata
10.6.55.3:/umdata
10.6.55.6:/umdata
10.6.55.16:/umdata

Where,
  • -a: List both the client hostname or IP address and mounted directory in host:dir format. This info should not be considered reliable.

As per the rpc.mountd(8) man page:
he rpc.mountd daemon registers every successful MNT request by adding an entry to the /var/lib/nfs/rmtab file. When receivng a UMNT request from an NFS client, rpc.mountd simply removes the matching entry from /var/lib/nfs/rmtab, as long as the access control list for that export allows that sender to access the export.

Clients can discover the list of file systems an NFS server is currently exporting, or the list of other clients that have mounted its exports, by using the showmount(8) command. showmount(8) uses other procedures in the NFS MOUNT protocol to report information about the server's exported file systems.

Note, however, that there is little to guarantee that the contents of /var/lib/nfs/rmtab are accurate. A client may continue accessing an export even after invoking UMNT. If the client reboots without sending a UMNT request, stale entries remain for that client in /var/lib/nfs/rmtab.