Thursday, September 13, 2018

NFS error not responding still trying error on Linux

"NFS error not responding still trying" ........ for filer from storage.
df -hT in hang sate ..........
dsmc q sched not working.........

All  above mentioned error alert we received from alert monitoring software , first didn't got what exactly wrong. because there are multiple error on LINUX server. on affected linux server we checked /etc/fstab and found that there are around 6 NFS share mounted . when we do ls and cd to this NFS share , for 2 share we found issue.

For 2 NFS share ls and cd command not succeeding. so we decided to check with storage admin .
we provided affected server IP and filer details to them and asked to check whether everything is correctly shared from their side or not ?? storage admin answered that all permission are ok and filer is correctly shared to LINUX server. we decided that reexport same filer again. after reexporting , we able to access NFS share and df -hT command also working.

But this joy of solving issue was not permanent . again same issue occurred on this linux server.
So whats next..........

Now we thinking that there must some issue at network level which causing  this and also in linux server log file "messages" we found following entry,

18:22:04 linuxclient1 kernel:  nfs: server not responding, still trying
Sep  6 18:22:42 linuxclient1 kernel: nfs: server not responding, still trying
Sep  6 18:23:22 linuxclient1 kernel: nfs: server OK
Sep  6 18:23:22 linuxclient1 kernel: nfs: server OK

Sep  6 18:23:22 linuxclient1 kernel:] nfs: server OK 

From above logs ,we found that filer server is not responding to Linux server request, so there may be possibility that firewall blocking communication between NFS filer server and Linux client server.
we provided all required details to network team but after analysis found that from their side also no issue. Now pending team is VMWARE team who created this VM . but VM team also saying that all VM configuration is correct.

After their answer we did google search and found one interesting thing for this type of error and that was MTU. MTU is maximum transfer unit for ETHERNET interface on network.  incorrect MTU configuration on server causes performance issue on any Linu/UNIX and windows server. on affected Linux server MTU was 9000 . we also did search for MTU value on other server which are in same IP range and found that MTU for these server is 1500 and affected server it was 9000. we decided to take downtime for changing MTU value . we did MTU chnage to 1500 for primary interface and took reboot of Linux server and guess what after reboot everything was working perfect. df -hT , dsmc q sched and also ls ,cd command working perfectly on these NFS share.

At the end we can say that because of incorrect MTU configuration nfs share hang issue occurred and this nfs hang affect execution of df -hT ,dsmc q sched and ls ,cd command execution.

There may be multiple cause of NFS hang.

1. NFS server hang or down.
2. Firewall blocking communication between NFS server and Linux client.
3. incorrect MTU configuration on client or server side .
4. overloaded NFS server causing timeout for client request.

Thanks !!!


Do NFS export from AIX HACMP CLUSTER to Linux server

Scenario: Export /hacmp/export from aixhanode1 to linux host linuxnode1 and linuxnode2

On PowerHA-cluster /usr/es/sbin/cluster/etc/exports is used for exporting filesystems.

Step 1: For Cluster setup while doing NFS export from Cluster node edit file /usr/es/sbin/cluster/etc/exports . Don’t use /etc/exports file, /etc/exports file used in NON-CLUSTER environment.
Step 2:
Find export directory stanza and add hostname to whom directory need to export.
Here find stanza "/hacmp/export" and add client name at last without affecting current hostname list.
Find stanza for /hacmp/export in file /usr/es/sbin/cluster/etc/exports and add hostname linuxnode1 linuxnode2 and save file
Vi /usr/es/sbin/cluster/etc/exports
/hacmp/export -rw,root=linuxnode1:linuxnode2
Execute following command to activate changes which we made in file /usr/es/sbin/cluster/etc/exports , by executing following command.
exportfs -a -f /usr/es/sbin/cluster/etc/exports
Step 4 :
Check on both client whether NFS directory is exported or not by using below command
showmount -e nfsservername
showmount –e aixhanode1
Step 5:
Mount share directory using mount command.
Mount aixhanode1:/hacmp/export /mnt      //mounting on temp mount point
Mount aixhanode1:/hacmp/export /mnt
       Or you also mount on same name mount point using below command.
Before executing make sure that /hacmp/export exist on both NFS client.
Mount aixhanode1:/hacmp/export  /hacmp/export
Mount aixhanode1:/hacmp/export  /hacmp/export

Verify whether directory/FS is mounted or not by executing below command.
df –hT /hacmp/export