All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: NFS synchronization problems on Beowulf cluster
@ 2006-07-17 18:17 Mario Storti
  2006-07-17 18:49 ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Mario Storti @ 2006-07-17 18:17 UTC (permalink / raw)
  To: nfs


Hi all, 

We have still problems with NFS in our Beowulf cluster. The original
post is here

http://thread.gmane.org/gmane.linux.nfs/9564/focus=9566

The news are that we reproduced the problem on Scientific Linux 4.2
and also without VNFS. In addition we have taken a tethereal capture
that may help in understanding the problem. 

In brief we have a Beowulf class cluster built on Scientific Linux
(Beryllium) 4.2, (kernel 2.6.9-22.0.1). The cluster is disk-less
(nodes don't have hard disks) based on the Warewulf package. NFS
traffic is reduced by using VNFS filesystems at the nodes. However we
reproduced the problem in a configuration with disks at the nodes. 

The problem is that for some files in user accounts if we make some
modifications to the file in the server, this changes are not seen in
the compute nodes. The NFS server is NFS3 and with the standard
configuration (8 instances of the server and default parameters). The
cluster has 20 nodes at this time, but we have made experiments with a
`cloned' cluster and even with only two nodes the problem persist.

The experiment is as follows: We change a text file in a user account
with Emacs and checking whether the change is seen in the compute
nodes. Sometimes the change is immediately seen in the compute nodes,
but many times some nodes don't see the change. 

This happens even when the cluster is idle (no computing or network
load in the slave nodes, neither on the server), so it's not a problem
of performance. It is plainly working bad.

`rpcinfo -p' and `nfsstat' on the server and on the client can be
found in the riginal post.

We tried several combinations for the export and mount options
(sync/async...) with no success. 

We made en experiment and you can find the tethereal log here

http://venus.ceride.gov.ar/~mstorti/tempo/tethereal-raw10.log.gz

The process is as follows: we start with a file `foo5' and we verify
that two compute nodes (node1 and node11) see the same file. After
that we modify the file in the server and look what the compute nodes
see. I verify that node1 sees (correctly) the change, while `node11'
doesn't see the change.

These are the operations 

* Modify file foo5 in server, adding a line with 'jaja27'
* grep jaja in node1 -> reports `jaja27'   (OK)
* grep jaja in node11 -> reports `jaja27'   (OK)
* Modify file foo5 in server, replace 'jaja27' by 'jaja28'
* grep jaja in node1 -> reports `jaja28'   (OK)
* grep jaja in node11 -> reports `jaja27'   (NOT OK)


These are the shell commands:
================================================================
                            [[turn on tethereal]]
                            [[Emacs writes #jaja27 to foo5 on server]]
[mstorti@aquiles ~]$ rsh node1 grep jaja foo5
# jaja27
[mstorti@aquiles ~]$ rsh node11 grep jaja foo5
# jaja27
                            [[Emacs changes 27 to 28 in foo5 on server]]
[mstorti@aquiles ~]$ rsh node1 grep jaja foo5
# jaja28
[mstorti@aquiles ~]$ rsh node11 grep jaja foo5
# jaja27
[mstorti@aquiles ~]$
                            [[turn off tethereal]]
================================================================

I examined the frames in the tethereal log and I found the following. 

first rsh from node1: frame 183: GETATTR reply mtime:
21:36:21.975422000

first rsh from node11: frame 378: GETATTR reply mtime:
21:36:21.975422000

second rsh from node1: frame 564: GETATTR reply mtime:
21:37:37.515422000
                       frame 566: READ

second rsh from node11: frame 713: LOOKUP reply mtime:
21:37:37.515422000

Note: the inode for the file /u/mstorti/foo  is 11590453. 

So, during the second rsh from both nodes the clients receives an
mtime for the file that is posterior to the one during the first
read. The client at node1 correctly requires the new file contents
with a READ, while the client from node11 doesn't. I assume that the
client uses the contents in the cache (which is wrong). 

Sorry for the way long post. Any help is appreciated,

TIA, Mario



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread
* NFS synchronization problems on Beowulf cluster
@ 2006-05-31 19:23 Mario Storti
  2006-06-01 13:58 ` Mario Storti
  0 siblings, 1 reply; 12+ messages in thread
From: Mario Storti @ 2006-05-31 19:23 UTC (permalink / raw)
  To: nfs

Hi all, 

We have a Beowulf class cluster built on Linux Fedora Core 3 (kernel
2.6.15). The cluster is disk-less (nodes don't have hard disks) based
on the Warewulf package. NFS traffic is reduced by using VNFS
filesystems at the nodes.

We found something strange related with NFS. For some files in user
accounts if we make some modifications to the file in the server, this
changes are not seen in the compute nodes. The NFS server is NFS3 and
with the standard configuration (8 instances of the server and default
parameters). The cluster has 20 nodes at this time, but we have made
experiments with a `cloned' cluster and even with only two nodes the
problem persist. 

The experiment is as follows: We change a text file in a user account
with Emacs and checking whether the change is seen in the compute
nodes. Sometimes the change is immediately seen in the compute nodes,
but many times some nodes don't see the change. 

This happens even when the cluster is idle (no computing or network
load in the slave nodes, neither on the server), so it's not a problem
of performance. It is plainly working bad.

Below are the `rpcinfo -p' and `nfsstat' on the server and on the
client. The version of nfs-utils are 1.0.6. 

We tried several combinations for the export and mount options
(sync/async...) with no success. 

Any help is appreciated,

TIA, Mario

http://www.cimec.org.ar/mstorti

======================= SERVER =================================
[root@aquiles ~]# rpcinfo -p
   program vers proto   port
    100000    2   tcp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp    947  status
    100024    1   tcp    950  status
    100011    1   udp    777  rquotad
    100011    2   udp    777  rquotad
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    4   udp   2049  nfs
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100021    1   udp  32775  nlockmgr
    100021    3   udp  32775  nlockmgr
    100021    4   udp  32775  nlockmgr
    100021    1   tcp  32769  nlockmgr
    100021    3   tcp  32769  nlockmgr
    100021    4   tcp  32769  nlockmgr
    100005    1   udp    795  mountd
    100005    1   tcp    798  mountd
    100005    3   udp    795  mountd
    100005    3   tcp    798  mountd
[root@aquiles ~]# nfsstat
Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
3734126    0          0          0          0       
Server nfs v3:
null       getattr    setattr    lookup     access     readlink   
240     0% 1375308 36% 86317   2% 135591  3% 605405 16% 1719    0% 
read       write      create     mkdir      symlink    mknod      
72626   1% 1352149 36% 9697    0% 0       0% 0       0% 160     0% 
remove     rmdir      rename     link       readdir    readdirplus
4909    0% 0       0% 64      0% 30      0% 18      0% 60812   1% 
fsstat     fsinfo     pathconf   commit     
0       0% 122     0% 0       0% 28958   0% 

======================= COMPUTE NODE =============================

[root@node1 ~]# rpcinfo -p
   programa vers proto   puerto
    100000    2   tcp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp    892  status
    100024    1   tcp    895  status
    100021    1   udp  32768  nlockmgr
    100021    3   udp  32768  nlockmgr
    100021    4   udp  32768  nlockmgr
    100021    1   tcp  53111  nlockmgr
    100021    3   tcp  53111  nlockmgr
    100021    4   tcp  53111  nlockmgr
[root@node1 ~]# nfsstat
Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
0          0          0          0          0       
Client rpc stats:
calls      retrans    authrefrsh
2950634    0          0       
Client nfs v3:
null       getattr    setattr    lookup     access     readlink   
0       0% 988154 33% 86986   2% 44612   1% 353215 11% 102     0% 
read       write      create     mkdir      symlink    mknod      
6821    0% 1367868 46% 9227    0% 0       0% 0       0% 8       0% 
remove     rmdir      rename     link       readdir    readdirplus
4577    0% 0       0% 7       0% 18      0% 18      0% 59879   2% 
fsstat     fsinfo     pathconf   commit     
0       0% 2       0% 0       0% 29138   0% 





-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-07-20 11:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-17 18:17 NFS synchronization problems on Beowulf cluster Mario Storti
2006-07-17 18:49 ` Trond Myklebust
2006-07-18 11:19   ` Mario Storti
2006-07-18 13:48     ` Trond Myklebust
2006-07-20 10:49       ` Mario Storti
2006-07-20 11:04         ` Trond Myklebust
2006-07-20 11:35           ` Mario Storti
  -- strict thread matches above, loose matches on Subject: below --
2006-05-31 19:23 Mario Storti
2006-06-01 13:58 ` Mario Storti
2006-06-01 14:25   ` Peter Staubach
2006-06-01 14:54     ` Mario Storti
2006-06-01 15:56       ` Peter Staubach
2006-06-01 16:35         ` Mario Storti
2006-06-01 17:22           ` Peter Staubach
2006-06-01 18:50             ` Mario Storti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.