linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS client hang on attempt to do async blocking posix lock enqueue
@ 2007-11-29 19:15 J. Bruce Fields
  2007-11-29 22:41 ` Marc Eshel
  0 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2007-11-29 19:15 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Manoj Naik, linux-fsdevel, Marc Eshel

On Thu, Nov 29, 2007 at 02:04:40PM -0500, Oleg Drokin wrote:
> Hello!
>
>     There is a problem with blocking async posix lock enqueue in
>     2.6.22 and 2.6.23 kernels.  Lock call to underlying FS is done
>     just fine, but when fl_grant is called to inform lockd of
>     succesful granting, nothing happens, and no reply to client is
>     sent.  The end result is client reports that the server is not
>     responding.  I enabled dprintks in the code and I see that
>     immediately after fl_grant, there is nlmsvc_grant_blocked message
>     (after callback: label) printed. Then server not responding
>     messages start, and after every message about "coulndn't create
>     RPC handle for localhost" I see nlmsvc_grant_blocked "lockd:
>     GRANTing blocked lock" message again with no activity from
>     underlying FS.
>
>     I am attaching a reproducer that I have, it is quite simple
>     actually.  Take note, that path to file to lock is hardcoded, so
>     adjust for your environment please.  Lcoking should be performed
>     on a file that resides on nfs client mountpoint.
>
>     I reproduced the problem with 2.6.22 and 2.6.23 with Lustre (I am
>     working on adapting lustre to async posix locks API) and GFS2.
>     Setup is totally local, i.e. I have single node on which there is
>     gfs (both server and client) (or lustre - just client, but that
>     does not make any difference), nfs server and nfs client that
>     mounts exported gfs or lustre.

Thanks, I'll take a look.  Replying now just to add Marc to the cc:.

--b.

^ permalink raw reply	[flat|nested] 12+ messages in thread
* NFS client hang on attempt to do async blocking posix lock enqueue
@ 2007-11-29 19:04 Oleg Drokin
  0 siblings, 0 replies; 12+ messages in thread
From: Oleg Drokin @ 2007-11-29 19:04 UTC (permalink / raw)
  To: J. Bruce Fields, Manoj Naik; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1415 bytes --]

Hello!

     There is a problem with blocking async posix lock enqueue in  
2.6.22 and 2.6.23 kernels.
     Lock call to underlying FS is done just fine, but when fl_grant  
is called to inform lockd
     of succesful granting, nothing happens, and no reply to client is  
sent. The end result
     is client reports that the server is not responding.
     I enabled dprintks in the code and I see that immediately after  
fl_grant, there is nlmsvc_grant_blocked
     message (after callback: label) printed. Then server not  
responding messages start, and
     after every message about "coulndn't create RPC handle for  
localhost" I see
     nlmsvc_grant_blocked "lockd: GRANTing blocked lock" message again  
with no activity
     from underlying FS.

     I am attaching a reproducer that I have, it is quite simple  
actually. Take note, that
     path to file to lock is hardcoded, so adjust for your environment  
please.
     Lcoking should be performed on a file that resides on nfs client  
mountpoint.

     I reproduced the problem with 2.6.22 and 2.6.23 with Lustre (I am  
working on adapting lustre
     to async posix locks API) and GFS2.
     Setup is totally local, i.e. I have single node on which there is  
gfs (both server and client)
     (or lustre - just client, but that does not make any difference),  
nfs server and nfs client
     that mounts exported gfs or lustre.

Bye,
     Oleg

[-- Attachment #2: flock.c --]
[-- Type: application/octet-stream, Size: 5959 bytes --]

/* -*- mode: c; c-basic-offset: 8; indent-tabs-mode: nil; -*-
 * vim:expandtab:shiftwidth=8:tabstop=8:
 *
 * Lustre Light user test program
 *
 *  Copyright (c) 2002, 2003 Cluster File Systems, Inc.
 *
 *   This file is part of Lustre, http://www.lustre.org.
 *
 *   Lustre is free software; you can redistribute it and/or
 *   modify it under the terms of version 2 of the GNU General Public
 *   License as published by the Free Software Foundation.
 *
 *   Lustre is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *   GNU General Public License for more details.
 *
 *   You should have received a copy of the GNU General Public License
 *   along with Lustre; if not, write to the Free Software
 *   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 */

#define _BSD_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <getopt.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/queue.h>
#include <signal.h>
#include <errno.h>
#include <dirent.h>
#include <sys/uio.h>
#include <sys/time.h>
#include <stdarg.h>

static char lustre_path[] = "/mnt/lustre2";

#define ENTRY(str)                                                      \
        do {                                                            \
                char buf[100];                                          \
                int len;                                                \
                sprintf(buf, "===== START %s: %s ", __FUNCTION__, (str)); \
                len = strlen(buf);                                      \
                if (len < 79) {                                         \
                        memset(buf+len, '=', 100-len);                  \
                        buf[79] = '\n';                                 \
                        buf[80] = 0;                                    \
                }                                                       \
                printf("%s", buf);                                      \
        } while (0)

#define LEAVE()                                                         \
        do {                                                            \
                char buf[100];                                          \
                int len;                                                \
                sprintf(buf, "===== END TEST %s: successfully ",        \
                        __FUNCTION__);                                  \
                len = strlen(buf);                                      \
                if (len < 79) {                                         \
                        memset(buf+len, '=', 100-len);                  \
                        buf[79] = '\n';                                 \
                        buf[80] = 0;                                    \
                }                                                       \
                printf("%s", buf);                                      \
        } while (0)

#define EXIT return

#define MAX_PATH_LENGTH 4096


int t_fcntl(int fd, int cmd, ...)
{
	va_list ap;
	long arg;
	struct flock *lock = NULL;
	int rc = -1;

	va_start(ap, cmd);
	switch (cmd) {
	case F_GETFL:
		va_end(ap);
		rc = fcntl(fd, cmd);
		if (rc == -1) {
			printf("fcntl GETFL failed: %s\n",
				 strerror(errno));
			EXIT(1);
		}
		break;
	case F_SETFL:
		arg = va_arg(ap, long);
		va_end(ap);
		rc = fcntl(fd, cmd, arg);
		if (rc == -1) {
			printf("fcntl SETFL %ld failed: %s\n",
				 arg, strerror(errno));
			EXIT(1);
		}
		break;
	case F_GETLK:
	case F_SETLK:
	case F_SETLKW:
		lock = va_arg(ap, struct flock *);
		va_end(ap);
		rc = fcntl(fd, cmd, lock);
		if (rc == -1) {
			printf("fcntl cmd %d failed: %s\n",
				 cmd, strerror(errno));
			EXIT(1);
		}
		break;
	case F_DUPFD:
		arg = va_arg(ap, long);
		va_end(ap);
		rc = fcntl(fd, cmd, arg);
		if (rc == -1) {
			printf("fcntl F_DUPFD %d failed: %s\n",
				 (int)arg, strerror(errno));
			EXIT(1);
		}
		break;
	default:
		va_end(ap);
		printf("fcntl cmd %d not supported\n", cmd);
		EXIT(1);
	}
        if (lock)
                printf("fcntl %d = %d, ltype = %d\n", cmd, rc, lock->l_type);
	return rc;
}

int t_unlink(const char *path)
{
        int rc;

        rc = unlink(path);
        if (rc) {
                printf("unlink(%s) error: %s\n", path, strerror(errno));
                EXIT(-1);
        }
        return rc;
}

void t21()
{
        char file[MAX_PATH_LENGTH] = "";
        int fd, ret;
	struct flock lock = {
		.l_type = F_RDLCK,
		.l_whence = SEEK_SET,
	};

        ENTRY("basic fcntl support");
        snprintf(file, MAX_PATH_LENGTH, "%s/test_t21_file", lustre_path);

        fd = open(file, O_RDWR|O_CREAT, (mode_t)0666);
        if (fd < 0) {
                printf("error open file: %m\n", file);
                exit(-1);
        }

        t_fcntl(fd, F_SETFL, O_APPEND);
        if (!(ret = t_fcntl(fd, F_GETFL)) & O_APPEND) {
                printf("error get flag: ret %x\n", ret);
                exit(-1);
        }

	t_fcntl(fd, F_SETLK, &lock);
	t_fcntl(fd, F_GETLK, &lock);
	lock.l_type = F_WRLCK;
	t_fcntl(fd, F_SETLKW, &lock);
	t_fcntl(fd, F_GETLK, &lock);
	lock.l_type = F_UNLCK;
	t_fcntl(fd, F_SETLK, &lock);

        close(fd);
        t_unlink(file);
        LEAVE();
}


int main(int argc, char * const argv[])
{
        /* Set D_VFSTRACE to see messages from ll_file_flock.
           The test passes either with -o flock or -o noflock 
           mount -o flock -t lustre uml1:/mds1/client /mnt/lustre */
        t21();

	printf("completed successfully\n");
	return 0;
}

[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-02-08 21:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-29 19:15 NFS client hang on attempt to do async blocking posix lock enqueue J. Bruce Fields
2007-11-29 22:41 ` Marc Eshel
2008-01-18 23:07   ` J. Bruce Fields
2008-01-20 14:58     ` Oleg Drokin
2008-02-07 23:26       ` J. Bruce Fields
2008-02-08 12:15         ` Jeff Layton
2008-02-08 14:33           ` J. Bruce Fields
2008-02-08 18:49             ` david m. richter
2008-02-08 20:54               ` Jeff Layton
2008-02-08 21:12                 ` J. Bruce Fields
2008-02-08 21:27                   ` Jeff Layton
  -- strict thread matches above, loose matches on Subject: below --
2007-11-29 19:04 Oleg Drokin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).