* NFS file locking?
@ 2001-10-14 18:11 Larry McVoy
2001-10-14 23:52 ` Neil Brown
2001-10-15 1:43 ` Alan Cox
0 siblings, 2 replies; 5+ messages in thread
From: Larry McVoy @ 2001-10-14 18:11 UTC (permalink / raw)
To: linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]
Hi, the open(2) man page says:
O_EXCL When used with O_CREAT, if the file already exists
it is an error and the open will fail. O_EXCL is
broken on NFS file systems, programs which rely on
it for performing locking tasks will contain a race
condition. The solution for performing atomic file
locking using a lockfile is to create a unique file
on the same fs (e.g., incorporating hostname and
pid), use link(2) to make a link to the lockfile.
If link() returns 0, the lock is successful. Oth
erwise, use stat(2) on the unique file to check if
its link count has increased to 2, in which case
the lock is also successful.
I coded this up and tried it here on a cluster of different operating
systems (Linux 2.4.5 server, linux, freebsd, solaris, aix, hpux, irix
clients) and it doesn't work.
2 questions:
a) is it the belief of folks here that this should work?
b) if performance isn't a big issue, is there any portable way to do
locking over NFS with just files?
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: NFS file locking? 2001-10-14 18:11 NFS file locking? Larry McVoy @ 2001-10-14 23:52 ` Neil Brown 2001-10-15 2:38 ` Larry McVoy 2001-10-15 1:43 ` Alan Cox 1 sibling, 1 reply; 5+ messages in thread From: Neil Brown @ 2001-10-14 23:52 UTC (permalink / raw) To: Larry McVoy; +Cc: linux-kernel On Sunday October 14, lm@bitmover.com wrote: > Hi, the open(2) man page says: > > O_EXCL When used with O_CREAT, if the file already exists > it is an error and the open will fail. O_EXCL is > broken on NFS file systems, programs which rely on > it for performing locking tasks will contain a race > condition. The solution for performing atomic file > locking using a lockfile is to create a unique file > on the same fs (e.g., incorporating hostname and > pid), use link(2) to make a link to the lockfile. > If link() returns 0, the lock is successful. Oth > erwise, use stat(2) on the unique file to check if > its link count has increased to 2, in which case > the lock is also successful. > > I coded this up and tried it here on a cluster of different operating > systems (Linux 2.4.5 server, linux, freebsd, solaris, aix, hpux, irix > clients) and it doesn't work. > > 2 questions: > > a) is it the belief of folks here that this should work? No. It is unsupportable with NFSv2. The NFSv3 protocol does provide support, the I don't think the Linux NFSv3 client supports it yet because the VFS layer tries to handle all the exclusion, and doesn't give the file-system a chance. > > b) if performance isn't a big issue, is there any portable way to do > locking over NFS with just files? Instead of creating a lock file, create a lock symlink. Have the content of the symlink be something recognisably unique. e.g. hostname.pid If the "symlink" syscall succeeds, you have got the lock. If it fails, issue a readlink and see if the content is what you tried to create (RPC packet loss and retransmit could have caused an incorrect failure return). If it is, you have the lock. If not, you don't. Similar tricks can be done with hard links if you really want a file. i.e. create a file with a unique name and then hard-link it to the lock-file-name. On apparent failure, check the inode number. With all these approaches (including O_EXCL) the tricky bit is cleaning up after a failed application left a lockfile lying around. Automatically deleting it is racy unless you guarantee that only one process could ever consider deleting an old lock file. e.g. a cron job on the fileserver that runs every 5 minutes and deletes any lock file older that 10 minutes. NeilBrown > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS file locking? 2001-10-14 23:52 ` Neil Brown @ 2001-10-15 2:38 ` Larry McVoy 2001-10-17 11:15 ` Miquel van Smoorenburg 0 siblings, 1 reply; 5+ messages in thread From: Larry McVoy @ 2001-10-15 2:38 UTC (permalink / raw) To: Neil Brown; +Cc: Larry McVoy, linux-kernel > Instead of creating a lock file, create a lock symlink. > Have the content of the symlink be something recognisably unique. > e.g. hostname.pid > If the "symlink" syscall succeeds, you have got the lock. > If it fails, issue a readlink and see if the content is what you > tried to create (RPC packet loss and retransmit could have caused > an incorrect failure return). If it is, you have the lock. > If not, you don't. OK, tried that too, here's the code. Doesn't work. Neither does the link approach. Am I doing something wrong? It seems to me that I'm completely at the mercy of the client NFS implementation - if it caches stuff wrong, I'm hosed. There has to be some cute trick to get past this. --lm int sccs_lockfile(char *lockfile, int seconds) { char *s; char buf[300]; int n, uslp = 1000, waited = 0; s = aprintf("%u %s", getpid(), sccs_gethost()); for ( ;; ) { if (symlink(s, lockfile) == 0) return (0); n = readlink(lockfile, buf, sizeof(buf)); if (n > 0) { buf[n] = 0; if (streq(s, buf)) return (0); } if (seconds && ((waited / 1000000) >= seconds)) { fprintf(stderr, "timed out waiting for %s\n", lockfile); free(s); return (-1); } usleep(uslp); waited += uslp; if (uslp < 20000) uslp <<= 1; } /* NOTREACHED */ } /* * Usage: a.out iterations lockfile */ int main(int ac, char **av) { int i, iter; int me = getpid(); unless (ac == 3) return (1); unless ((iter = atoi(av[1])) > 0) return (1); printf("%d starts\n", me); for (i = 1; i <= iter; ++i) { sccs_lockfile(av[2], 0); assert(mine(av[2])); unlink(av[2]); unless (i % 10) printf("%d locked %d times\n", me, i); } printf("%d done\n", me); return (0); } int mine(char *file) { char buf[300]; char *s; int n; n = readlink(file, buf, sizeof(buf)); if (n > 0) { s = aprintf("%u %s", getpid(), sccs_gethost()); buf[n] = 0; n = streq(s, buf); unless (n) fprintf(stderr, "%s != %s\n", s, buf); free(s); return (n); } return (0); } /* * This function works like sprintf(), except it return a * malloc'ed buffer which caller should free when done */ char * aprintf(char *fmt, ...) { va_list ptr; int rc, size = strlen(fmt) + 64; char *buf = malloc(size); va_start(ptr, fmt); rc = vsnprintf(buf, size, fmt, ptr); va_end(ptr); /* * On IRIX, it truncates and returns size-1. * We can't assume that that is OK, even though that might be * a perfect fit. We always bump up the size and try again. * This can rarely lead to an extra alloc that we didn't need, * but that's tough. */ while ((rc < 0) || (rc >= (size-1))) { size *= 2; free(buf); buf = malloc(size); va_start(ptr, fmt); rc = vsnprintf(buf, size, fmt, ptr); va_end(ptr); } return (buf); /* caller should free */ } char * sccs_gethost() { static char host[256]; if (gethostname(host, sizeof(host)) == -1) return "?"; return (host); } ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS file locking? 2001-10-15 2:38 ` Larry McVoy @ 2001-10-17 11:15 ` Miquel van Smoorenburg 0 siblings, 0 replies; 5+ messages in thread From: Miquel van Smoorenburg @ 2001-10-17 11:15 UTC (permalink / raw) To: linux-kernel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 4768 bytes --] In article <20011014193844.C13153@work.bitmover.com>, Larry McVoy <lm@bitmover.com> wrote: >OK, tried that too, here's the code. Doesn't work. Neither does the >link approach. Am I doing something wrong? It seems to me that I'm >completely at the mercy of the client NFS implementation - if it caches >stuff wrong, I'm hosed. There has to be some cute trick to get past this. Download ftp://ftp.debian.org/debian/pool/main/libl/liblockfile/liblockfile_1.03.tar.gz It contains NFS safe locking functions, and it knows how to work around NFS client caches. And it documents all algorithms in the manpages too. ALGORITHM The algorithm that is used to create a lockfile in an atomic way, even over NFS, is as follows: 1 A unique file is created. In printf format, the name of the file is .lk%05d%x%s. The first argument (%05d) is the current process id. The second argu ment (%x) consists of the 4 minor bits of the value returned by time(2). The last argument is the sys tem hostname. 2 Then the lockfile is created using link(2). The return value of link is ignored. 3 Now the lockfile is stat()ed. If the stat fails, we go to step 6. 4 The stat value of the lockfile is compared with that of the temporary file. If they are the same, we have the lock. The temporary file is deleted and a value of 0 (success) is returned to the caller. 5 A check is made to see if the existing lockfile is a valid one. If it isn't valid, the stale lockfile is deleted. 6 Before retrying, we sleep for n seconds. n is ini tially 5 seconds, but after every retry 5 extra seconds is added up to a maximum of 60 seconds (an incremental backoff). Then we go to step 2 up to retries times. REMOTE FILE SYSTEMS AND THE KERNEL ATTRIBUTE CACHE If you are using lockfile_create to create a lock on a file that resides on a remote server, and you already have that file open, you need to flush the NFS attribute cache after locking. This is needed to prevent the following scenario: o open /var/mail/USERNAME o attributes, such as size, inode, etc are now cached in the kernel! o meanwhile, another remote system appends data to /var/mail/USERNAME o grab lock using lockfile_create() o seek to end of file o write data Now the end of the file really isn't the end of the file - the kernel cached the attributes on open, and st_size is not the end of the file anymore. So after locking the file, you need to tell the kernel to flush the NFS file attribute cache. The only portable way to do this is the POSIX fcntl() file locking primitives - locking a file using fcntl() has the fortunate side-effect of invalidating the NFS file attribute cache of the kernel. lockfile_create() cannot do this for you for two reasons. One, it just creates a lockfile- it doesn't know which file you are actually trying to lock! Two, even if it could deduce the file you're locking from the filename, by just opening and closing it, it would invalidate any existing POSIX locks the program might already have on that file (yes, POSIX locking semantics are insane!). So basically what you need to do is something like this: fd = open("/var/mail/USER"); .. program code .. lockfile_create("/var/mail/USER.lock", x, y); /* Invalidate NFS attribute cache using POSIX locks */ if (lockf(fd, F_TLOCK, 0) == 0) lockf(fd, F_ULOCK, 0); You have to be careful with this if you're putting this in an existing program that might already be using fcntl(), flock() or lockf() locking- you might invalidate existing locks. There is also a non-portable way. A lot of NFS operations return the updated attributes - and the Linux kernel actu ally uses these to update the attribute cache. One of these operations is chmod(2). So stat()ing a file and then chmod()ing it to st.st_mode will not actually change the file, nor will it interfere with any locks on the file, but it will invalidate the attribute cache. The equivalent to use from a shell script would be chmod u=u /var/mail/USER Mike. -- Move sig. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS file locking? 2001-10-14 18:11 NFS file locking? Larry McVoy 2001-10-14 23:52 ` Neil Brown @ 2001-10-15 1:43 ` Alan Cox 1 sibling, 0 replies; 5+ messages in thread From: Alan Cox @ 2001-10-15 1:43 UTC (permalink / raw) To: Larry McVoy; +Cc: linux-kernel > a) is it the belief of folks here that this should work? NFSv2 doesnt have the needed semantics > b) if performance isn't a big issue, is there any portable way to do > locking over NFS with just files? The classic way is to use link(). ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2001-10-17 11:15 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-10-14 18:11 NFS file locking? Larry McVoy 2001-10-14 23:52 ` Neil Brown 2001-10-15 2:38 ` Larry McVoy 2001-10-17 11:15 ` Miquel van Smoorenburg 2001-10-15 1:43 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox