All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Timo Sirainen <tss@iki.fi>
Cc: linux-nfs@vger.kernel.org
Subject: Re: inode caching
Date: Tue, 27 May 2008 08:48:27 -0400	[thread overview]
Message-ID: <483C031B.80601@redhat.com> (raw)
In-Reply-To: <1211835499.3904.231.camel@hurina>

Timo Sirainen wrote:
> NFS server: Linux 2.6.25
> NFS client: Linux debian 2.6.25-2 (or 2.6.23.1)
>
> If I do:
>
> NFS client: fd1 =3D creat("foo"); write(fd1, "xx", 2); fsync(fd1);
> NFS server: unlink("foo"); creat("foo");
> NFS client: fd2 =3D open("foo"); fstat(fd1, &st1); fstat(fd2, &st2);
> fstat(fd1, &st3);
>
> The result is usually that the fstat(fd1) fails with ESTALE. But
> sometimes the result is st1.st_ino =3D=3D st2.st_ino =3D=3D st3.st_in=
o and
> st1.st_size =3D=3D 2 but st2.st_size =3D=3D 0. So I see two different=
 files
> using the same inode number. I'd really want to avoid seeing that
> condition.
>
>  =20

This is really up the file system on the server. It is the one
that selects the inode number when creating a new file.

> So what I'd want to know is:
>
> a) Why does this happen only sometimes? I can't really figure out fro=
m
> the code what invalidates the fd1 inode. Apparently the second open()
> somehow, but since it uses the new "foo" file with a different struct
> inode, where does the old struct inode get invalidated?
>
>  =20

This will happen always, but you may see occasional successful
fstat() calls on the client due to attribute caching and/or
dentry caching.

> b) Can this be fixed? Or is it just luck that it works as well as it
> does now?
>
>  =20

This can be fixed, somewhat. I have some changes to address the
ESTALE situation in system calls that take filename as arguments,
but I need to work with some more people to get them included.
The system calls which do not take file names as arguments can not
be recovered from because the file they are referring is really
gone or at least not accessible anymore.

The reuse of the inode number is just a fact of life and that way
that file systems work. I would suggest rethinking your application
in order to reduce or eliminate any dependence that it might have.

All this said, making changes on both the server and the client is
dangerous and can easily to lead to consistency and/or performance
issues.

Thanx...

ps


> =EF=BB=BFAttached a test program. Usage:
>
> NFS client: Mount with actimeo=3D2
> NFS client: ./t
> (Run the next two commands within 2 seconds)
> NFS server: rm -f foo;touch foo
> NFS client: hit enter=20
>
> Once in a while the result will be:
> 1a: ino=3D15646940 size=3D2
> 1b: ino=3D15646940 size=3D2
> 1c: ino=3D15646940 size=3D2
> 2: ino=3D15646940 size=3D0
> 1d: ino=3D15646940 size=3D2
>
>  =20
> ---------------------------------------------------------------------=
---
>
> #include <errno.h>
> #include <string.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <sys/stat.h>
>
> int main(void) {
> 	struct stat st;
> 	int fd, fd2;
> 	char buf[100];
>
> 	fd =3D open("foo", O_RDWR | O_CREAT, 0666);
> 	write(fd, "xx", 2); fsync(fd);
> 	if (fstat(fd, &st) < 0) perror("fstat()");
> 	printf("1a: ino=3D%ld size=3D%ld\n", (long)st.st_ino, st.st_size);
>
> 	fgets(buf, sizeof(buf), stdin);
> 	if (fstat(fd, &st) < 0) perror("fstat()");
> 	else printf("1b: ino=3D%ld size=3D%ld\n", (long)st.st_ino, st.st_siz=
e);
>
> 	fd2 =3D open("foo", O_RDWR);
> 	if (fstat(fd, &st) < 0) perror("fstat()");
> 	else printf("1c: ino=3D%ld size=3D%ld\n", (long)st.st_ino, st.st_siz=
e);
> 	if (fstat(fd2, &st) < 0) perror("fstat()");
> 	else printf("2: ino=3D%ld size=3D%ld\n", (long)st.st_ino, st.st_size=
);
> 	if (fstat(fd, &st) < 0) perror("fstat()");
> 	else printf("1d: ino=3D%ld size=3D%ld\n", (long)st.st_ino, st.st_siz=
e);
> 	return 0;
> }
>  =20


  reply	other threads:[~2008-05-27 12:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-26 20:58 inode caching Timo Sirainen
2008-05-27 12:48 ` Peter Staubach [this message]
2008-05-27 15:40   ` Timo Sirainen
2008-05-27 18:09     ` Peter Staubach
2008-05-27 19:13       ` Timo Sirainen
2008-05-28  5:38         ` Benny Halevy
2008-05-28 13:59           ` J. Bruce Fields
2008-05-28 15:20             ` Timo Sirainen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=483C031B.80601@redhat.com \
    --to=staubach@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tss@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.