All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Richard B. Johnson" <root@chaos.analogic.com>
To: Sean Hunter <sean@uncarved.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	nfs@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine for directories on NFS mounts
Date: Mon, 15 Jul 2002 08:45:12 -0400 (EDT)	[thread overview]
Message-ID: <agugr0$eor$2@main.gmane.org> (raw)
In-Reply-To: <20020715075221.GC21470@uncarved.com>

On Mon, 15 Jul 2002, Sean Hunter wrote:

> On Tue, Jul 09, 2002 at 03:50:17PM -0400, Richard B. Johnson wrote:
> > On Tue, 9 Jul 2002, Alan Cox wrote:
> > 
> > > > That is what it's supposed to do with files. The attached code clearly
> > > > shows that it doesn't work with directories. The fsync() instantly
> > > > returns, even though there is buffered data still to be written.
> > > 
> > > Your understanding or code is wrong. Its hard to tell which.
> > > 
> > > fsync on the directory syncs the directory metadata not the file metadata
> > > 
> > 
> > Well the original complaint was that Linux NFS didn't allow a directory to
> > be fsync()ed. I showed that POSIX.4 doesn't provide for fsync()ing
> > directories, only files, that you have to fsync() individual files, not
> > the directories that contain them. Others said that fsync()ing individual
> > files was not necessary, that you only have to fsync() the directory. I
> > explained that you have to cheat to even get a fd that can be used
> > to fsync() a directory. Then I showed that fsync()ing a directory in this
> > manner doesn't work so, we are actually in violent agreement.
> 
> I'm not sure whether or not you've got the gist with all the flamage and
> shrapnel flying about, however as I understand it, fsync on a directory fd
> ensures that all directory ops such as rename()s unlinks(), links() etc are
> committed, not that all data pending to all files in that dir are flushed.
> 
> To get all changes you need to fsync the dirfd and all the fds of the files as
> well.
> 
> Because directory changes (such as renames, unlinks etc) are synchronous on NFS
> any way, fsync() on a dir fd on an NFS mount can simply return.  There will
> never be any outstanding dir ops to flush.  ergo: no bug.
> 
> Hope that's clear.
> 
> Sean
> 


NFS has characteristics that seem to make it 'special'.
For instance, you have a server that performs local actions
on behalf of a remote client. As long as the local server
doesn't crash, everything it did for the remote client is
safe even if the remote client crashes and burns. From
the perspective of the remote client, it really doesn't make
much difference if it ever calls fsync() on anything as long
as the server doesn't crash. Therefore, for discussion I
will ignore NFS and other Client Server file access systems.
But just because they are special, it doesn't mean that they
should be treated specially.

Given the following:

		/1/2/3/4/5/6/7/8/9/file

... I suggest that it MUST be sufficient to fsync() 'file' to
assure that file data can be recovered. That's what POSIX.4 states.
If the implementation doesn't allow this, i.e., 'file' will end up
in 'lost+found', then there is a problem that should be addressed.
This is because a local file user's program may not know the entire
directory tree. For example, in a chrooted environment. Also,
the task has no way of knowing what, if any, of these directory
entries have already been flushed to disk. A directory tree could,
in principle, be up to _POSIX_PATH_MAX  entries in length.

In the beginning, when God created Unix, files and directories
were all the same. I could fix a bad directory entry with an
editor. Over the years, certain rules were established to prevent
users from accessing directories as files. They still are files,
but the Operating System(s) try their best to make sure you don't
muck with directories as files.

So now you have to read a directory with getdents(), actually that's
not even POSIX, you need to use readdir(). Also, the directory will
fail to be opened in other than read-only. These are all artificial
constraints, imposed to make sure you follow the rules.

So, you get a read-only file-descriptor and fsync() it! What does
that mean? Obviously, the file must have existed previously to open
it read-only. Since I can't change its contents, because I opened
it read-only, fsync() can't do anything because I could not have
altered its contents.

So, lets say two tasks open the same file.  One opens it read-only
and the other read-write. The read-write task is happily writing
to the file. The read-only task executes fsync(). Does this cause
the writer to wait until the file has been flushed to disk? I don't
know, but if it does, we have a very broken system where an
unprivileged reader can severely affect the performance of a
file-server with a denial-of-service attack. So, I suggest that
a read-only file-descriptor CANNOT cause the contents of a file
to be written. If it does, it's broken. Given this, fsync() on
a directory entry, accessed by a read-only file-descriptor, can't
do anything.

These are things that should be addressed rather than flamed-
away. I think that the intent of fsync() on a file is to make
certain that it is on the physical media in a state from which
it can be accessed after a crash. If this is the intent, then
playing games with individual directories is not useful and
fsync() on the read/write file-descriptor actually updating the
file should be sufficient.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2002-07-15 12:43 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-09 13:49 [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine for directories on NFS mounts Trond Myklebust
2002-07-09 14:06 ` Richard B. Johnson
2002-07-09 14:06 ` Richard B. Johnson
2002-07-09 14:08   ` Trond Myklebust
2002-07-09 14:08   ` Trond Myklebust
2002-07-09 15:06     ` Richard B. Johnson
2002-07-09 15:06     ` Richard B. Johnson
2002-07-09 16:56       ` Alan Cox
2002-07-09 16:56         ` Alan Cox
2002-07-09 17:22         ` Richard B. Johnson
2002-07-09 17:22           ` Richard B. Johnson
2002-07-09 18:58           ` Bill Rugolsky Jr.
2002-07-09 18:58           ` [NFS] " Bill Rugolsky Jr.
2002-07-09 19:11           ` Alan Cox
2002-07-09 19:11             ` Alan Cox
2002-07-09 19:13             ` Richard B. Johnson
2002-07-09 19:13             ` Richard B. Johnson
2002-07-09 19:39               ` [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine fordirectories " David Dillow
2002-07-09 19:59               ` [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine for directories " Alan Cox
2002-07-09 19:59                 ` Alan Cox
2002-07-09 19:50                 ` Richard B. Johnson
2002-07-09 19:50                 ` Richard B. Johnson
2002-07-15  7:52                   ` Sean Hunter
2002-07-15 12:45                     ` Richard B. Johnson [this message]
2002-07-15 12:45                     ` Richard B. Johnson
2002-07-15 13:35                       ` Matthias Andree
     [not found]                         ` <mit.lcs.mail.linux-kernel/20020715133507.GF32155@merlin.emma.line.org>
2002-07-15 14:49                           ` Patrick J. LoPresti
2002-07-15 15:18                             ` Matthias Andree
     [not found]                               ` <mit.lcs.mail.linux-kernel/20020715151833.GA22828@merlin.emma.line.org>
2002-07-15 16:10                                 ` Patrick J. LoPresti
2002-07-15 18:16                                   ` Matthias Andree
     [not found]                                     ` <mit.lcs.mail.linux-kernel/20020715181650.GA20665@merlin.emma.line.org>
2002-07-15 18:56                                       ` Patrick J. LoPresti
2002-07-15 20:50                                         ` Matthias Andree
2002-07-15 16:16                             ` Alan Cox
2002-07-15 15:19                               ` Matthias Andree
2002-07-15 16:45                                 ` Alan Cox
2002-07-15 15:38                               ` Patrick J. LoPresti
2002-07-15 16:55                                 ` Alan Cox
2002-07-15 15:29                                   ` [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine fordirectories " Sandy Harris
2002-07-15 20:17                                   ` [PATCH] 2.4.19-rc1/2.5.25 provide dummy fsync() routine for directories " Patrick J. LoPresti
2002-07-16  1:40                                     ` jw schultz
2002-07-15 15:20                         ` Bill Rugolsky Jr.
2002-07-15 15:35                           ` Matthias Andree
2002-07-15 16:14                             ` Bill Rugolsky Jr.
2002-07-10  6:33   ` Alex Riesen
2002-07-10 11:20     ` Richard B. Johnson
2002-07-11 10:52 ` Matthias Andree
2002-07-11 11:26   ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2002-07-09 13:49 Trond Myklebust
     [not found] <E17SjDh-00067R-00@usw-sf-list2.sourceforge.net>
2002-07-11 19:14 ` Rex Dieter
2002-07-11 20:05   ` Tom McNeal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='agugr0$eor$2@main.gmane.org' \
    --to=root@chaos.analogic.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nfs@lists.sourceforge.net \
    --cc=sean@uncarved.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.