public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [patch 12/16] fix race between writeback and unlink
Date: Sat, 01 Jun 2002 15:25:56 -0700	[thread overview]
Message-ID: <3CF949F4.62F13F11@zip.com.au> (raw)
In-Reply-To: <3CF88933.2EC13C8F@zip.com.au> <Pine.LNX.4.44.0206010935290.10978-100000@home.transmeta.com> <3CF91E48.C76B34FA@zip.com.au> <20020601200414.GD14918@holomorphy.com>

William Lee Irwin III wrote:
> 
> Linus Torvalds wrote:
> >> The general VFS layer really shouldn't have assigned that strogn a meaning
> >> to "i_nlink" anyway, it's not for the VFS layer to decide (and it only
> >> causes problems for any non-UNIX-on-a-disk filesystems).
> 
> On Sat, Jun 01, 2002 at 12:19:36PM -0700, Andrew Morton wrote:
> > Yes, I suspect all the inode refcounting, locking, I_FREEING, I_LOCK, etc
> > could do with a spring clean. Make it a bit more conventional.  I'll
> > discuss with Al when he resurfaces.
> 
> I'm somewhat concerned about the protection of ->i_size, since that
> appears to be accessed in generic_file_read() without any protection
> against writers to the field. From a quick glance at current 2.5 (it
> looks like 2.4 has this too) it looks like it's written to by
> vmtruncate() through notify_change() with the ->i_sem and BKL held at
> the moment, but generic_file_read() doesn't take either before reading
> it, and there may be still other writers.

truncate and write change i_size, under i_sem.   The i_size test on
the read path doesn't really need to be there, I suspect.  It handles
the window where i_size has been decreased by truncate but the filesystem
hasn't finished truncating the blocks yet.  It also optimises reads
outside the end of file - no point in calling into the filesystem
to try to map blocks which aren't there.

> I also don't see the anything
> like read_barrier_depends() for lockless algorithms or any atomic reads.
> Even on machines with extremely strong memory consistency models like
> i386, as loff_t is long long, it would seem possible to catch a partial
> update and see an entirely bogus ->i_size value.

That's true.  sys_stat() also could see a confusing intermediate value.
A while back Ingo and Linus were tossing around possible solutions to
this based on x86 compare-and-exchange operations, but nothing conclusive
came out of it.   It's a "known bug".

-

  reply	other threads:[~2002-06-01 22:22 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-06-01  8:43 [patch 12/16] fix race between writeback and unlink Andrew Morton
2002-06-01 16:42 ` Linus Torvalds
2002-06-01 19:19   ` Andrew Morton
2002-06-01 20:04     ` William Lee Irwin III
2002-06-01 22:25       ` Andrew Morton [this message]
2002-06-03  4:27     ` [RFC] iput() cleanup (was Re: [patch 12/16] fix race between writeback and unlink) Linus Torvalds
2002-06-03 16:26       ` Andreas Dilger
2002-06-03 16:47         ` Linus Torvalds
2002-06-03 19:09       ` Chris Mason
2002-06-03 19:34         ` Linus Torvalds
2002-06-03 19:49           ` Chris Mason
2002-06-03 19:55             ` Linus Torvalds
2002-06-03 22:10     ` [patch 12/16] fix race between writeback and unlink Chris Mason
2002-06-03 22:19       ` Linus Torvalds
2002-06-03 22:30         ` Andrew Morton
2002-06-04 18:47           ` Linus Torvalds
2002-06-04 20:15             ` Andrew Morton
2002-06-04 20:23               ` Linus Torvalds
2002-06-04 20:40                 ` Andrew Morton
2002-06-04 21:37                   ` Linus Torvalds
2002-06-04 22:04                     ` Benjamin LaHaise
2002-06-04 22:08                     ` Andrew Morton
2002-07-07 20:38                     ` Riley Williams
2002-06-04 22:05                 ` Craig Milo Rogers
2002-06-04 22:08                   ` Linus Torvalds
2002-06-03 22:36         ` Chris Mason
2002-06-03 22:47           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3CF949F4.62F13F11@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox