linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Chris Mason <chris.mason@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, david@fromorbit.com
Subject: Re: [patch 9/9] mm: do_sync_mapping_range integrity fix
Date: Sat, 1 Nov 2008 09:04:23 +0100	[thread overview]
Message-ID: <20081101080423.GB15112@wotan.suse.de> (raw)
In-Reply-To: <1225462219.10549.19.camel@think.oraclecorp.com>

On Fri, Oct 31, 2008 at 10:10:19AM -0400, Chris Mason wrote:
> On Fri, 2008-10-31 at 10:16 +0100, Nick Piggin wrote:
> > On Thu, Oct 30, 2008 at 04:13:44PM -0700, Andrew Morton wrote:
> > > On Wed, 29 Oct 2008 01:47:24 +1100
> > > npiggin@suse.de wrote:
> > > 
> > > > Chris Mason notices do_sync_mapping_range didn't actually ask for
> > data
> > > > integrity writeout. Unfortunately, it is advertised as being
> > usable for
> > > > data integrity operations.
> > > > 
> > > > This is a data interity bug.
> > > > 
> 
> [ use WB_SYNC_ALL instead of WB_SYNC_NONE ]
> 
> > >  If the caller
> > > is using sync_file_range() for integrity then the caller has done a
> > > SYNC_FILE_RANGE_WAIT_BEFORE.
> > 
> > No disputes about whether the API works "by design". But I think the
> > implementation has a bug. I'll explain:
> 
> I'll definitely agree the current usage is clumsy, and there is a bug in
> fs/buffer.c.  A grep through the rest of the filesystems doesn't turn up
> many assumptions that WB_SYNC_NONE means it's ok to skip dirty pages.
> 
> Greps for WB_SYNC_ALL and WB_SYNC_NONE in the fs code reveal:
> 
> fs/buffer.c:__block_write_full_page()
>                if (wbc->sync_mode != WB_SYNC_NONE || !wbc->nonblocking) {
>                         lock_buffer(bh);
>                 } else if (!trylock_buffer(bh)) {
>                         redirty_page_for_writepage(wbc, page);
>                         continue;
>                 }
> 
> Easily fixed s/||/&&/, which is what XFS does.  reiser3 has the same bug
> in fs/reiserfs/inode.c
> 
> ntfs and gfs2 each have a check that assumes WB_SYNC_NONE means optional
> writeback, both seem fixable with one liners.

Well, we went over this before :) Semantics... Whether or not it can be
considered a bug, WB_SYNC_NONE is very much considered not to be a data
integrity sync in code and comments (eg. buffer.c fs-writeback.c and
file systems) for a long time.

So that's not something I can really change (especially not for stable
releases).

It's _probably_ a good idea not to fragment bios, and it's probably not
going to introduce bugs if we convert everything over. But presently,
all the bugs/misunderstandings/whatever mean that WB_SYNC_NONE is not
usable for data integrity. Obviously this has been the case for long
before sync_mapping_range was added.


> Everywhere we wait on page writeback while we're trying to build nice
> big bios hurts performance.  I'd rather see us switch to something
> closer to the do_sync_mapping_range expectation of WB_SYNC_NONE than
> sprinkle WB_SYNC_ALLs everywhere.

They already are I guess. At least, this is the only place I have seen
that needed changing. After that series, I think we get a lot closer to
being correct in a lot of the writeback paths, which is a good starting
point to make improvements. I'll be looking at the "livelock" avoidance,
but you or other fs developers probably in a better position to look at
things like waiting for locked or writeback pages...


  parent reply	other threads:[~2008-11-01  8:04 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-28 14:47 [patch 0/9] writeback data integrity and other fixes (take 3) npiggin
2008-10-28 14:47 ` [patch 1/9] mm: write_cache_pages cyclic fix npiggin
2008-10-29  0:24   ` [patch 1.1/9] mm: write_cache_pages cyclic fix fix Nick Piggin
2008-10-28 14:47 ` [patch 2/9] mm: write_cache_pages early loop termination npiggin
2008-10-28 14:47 ` [patch 3/9] mm: write_cache_pages writepage error fix npiggin
2008-10-28 14:47 ` [patch 4/9] mm: write_cache_pages integrity fix npiggin
2008-10-28 14:47 ` [patch 5/9] mm: write_cache_pages cleanups npiggin
2008-10-28 14:47 ` [patch 6/9] mm: write_cache_pages optimise page cleaning npiggin
2008-10-28 14:47 ` [patch 7/9] mm: write_cache_pages terminate quickly npiggin
2008-10-30 23:07   ` Andrew Morton
2008-10-31  7:29     ` Nick Piggin
2008-10-28 14:47 ` [patch 8/9] mm: write_cache_pages more " npiggin
2008-10-28 14:47 ` [patch 9/9] mm: do_sync_mapping_range integrity fix npiggin
2008-10-30 23:13   ` Andrew Morton
2008-10-31  9:16     ` Nick Piggin
2008-10-31 10:04       ` Andrew Morton
2008-10-31 10:53         ` Nick Piggin
2008-10-31 20:03         ` Jamie Lokier
2008-10-31 14:10       ` Chris Mason
2008-10-31 14:30         ` steve
2008-10-31 15:02           ` Chris Mason
2008-11-01  8:04         ` Nick Piggin [this message]
2008-10-28 15:39 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin
2008-10-28 22:27   ` Dave Chinner
2008-10-29  0:04     ` Nick Piggin
2008-10-29  0:16     ` Nick Piggin
2008-10-29  3:16       ` Dave Chinner
2008-10-29  3:26         ` Dave Chinner
2008-10-29  4:11           ` Nick Piggin
2008-10-29  4:57             ` Dave Chinner
2008-10-29  5:06               ` Nick Piggin
2008-10-29  9:13           ` Christoph Hellwig
2008-10-29 21:42             ` Dave Chinner
2008-10-29 21:45               ` Christoph Hellwig
2008-10-29 21:53                 ` Dave Chinner
2008-10-29  4:00         ` Nick Piggin
2008-10-29  5:27           ` Dave Chinner
2008-10-29  9:12         ` Christoph Hellwig
2008-10-29  9:21           ` Nick Piggin
2008-10-29  9:44             ` Christoph Hellwig
2008-10-29 10:30               ` Nick Piggin
2008-10-29 12:22                 ` Jamie Lokier
     [not found]                   ` <20081029122234.GE846-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2008-10-29 13:32                     ` Ric Wheeler
2008-10-29 14:56                       ` Chris Mason
     [not found]                         ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30  2:16                           ` Nick Piggin
     [not found]                             ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51                               ` jim owens
2008-10-30 13:41                                 ` Jim Rees
2008-10-29 21:43                   ` Dave Chinner
2008-10-29  8:51     ` Dave Chinner
2008-10-28 23:14 ` Dave Chinner
2008-10-28 23:57   ` Nick Piggin
2008-10-29  0:05     ` Andrew Morton
2008-10-29  0:10       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081101080423.GB15112@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).