From: Nick Piggin <npiggin@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, david@fromorbit.com,
chris.mason@oracle.com
Subject: Re: [patch 9/9] mm: do_sync_mapping_range integrity fix
Date: Fri, 31 Oct 2008 10:16:16 +0100 [thread overview]
Message-ID: <20081031091616.GF19268@wotan.suse.de> (raw)
In-Reply-To: <20081030161344.0ed5ca52.akpm@linux-foundation.org>
On Thu, Oct 30, 2008 at 04:13:44PM -0700, Andrew Morton wrote:
> On Wed, 29 Oct 2008 01:47:24 +1100
> npiggin@suse.de wrote:
>
> > Chris Mason notices do_sync_mapping_range didn't actually ask for data
> > integrity writeout. Unfortunately, it is advertised as being usable for
> > data integrity operations.
> >
> > This is a data interity bug.
> >
> > Signed-off-by: Nick Piggin <npiggin@suse.de>
> > ---
> > Index: linux-2.6/fs/sync.c
> > ===================================================================
> > --- linux-2.6.orig/fs/sync.c
> > +++ linux-2.6/fs/sync.c
> > @@ -269,7 +269,7 @@ int do_sync_mapping_range(struct address
> >
> > if (flags & SYNC_FILE_RANGE_WRITE) {
> > ret = __filemap_fdatawrite_range(mapping, offset, endbyte,
> > - WB_SYNC_NONE);
> > + WB_SYNC_ALL);
> > if (ret < 0)
> > goto out;
> > }
> >
>
> Really?
>
> Some thought did go into the code which you're "fixing".
Yes, I even remember something of a flamewar involving me and you :)
> If the caller
> is using sync_file_range() for integrity then the caller has done a
> SYNC_FILE_RANGE_WAIT_BEFORE.
No disputes about whether the API works "by design". But I think the
implementation has a bug. I'll explain:
> So all we need to guarantee here is that
> __filemap_fdatawrite_range(WB_SYNC_NONE) will start writeout on all
> dirty pages in the range. Probably that gets broken lower down as part
> of various hacks^woptimisations have gone in, but which ones, and
> where? Perhaps _this_ (if it's there) is what should be fixed.
WB_SYNC_NONE has never (until this was introduced) been used for data
integrity AFAIKS. There is code littered throughout fs/ which assumes
WB_SYNC_NONE ~= "efficiency / background writeback". At least
definitely the buffer.c "trylock" will cause dirty pages to be
skipped. There is also a fair amount of filesystem code I haven't looked
at. The fs-writeback.c code might not affect this path, but it also
definitely makes the same assumption about WB_SYNC_NONE, so it would be
ugly to mandate WB_SYNC_NONE is for data integrity from mapping downard,
but not from inode upward...
I didn't check, but I suspect this has been broken since it got merged.
> And if we _do_ make the above change, we don't need to run the
> wait_on_page_writeback_range() if userspace asked for
> SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE, yes?
Now you're asking the hard questions... I think we still have to wait,
because SYNC_FILE_RANGE_WRITE itself doesn't necessarily wait for writeback.
After the optimisation to skip waiting for clean and writeback pages
in write_cache_pages, actually I think this change (to use WB_SYNC_ALL)
should not hurt very much...
>
>
>
> IOW, I don't think enough thought (or at least description of that
> thought) has gone into this one.
next prev parent reply other threads:[~2008-10-31 9:16 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-28 14:47 [patch 0/9] writeback data integrity and other fixes (take 3) npiggin
2008-10-28 14:47 ` [patch 1/9] mm: write_cache_pages cyclic fix npiggin
2008-10-29 0:24 ` [patch 1.1/9] mm: write_cache_pages cyclic fix fix Nick Piggin
2008-10-28 14:47 ` [patch 2/9] mm: write_cache_pages early loop termination npiggin
2008-10-28 14:47 ` [patch 3/9] mm: write_cache_pages writepage error fix npiggin
2008-10-28 14:47 ` [patch 4/9] mm: write_cache_pages integrity fix npiggin
2008-10-28 14:47 ` [patch 5/9] mm: write_cache_pages cleanups npiggin
2008-10-28 14:47 ` [patch 6/9] mm: write_cache_pages optimise page cleaning npiggin
2008-10-28 14:47 ` [patch 7/9] mm: write_cache_pages terminate quickly npiggin
2008-10-30 23:07 ` Andrew Morton
2008-10-31 7:29 ` Nick Piggin
2008-10-28 14:47 ` [patch 8/9] mm: write_cache_pages more " npiggin
2008-10-28 14:47 ` [patch 9/9] mm: do_sync_mapping_range integrity fix npiggin
2008-10-30 23:13 ` Andrew Morton
2008-10-31 9:16 ` Nick Piggin [this message]
2008-10-31 10:04 ` Andrew Morton
2008-10-31 10:53 ` Nick Piggin
2008-10-31 20:03 ` Jamie Lokier
2008-10-31 14:10 ` Chris Mason
2008-10-31 14:30 ` steve
2008-10-31 15:02 ` Chris Mason
2008-11-01 8:04 ` Nick Piggin
2008-10-28 15:39 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin
2008-10-28 22:27 ` Dave Chinner
2008-10-29 0:04 ` Nick Piggin
2008-10-29 0:16 ` Nick Piggin
2008-10-29 3:16 ` Dave Chinner
2008-10-29 3:26 ` Dave Chinner
2008-10-29 4:11 ` Nick Piggin
2008-10-29 4:57 ` Dave Chinner
2008-10-29 5:06 ` Nick Piggin
2008-10-29 9:13 ` Christoph Hellwig
2008-10-29 21:42 ` Dave Chinner
2008-10-29 21:45 ` Christoph Hellwig
2008-10-29 21:53 ` Dave Chinner
2008-10-29 4:00 ` Nick Piggin
2008-10-29 5:27 ` Dave Chinner
2008-10-29 9:12 ` Christoph Hellwig
2008-10-29 9:21 ` Nick Piggin
2008-10-29 9:44 ` Christoph Hellwig
2008-10-29 10:30 ` Nick Piggin
2008-10-29 12:22 ` Jamie Lokier
[not found] ` <20081029122234.GE846-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2008-10-29 13:32 ` Ric Wheeler
2008-10-29 14:56 ` Chris Mason
[not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30 2:16 ` Nick Piggin
[not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51 ` jim owens
2008-10-30 13:41 ` Jim Rees
2008-10-29 21:43 ` Dave Chinner
2008-10-29 8:51 ` Dave Chinner
2008-10-28 23:14 ` Dave Chinner
2008-10-28 23:57 ` Nick Piggin
2008-10-29 0:05 ` Andrew Morton
2008-10-29 0:10 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081031091616.GF19268@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).