From: Jan Kara <jack@suse.cz>
To: Curt Wohlgemuth <curtw@google.com>
Cc: Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
fengguang.wu@intel.com
Subject: Re: [PATCH] writeback: Don't wait for completion in writeback_inodes_sb_nr
Date: Mon, 11 Jul 2011 21:48:35 +0200 [thread overview]
Message-ID: <20110711194835.GI5482@quack.suse.cz> (raw)
In-Reply-To: <CAO81RMairX_L5qGOcKKpw7gWdcTd2Y+5NL=+BA9EwTCNLCNmLw@mail.gmail.com>
On Mon 11-07-11 10:11:17, Curt Wohlgemuth wrote:
> >> One other issue I have with sync as it's structured is that we don't
> >> do a WB_SYNC_ALL pass on any inode that's only associated with a block
> >> device, and not on a mounted filesystem. Blockdev mounts are
> >> pseudo-mounts, and are explicitly skipped in __sync_filesystem(). So
> >> if you've written directly to a block device and do a sync, the only
> >> pass over the pages for this inode are via the
> >> wakeup_flusher_threads() -- which operates on a BDI, regardless of the
> >> superblock, and uses WB_SYNC_NONE.
> >>
> >> All the sync_filesystem() calls are per-sb, not per-BDI, and they'll
> >> exclude pseudo-superblocks.
> >>
> >> I've seen cases in our modified kernels here at Google in which
> >> lilo/shutdown failed because of a lack of WB_SYNC_ALL writeback for
> >> /dev/sda (though I haven't been able to come up with a consistent test
> >> case, nor reproduce this on an upstream kernel).
> > Ho hum, I wonder what would be the cleanest way to fix this. I guess we
> > could make an exception for 'bdev' superblock in the code but it's ugly.
>
> Yes, it's ugly.
>
> You could declare that sync(2) is just doing it's job, that it's not
> designed to write dirty pages from devices that don't have filesystems
> mounted on them; that's what Christoph seems to be saying, and it's
> what the man pages for sync(2) that I've seen say as well. But
> everybody I've talked to about this is surprised, so you could declare
> it a bug as well :-) .
>
> It seems to me that sys_sync *could* just dispense with iterating over
> the superblocks, and just iterate over the BDIs, just as
> wakeup_flusher_threads() does. I.e., just do two passes over all BDIs
> -- one with WB_SYNC_NONE, and one with WB_SYNC_ALL. Well, not quite.
> It would still have to go to each SB and call the quota_sync and
> sync_fs callbacks, but those should be cheap.
Well, they're not exactly cheap (journaling filesystems have to force
transaction commits and wait for them) but that does not really matter.
The real problem is that to wait for IO completion you need some list of
inodes you want to wait for and you can currently get such list only from
superblock.
> And yes, this would make sys_sync and sys_syncfs more different than
> they are today.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2011-07-11 19:48 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-28 23:43 [PATCH] writeback: Don't wait for completion in writeback_inodes_sb_nr Curt Wohlgemuth
2011-06-29 0:54 ` Dave Chinner
2011-06-29 1:56 ` Curt Wohlgemuth
2011-06-29 8:11 ` Christoph Hellwig
2011-06-29 16:57 ` Jan Kara
2011-06-29 17:55 ` Christoph Hellwig
2011-06-29 19:15 ` Jan Kara
2011-06-29 20:12 ` Christoph Hellwig
2011-06-30 12:15 ` Jan Kara
2011-06-30 12:37 ` Christoph Hellwig
2011-07-01 22:55 ` Curt Wohlgemuth
2011-07-02 11:32 ` Christoph Hellwig
2011-07-11 17:00 ` Jan Kara
2011-07-11 17:11 ` Curt Wohlgemuth
2011-07-11 19:48 ` Jan Kara [this message]
2011-07-11 19:51 ` Curt Wohlgemuth
2011-07-11 20:11 ` Christoph Hellwig
2011-07-12 10:34 ` Jan Kara
2011-07-12 10:41 ` Christoph Hellwig
2011-07-12 22:37 ` Jan Kara
2011-07-14 16:29 ` Curt Wohlgemuth
2011-07-14 23:08 ` Jan Kara
2011-07-19 16:56 ` Christoph Hellwig
2011-07-21 18:35 ` Jan Kara
2011-07-22 15:16 ` Christoph Hellwig
2011-07-19 16:53 ` Christoph Hellwig
2011-07-19 16:51 ` Christoph Hellwig
2011-07-20 22:00 ` Jan Kara
2011-07-22 15:11 ` Christoph Hellwig
2011-06-29 17:26 ` Curt Wohlgemuth
2011-06-29 18:00 ` Christoph Hellwig
2011-06-29 21:30 ` Curt Wohlgemuth
2011-07-19 15:54 ` Christoph Hellwig
2011-06-29 6:42 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110711194835.GI5482@quack.suse.cz \
--to=jack@suse.cz \
--cc=curtw@google.com \
--cc=fengguang.wu@intel.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).