From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: trying to understand READ_META, READ_SYNC, WRITE_SYNC & co
Date: Mon, 21 Jun 2010 13:04:36 +0200 [thread overview]
Message-ID: <20100621110436.GA4056@lst.de> (raw)
In-Reply-To: <4C1F3916.4070608@kernel.dk>
On Mon, Jun 21, 2010 at 12:04:06PM +0200, Jens Axboe wrote:
> It's also annotation for blktrace, so you can tell which parts of the IO
> is meta data etc. The scheduler impact is questionable, I doubt it makes
> a whole lot of difference.
> For analysis purposes, annotating all meta data IOs as such would be
> beneficial (reads as well as writes). As mentioned above, the current
> scheduler impact isn't huge. There may be some interesting test and
> benchmarks there for improving that part.
As mentioned in the previous mail the use of the flag is not very
wide spread. Currently it's only ext3/ext3 inodes and directories as well
as all metadata I/O in gfs2 that gets marked this way. And I'd be much more
comfortable to add more annotations if it didn't also have some form
of schedule impact. The I/O schedules in general and cfq in particular
have caused us far too many issues with such subtile differences.
> > This one is used in quite a few places, with many of them
> > obsfucated by macros like rw_is_sync, rq_is_sync and
> > cfq_bio_sync. In general all uses seem to imply giving
> > a write request the same priority as a read request and
> > treat it as synchronous. I could not spot a place where
> > it actually has any effect on reads.
>
> Reads are sync by nature in the block layer, so they don't get that
> special annotation.
Well, we do give them this special annotation in a few places, but we
don't actually use it.
> So a large part of that problem is the overloaded meaning of sync. For
> some cases it means "unplug on issue", and for others it means that the
> IO itself is syncronous. The other nasty bit is the implicit plugging
> that happens behind the back of the submitter, but that's an issue to
> tackle separately. I'd suggest avoiding unnecessary churn in naming of
> those.
Well, the current naming is extremly confusing. The best thing I could
come up with is to completely drop READ_SYNC and WRITE_SYNC and just
pass REQ_UNPLUG explicitly together with READ / WRITE_SYNC_PLUG.
There's only 5 respective 8 users of them, so explicitly documenting
our intentions there seems useful. Especially if we want to add more
_META annotation in which case the simple READ_* / WRITE_* macros
don't do anymore either. Similarly it might be useful to remove
READ_META/WRITE_META and replace them with explicit | REQ_META, which
is just about as short and a lot more descriptive, especially for
synchronous metadata writes.
> > Why do O_DIRECT writes not want to set REQ_NOIDLE (and that
> > exactly does REQ_NOIDLE mean anyway). It's the only sync writes
> > that do not set it, so if this special case went away we
> > could get rid of the flag and key it off REQ_SYNC.
>
> See above for NOIDLE. You kill O_DIRECT write throughput if you don't
> idle at the end of a write, if you have other activity on the disk.
Ok, makes sense. Would you mind taking a patch to kill the
WRITE_ODIRECT_PLUG and just do a
/*
* O_DIRECT writes are synchronous, but we must not disable the
* idling logic in CFQ to avoid killing performance.
*/
if (rw & WRITE)
rw |= REQ_SYNC;
But that leaves the question why disabling the idling logical for
data integrity ->writepage is fine? This gets called from ->fsync
or O_SYNC writes and will have the same impact as O_DIRECT writes.
next prev parent reply other threads:[~2010-06-21 11:04 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-21 9:48 trying to understand READ_META, READ_SYNC, WRITE_SYNC & co Christoph Hellwig
2010-06-21 10:04 ` Jens Axboe
2010-06-21 11:04 ` Christoph Hellwig [this message]
2010-06-21 18:56 ` Jens Axboe
2010-06-21 19:14 ` Christoph Hellwig
2010-06-21 19:16 ` Jens Axboe
2010-06-21 19:20 ` Christoph Hellwig
2010-06-21 21:36 ` Vivek Goyal
2010-06-23 10:01 ` Christoph Hellwig
2010-06-24 1:44 ` Vivek Goyal
2010-06-25 11:03 ` Christoph Hellwig
2010-06-26 3:35 ` Vivek Goyal
2010-06-26 10:05 ` Christoph Hellwig
2010-06-26 11:20 ` Jens Axboe
2010-06-26 11:56 ` Christoph Hellwig
2010-06-27 15:44 ` Jeff Moyer
2010-06-29 9:06 ` Corrado Zoccolo
2010-06-29 12:30 ` Vivek Goyal
2010-06-29 12:30 ` Vivek Goyal
2010-06-30 15:30 ` Corrado Zoccolo
2010-06-30 15:30 ` Corrado Zoccolo
2010-06-26 9:25 ` Nick Piggin
2010-06-26 9:27 ` Christoph Hellwig
2010-06-26 10:10 ` Nick Piggin
2010-06-26 10:16 ` Christoph Hellwig
2010-06-21 18:52 ` Jeff Moyer
2010-06-21 18:58 ` Jens Axboe
2010-06-21 19:08 ` Jeff Moyer
2010-06-23 9:26 ` Christoph Hellwig
2010-06-21 20:25 ` Vivek Goyal
2010-06-23 10:02 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100621110436.GA4056@lst.de \
--to=hch@lst.de \
--cc=axboe@kernel.dk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.