linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	Linux Kernel Developers List <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	jack@suse.cz
Subject: Re: [PATCH 1/3] block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks
Date: Tue, 7 Apr 2009 00:23:13 -0700	[thread overview]
Message-ID: <20090407002313.fcdd1da0.akpm@linux-foundation.org> (raw)
In-Reply-To: <20090407070835.GM5178@kernel.dk>

On Tue, 7 Apr 2009 09:08:36 +0200 Jens Axboe <jens.axboe@oracle.com> wrote:

> On Mon, Apr 06 2009, Andrew Morton wrote:
> > On Mon, 6 Apr 2009 23:21:41 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> > > I mean, let's graph it:
> > > 
> > > WRITE_SYNC -> WRITE_SYNC_PLUG -> BIO_RW_SYNCIO -> bio_sync() -> REQ_RW_SYNC -> rw_is_sync() -> does something mysterious in get_request()
> > >                                                                             -> rq_is_sync() -> does something mysterious in IO schedulers
> > >                               -> BIO_RW_NOIDLE -> bio_noidle() -> REQ_NOIDLE -> rq_noidle() -> does something mysterious in cfq-iosched only
> > >            -> BIO_RW_UNPLUG   -> bio_unplug() -> REQ_UNPLUG -> OK, the cognoscenti know what this is supposed to do, but it is unused!

I think the number of different greps which was needed to find all the
above was excessive.  Too many levels of wrappers and helpers.

If there was documentation at the intermediate levels then that would
terminate the search early.  But working out the _actual_ semantics of
(say) BIO_RW_SYNCIO is quite hard!

> > whoop, I found a use of bio_unplug() in __make_request().
> > 
> > So it appears that the intent of your patch is to cause an unplug after
> > submission of each WB_SYNC_ALL block?
> > 
> > But what about all the other stuff which WRITE_SYNC might or might not
> > do?  What does WRITE_SYNC _actually_ do, and what are the actual
> > effects of this change??
> > 
> > And what effect will this large stream of unplugs have upon merging?
> 
> It looks like a good candidate for WRITE_SYNC_PLUG instead,

Perhaps that mean that Ted didn't know what his own patch did.  I
certainly couldn't work it out.  That's a problem, IMO!

> since it
> does more than one buffer submission before waiting. It likely wont mean
> a whole lot since we'll usually only have a single buffer on that page,
> but for < PAGE_CACHE_SIZE block sizes it could easily make a big
> difference (4 ios instead of 1!).

OK.

But what is the advantage in doing this stream of unplugs?  For your
average fsync(), we're probably doing tens of thousands per second.
Does it actually help?

I assume that the actual code path for such a buffer becomes a lot
longer because we need to go beyond the the queueing layer and perhaps
as far down as the device driver for each block, so the CPU cost will
go up?

> So on the write side, basically we have:

Could we get this in patch form, pretty please?

> WRITE                   Normal async write.
> WRITE_SYNC_PLUG         Sync write, someone will wait on this so don't
>                         treat it as background activity. This is a hint
>                         to the io schedulers. This one does NOT unplug
>                         the queue, either the caller should do it after
>                         submission, or he should make sure that the
>                         wait_on_* callbacks do it for him.

The description isn't terribly useful unless the reader is told what
actions the schedulers are expected to take in response to the hint.

> WRITE_SYNC              Like WRITE_SYNC_PLUG, but causes immediate
>                         unplug of the queue after submission. Most
>                         uses of this should likely use WRITE_SYNC_PLUG,
>                         at least in the normal IO path.
> WRITE_ODIRECT           Like WRITE_SYNC, but also passes a hint to the
>                         IO scheduler that we should expect more IO.
>                         This is similar to how a read is treated in the
>                         scheduler, it'll enable anticipation/idling.

Ditto, somewhat.

> Ditto for the SWRITE* variants, which are special hacks for
> ll_rw_block() only.
> 
> I have killed REQ_UNPLUG, it doesn't make sense to pass the further down
> than to __make_request(), so the bio flag is enough.

OK.

  parent reply	other threads:[~2009-04-07  7:26 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-27 20:24 [PATCH 0/3] Ext3 latency improvement patches Theodore Ts'o
2009-03-27 20:24 ` [PATCH 1/3] block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks Theodore Ts'o
2009-03-27 20:24   ` [PATCH 2/3] ext3: Use WRITE_SYNC for commits which are caused by fsync() Theodore Ts'o
2009-03-27 20:24     ` [PATCH 3/3] ext3: Avoid starting a transaction in writepage when not necessary Theodore Ts'o
2009-03-27 22:23       ` Jan Kara
2009-03-27 23:03         ` Theodore Tso
2009-03-30 13:22           ` Jan Kara
2009-03-27 22:20     ` [PATCH 2/3] ext3: Use WRITE_SYNC for commits which are caused by fsync() Jan Kara
2009-03-27 20:55   ` [PATCH 1/3] block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks Jan Kara
2009-04-07  6:21   ` Andrew Morton
2009-04-07  6:50     ` Andrew Morton
2009-04-07  7:08       ` Jens Axboe
2009-04-07  7:17         ` Jens Axboe
2009-04-07  8:16           ` Jens Axboe
2009-04-07  7:23         ` Andrew Morton [this message]
2009-04-07  7:57           ` Jens Axboe
2009-04-07 19:09             ` Theodore Tso
2009-04-07 19:32               ` Jens Axboe
2009-04-07 21:44                 ` Theodore Tso
2009-04-07 22:19                   ` [PATCH] block_write_full_page: switch synchronous writes to use WRITE_SYNC_PLUG Theodore Tso
2009-04-07 23:09                     ` Andrew Morton
2009-04-07 23:46                       ` Theodore Tso
2009-04-08  8:08                       ` Jens Axboe
2009-04-08 22:34                         ` Andrew Morton
2009-04-09 17:59                           ` Jens Axboe
2009-04-08  6:00                     ` Jens Axboe
2009-04-08 15:26                       ` Theodore Tso
2009-04-08  5:58                   ` [PATCH 1/3] block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks Jens Axboe
2009-04-08 15:25                     ` Theodore Tso
2009-04-07 14:19           ` Theodore Tso
2009-03-27 20:50 ` [PATCH 0/3] Ext3 latency improvement patches Chris Mason
2009-03-27 21:03   ` Chris Mason
2009-03-27 21:19     ` Jan Kara
2009-03-27 21:30     ` Theodore Tso
2009-03-27 21:54       ` Jan Kara
2009-03-27 23:09         ` Theodore Tso
2009-03-28  0:14           ` Jeff Garzik
2009-03-28  0:24             ` David Rees
2009-03-30 14:16               ` Ric Wheeler
2009-03-30 11:23       ` Aneesh Kumar K.V
     [not found]       ` <20090330112330.GA11357@skywalker>
2009-03-30 11:44         ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090407002313.fcdd1da0.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).