linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Shaohua Li <shaohua.li@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jaxboe@fusionio.com" <jaxboe@fusionio.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"khlebnikov@openvz.org" <khlebnikov@openvz.org>,
	"jmoyer@redhat.com" <jmoyer@redhat.com>
Subject: Re: [RFC PATCH 0/3] block: Fix fsync slowness with CFQ cgroups
Date: Tue, 28 Jun 2011 21:29:55 -0400	[thread overview]
Message-ID: <20110629012955.GA19041@redhat.com> (raw)
In-Reply-To: <1309309495.15392.213.camel@sli10-conroe>

On Wed, Jun 29, 2011 at 09:04:55AM +0800, Shaohua Li wrote:

[..]
> > We idle on last queue on sync-noidle tree. So we idle on fysnc queue as
> > it is last queue on sync-noidle tree. That's how we provide protection
> > to all sync-noidle queues against sync-idle queues. Instead of idling
> > on individual quues we do idling in group and that is on service tree.
> Ok. but this looks silly. We are idling in a noidle service tree or a
> group (backed by the last queue of the tree or group) because we assume
> the tree or group can dispatch a request soon. But if the think time of
> the tree or group is big, the assumption isn't true. Doing idle here is
> blind. I thought we can extend the think time check for both service
> tree and group.

We can implement the thinktime for noidle service tree and group idle as
well. That's not a problem, though I am yet to be convinced that thinktime
still makes sense for the group. I guess it will just mean that in the
past have you done a bunch of IO with gap between IO less than 8ms. If
yes, then we expect you to do more IO in future. Frankly speaking, I am
not too sure that how past IO pattern predicts the future IO pattern
of the group.

But anyway, the point is, even if you we implement it, it will not solve
the fsync issue at hand. The reason I explained in previous mail. We 
will be oscillating between high think time and low thinktime depending
on whether we are idling or not. There is no correlation between think
time of fsync thread and idling here.

I think you are banking on the fact that after fsync, journaling thread
IO can take more than 8ms hence delaying next IO to fsync thread, pushing
its thinktim more than 8ms hence we will not idle on fsync thread at
all. It is just one corner case and I think it is broken in multiple
cases.

- If filesystem barriers are disabled or backend storage has battery
  backup then journal IO most likely will go in cache and barriers
  will be ignored. In that case write will finish almost instantly
  and we will get next IO from fsync thread very soon hence pushing
  down thinktime of fsync thread which will enable idling and we will
  be back to the problem we are trying to solve.

- Fsync thread might be submitting string of IOs (say 10-12) before it
  moves to journal thread to commit meta data. In that case we might
  have lowered thinktime of fsync hence enable idle. 

So implementing think time for service tree/group might be a good idea
in general but it will not solve this IO dependecny issue across cgroups.

Thanks
Vivek

  reply	other threads:[~2011-06-29  1:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-27 20:17 [RFC PATCH 0/3] block: Fix fsync slowness with CFQ cgroups Vivek Goyal
2011-06-27 20:17 ` [PATCH 1/3] block: A new interface for specifying IO dependencing among tasks Vivek Goyal
2011-06-27 20:17 ` [PATCH 2/3] ext4: Explicitly specify fsync dependency on journaling thread Vivek Goyal
2011-06-27 20:17 ` [PATCH 3/3] ext3: " Vivek Goyal
2011-06-28  1:18 ` [RFC PATCH 0/3] block: Fix fsync slowness with CFQ cgroups Shaohua Li
2011-06-28  1:40   ` Vivek Goyal
2011-06-28  2:03     ` Shaohua Li
2011-06-28 13:04       ` Vivek Goyal
2011-06-29  1:04         ` Shaohua Li
2011-06-29  1:29           ` Vivek Goyal [this message]
2011-06-30  0:29             ` Shaohua Li
2011-06-28  2:47 ` Dave Chinner
2011-06-28 13:35   ` Vivek Goyal
2011-06-28 11:00 ` Konstantin Khlebnikov
2011-06-28 13:45   ` Vivek Goyal
2011-06-28 14:42     ` Konstantin Khlebnikov
2011-06-28 14:47       ` Vivek Goyal
2011-06-28 21:20         ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110629012955.GA19041@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=jaxboe@fusionio.com \
    --cc=jmoyer@redhat.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).