public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Corrado Zoccolo <czoccolo@gmail.com>
Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cfq: Make use of service count to estimate the rb_key offset
Date: Mon, 30 Nov 2009 10:36:04 -0500	[thread overview]
Message-ID: <20091130153604.GB11670@redhat.com> (raw)
In-Reply-To: <4e5e476b0911260108s2fe4cd86lcb32c7be76b4f75c@mail.gmail.com>

On Thu, Nov 26, 2009 at 10:08:36AM +0100, Corrado Zoccolo wrote:
> Hi Gui, Jens
> On Thu, Nov 26, 2009 at 7:20 AM, Gui Jianfeng
> <guijianfeng@cn.fujitsu.com> wrote:
> > Hi Jens, Czoccolo
> >
> > For the moment, different workload cfq queues are put into different
> > service trees. But CFQ still uses "busy_queues" to estimate rb_key
> > offset when inserting a cfq queue into a service tree. I think this
> > isn't appropriate, and it should make use of service tree count to do
> > this estimation. This patch is for for-2.6.33 branch.
> 
> In cfq_choose_wl, we rely on consistency of rb_keys across service
> trees to compute the next workload to be serviced.
>         for (i = 0; i < 3; ++i) {
>                 /* otherwise, select the one with lowest rb_key */
>                 queue = cfq_rb_first(service_tree_for(prio, i, cfqd));
>                 if (queue &&
>                     (!key_valid || time_before(queue->rb_key, lowest_key))) {
>                         lowest_key = queue->rb_key;
> 			cur_best = i;
> 			key_valid = true;
> 		}
>         }
> 
> If you change how the rb_key is computed (so it is no longer
> consistent across service trees) without changing how it is used can
> introduce problems.
> 

Hi Corrado,

currently rb_key seems to be combination of two things. busy_queues and
jiffies. 

In new scheme, where we decide the share of a workload and then switch to
new workload, dependence on busy_queues does not seem to make much sense.

Assume, a bunch of sequential readers get backlogged and then few random
readers gets backlogged. Now random reader will get higher rb_key because
there are 8 sequential reders on sync-idle tree.

IIUC, with above logic, even if we expire the sync-idle workload duration
once, we might not switch to sync-noidle workload and start running the
sync-idle workload again. (Because minimum slice length restrictions or
if low_latency is not set).

So instead of relying on rb_keys to switch the workload type, why not do
it in round robin manner across the workload types? So rb_key will be
significant only with-in service tree and not across service tree? 

Thanks
Vivek

> Thanks,
> Corrado
> 
> >
> > Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
> > ---
> >  block/cfq-iosched.c |    8 ++++++--
> >  1 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index 1bcbd8c..467981e 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -600,11 +600,15 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> >  static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
> >                                      struct cfq_queue *cfqq)
> >  {
> > +       struct cfq_rb_root *service_tree;
> > +
> > +       service_tree = service_tree_for(cfqq_prio(cfqq), cfqq_type(cfqq), cfqd);
> > +
> >        /*
> >         * just an approximation, should be ok.
> >         */
> > -       return (cfqd->busy_queues - 1) * (cfq_prio_slice(cfqd, 1, 0) -
> > -                      cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
> > +       return  service_tree->count * (cfq_prio_slice(cfqd, 1, 0) -
> > +                  cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
> >  }
> >
> >  /*
> > --
> > 1.5.4.rc3
> >
> > --
> > Regards
> > Gui Jianfeng
> >
> >
> 
> 
> 
> -- 
> __________________________________________________________________________
> 
> dott. Corrado Zoccolo                          mailto:czoccolo@gmail.com
> PhD - Department of Computer Science - University of Pisa, Italy
> --------------------------------------------------------------------------
> The self-confidence of a warrior is not the self-confidence of the average
> man. The average man seeks certainty in the eyes of the onlooker and calls
> that self-confidence. The warrior seeks impeccability in his own eyes and
> calls that humbleness.
>                                Tales of Power - C. Castaneda

  parent reply	other threads:[~2009-11-30 15:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-26  6:20 [PATCH] cfq: Make use of service count to estimate the rb_key offset Gui Jianfeng
2009-11-26  8:14 ` Jens Axboe
2009-11-26  9:08 ` Corrado Zoccolo
2009-11-27  1:42   ` Gui Jianfeng
2009-11-27  8:16     ` Corrado Zoccolo
2009-11-30  3:02       ` Gui Jianfeng
2009-11-30  8:38         ` Jens Axboe
2009-11-30 15:36   ` Vivek Goyal [this message]
2009-11-30 16:01     ` Corrado Zoccolo
2009-11-30 16:46       ` Vivek Goyal
2009-11-30 21:56         ` Corrado Zoccolo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091130153604.GB11670@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=czoccolo@gmail.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox