All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Corrado Zoccolo <czoccolo@gmail.com>
Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cfq: Make use of service count to estimate the rb_key offset
Date: Mon, 30 Nov 2009 10:36:04 -0500	[thread overview]
Message-ID: <20091130153604.GB11670@redhat.com> (raw)
In-Reply-To: <4e5e476b0911260108s2fe4cd86lcb32c7be76b4f75c@mail.gmail.com>

On Thu, Nov 26, 2009 at 10:08:36AM +0100, Corrado Zoccolo wrote:
> Hi Gui, Jens
> On Thu, Nov 26, 2009 at 7:20 AM, Gui Jianfeng
> <guijianfeng@cn.fujitsu.com> wrote:
> > Hi Jens, Czoccolo
> >
> > For the moment, different workload cfq queues are put into different
> > service trees. But CFQ still uses "busy_queues" to estimate rb_key
> > offset when inserting a cfq queue into a service tree. I think this
> > isn't appropriate, and it should make use of service tree count to do
> > this estimation. This patch is for for-2.6.33 branch.
> 
> In cfq_choose_wl, we rely on consistency of rb_keys across service
> trees to compute the next workload to be serviced.
>         for (i = 0; i < 3; ++i) {
>                 /* otherwise, select the one with lowest rb_key */
>                 queue = cfq_rb_first(service_tree_for(prio, i, cfqd));
>                 if (queue &&
>                     (!key_valid || time_before(queue->rb_key, lowest_key))) {
>                         lowest_key = queue->rb_key;
> 			cur_best = i;
> 			key_valid = true;
> 		}
>         }
> 
> If you change how the rb_key is computed (so it is no longer
> consistent across service trees) without changing how it is used can
> introduce problems.
> 

Hi Corrado,

currently rb_key seems to be combination of two things. busy_queues and
jiffies. 

In new scheme, where we decide the share of a workload and then switch to
new workload, dependence on busy_queues does not seem to make much sense.

Assume, a bunch of sequential readers get backlogged and then few random
readers gets backlogged. Now random reader will get higher rb_key because
there are 8 sequential reders on sync-idle tree.

IIUC, with above logic, even if we expire the sync-idle workload duration
once, we might not switch to sync-noidle workload and start running the
sync-idle workload again. (Because minimum slice length restrictions or
if low_latency is not set).

So instead of relying on rb_keys to switch the workload type, why not do
it in round robin manner across the workload types? So rb_key will be
significant only with-in service tree and not across service tree? 

Thanks
Vivek

> Thanks,
> Corrado
> 
> >
> > Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
> > ---
> >  block/cfq-iosched.c |    8 ++++++--
> >  1 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index 1bcbd8c..467981e 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -600,11 +600,15 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> >  static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
> >                                      struct cfq_queue *cfqq)
> >  {
> > +       struct cfq_rb_root *service_tree;
> > +
> > +       service_tree = service_tree_for(cfqq_prio(cfqq), cfqq_type(cfqq), cfqd);
> > +
> >        /*
> >         * just an approximation, should be ok.
> >         */
> > -       return (cfqd->busy_queues - 1) * (cfq_prio_slice(cfqd, 1, 0) -
> > -                      cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
> > +       return  service_tree->count * (cfq_prio_slice(cfqd, 1, 0) -
> > +                  cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
> >  }
> >
> >  /*
> > --
> > 1.5.4.rc3
> >
> > --
> > Regards
> > Gui Jianfeng
> >
> >
> 
> 
> 
> -- 
> __________________________________________________________________________
> 
> dott. Corrado Zoccolo                          mailto:czoccolo@gmail.com
> PhD - Department of Computer Science - University of Pisa, Italy
> --------------------------------------------------------------------------
> The self-confidence of a warrior is not the self-confidence of the average
> man. The average man seeks certainty in the eyes of the onlooker and calls
> that self-confidence. The warrior seeks impeccability in his own eyes and
> calls that humbleness.
>                                Tales of Power - C. Castaneda

  parent reply	other threads:[~2009-11-30 15:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-26  6:20 [PATCH] cfq: Make use of service count to estimate the rb_key offset Gui Jianfeng
2009-11-26  8:14 ` Jens Axboe
2009-11-26  9:08 ` Corrado Zoccolo
2009-11-27  1:42   ` Gui Jianfeng
2009-11-27  8:16     ` Corrado Zoccolo
2009-11-30  3:02       ` Gui Jianfeng
2009-11-30  8:38         ` Jens Axboe
2009-11-30 15:36   ` Vivek Goyal [this message]
2009-11-30 16:01     ` Corrado Zoccolo
2009-11-30 16:46       ` Vivek Goyal
2009-11-30 21:56         ` Corrado Zoccolo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091130153604.GB11670@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=czoccolo@gmail.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.