From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755109Ab1FXLNN (ORCPT ); Fri, 24 Jun 2011 07:13:13 -0400 Received: from relay.parallels.com ([195.214.232.42]:54010 "EHLO relay.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753234Ab1FXLNL (ORCPT ); Fri, 24 Jun 2011 07:13:11 -0400 Message-ID: <4E047145.8050601@parallels.com> Date: Fri, 24 Jun 2011 15:13:09 +0400 From: Konstantin Khlebnikov User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.18) Gecko/20110416 SeaMonkey/2.0.13 MIME-Version: 1.0 To: Vivek Goyal CC: Jens Axboe , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] cfq-iosched: queue groups more gracefully References: <20110623162206.3222.3312.stgit@localhost6> <20110623175118.GD20763@redhat.com> In-Reply-To: <20110623175118.GD20763@redhat.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Vivek Goyal wrote: > On Thu, Jun 23, 2011 at 08:22:06PM +0400, Konstantin Khlebnikov wrote: >> This patch queue awakened cfq-groups according its current vdisktime, >> it try to save upto one group timeslice from unused virtual disk time. >> Thus group does not loses everything, if it was not continuously backlogged. >> >> Signed-off-by: Konstantin Khlebnikov > > I think this patch is not required till we start preemption across > groups? Any more details of actual use will help. I saw some problems with fairness and latency between groups with parallel intensive IO and interactive groups -- cfq always put interactive groups at the end, so its latency is extremely high. With this patch interactive groups got real chance to be scheduled much earlier. I'm sorry, I can not show simple test-cases right now. > >> --- >> block/cfq-iosched.c | 36 ++++++++++++++++++++++++++++++------ >> 1 files changed, 30 insertions(+), 6 deletions(-) >> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c >> index c71533e..d5c7c79 100644 >> --- a/block/cfq-iosched.c >> +++ b/block/cfq-iosched.c >> @@ -592,6 +592,26 @@ cfq_group_slice(struct cfq_data *cfqd, struct cfq_group *cfqg) >> return cfq_target_latency * cfqg->weight / st->total_weight; >> } >> >> +static inline u64 >> +cfq_group_vslice(struct cfq_data *cfqd, struct cfq_group *cfqg) >> +{ >> + struct cfq_rb_root *st =&cfqd->grp_service_tree; >> + u64 vslice; >> + >> + /* There no group slices in iops mode */ >> + if (iops_mode(cfqd)) >> + return 0; >> + >> + /* >> + * Equal to cfq_scale_slice(cfq_group_slice(cfqd, cfqg), cfqg). >> + * Add group weight beacuse it currently not in service tree. >> + */ >> + vslice = (u64)cfq_target_latency<< CFQ_SERVICE_SHIFT; >> + vslice *= BLKIO_WEIGHT_DEFAULT; >> + do_div(vslice, st->total_weight + cfqg->weight); > > Above is not equivalent to cfq_scale_slice(cfq_group_slice(cfqd, cfqg), > cfqg) as comment says. > > you are not calculating cfq_group_slice(). Instead using cfq_target_latency. No, this this expression gives the same value as cfq_scale_slice(cfq_group_slice()) after the group will be added to service tree. It is equal to slice that the group will receive if it will be queued immediately after the addition. > > Also it does not make sense. A higher weight group gets lower vslice > and in turn gets put further away on the tree. This is reverse of what > you want. > >> + return vslice; >> +} >> + >> static inline unsigned >> cfq_scaled_cfqq_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq) >> { >> @@ -884,16 +904,20 @@ cfq_group_notify_queue_add(struct cfq_data *cfqd, struct cfq_group *cfqg) >> return; >> >> /* >> - * Currently put the group at the end. Later implement something >> - * so that groups get lesser vtime based on their weights, so that >> - * if group does not loose all if it was not continuously backlogged. >> + * Bump vdisktime to be greater or equal min_vdisktime. >> + */ >> + cfqg->vdisktime = max_vdisktime(cfqg->vdisktime, st->min_vdisktime); >> + > > why do we need to do this? Time should not go back, it's dangerous. > >> + /* >> + * Put the group at the end, but save one slice from unused time. >> */ >> n = rb_last(&st->rb); >> if (n) { >> __cfqg = rb_entry_cfqg(n); >> - cfqg->vdisktime = __cfqg->vdisktime + CFQ_IDLE_DELAY; >> - } else >> - cfqg->vdisktime = st->min_vdisktime; >> + cfqg->vdisktime = max_vdisktime(cfqg->vdisktime, > ^^^^^^^ > I think you meant st->min_vdisktime here? No, I adjust group vdisktime to put it at the end, but save up to one slice. Although there may be a problem with the overlap, with wakeup after looong sleep.. >> + __cfqg->vdisktime - >> + cfq_group_vslice(cfqd, cfqg)); >> + } >> cfq_group_service_tree_add(st, cfqg); >> } >> > > Thanks > Vivek