From: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org
Subject: Re: [patch] cfq-iosched: fix incorrect filing of rt async cfqq
Date: Wed, 21 Jan 2015 13:55:59 +0900 [thread overview]
Message-ID: <54BF315F.5090709@hitachi.com> (raw)
In-Reply-To: <x494mrvu4wi.fsf@segfault.boston.devel.redhat.com>
Hi,
Thank you for the patch, Jeff!
(2015/01/13 5:21), Jeff Moyer wrote:
> Hi,
>
> If you can manage to submit an async write as the first async I/O from
> the context of a process with realtime scheduling priority, then a
> cfq_queue is allocated, but filed into the wrong async_cfqq bucket. It
> ends up in the best effort array, but actually has realtime I/O
> scheduling priority set in cfqq->ioprio.
Actually, this bug causes a severe I/O starvation in the following scenario
(we've experienced):
1. RT demon issues an O_SYNC write to an ext2 filesystem early in the
boot sequence (this is an example which causes the first async write
to a device in an RT process context)
2. Do heavy buffered writes to the device (these writes will be
treated as RT class)
3. Issue a sync I/O to the device, and it will take over tens seconds
to be prevented by many RT class I/Os
I confirmed Jeff's patch solves the above problem.
Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
>
> The reason is that cfq_get_queue assumes the default scheduling class and
> priority when there is no information present (i.e. when the async cfqq
> is created):
>
> static struct cfq_queue *
> cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
> struct bio *bio, gfp_t gfp_mask)
> {
> const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
> const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
>
> cic->ioprio starts out as 0, which is "invalid". So, class of 0
> (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:
>
> async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
>
> static struct cfq_queue **
> cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
> {
> switch (ioprio_class) {
> case IOPRIO_CLASS_RT:
> return &cfqd->async_cfqq[0][ioprio];
> case IOPRIO_CLASS_NONE:
> ioprio = IOPRIO_NORM;
> /* fall through */
> case IOPRIO_CLASS_BE:
> return &cfqd->async_cfqq[1][ioprio];
> case IOPRIO_CLASS_IDLE:
> return &cfqd->async_idle_cfqq;
> default:
> BUG();
> }
> }
>
> Here, instead of returning a class mapped from the process' scheduling
> priority, we get back the bucket associated with IOPRIO_CLASS_BE.
>
> Now, there is no queue allocated there yet, so we create it:
>
> cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);
>
> That function ends up doing this:
>
> cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
> cfq_init_prio_data(cfqq, cic);
>
> cfq_init_cfqq marks the priority as having changed. Then, cfq_init_prio
> data does this:
>
> ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
> switch (ioprio_class) {
> default:
> printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
> case IOPRIO_CLASS_NONE:
> /*
> * no prio set, inherit CPU scheduling settings
> */
> cfqq->ioprio = task_nice_ioprio(tsk);
> cfqq->ioprio_class = task_nice_ioclass(tsk);
> break;
>
> So we basically have two code paths that treat IOPRIO_CLASS_NONE
> differently, which results in an RT async cfqq filed into a best effort
> bucket.
>
> Attached is a patch which fixes the problem. I'm not sure how to make
> it cleaner. Suggestions would be welcome.
>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 6f2751d..b9abdca 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3656,12 +3656,17 @@ static struct cfq_queue *
> cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
> struct bio *bio, gfp_t gfp_mask)
> {
> - const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
> - const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
> + int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
> + int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
> struct cfq_queue **async_cfqq = NULL;
> struct cfq_queue *cfqq = NULL;
>
> if (!is_sync) {
> + if (!ioprio_valid(cic->ioprio)) {
> + struct task_struct *tsk = current;
> + ioprio = task_nice_ioprio(tsk);
> + ioprio_class = task_nice_ioclass(tsk);
> + }
> async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
> cfqq = *async_cfqq;
> }
>
>
--
Hidehiro Kawai
Hitachi, Yokohama Research Laboratory
next prev parent reply other threads:[~2015-01-21 5:30 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-12 20:21 [patch] cfq-iosched: fix incorrect filing of rt async cfqq Jeff Moyer
2015-01-21 4:55 ` Hidehiro Kawai [this message]
2015-01-21 17:37 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54BF315F.5090709@hitachi.com \
--to=hidehiro.kawai.ez@hitachi.com \
--cc=axboe@kernel.dk \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.