From: Vivek Goyal <vgoyal@redhat.com>
To: Corrado Zoccolo <czoccolo@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
Jeff Moyer <jmoyer@redhat.com>, Shaohua Li <shaohua.li@intel.com>,
Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Subject: Re: [PATCH 2/2] cfq-iosched: rethink seeky detection for SSDs
Date: Mon, 8 Mar 2010 09:08:20 -0500 [thread overview]
Message-ID: <20100308140820.GA10306@redhat.com> (raw)
In-Reply-To: <4e5e476b1003051431m1bcd919and1726339b4570b8@mail.gmail.com>
On Fri, Mar 05, 2010 at 11:31:43PM +0100, Corrado Zoccolo wrote:
[..]
> > ---
> > block/cfq-iosched.c | 7 +++++--
> > 1 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index 40840da..296aa23 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -1788,9 +1788,12 @@ static bool cfq_should_idle(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> > if (prio == IDLE_WORKLOAD)
> > return false;
> >
> > + /* Don't idle on NCQ SSDs */
> > + if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag)
> > + return false;
> > +
> > /* We do for queues that were marked with idle window flag. */
> > - if (cfq_cfqq_idle_window(cfqq) &&
> > - !(blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag))
> > + if (cfq_cfqq_idle_window(cfqq))
> > return true;
> >
> > /*
> > --
> > 1.6.2.5
> >
> >
> > Now I am wondering what are the side affects of above change.
> >
> > One advantage of idling on service tree even on NCQ SSD is that READS get
> > more protection from WRITES. Especially if there are SSDs which implement
> > NCQ but really suck at WRITE speed or don't implement parallel IO
> > channels.
> I think the following code:
> /*
> * If this is an async queue and we have sync IO in flight, let it wait
> */
> if (cfqd->rq_in_flight[BLK_RW_SYNC] && !cfq_cfqq_sync(cfqq))
> return false;
> already provides this protection, regardless of the result of cfq_should_idle().
It does but in some cases it does not. For example, consider a case of one
writer and one reader (sequential or seeky) trying to do some IO. Now
reader will do one read and then immediately writer will pump few requests
(4-5 in dispatch queue) and then reader does one read and it goes on and
this can add to latency of reader on bad NCQ SSDs.
>
> >
> > I see cfq_should_idle() being used at four places.
> >
> > - cfq_select_queue()
> > if (cfqq->cfqg->nr_cfqq == 1 && RB_EMPTY_ROOT(&cfqq->sort_list)
> > && cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) {
> > cfqq = NULL;
> > goto keep_queue;
> > }
> >
> > This is for fairness in group scheduling. I think that impact here should
> > not be much because this condition is primarily hit on media where we idle
> > on the queue.
> >
> > On NCQ SSD, I don't think we get fairness in group scheduling much until
> > and unless a group is generating enough traffic to keep the request queue
> > full.
> >
> > - cfq_select_queue()
> > (cfqq->dispatched && cfq_should_idle(cfqd, cfqq))) {
> > cfqq = NULL;
> >
> > Should we really wait for a dispatched request to complete on NCQ SSD?
> I don't think so. This is the cause for the bad performance we are looking at.
> And remember that, without my patch, you will still get the same bad
> behaviour, just with different conditions (one random and one
> sequential reader, regardless of request size).
> > On SSD with parallel channel, I think this will reduce throughput?
> >
> > - cfq_may_dispatch()
> >
> > /*
> > * Drain async requests before we start sync IO
> > */
> > if (cfq_should_idle(cfqd, cfqq) && cfqd->rq_in_driver[BLK_RW_ASYNC])
> > return false;
> >
> > If NCQ SSD is implementing parallel IO channels, probably there is no need
> > to clear out WRITES before we start sync IO?
> Right. I think this is present for crazy NCQ rotational disks, where a
> write could sit in cache forever if a stream of reads is coming.
> >
> > - cfq_arm_slice_timer()
> >
> > if (!cfqd->cfq_slice_idle || !cfq_should_idle(cfqd, cfqq))
> > return;
> >
> > cfq_arm_slice_timer() already check for NCQ SSD and does not arm timer
> > so this function should remain unaffected with this change.
> >
> > So the question is, should we wait on service tree on an NCQ SSD or not?
> I don't see a good reason to wait, and bad results if we do.
Actually with my patch, we will not get fairness on NCQ SSD (group
scheduling) until and unless a group has enough traffic to keep the
SSD busy (or dispatch queue full). Previously we used to wait on empty
service tree on NCQ SSD and now we will not.
But I guess, we can move wait on the group under the tunable
group_isolation=1, so that if one prefers more isolation between groups
even at the expense of total throughput, he can choose to do so.
> >
> > Secondly, your patch of determining cfqq seeky nature based on request
> > size on SSD, helps only on non NCQ SSD.
> We know it helps on non NCQ. If we fix the waiting bug, it could help
> also on NCQ, in scenarios where lot of processes (> NCQ depth) are
> submitting requests with largely different sizes.
Ok, I will run some more tests with my patch and if I don't see any major
regressions, I will send it to Jens.
Thanks
Vivek
next prev parent reply other threads:[~2010-03-08 14:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1267296340-3820-1-git-send-email-czoccolo@gmail.com>
2010-02-27 18:45 ` [PATCH 1/2] cfq-iosched: rework seeky detection Corrado Zoccolo
2010-02-27 18:45 ` [PATCH 2/2] cfq-iosched: rethink seeky detection for SSDs Corrado Zoccolo
2010-03-01 14:25 ` Vivek Goyal
2010-03-03 19:47 ` Corrado Zoccolo
2010-03-03 21:21 ` Vivek Goyal
2010-03-03 23:28 ` Vivek Goyal
2010-03-04 20:34 ` Corrado Zoccolo
2010-03-04 22:27 ` Vivek Goyal
2010-03-05 22:31 ` Corrado Zoccolo
2010-03-08 14:08 ` Vivek Goyal [this message]
2010-02-28 18:41 ` [RFC, PATCH 0/2] Reworking seeky detection for 2.6.34 Jens Axboe
2010-03-01 16:35 ` Vivek Goyal
2010-03-01 19:45 ` Vivek Goyal
2010-03-01 23:01 ` Corrado Zoccolo
2010-03-03 22:39 ` Corrado Zoccolo
2010-03-03 23:11 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100308140820.GA10306@redhat.com \
--to=vgoyal@redhat.com \
--cc=czoccolo@gmail.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox