* [PATCH] cfq-iosched: quantum check tweak --resend
@ 2010-03-01 1:50 Shaohua Li
2010-03-01 8:02 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Shaohua Li @ 2010-03-01 1:50 UTC (permalink / raw)
To: jens.axboe; +Cc: linux-kernel, czoccolo, vgoyal, jmoyer, guijianfeng
Currently a queue can only dispatch up to 4 requests if there are other queues.
This isn't optimal, device can handle more requests, for example, AHCI can
handle 31 requests. I can understand the limit is for fairness, but we could
do a tweak: if the queue still has a lot of slice left, sounds we could
ignore the limit. Test shows this boost my workload (two thread randread of
a SSD) from 78m/s to 100m/s.
Thanks for suggestions from Corrado and Vivek for the patch.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
block/cfq-iosched.c | 30 ++++++++++++++++++++++++++----
1 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index f27e535..0db07d7 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -19,7 +19,7 @@
* tunables
*/
/* max queue in one round of service */
-static const int cfq_quantum = 4;
+static const int cfq_quantum = 8;
static const int cfq_fifo_expire[2] = { HZ / 4, HZ / 8 };
/* maximum backwards seek, in KiB */
static const int cfq_back_max = 16 * 1024;
@@ -2197,6 +2197,19 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
return dispatched;
}
+static inline bool cfq_slice_used_soon(struct cfq_data *cfqd,
+ struct cfq_queue *cfqq)
+{
+ /* the queue hasn't finished any request, can't estimate */
+ if (cfq_cfqq_slice_new(cfqq))
+ return 1;
+ if (time_after(jiffies + cfqd->cfq_slice_idle * cfqq->dispatched,
+ cfqq->slice_end))
+ return 1;
+
+ return 0;
+}
+
static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
unsigned int max_dispatch;
@@ -2213,7 +2226,7 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
if (cfqd->rq_in_flight[BLK_RW_SYNC] && !cfq_cfqq_sync(cfqq))
return false;
- max_dispatch = cfqd->cfq_quantum;
+ max_dispatch = max_t(unsigned int, cfqd->cfq_quantum / 2, 1);
if (cfq_class_idle(cfqq))
max_dispatch = 1;
@@ -2230,13 +2243,22 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
/*
* We have other queues, don't allow more IO from this one
*/
- if (cfqd->busy_queues > 1)
+ if (cfqd->busy_queues > 1 && cfq_slice_used_soon(cfqd, cfqq))
return false;
/*
* Sole queue user, no limit
*/
- max_dispatch = -1;
+ if (cfqd->busy_queues == 1)
+ max_dispatch = -1;
+ else
+ /*
+ * Normally we start throttling cfqq when cfq_quantum/2
+ * requests have been dispatched. But we can drive
+ * deeper queue depths at the beginning of slice
+ * subjected to upper limit of cfq_quantum.
+ * */
+ max_dispatch = cfqd->cfq_quantum;
}
/*
--
1.6.3.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] cfq-iosched: quantum check tweak --resend
2010-03-01 1:50 [PATCH] cfq-iosched: quantum check tweak --resend Shaohua Li
@ 2010-03-01 8:02 ` Jens Axboe
2010-03-01 8:15 ` Shaohua Li
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2010-03-01 8:02 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-kernel, czoccolo, vgoyal, jmoyer, guijianfeng
On Mon, Mar 01 2010, Shaohua Li wrote:
> Currently a queue can only dispatch up to 4 requests if there are other queues.
> This isn't optimal, device can handle more requests, for example, AHCI can
> handle 31 requests. I can understand the limit is for fairness, but we could
> do a tweak: if the queue still has a lot of slice left, sounds we could
> ignore the limit. Test shows this boost my workload (two thread randread of
> a SSD) from 78m/s to 100m/s.
> Thanks for suggestions from Corrado and Vivek for the patch.
As mentioned before, I think we definitely want to ensure that we drive
the full queue depth whenever possible. I think your patch is a bit
dangerous, though. The problematic workload here is a buffered write,
interleaved with the occasional sync reader. If the sync reader has to
endure 32 requests every time, latency rises dramatically for him.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cfq-iosched: quantum check tweak --resend
2010-03-01 8:02 ` Jens Axboe
@ 2010-03-01 8:15 ` Shaohua Li
2010-03-01 8:19 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Shaohua Li @ 2010-03-01 8:15 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-kernel@vger.kernel.org, czoccolo@gmail.com,
vgoyal@redhat.com, jmoyer@redhat.com, guijianfeng@cn.fujitsu.com
On Mon, Mar 01, 2010 at 04:02:34PM +0800, Jens Axboe wrote:
> On Mon, Mar 01 2010, Shaohua Li wrote:
> > Currently a queue can only dispatch up to 4 requests if there are other queues.
> > This isn't optimal, device can handle more requests, for example, AHCI can
> > handle 31 requests. I can understand the limit is for fairness, but we could
> > do a tweak: if the queue still has a lot of slice left, sounds we could
> > ignore the limit. Test shows this boost my workload (two thread randread of
> > a SSD) from 78m/s to 100m/s.
> > Thanks for suggestions from Corrado and Vivek for the patch.
>
> As mentioned before, I think we definitely want to ensure that we drive
> the full queue depth whenever possible. I think your patch is a bit
> dangerous, though. The problematic workload here is a buffered write,
> interleaved with the occasional sync reader. If the sync reader has to
> endure 32 requests every time, latency rises dramatically for him.
the patch still matains a hardlimit for dispatched request. For a async,
the limit is cfq_slice_async/cfq_slice_idle = 5. For sync, the limit is 8.
And we only pipe out such number of requests at the begining of a slice.
For the workload you mentioned here, we only dispatch 1 extra request.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cfq-iosched: quantum check tweak --resend
2010-03-01 8:15 ` Shaohua Li
@ 2010-03-01 8:19 ` Jens Axboe
2010-03-01 8:22 ` Shaohua Li
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2010-03-01 8:19 UTC (permalink / raw)
To: Shaohua Li
Cc: linux-kernel@vger.kernel.org, czoccolo@gmail.com,
vgoyal@redhat.com, jmoyer@redhat.com, guijianfeng@cn.fujitsu.com
On Mon, Mar 01 2010, Shaohua Li wrote:
> On Mon, Mar 01, 2010 at 04:02:34PM +0800, Jens Axboe wrote:
> > On Mon, Mar 01 2010, Shaohua Li wrote:
> > > Currently a queue can only dispatch up to 4 requests if there are other queues.
> > > This isn't optimal, device can handle more requests, for example, AHCI can
> > > handle 31 requests. I can understand the limit is for fairness, but we could
> > > do a tweak: if the queue still has a lot of slice left, sounds we could
> > > ignore the limit. Test shows this boost my workload (two thread randread of
> > > a SSD) from 78m/s to 100m/s.
> > > Thanks for suggestions from Corrado and Vivek for the patch.
> >
> > As mentioned before, I think we definitely want to ensure that we drive
> > the full queue depth whenever possible. I think your patch is a bit
> > dangerous, though. The problematic workload here is a buffered write,
> > interleaved with the occasional sync reader. If the sync reader has to
> > endure 32 requests every time, latency rises dramatically for him.
> the patch still matains a hardlimit for dispatched request. For a async,
> the limit is cfq_slice_async/cfq_slice_idle = 5. For sync, the limit is 8.
> And we only pipe out such number of requests at the begining of a slice.
> For the workload you mentioned here, we only dispatch 1 extra request.
OK, that sound appropriate. Final question - why change the quantum and
use quantum/2?
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cfq-iosched: quantum check tweak --resend
2010-03-01 8:19 ` Jens Axboe
@ 2010-03-01 8:22 ` Shaohua Li
2010-03-01 8:25 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Shaohua Li @ 2010-03-01 8:22 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-kernel@vger.kernel.org, czoccolo@gmail.com,
vgoyal@redhat.com, jmoyer@redhat.com, guijianfeng@cn.fujitsu.com
On Mon, Mar 01, 2010 at 04:19:20PM +0800, Jens Axboe wrote:
> On Mon, Mar 01 2010, Shaohua Li wrote:
> > On Mon, Mar 01, 2010 at 04:02:34PM +0800, Jens Axboe wrote:
> > > On Mon, Mar 01 2010, Shaohua Li wrote:
> > > > Currently a queue can only dispatch up to 4 requests if there are other queues.
> > > > This isn't optimal, device can handle more requests, for example, AHCI can
> > > > handle 31 requests. I can understand the limit is for fairness, but we could
> > > > do a tweak: if the queue still has a lot of slice left, sounds we could
> > > > ignore the limit. Test shows this boost my workload (two thread randread of
> > > > a SSD) from 78m/s to 100m/s.
> > > > Thanks for suggestions from Corrado and Vivek for the patch.
> > >
> > > As mentioned before, I think we definitely want to ensure that we drive
> > > the full queue depth whenever possible. I think your patch is a bit
> > > dangerous, though. The problematic workload here is a buffered write,
> > > interleaved with the occasional sync reader. If the sync reader has to
> > > endure 32 requests every time, latency rises dramatically for him.
> > the patch still matains a hardlimit for dispatched request. For a async,
> > the limit is cfq_slice_async/cfq_slice_idle = 5. For sync, the limit is 8.
> > And we only pipe out such number of requests at the begining of a slice.
> > For the workload you mentioned here, we only dispatch 1 extra request.
>
> OK, that sound appropriate. Final question - why change the quantum and
> use quantum/2?
This is suggested by Vivek. In this way quantum is still the hard limit and
doesn't surprise users. we do throttling at 1/2 quantum (softlimit) and
then stop at quantum (hard limit)
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cfq-iosched: quantum check tweak --resend
2010-03-01 8:22 ` Shaohua Li
@ 2010-03-01 8:25 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2010-03-01 8:25 UTC (permalink / raw)
To: Shaohua Li
Cc: linux-kernel@vger.kernel.org, czoccolo@gmail.com,
vgoyal@redhat.com, jmoyer@redhat.com, guijianfeng@cn.fujitsu.com
On Mon, Mar 01 2010, Shaohua Li wrote:
> On Mon, Mar 01, 2010 at 04:19:20PM +0800, Jens Axboe wrote:
> > On Mon, Mar 01 2010, Shaohua Li wrote:
> > > On Mon, Mar 01, 2010 at 04:02:34PM +0800, Jens Axboe wrote:
> > > > On Mon, Mar 01 2010, Shaohua Li wrote:
> > > > > Currently a queue can only dispatch up to 4 requests if there are other queues.
> > > > > This isn't optimal, device can handle more requests, for example, AHCI can
> > > > > handle 31 requests. I can understand the limit is for fairness, but we could
> > > > > do a tweak: if the queue still has a lot of slice left, sounds we could
> > > > > ignore the limit. Test shows this boost my workload (two thread randread of
> > > > > a SSD) from 78m/s to 100m/s.
> > > > > Thanks for suggestions from Corrado and Vivek for the patch.
> > > >
> > > > As mentioned before, I think we definitely want to ensure that we drive
> > > > the full queue depth whenever possible. I think your patch is a bit
> > > > dangerous, though. The problematic workload here is a buffered write,
> > > > interleaved with the occasional sync reader. If the sync reader has to
> > > > endure 32 requests every time, latency rises dramatically for him.
> > > the patch still matains a hardlimit for dispatched request. For a async,
> > > the limit is cfq_slice_async/cfq_slice_idle = 5. For sync, the limit is 8.
> > > And we only pipe out such number of requests at the begining of a slice.
> > > For the workload you mentioned here, we only dispatch 1 extra request.
> >
> > OK, that sound appropriate. Final question - why change the quantum and
> > use quantum/2?
> This is suggested by Vivek. In this way quantum is still the hard limit and
> doesn't surprise users. we do throttling at 1/2 quantum (softlimit) and
> then stop at quantum (hard limit)
OK, that makes sense. I will apply the patch, thanks!
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-03-01 8:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-01 1:50 [PATCH] cfq-iosched: quantum check tweak --resend Shaohua Li
2010-03-01 8:02 ` Jens Axboe
2010-03-01 8:15 ` Shaohua Li
2010-03-01 8:19 ` Jens Axboe
2010-03-01 8:22 ` Shaohua Li
2010-03-01 8:25 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox