[PATCH] block: Make cfq_target_latency tunable through sysfs.

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] block: Make cfq_target_latency tunable through sysfs.
@ 2012-03-27  8:16 Tao Ma
  2012-03-29  0:32 ` Greg KH
  2012-03-31 15:31 ` [PATCH V2 1/2] " Tao Ma
  0 siblings, 2 replies; 7+ messages in thread
From: Tao Ma @ 2012-03-27  8:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

From: Tao Ma <boyu.mt@taobao.com>

In cfq, when we calculate a time slice for a process(or a cfqq to
be precise), we have to consider the cfq_target_latency so that all the
sync request have an estimated latency(300ms) and it is controlled by
cfq_target_latency. But in some hadoop test, we have found that if
there are many processes doing sequential read(24 for example), the
throughput is bad because every process can only work for about 25ms
and the cfqq is switched. That leads to a higher disk seek. We can
achive the good throughput by setting low_latency=0, but then some
read's latency is too much for the application.

So this patch makes cfq_target_latency tunable through sysfs so that
we can tune it and find some magic number which is not bad for both
the throughput and the read latency.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 block/cfq-iosched.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4572952..3c38536 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -295,6 +295,7 @@ struct cfq_data {
 	unsigned int cfq_slice_idle;
 	unsigned int cfq_group_idle;
 	unsigned int cfq_latency;
+	unsigned int cfq_target_latency;
 
 	/*
 	 * Fallback dummy cfqq for extreme OOM conditions
@@ -604,7 +605,7 @@ cfq_group_slice(struct cfq_data *cfqd, struct cfq_group *cfqg)
 {
 	struct cfq_rb_root *st = &cfqd->grp_service_tree;
 
-	return cfq_target_latency * cfqg->weight / st->total_weight;
+	return cfqd->cfq_target_latency * cfqg->weight / st->total_weight;
 }
 
 static inline unsigned
@@ -2271,7 +2272,8 @@ new_workload:
 		 * to have higher weight. A more accurate thing would be to
 		 * calculate system wide asnc/sync ratio.
 		 */
-		tmp = cfq_target_latency * cfqg_busy_async_queues(cfqd, cfqg);
+		tmp = cfqd->cfq_target_latency *
+			cfqg_busy_async_queues(cfqd, cfqg);
 		tmp = tmp/cfqd->busy_queues;
 		slice = min_t(unsigned, slice, tmp);
 
@@ -3737,6 +3739,7 @@ static void *cfq_init_queue(struct request_queue *q)
 	cfqd->cfq_back_penalty = cfq_back_penalty;
 	cfqd->cfq_slice[0] = cfq_slice_async;
 	cfqd->cfq_slice[1] = cfq_slice_sync;
+	cfqd->cfq_target_latency = cfq_target_latency;
 	cfqd->cfq_slice_async_rq = cfq_slice_async_rq;
 	cfqd->cfq_slice_idle = cfq_slice_idle;
 	cfqd->cfq_group_idle = cfq_group_idle;
@@ -3788,6 +3791,7 @@ SHOW_FUNCTION(cfq_slice_sync_show, cfqd->cfq_slice[1], 1);
 SHOW_FUNCTION(cfq_slice_async_show, cfqd->cfq_slice[0], 1);
 SHOW_FUNCTION(cfq_slice_async_rq_show, cfqd->cfq_slice_async_rq, 0);
 SHOW_FUNCTION(cfq_low_latency_show, cfqd->cfq_latency, 0);
+SHOW_FUNCTION(cfq_target_latency_show, cfqd->cfq_target_latency, 1);
 #undef SHOW_FUNCTION
 
 #define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)			\
@@ -3821,6 +3825,7 @@ STORE_FUNCTION(cfq_slice_async_store, &cfqd->cfq_slice[0], 1, UINT_MAX, 1);
 STORE_FUNCTION(cfq_slice_async_rq_store, &cfqd->cfq_slice_async_rq, 1,
 		UINT_MAX, 0);
 STORE_FUNCTION(cfq_low_latency_store, &cfqd->cfq_latency, 0, 1, 0);
+STORE_FUNCTION(cfq_target_latency_store, &cfqd->cfq_target_latency, 1, UINT_MAX, 1);
 #undef STORE_FUNCTION
 
 #define CFQ_ATTR(name) \
@@ -3838,6 +3843,7 @@ static struct elv_fs_entry cfq_attrs[] = {
 	CFQ_ATTR(slice_idle),
 	CFQ_ATTR(group_idle),
 	CFQ_ATTR(low_latency),
+	CFQ_ATTR(target_latency),
 	__ATTR_NULL
 };
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: Make cfq_target_latency tunable through sysfs.
  2012-03-27  8:16 [PATCH] block: Make cfq_target_latency tunable through sysfs Tao Ma
@ 2012-03-29  0:32 ` Greg KH
  2012-03-29  0:36   ` Tao Ma
  2012-03-31 15:31 ` [PATCH V2 1/2] " Tao Ma
  1 sibling, 1 reply; 7+ messages in thread
From: Greg KH @ 2012-03-29  0:32 UTC (permalink / raw)
  To: Tao Ma; +Cc: linux-kernel, Jens Axboe

On Tue, Mar 27, 2012 at 04:16:53PM +0800, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
> 
> In cfq, when we calculate a time slice for a process(or a cfqq to
> be precise), we have to consider the cfq_target_latency so that all the
> sync request have an estimated latency(300ms) and it is controlled by
> cfq_target_latency. But in some hadoop test, we have found that if
> there are many processes doing sequential read(24 for example), the
> throughput is bad because every process can only work for about 25ms
> and the cfqq is switched. That leads to a higher disk seek. We can
> achive the good throughput by setting low_latency=0, but then some
> read's latency is too much for the application.
> 
> So this patch makes cfq_target_latency tunable through sysfs so that
> we can tune it and find some magic number which is not bad for both
> the throughput and the read latency.

If you add/modify sysfs files, you HAVE to also have a matching change
to Documentation/ABI.

Please do so.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: Make cfq_target_latency tunable through sysfs.
  2012-03-29  0:32 ` Greg KH
@ 2012-03-29  0:36   ` Tao Ma
  2012-03-29 10:38     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Tao Ma @ 2012-03-29  0:36 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, Jens Axboe

On 03/29/2012 08:32 AM, Greg KH wrote:
> On Tue, Mar 27, 2012 at 04:16:53PM +0800, Tao Ma wrote:
>> From: Tao Ma <boyu.mt@taobao.com>
>>
>> In cfq, when we calculate a time slice for a process(or a cfqq to
>> be precise), we have to consider the cfq_target_latency so that all the
>> sync request have an estimated latency(300ms) and it is controlled by
>> cfq_target_latency. But in some hadoop test, we have found that if
>> there are many processes doing sequential read(24 for example), the
>> throughput is bad because every process can only work for about 25ms
>> and the cfqq is switched. That leads to a higher disk seek. We can
>> achive the good throughput by setting low_latency=0, but then some
>> read's latency is too much for the application.
>>
>> So this patch makes cfq_target_latency tunable through sysfs so that
>> we can tune it and find some magic number which is not bad for both
>> the throughput and the read latency.
> 
> If you add/modify sysfs files, you HAVE to also have a matching change
> to Documentation/ABI.
OK, I will add it in the next round. Great thanks.

Tao
> 
> Please do so.
> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: Make cfq_target_latency tunable through sysfs.
  2012-03-29  0:36   ` Tao Ma
@ 2012-03-29 10:38     ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2012-03-29 10:38 UTC (permalink / raw)
  To: Tao Ma; +Cc: Greg KH, linux-kernel

On 03/29/2012 02:36 AM, Tao Ma wrote:
> On 03/29/2012 08:32 AM, Greg KH wrote:
>> On Tue, Mar 27, 2012 at 04:16:53PM +0800, Tao Ma wrote:
>>> From: Tao Ma <boyu.mt@taobao.com>
>>>
>>> In cfq, when we calculate a time slice for a process(or a cfqq to
>>> be precise), we have to consider the cfq_target_latency so that all the
>>> sync request have an estimated latency(300ms) and it is controlled by
>>> cfq_target_latency. But in some hadoop test, we have found that if
>>> there are many processes doing sequential read(24 for example), the
>>> throughput is bad because every process can only work for about 25ms
>>> and the cfqq is switched. That leads to a higher disk seek. We can
>>> achive the good throughput by setting low_latency=0, but then some
>>> read's latency is too much for the application.
>>>
>>> So this patch makes cfq_target_latency tunable through sysfs so that
>>> we can tune it and find some magic number which is not bad for both
>>> the throughput and the read latency.
>>
>> If you add/modify sysfs files, you HAVE to also have a matching change
>> to Documentation/ABI.
> OK, I will add it in the next round. Great thanks.

If you do that, we can queue up the patch. I'm also a bit nervous about
adding new sysfs files, but target latency is generic enough that it
definitely makes sense to add.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 1/2] block: Make cfq_target_latency tunable through sysfs.
  2012-03-27  8:16 [PATCH] block: Make cfq_target_latency tunable through sysfs Tao Ma
  2012-03-29  0:32 ` Greg KH
@ 2012-03-31 15:31 ` Tao Ma
  2012-03-31 15:31   ` [PATCH V2 2/2] Documentation: Add sysfs ABI change for cfq's target latency Tao Ma
  2012-04-01 21:33   ` [PATCH V2 1/2] block: Make cfq_target_latency tunable through sysfs Jens Axboe
  1 sibling, 2 replies; 7+ messages in thread
From: Tao Ma @ 2012-03-31 15:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

From: Tao Ma <boyu.mt@taobao.com>

In cfq, when we calculate a time slice for a process(or a cfqq to
be precise), we have to consider the cfq_target_latency so that all the
sync request have an estimated latency(300ms) and it is controlled by
cfq_target_latency. But in some hadoop test, we have found that if
there are many processes doing sequential read(24 for example), the
throughput is bad because every process can only work for about 25ms
and the cfqq is switched. That leads to a higher disk seek. We can
achive the good throughput by setting low_latency=0, but then some
read's latency is too much for the application.

So this patch makes cfq_target_latency tunable through sysfs so that
we can tune it and find some magic number which is not bad for both
the throughput and the read latency.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 block/cfq-iosched.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4572952..3c38536 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -295,6 +295,7 @@ struct cfq_data {
 	unsigned int cfq_slice_idle;
 	unsigned int cfq_group_idle;
 	unsigned int cfq_latency;
+	unsigned int cfq_target_latency;
 
 	/*
 	 * Fallback dummy cfqq for extreme OOM conditions
@@ -604,7 +605,7 @@ cfq_group_slice(struct cfq_data *cfqd, struct cfq_group *cfqg)
 {
 	struct cfq_rb_root *st = &cfqd->grp_service_tree;
 
-	return cfq_target_latency * cfqg->weight / st->total_weight;
+	return cfqd->cfq_target_latency * cfqg->weight / st->total_weight;
 }
 
 static inline unsigned
@@ -2271,7 +2272,8 @@ new_workload:
 		 * to have higher weight. A more accurate thing would be to
 		 * calculate system wide asnc/sync ratio.
 		 */
-		tmp = cfq_target_latency * cfqg_busy_async_queues(cfqd, cfqg);
+		tmp = cfqd->cfq_target_latency *
+			cfqg_busy_async_queues(cfqd, cfqg);
 		tmp = tmp/cfqd->busy_queues;
 		slice = min_t(unsigned, slice, tmp);
 
@@ -3737,6 +3739,7 @@ static void *cfq_init_queue(struct request_queue *q)
 	cfqd->cfq_back_penalty = cfq_back_penalty;
 	cfqd->cfq_slice[0] = cfq_slice_async;
 	cfqd->cfq_slice[1] = cfq_slice_sync;
+	cfqd->cfq_target_latency = cfq_target_latency;
 	cfqd->cfq_slice_async_rq = cfq_slice_async_rq;
 	cfqd->cfq_slice_idle = cfq_slice_idle;
 	cfqd->cfq_group_idle = cfq_group_idle;
@@ -3788,6 +3791,7 @@ SHOW_FUNCTION(cfq_slice_sync_show, cfqd->cfq_slice[1], 1);
 SHOW_FUNCTION(cfq_slice_async_show, cfqd->cfq_slice[0], 1);
 SHOW_FUNCTION(cfq_slice_async_rq_show, cfqd->cfq_slice_async_rq, 0);
 SHOW_FUNCTION(cfq_low_latency_show, cfqd->cfq_latency, 0);
+SHOW_FUNCTION(cfq_target_latency_show, cfqd->cfq_target_latency, 1);
 #undef SHOW_FUNCTION
 
 #define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)			\
@@ -3821,6 +3825,7 @@ STORE_FUNCTION(cfq_slice_async_store, &cfqd->cfq_slice[0], 1, UINT_MAX, 1);
 STORE_FUNCTION(cfq_slice_async_rq_store, &cfqd->cfq_slice_async_rq, 1,
 		UINT_MAX, 0);
 STORE_FUNCTION(cfq_low_latency_store, &cfqd->cfq_latency, 0, 1, 0);
+STORE_FUNCTION(cfq_target_latency_store, &cfqd->cfq_target_latency, 1, UINT_MAX, 1);
 #undef STORE_FUNCTION
 
 #define CFQ_ATTR(name) \
@@ -3838,6 +3843,7 @@ static struct elv_fs_entry cfq_attrs[] = {
 	CFQ_ATTR(slice_idle),
 	CFQ_ATTR(group_idle),
 	CFQ_ATTR(low_latency),
+	CFQ_ATTR(target_latency),
 	__ATTR_NULL
 };
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH V2 2/2] Documentation: Add sysfs ABI change for cfq's target latency.
  2012-03-31 15:31 ` [PATCH V2 1/2] " Tao Ma
@ 2012-03-31 15:31   ` Tao Ma
  2012-04-01 21:33   ` [PATCH V2 1/2] block: Make cfq_target_latency tunable through sysfs Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Tao Ma @ 2012-03-31 15:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Greg Kroah-Hartman

From: Tao Ma <boyu.mt@taobao.com>

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 Documentation/ABI/testing/sysfs-cfq-target-latency |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-cfq-target-latency

diff --git a/Documentation/ABI/testing/sysfs-cfq-target-latency b/Documentation/ABI/testing/sysfs-cfq-target-latency
new file mode 100644
index 0000000..df0f782
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-cfq-target-latency
@@ -0,0 +1,8 @@
+What:		/sys/block/<device>/iosched/target_latency
+Date:		March 2012
+contact:	Tao Ma <boyu.mt@taobao.com>
+Description:
+		The /sys/block/<device>/iosched/target_latency only exists
+		when the user sets cfq to /sys/block/<device>/scheduler.
+		It contains an estimated latency time for the cfq. cfq will
+		use it to calculate the time slice used for every task.
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH V2 1/2] block: Make cfq_target_latency tunable through sysfs.
  2012-03-31 15:31 ` [PATCH V2 1/2] " Tao Ma
  2012-03-31 15:31   ` [PATCH V2 2/2] Documentation: Add sysfs ABI change for cfq's target latency Tao Ma
@ 2012-04-01 21:33   ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2012-04-01 21:33 UTC (permalink / raw)
  To: Tao Ma; +Cc: linux-kernel

On 2012-03-31 08:31, Tao Ma wrote:
> From: Tao Ma <boyu.mt@taobao.com>
> 
> In cfq, when we calculate a time slice for a process(or a cfqq to
> be precise), we have to consider the cfq_target_latency so that all the
> sync request have an estimated latency(300ms) and it is controlled by
> cfq_target_latency. But in some hadoop test, we have found that if
> there are many processes doing sequential read(24 for example), the
> throughput is bad because every process can only work for about 25ms
> and the cfqq is switched. That leads to a higher disk seek. We can
> achive the good throughput by setting low_latency=0, but then some
> read's latency is too much for the application.
> 
> So this patch makes cfq_target_latency tunable through sysfs so that
> we can tune it and find some magic number which is not bad for both
> the throughput and the read latency.

Thanks, applied both.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-04-01 21:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-27  8:16 [PATCH] block: Make cfq_target_latency tunable through sysfs Tao Ma
2012-03-29  0:32 ` Greg KH
2012-03-29  0:36   ` Tao Ma
2012-03-29 10:38     ` Jens Axboe
2012-03-31 15:31 ` [PATCH V2 1/2] " Tao Ma
2012-03-31 15:31   ` [PATCH V2 2/2] Documentation: Add sysfs ABI change for cfq's target latency Tao Ma
2012-04-01 21:33   ` [PATCH V2 1/2] block: Make cfq_target_latency tunable through sysfs Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).