[PATCH] cfq-iosched: reduce write depth only if sync was delayed

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] cfq-iosched: reduce write depth only if sync was delayed
@ 2009-12-04 12:35 Corrado Zoccolo
  2009-12-05 11:13 ` Corrado Zoccolo
  2009-12-08 18:00 ` Jeff Moyer
  0 siblings, 2 replies; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-04 12:35 UTC (permalink / raw)
  To: Linux-Kernel, Jens Axboe, Jeff Moyer, Vivek Goyal

The introduction of ramp-up formula for async queue depths has
slowed down dirty page reclaim, by reducing async write performance.
This patch makes sure the formula kicks in only when sync request
was recently delayed.

Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
---
 block/cfq-iosched.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index b00ca4c..a594388 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -284,7 +284,7 @@ struct cfq_data {
 	 */
 	struct cfq_queue oom_cfqq;
 
-	unsigned long last_end_sync_rq;
+	unsigned long last_delayed_sync;
 
 	/* List of cfq groups being managed on this device*/
 	struct hlist_head cfqg_list;
@@ -2264,7 +2264,7 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd, 
struct cfq_queue *cfqq)
 	 * based on the last sync IO we serviced
 	 */
 	if (!cfq_cfqq_sync(cfqq) && cfqd->cfq_latency) {
-		unsigned long last_sync = jiffies - cfqd->last_end_sync_rq;
+		unsigned long last_sync = jiffies - cfqd->last_delayed_sync;
 		unsigned int depth;
 
 		depth = last_sync / cfqd->cfq_slice[1];
@@ -3272,7 +3272,8 @@ static void cfq_completed_request(struct request_queue 
*q, struct request *rq)
 
 	if (sync) {
 		RQ_CIC(rq)->last_end_request = now;
-		cfqd->last_end_sync_rq = now;
+		if (time_after(rq->start_time + cfqd->cfq_fifo_expire[1], now))
+			cfqd->last_delayed_sync = now;
 	}
 
 	/*
@@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
 	cfqd->cfq_latency = 1;
 	cfqd->cfq_group_isolation = 0;
 	cfqd->hw_tag = -1;
-	cfqd->last_end_sync_rq = jiffies;
+	cfqd->last_delayed_sync = jiffies - HZ;
 	return cfqd;
 }
 
-- 
1.6.2.5




^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-04 12:35 Corrado Zoccolo
@ 2009-12-05 11:13 ` Corrado Zoccolo
  2009-12-06 10:45   ` Corrado Zoccolo
  2009-12-08 18:00 ` Jeff Moyer
  1 sibling, 1 reply; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-05 11:13 UTC (permalink / raw)
  To: Linux-Kernel, Jens Axboe, Jeff Moyer, Vivek Goyal

On Fri, Dec 4, 2009 at 1:35 PM, Corrado Zoccolo <czoccolo@gmail.com> wrote:
> The introduction of ramp-up formula for async queue depths has
> slowed down dirty page reclaim, by reducing async write performance.
> This patch makes sure the formula kicks in only when sync request
> was recently delayed.
>
> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
> ---
>  block/cfq-iosched.c |    9 +++++----
>  1 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index b00ca4c..a594388 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -284,7 +284,7 @@ struct cfq_data {
>         */
>        struct cfq_queue oom_cfqq;
>
> -       unsigned long last_end_sync_rq;
> +       unsigned long last_delayed_sync;
>
>        /* List of cfq groups being managed on this device*/
>        struct hlist_head cfqg_list;
> @@ -2264,7 +2264,7 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd,
> struct cfq_queue *cfqq)
>         * based on the last sync IO we serviced
>         */
>        if (!cfq_cfqq_sync(cfqq) && cfqd->cfq_latency) {
> -               unsigned long last_sync = jiffies - cfqd->last_end_sync_rq;
> +               unsigned long last_sync = jiffies - cfqd->last_delayed_sync;
>                unsigned int depth;
>
>                depth = last_sync / cfqd->cfq_slice[1];
> @@ -3272,7 +3272,8 @@ static void cfq_completed_request(struct request_queue
> *q, struct request *rq)
>
>        if (sync) {
>                RQ_CIC(rq)->last_end_request = now;
> -               cfqd->last_end_sync_rq = now;
> +               if (time_after(rq->start_time + cfqd->cfq_fifo_expire[1], now))
> +                       cfqd->last_delayed_sync = now;

Sorry, the check be reversed (!time_after).

>        }
>
>        /*
> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
>        cfqd->cfq_latency = 1;
>        cfqd->cfq_group_isolation = 0;
>        cfqd->hw_tag = -1;
> -       cfqd->last_end_sync_rq = jiffies;
> +       cfqd->last_delayed_sync = jiffies - HZ;
>        return cfqd;
>  }
>
> --
> 1.6.2.5
>
>
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-05 11:13 ` Corrado Zoccolo
@ 2009-12-06 10:45   ` Corrado Zoccolo
  2009-12-06 10:49     ` Jens Axboe
  2009-12-08  0:07     ` Jeff Moyer
  0 siblings, 2 replies; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-06 10:45 UTC (permalink / raw)
  To: Linux-Kernel, Jens Axboe, Jeff Moyer, Vivek Goyal

Hi Jeff,
I remember you saw large performance drop on your SAN for sequential
writes with low_latency=1.
Can you test if Shaohua's and this patch fix allow to recover some bandwidth?
I think that enabling the queue depth ramp up only if a sync request
was delayed should disable it
for fast hardware like yours, so you should not be seeing the slowdown any more.

Thanks,
Corrado

On Sat, Dec 5, 2009 at 12:13 PM, Corrado Zoccolo <czoccolo@gmail.com> wrote:
> On Fri, Dec 4, 2009 at 1:35 PM, Corrado Zoccolo <czoccolo@gmail.com> wrote:
>> The introduction of ramp-up formula for async queue depths has
>> slowed down dirty page reclaim, by reducing async write performance.
>> This patch makes sure the formula kicks in only when sync request
>> was recently delayed.
>>
>> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
>> ---
>>  block/cfq-iosched.c |    9 +++++----
>>  1 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index b00ca4c..a594388 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -284,7 +284,7 @@ struct cfq_data {
>>         */
>>        struct cfq_queue oom_cfqq;
>>
>> -       unsigned long last_end_sync_rq;
>> +       unsigned long last_delayed_sync;
>>
>>        /* List of cfq groups being managed on this device*/
>>        struct hlist_head cfqg_list;
>> @@ -2264,7 +2264,7 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd,
>> struct cfq_queue *cfqq)
>>         * based on the last sync IO we serviced
>>         */
>>        if (!cfq_cfqq_sync(cfqq) && cfqd->cfq_latency) {
>> -               unsigned long last_sync = jiffies - cfqd->last_end_sync_rq;
>> +               unsigned long last_sync = jiffies - cfqd->last_delayed_sync;
>>                unsigned int depth;
>>
>>                depth = last_sync / cfqd->cfq_slice[1];
>> @@ -3272,7 +3272,8 @@ static void cfq_completed_request(struct request_queue
>> *q, struct request *rq)
>>
>>        if (sync) {
>>                RQ_CIC(rq)->last_end_request = now;
>> -               cfqd->last_end_sync_rq = now;
>> +               if (time_after(rq->start_time + cfqd->cfq_fifo_expire[1], now))
>> +                       cfqd->last_delayed_sync = now;
>
> Sorry, the check be reversed (!time_after).
>
>>        }
>>
>>        /*
>> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
>>        cfqd->cfq_latency = 1;
>>        cfqd->cfq_group_isolation = 0;
>>        cfqd->hw_tag = -1;
>> -       cfqd->last_end_sync_rq = jiffies;
>> +       cfqd->last_delayed_sync = jiffies - HZ;
>>        return cfqd;
>>  }
>>
>> --
>> 1.6.2.5
>>
>>
>>
>>
>



-- 
__________________________________________________________________________

dott. Corrado Zoccolo                          mailto:czoccolo@gmail.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
                               Tales of Power - C. Castaneda

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-06 10:45   ` Corrado Zoccolo
@ 2009-12-06 10:49     ` Jens Axboe
  2009-12-07 14:13       ` Jeff Moyer
  2009-12-08  0:07     ` Jeff Moyer
  1 sibling, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2009-12-06 10:49 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal

On Sun, Dec 06 2009, Corrado Zoccolo wrote:
> Hi Jeff,
> I remember you saw large performance drop on your SAN for sequential
> writes with low_latency=1.  Can you test if Shaohua's and this patch
> fix allow to recover some bandwidth?  I think that enabling the queue
> depth ramp up only if a sync request was delayed should disable it for
> fast hardware like yours, so you should not be seeing the slowdown any
> more.

I queued this up for post inclusion into 2.6.33, with the time_after()
fixed.

The patch was word-wrapped, btw.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-06 10:49     ` Jens Axboe
@ 2009-12-07 14:13       ` Jeff Moyer
  2009-12-07 14:41         ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-07 14:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Corrado Zoccolo, Linux-Kernel, Vivek Goyal

Jens Axboe <jens.axboe@oracle.com> writes:

> On Sun, Dec 06 2009, Corrado Zoccolo wrote:
>> Hi Jeff,
>> I remember you saw large performance drop on your SAN for sequential
>> writes with low_latency=1.  Can you test if Shaohua's and this patch
>> fix allow to recover some bandwidth?  I think that enabling the queue
>> depth ramp up only if a sync request was delayed should disable it for
>> fast hardware like yours, so you should not be seeing the slowdown any
>> more.
>
> I queued this up for post inclusion into 2.6.33, with the time_after()
> fixed.
>
> The patch was word-wrapped, btw.

So in what branch can I find this fix?  Once I know that I can queue up
some tests.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-07 14:13       ` Jeff Moyer
@ 2009-12-07 14:41         ` Jens Axboe
  2009-12-07 16:45           ` Jeff Moyer
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2009-12-07 14:41 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Corrado Zoccolo, Linux-Kernel, Vivek Goyal

On Mon, Dec 07 2009, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> > On Sun, Dec 06 2009, Corrado Zoccolo wrote:
> >> Hi Jeff,
> >> I remember you saw large performance drop on your SAN for sequential
> >> writes with low_latency=1.  Can you test if Shaohua's and this patch
> >> fix allow to recover some bandwidth?  I think that enabling the queue
> >> depth ramp up only if a sync request was delayed should disable it for
> >> fast hardware like yours, so you should not be seeing the slowdown any
> >> more.
> >
> > I queued this up for post inclusion into 2.6.33, with the time_after()
> > fixed.
> >
> > The patch was word-wrapped, btw.
> 
> So in what branch can I find this fix?  Once I know that I can queue up
> some tests.

It's in next-2.6.33

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-07 14:41         ` Jens Axboe
@ 2009-12-07 16:45           ` Jeff Moyer
  2009-12-07 17:04             ` Vivek Goyal
  2009-12-07 17:27             ` Vivek Goyal
  0 siblings, 2 replies; 24+ messages in thread
From: Jeff Moyer @ 2009-12-07 16:45 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Corrado Zoccolo, Linux-Kernel, Vivek Goyal

Jens Axboe <jens.axboe@oracle.com> writes:

> On Mon, Dec 07 2009, Jeff Moyer wrote:
>> Jens Axboe <jens.axboe@oracle.com> writes:
>> 
>> > On Sun, Dec 06 2009, Corrado Zoccolo wrote:
>> >> Hi Jeff,
>> >> I remember you saw large performance drop on your SAN for sequential
>> >> writes with low_latency=1.  Can you test if Shaohua's and this patch
>> >> fix allow to recover some bandwidth?  I think that enabling the queue
>> >> depth ramp up only if a sync request was delayed should disable it for
>> >> fast hardware like yours, so you should not be seeing the slowdown any
>> >> more.
>> >
>> > I queued this up for post inclusion into 2.6.33, with the time_after()
>> > fixed.
>> >
>> > The patch was word-wrapped, btw.
>> 
>> So in what branch can I find this fix?  Once I know that I can queue up
>> some tests.
>
> It's in next-2.6.33

next-2.6.33 won't boot for me:

general protection fault: 0000 [#1] SMP
async/0 used greatest stack depth: 4256 bytes left

last sysfs file: /sys/class/firmware/timeout
CPU 1
Modules linked in: ata_piix pata_acpi libata sd_mod scsi_mod ext3 jbd mbcache uh
ci_hcd ohci_hcd ehci_hcd
Pid: 729, comm: async/1 Not tainted 2.6.32 #1 ProLiant DL320 G5p
RIP: 0010:[<ffffffff81199cee>]  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
RSP: 0018:ffff8801251b1d48  EFLAGS: 00010002
RAX: ffff880126dcdd28 RBX: ffff8801251fa158 RCX: 0000000000170001
RDX: ffff880125556700 RSI: ffff8801251fa158 RDI: 6b6b6b6b6b6b6b6b
RBP: ffff8801251b1d70 R08: ffff8801255a0448 R09: 000000000000005a
R10: ffff8801255a0448 R11: ffffffff818d6210 R12: ffff880126dcdb18
R13: ffff880126dcdb50 R14: 0000000000000286 R15: ffff880125556760
FS:  0000000000000000(0000) GS:ffff88002f200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000001256a5000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process async/1 (pid: 729, threadinfo ffff8801251b0000, task ffff880125556770)
Stack:
 ffffffff8119a856 0000000000000002 ffff880126dcdb18 ffff8801251fa158
<0> ffff880125210000 ffff8801251b1d90 ffffffff8119a9e5 ffff8801251f80d0
<0> ffff880126dcdb18 ffff8801251b1db0 ffffffff8119aa62 ffff880126dcdb18
Call Trace:
 [<ffffffff8119a856>] ? cfq_put_queue+0xfa/0x102
 [<ffffffff8119a9e5>] cfq_exit_cfqq+0x99/0x9e
 [<ffffffff8119aa62>] __cfq_exit_single_io_context+0x78/0x85
 [<ffffffff8119aaa9>] cfq_exit_single_io_context+0x3a/0x52
 [<ffffffff8119aa6f>] ? cfq_exit_single_io_context+0x0/0x52
 [<ffffffff8119b27b>] call_for_each_cic+0x56/0x7c
 [<ffffffff8119b225>] ? call_for_each_cic+0x0/0x7c
 [<ffffffff8119b2b1>] cfq_exit_io_context+0x10/0x12
 [<ffffffff81192d3b>] exit_io_context+0x93/0xbc
 [<ffffffff81192d03>] ? exit_io_context+0x5b/0xbc
 [<ffffffff810474e5>] do_exit+0x71a/0x747
 [<ffffffff810628f1>] ? async_thread+0x0/0x1fa
 [<ffffffff8105cd9e>] kthread_stop+0x0/0xb3
 [<ffffffff81033fa6>] ? complete+0x1c/0x4b
 [<ffffffff8100cafa>] child_rip+0xa/0x20
 [<ffffffff8103d667>] ? finish_task_switch+0x0/0xe3
 [<ffffffff8100c4bc>] ? restore_args+0x0/0x30
 [<ffffffff8105ccf8>] ? kthreadd+0xdf/0x100
 [<ffffffff8105cd19>] ? kthread+0x0/0x85
 [<ffffffff8100caf0>] ? child_rip+0x0/0x20
Code: 48 c7 43 38 00 00 00 00 48 c7 43 40 00 00 00 00 48 89 3e 48 8b 73 48 e8 fd 9e 00 00 eb 08 48 c7 43 48 00 00 00 00 5b 41 5c c9 c3 <8b> 87 d8 01 00 00 55 48 89 e5 85 c0 7f 04 0f 0b eb fe 48 8d 87
RIP  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
 RSP <ffff8801251b1d48>
---[ end trace ac909576caca45e8 ]---


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-07 16:45           ` Jeff Moyer
@ 2009-12-07 17:04             ` Vivek Goyal
  2009-12-07 17:27             ` Vivek Goyal
  1 sibling, 0 replies; 24+ messages in thread
From: Vivek Goyal @ 2009-12-07 17:04 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Jens Axboe, Corrado Zoccolo, Linux-Kernel

On Mon, Dec 07, 2009 at 11:45:43AM -0500, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> > On Mon, Dec 07 2009, Jeff Moyer wrote:
> >> Jens Axboe <jens.axboe@oracle.com> writes:
> >> 
> >> > On Sun, Dec 06 2009, Corrado Zoccolo wrote:
> >> >> Hi Jeff,
> >> >> I remember you saw large performance drop on your SAN for sequential
> >> >> writes with low_latency=1.  Can you test if Shaohua's and this patch
> >> >> fix allow to recover some bandwidth?  I think that enabling the queue
> >> >> depth ramp up only if a sync request was delayed should disable it for
> >> >> fast hardware like yours, so you should not be seeing the slowdown any
> >> >> more.
> >> >
> >> > I queued this up for post inclusion into 2.6.33, with the time_after()
> >> > fixed.
> >> >
> >> > The patch was word-wrapped, btw.
> >> 
> >> So in what branch can I find this fix?  Once I know that I can queue up
> >> some tests.
> >
> > It's in next-2.6.33
> 
> next-2.6.33 won't boot for me:
> 
> general protection fault: 0000 [#1] SMP
> async/0 used greatest stack depth: 4256 bytes left
> 
> last sysfs file: /sys/class/firmware/timeout
> CPU 1
> Modules linked in: ata_piix pata_acpi libata sd_mod scsi_mod ext3 jbd mbcache uh
> ci_hcd ohci_hcd ehci_hcd
> Pid: 729, comm: async/1 Not tainted 2.6.32 #1 ProLiant DL320 G5p
> RIP: 0010:[<ffffffff81199cee>]  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
> RSP: 0018:ffff8801251b1d48  EFLAGS: 00010002

Thanks Jeff. Looks like something wrong with my code. I am looking into
it now.

Thanks
Vivek

> RAX: ffff880126dcdd28 RBX: ffff8801251fa158 RCX: 0000000000170001
> RDX: ffff880125556700 RSI: ffff8801251fa158 RDI: 6b6b6b6b6b6b6b6b
> RBP: ffff8801251b1d70 R08: ffff8801255a0448 R09: 000000000000005a
> R10: ffff8801255a0448 R11: ffffffff818d6210 R12: ffff880126dcdb18
> R13: ffff880126dcdb50 R14: 0000000000000286 R15: ffff880125556760
> FS:  0000000000000000(0000) GS:ffff88002f200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 00000001256a5000 CR4: 00000000000006a0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process async/1 (pid: 729, threadinfo ffff8801251b0000, task ffff880125556770)
> Stack:
>  ffffffff8119a856 0000000000000002 ffff880126dcdb18 ffff8801251fa158
> <0> ffff880125210000 ffff8801251b1d90 ffffffff8119a9e5 ffff8801251f80d0
> <0> ffff880126dcdb18 ffff8801251b1db0 ffffffff8119aa62 ffff880126dcdb18
> Call Trace:
>  [<ffffffff8119a856>] ? cfq_put_queue+0xfa/0x102
>  [<ffffffff8119a9e5>] cfq_exit_cfqq+0x99/0x9e
>  [<ffffffff8119aa62>] __cfq_exit_single_io_context+0x78/0x85
>  [<ffffffff8119aaa9>] cfq_exit_single_io_context+0x3a/0x52
>  [<ffffffff8119aa6f>] ? cfq_exit_single_io_context+0x0/0x52
>  [<ffffffff8119b27b>] call_for_each_cic+0x56/0x7c
>  [<ffffffff8119b225>] ? call_for_each_cic+0x0/0x7c
>  [<ffffffff8119b2b1>] cfq_exit_io_context+0x10/0x12
>  [<ffffffff81192d3b>] exit_io_context+0x93/0xbc
>  [<ffffffff81192d03>] ? exit_io_context+0x5b/0xbc
>  [<ffffffff810474e5>] do_exit+0x71a/0x747
>  [<ffffffff810628f1>] ? async_thread+0x0/0x1fa
>  [<ffffffff8105cd9e>] kthread_stop+0x0/0xb3
>  [<ffffffff81033fa6>] ? complete+0x1c/0x4b
>  [<ffffffff8100cafa>] child_rip+0xa/0x20
>  [<ffffffff8103d667>] ? finish_task_switch+0x0/0xe3
>  [<ffffffff8100c4bc>] ? restore_args+0x0/0x30
>  [<ffffffff8105ccf8>] ? kthreadd+0xdf/0x100
>  [<ffffffff8105cd19>] ? kthread+0x0/0x85
>  [<ffffffff8100caf0>] ? child_rip+0x0/0x20
> Code: 48 c7 43 38 00 00 00 00 48 c7 43 40 00 00 00 00 48 89 3e 48 8b 73 48 e8 fd 9e 00 00 eb 08 48 c7 43 48 00 00 00 00 5b 41 5c c9 c3 <8b> 87 d8 01 00 00 55 48 89 e5 85 c0 7f 04 0f 0b eb fe 48 8d 87
> RIP  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
>  RSP <ffff8801251b1d48>
> ---[ end trace ac909576caca45e8 ]---

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-07 16:45           ` Jeff Moyer
  2009-12-07 17:04             ` Vivek Goyal
@ 2009-12-07 17:27             ` Vivek Goyal
  2009-12-07 18:03               ` Vivek Goyal
  1 sibling, 1 reply; 24+ messages in thread
From: Vivek Goyal @ 2009-12-07 17:27 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Jens Axboe, Corrado Zoccolo, Linux-Kernel

On Mon, Dec 07, 2009 at 11:45:43AM -0500, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> > On Mon, Dec 07 2009, Jeff Moyer wrote:
> >> Jens Axboe <jens.axboe@oracle.com> writes:
> >> 
> >> > On Sun, Dec 06 2009, Corrado Zoccolo wrote:
> >> >> Hi Jeff,
> >> >> I remember you saw large performance drop on your SAN for sequential
> >> >> writes with low_latency=1.  Can you test if Shaohua's and this patch
> >> >> fix allow to recover some bandwidth?  I think that enabling the queue
> >> >> depth ramp up only if a sync request was delayed should disable it for
> >> >> fast hardware like yours, so you should not be seeing the slowdown any
> >> >> more.
> >> >
> >> > I queued this up for post inclusion into 2.6.33, with the time_after()
> >> > fixed.
> >> >
> >> > The patch was word-wrapped, btw.
> >> 
> >> So in what branch can I find this fix?  Once I know that I can queue up
> >> some tests.
> >
> > It's in next-2.6.33
> 
> next-2.6.33 won't boot for me:
> 
> general protection fault: 0000 [#1] SMP
> async/0 used greatest stack depth: 4256 bytes left
> 
> last sysfs file: /sys/class/firmware/timeout
> CPU 1
> Modules linked in: ata_piix pata_acpi libata sd_mod scsi_mod ext3 jbd mbcache uh
> ci_hcd ohci_hcd ehci_hcd
> Pid: 729, comm: async/1 Not tainted 2.6.32 #1 ProLiant DL320 G5p
> RIP: 0010:[<ffffffff81199cee>]  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
> RSP: 0018:ffff8801251b1d48  EFLAGS: 00010002
> RAX: ffff880126dcdd28 RBX: ffff8801251fa158 RCX: 0000000000170001
> RDX: ffff880125556700 RSI: ffff8801251fa158 RDI: 6b6b6b6b6b6b6b6b
> RBP: ffff8801251b1d70 R08: ffff8801255a0448 R09: 000000000000005a
> R10: ffff8801255a0448 R11: ffffffff818d6210 R12: ffff880126dcdb18
> R13: ffff880126dcdb50 R14: 0000000000000286 R15: ffff880125556760
> FS:  0000000000000000(0000) GS:ffff88002f200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 00000001256a5000 CR4: 00000000000006a0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process async/1 (pid: 729, threadinfo ffff8801251b0000, task ffff880125556770)
> Stack:
>  ffffffff8119a856 0000000000000002 ffff880126dcdb18 ffff8801251fa158
> <0> ffff880125210000 ffff8801251b1d90 ffffffff8119a9e5 ffff8801251f80d0
> <0> ffff880126dcdb18 ffff8801251b1db0 ffffffff8119aa62 ffff880126dcdb18
> Call Trace:
>  [<ffffffff8119a856>] ? cfq_put_queue+0xfa/0x102
>  [<ffffffff8119a9e5>] cfq_exit_cfqq+0x99/0x9e
>  [<ffffffff8119aa62>] __cfq_exit_single_io_context+0x78/0x85
>  [<ffffffff8119aaa9>] cfq_exit_single_io_context+0x3a/0x52
>  [<ffffffff8119aa6f>] ? cfq_exit_single_io_context+0x0/0x52
>  [<ffffffff8119b27b>] call_for_each_cic+0x56/0x7c
>  [<ffffffff8119b225>] ? call_for_each_cic+0x0/0x7c
>  [<ffffffff8119b2b1>] cfq_exit_io_context+0x10/0x12
>  [<ffffffff81192d3b>] exit_io_context+0x93/0xbc
>  [<ffffffff81192d03>] ? exit_io_context+0x5b/0xbc
>  [<ffffffff810474e5>] do_exit+0x71a/0x747
>  [<ffffffff810628f1>] ? async_thread+0x0/0x1fa
>  [<ffffffff8105cd9e>] kthread_stop+0x0/0xb3
>  [<ffffffff81033fa6>] ? complete+0x1c/0x4b
>  [<ffffffff8100cafa>] child_rip+0xa/0x20
>  [<ffffffff8103d667>] ? finish_task_switch+0x0/0xe3
>  [<ffffffff8100c4bc>] ? restore_args+0x0/0x30
>  [<ffffffff8105ccf8>] ? kthreadd+0xdf/0x100
>  [<ffffffff8105cd19>] ? kthread+0x0/0x85
>  [<ffffffff8100caf0>] ? child_rip+0x0/0x20
> Code: 48 c7 43 38 00 00 00 00 48 c7 43 40 00 00 00 00 48 89 3e 48 8b 73 48 e8 fd 9e 00 00 eb 08 48 c7 43 48 00 00 00 00 5b 41 5c c9 c3 <8b> 87 d8 01 00 00 55 48 89 e5 85 c0 7f 04 0f 0b eb fe 48 8d 87
> RIP  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
>  RSP <ffff8801251b1d48>

Hm.., I seem to be accessing cfqq->orig_cfqg, after I have freed cfqq.
Following patch should fix it. Testing it now.

Thanks
Vivek

Index: linux-2.6-block/block/cfq-iosched.c
===================================================================
--- linux-2.6-block.orig/block/cfq-iosched.c
+++ linux-2.6-block/block/cfq-iosched.c
@@ -2368,7 +2368,7 @@ static int cfq_dispatch_requests(struct 
 static void cfq_put_queue(struct cfq_queue *cfqq)
 {
 	struct cfq_data *cfqd = cfqq->cfqd;
-	struct cfq_group *cfqg;
+	struct cfq_group *cfqg, *orig_cfqg;
 
 	BUG_ON(atomic_read(&cfqq->ref) <= 0);
 
@@ -2379,6 +2379,7 @@ static void cfq_put_queue(struct cfq_que
 	BUG_ON(rb_first(&cfqq->sort_list));
 	BUG_ON(cfqq->allocated[READ] + cfqq->allocated[WRITE]);
 	cfqg = cfqq->cfqg;
+	orig_cfqg = cfqq->orig_cfqg;
 
 	if (unlikely(cfqd->active_queue == cfqq)) {
 		__cfq_slice_expired(cfqd, cfqq, 0);
@@ -2388,8 +2389,8 @@ static void cfq_put_queue(struct cfq_que
 	BUG_ON(cfq_cfqq_on_rr(cfqq));
 	kmem_cache_free(cfq_pool, cfqq);
 	cfq_put_cfqg(cfqg);
-	if (cfqq->orig_cfqg)
-		cfq_put_cfqg(cfqq->orig_cfqg);
+	if (orig_cfqg)
+		cfq_put_cfqg(orig_cfqg);
 }
 
 /*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-07 17:27             ` Vivek Goyal
@ 2009-12-07 18:03               ` Vivek Goyal
  0 siblings, 0 replies; 24+ messages in thread
From: Vivek Goyal @ 2009-12-07 18:03 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Jens Axboe, Corrado Zoccolo, Linux-Kernel

On Mon, Dec 07, 2009 at 12:27:47PM -0500, Vivek Goyal wrote:

[..]
> > next-2.6.33 won't boot for me:
> > 
> > general protection fault: 0000 [#1] SMP
> > async/0 used greatest stack depth: 4256 bytes left
> > 
> > last sysfs file: /sys/class/firmware/timeout
> > CPU 1
> > Modules linked in: ata_piix pata_acpi libata sd_mod scsi_mod ext3 jbd mbcache uh
> > ci_hcd ohci_hcd ehci_hcd
> > Pid: 729, comm: async/1 Not tainted 2.6.32 #1 ProLiant DL320 G5p
> > RIP: 0010:[<ffffffff81199cee>]  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
> > RSP: 0018:ffff8801251b1d48  EFLAGS: 00010002
> > RAX: ffff880126dcdd28 RBX: ffff8801251fa158 RCX: 0000000000170001
> > RDX: ffff880125556700 RSI: ffff8801251fa158 RDI: 6b6b6b6b6b6b6b6b
> > RBP: ffff8801251b1d70 R08: ffff8801255a0448 R09: 000000000000005a
> > R10: ffff8801255a0448 R11: ffffffff818d6210 R12: ffff880126dcdb18
> > R13: ffff880126dcdb50 R14: 0000000000000286 R15: ffff880125556760
> > FS:  0000000000000000(0000) GS:ffff88002f200000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000000 CR3: 00000001256a5000 CR4: 00000000000006a0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process async/1 (pid: 729, threadinfo ffff8801251b0000, task ffff880125556770)
> > Stack:
> >  ffffffff8119a856 0000000000000002 ffff880126dcdb18 ffff8801251fa158
> > <0> ffff880125210000 ffff8801251b1d90 ffffffff8119a9e5 ffff8801251f80d0
> > <0> ffff880126dcdb18 ffff8801251b1db0 ffffffff8119aa62 ffff880126dcdb18
> > Call Trace:
> >  [<ffffffff8119a856>] ? cfq_put_queue+0xfa/0x102
> >  [<ffffffff8119a9e5>] cfq_exit_cfqq+0x99/0x9e
> >  [<ffffffff8119aa62>] __cfq_exit_single_io_context+0x78/0x85
> >  [<ffffffff8119aaa9>] cfq_exit_single_io_context+0x3a/0x52
> >  [<ffffffff8119aa6f>] ? cfq_exit_single_io_context+0x0/0x52
> >  [<ffffffff8119b27b>] call_for_each_cic+0x56/0x7c
> >  [<ffffffff8119b225>] ? call_for_each_cic+0x0/0x7c
> >  [<ffffffff8119b2b1>] cfq_exit_io_context+0x10/0x12
> >  [<ffffffff81192d3b>] exit_io_context+0x93/0xbc
> >  [<ffffffff81192d03>] ? exit_io_context+0x5b/0xbc
> >  [<ffffffff810474e5>] do_exit+0x71a/0x747
> >  [<ffffffff810628f1>] ? async_thread+0x0/0x1fa
> >  [<ffffffff8105cd9e>] kthread_stop+0x0/0xb3
> >  [<ffffffff81033fa6>] ? complete+0x1c/0x4b
> >  [<ffffffff8100cafa>] child_rip+0xa/0x20
> >  [<ffffffff8103d667>] ? finish_task_switch+0x0/0xe3
> >  [<ffffffff8100c4bc>] ? restore_args+0x0/0x30
> >  [<ffffffff8105ccf8>] ? kthreadd+0xdf/0x100
> >  [<ffffffff8105cd19>] ? kthread+0x0/0x85
> >  [<ffffffff8100caf0>] ? child_rip+0x0/0x20
> > Code: 48 c7 43 38 00 00 00 00 48 c7 43 40 00 00 00 00 48 89 3e 48 8b 73 48 e8 fd 9e 00 00 eb 08 48 c7 43 48 00 00 00 00 5b 41 5c c9 c3 <8b> 87 d8 01 00 00 55 48 89 e5 85 c0 7f 04 0f 0b eb fe 48 8d 87
> > RIP  [<ffffffff81199cee>] cfq_put_cfqg+0x0/0x91
> >  RSP <ffff8801251b1d48>
> 
> Hm.., I seem to be accessing cfqq->orig_cfqg, after I have freed cfqq.
> Following patch should fix it. Testing it now.
> 

I tested this patch and it is working. Now system boots fine. Thanks for
loaning the system to me Jeff.

Vivek

> Thanks
> Vivek
> 
> Index: linux-2.6-block/block/cfq-iosched.c
> ===================================================================
> --- linux-2.6-block.orig/block/cfq-iosched.c
> +++ linux-2.6-block/block/cfq-iosched.c
> @@ -2368,7 +2368,7 @@ static int cfq_dispatch_requests(struct 
>  static void cfq_put_queue(struct cfq_queue *cfqq)
>  {
>  	struct cfq_data *cfqd = cfqq->cfqd;
> -	struct cfq_group *cfqg;
> +	struct cfq_group *cfqg, *orig_cfqg;
>  
>  	BUG_ON(atomic_read(&cfqq->ref) <= 0);
>  
> @@ -2379,6 +2379,7 @@ static void cfq_put_queue(struct cfq_que
>  	BUG_ON(rb_first(&cfqq->sort_list));
>  	BUG_ON(cfqq->allocated[READ] + cfqq->allocated[WRITE]);
>  	cfqg = cfqq->cfqg;
> +	orig_cfqg = cfqq->orig_cfqg;
>  
>  	if (unlikely(cfqd->active_queue == cfqq)) {
>  		__cfq_slice_expired(cfqd, cfqq, 0);
> @@ -2388,8 +2389,8 @@ static void cfq_put_queue(struct cfq_que
>  	BUG_ON(cfq_cfqq_on_rr(cfqq));
>  	kmem_cache_free(cfq_pool, cfqq);
>  	cfq_put_cfqg(cfqg);
> -	if (cfqq->orig_cfqg)
> -		cfq_put_cfqg(cfqq->orig_cfqg);
> +	if (orig_cfqg)
> +		cfq_put_cfqg(orig_cfqg);
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-06 10:45   ` Corrado Zoccolo
  2009-12-06 10:49     ` Jens Axboe
@ 2009-12-08  0:07     ` Jeff Moyer
  2009-12-08 20:43       ` Corrado Zoccolo
  1 sibling, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-08  0:07 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> Hi Jeff, I remember you saw large performance drop on your SAN for
> sequential writes with low_latency=1.  Can you test if Shaohua's and
> this patch fix allow to recover some bandwidth?  I think that enabling
> the queue depth ramp up only if a sync request was delayed should
> disable it for fast hardware like yours, so you should not be seeing
> the slowdown any more.

Average of 10 runs.  Low latency set to 0:

Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   90.08 94.51%    51.316    10268.25   0.00000  0.00000    95
2.6.32                        8192  65536  16   99.36 199.8%    89.248    13883.81   0.00000  0.00000    50

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   81.16 85.90%    45.672     5963.19   0.00000  0.00000    94
2.6.32                        8192  65536  16  116.32 230.3%    58.371     6098.36   0.00000  0.00000    51

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  112.17 1085.%    42.623    17114.54   0.00152  0.00000    10
2.6.32                        8192  65536  16  111.26 2117.%    84.964    26480.60   0.03202  0.00000     5

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  162.48 887.1%     6.106      313.54   0.00000  0.00000    18
2.6.32                        8192  65536  16  156.38 1767.%    14.077     1254.62   0.00000  0.00000     9


Average of 10 runs.  Low latency set to 1:

Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   92.23 100.1%    52.119     6920.18   0.00000  0.00000    92
2.6.32                        8192  65536  16   97.88 217.0%    99.691     7453.18   0.00000  0.00000    45

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   98.70 107.5%    42.994     3409.08   0.00000  0.00000    92
2.6.32                        8192  65536  16  140.41 323.9%    59.616     4525.46   0.00000  0.00000    43

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  112.33 1076.%    42.617    17072.28   0.00076  0.00000    10
2.6.32                        8192  65536  16  111.84 2097.%    85.156    28221.77   0.02976  0.00000     5

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  159.33 870.5%     6.469      765.50   0.00000  0.00000    18
2.6.32                        8192  65536  16  141.60 1632.%    15.364     2337.57   0.00000  0.00000     9


Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-04 12:35 Corrado Zoccolo
  2009-12-05 11:13 ` Corrado Zoccolo
@ 2009-12-08 18:00 ` Jeff Moyer
  2009-12-08 20:46   ` Corrado Zoccolo
  1 sibling, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-08 18:00 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> The introduction of ramp-up formula for async queue depths has
> slowed down dirty page reclaim, by reducing async write performance.
> This patch makes sure the formula kicks in only when sync request
> was recently delayed.
> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
>  	cfqd->cfq_latency = 1;
>  	cfqd->cfq_group_isolation = 0;
>  	cfqd->hw_tag = -1;
> -	cfqd->last_end_sync_rq = jiffies;
> +	cfqd->last_delayed_sync = jiffies - HZ;

So, umm, what's that about?

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-08  0:07     ` Jeff Moyer
@ 2009-12-08 20:43       ` Corrado Zoccolo
  2009-12-09 18:09         ` Jeff Moyer
  0 siblings, 1 reply; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-08 20:43 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 7717 bytes --]

On Tue, Dec 8, 2009 at 1:07 AM, Jeff Moyer <jmoyer@redhat.com> wrote:> Corrado Zoccolo <czoccolo@gmail.com> writes:>>> Hi Jeff, I remember you saw large performance drop on your SAN for>> sequential writes with low_latency=1. Â Can you test if Shaohua's and>> this patch fix allow to recover some bandwidth? Â I think that enabling>> the queue depth ramp up only if a sync request was delayed should>> disable it for fast hardware like yours, so you should not be seeing>> the slowdown any more.>> Average of 10 runs. Â Low latency set to 0:>> Unit information> ================> File size = megabytes> Blk Size Â = bytes> Rate Â  Â  Â = megabytes per second> CPU% Â  Â  Â = percentage of CPU used during the test> Latency Â  = milliseconds> Lat% Â  Â  Â = percent of requests that took longer than X seconds> CPU Eff Â  = Rate divided by CPU% - throughput per cpu load>> Sequential Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  90.08 94.51% Â  Â 51.316 Â  Â 10268.25 Â  0.00000 Â 0.00000 Â  Â 95> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  99.36 199.8% Â  Â 89.248 Â  Â 13883.81 Â  0.00000 Â 0.00000 Â  Â 50>> Random Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  81.16 85.90% Â  Â 45.672 Â  Â  5963.19 Â  0.00000 Â 0.00000 Â  Â 94> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 116.32 230.3% Â  Â 58.371 Â  Â  6098.36 Â  0.00000 Â 0.00000 Â  Â 51>> Sequential Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 112.17 1085.% Â  Â 42.623 Â  Â 17114.54 Â  0.00152 Â 0.00000 Â  Â 10> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 111.26 2117.% Â  Â 84.964 Â  Â 26480.60 Â  0.03202 Â 0.00000 Â  Â  5>> Random Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 162.48 887.1% Â  Â  6.106 Â  Â  Â 313.54 Â  0.00000 Â 0.00000 Â  Â 18> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 156.38 1767.% Â  Â 14.077 Â  Â  1254.62 Â  0.00000 Â 0.00000 Â  Â  9>>> Average of 10 runs. Â Low latency set to 1:>> Unit information> ================> File size = megabytes> Blk Size Â = bytes> Rate Â  Â  Â = megabytes per second> CPU% Â  Â  Â = percentage of CPU used during the test> Latency Â  = milliseconds> Lat% Â  Â  Â = percent of requests that took longer than X seconds> CPU Eff Â  = Rate divided by CPU% - throughput per cpu load>> Sequential Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  92.23 100.1% Â  Â 52.119 Â  Â  6920.18 Â  0.00000 Â 0.00000 Â  Â 92> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  97.88 217.0% Â  Â 99.691 Â  Â  7453.18 Â  0.00000 Â 0.00000 Â  Â 45>> Random Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  98.70 107.5% Â  Â 42.994 Â  Â  3409.08 Â  0.00000 Â 0.00000 Â  Â 92> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 140.41 323.9% Â  Â 59.616 Â  Â  4525.46 Â  0.00000 Â 0.00000 Â  Â 43>> Sequential Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 112.33 1076.% Â  Â 42.617 Â  Â 17072.28 Â  0.00076 Â 0.00000 Â  Â 10> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 111.84 2097.% Â  Â 85.156 Â  Â 28221.77 Â  0.02976 Â 0.00000 Â  Â  5>> Random Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 159.33 870.5% Â  Â  6.469 Â  Â  Â 765.50 Â  0.00000 Â 0.00000 Â  Â 18> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 141.60 1632.% Â  Â 15.364 Â  Â  2337.57 Â  0.00000 Â 0.00000 Â  Â  9>>> Cheers,> Jeff>
The numbers look good. Now, there is no penalty in having low_latencyset for sequential writes, and just a small penalty for random ones.The fact that random reads are faster with low_latency set is interesting.Is the test is running with your patched tiobench (so that the numberof random operations is comparable with sequential ones)?
Thanks,Corrado
-- __________________________________________________________________________
dott. Corrado Zoccolo                          mailto:czoccolo@gmail.comPhD - Department of Computer Science - University of Pisa, Italy--------------------------------------------------------------------------The self-confidence of a warrior is not the self-confidence of the averageman. The average man seeks certainty in the eyes of the onlooker and callsthat self-confidence. The warrior seeks impeccability in his own eyes andcalls that humbleness.                               Tales of Power - C. Castanedaÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éÝ¶\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dÊ‡Ú™ë,j\a¢f£¢·hšïêÿ‘êçz_è®\x03(éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨èÚ&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-08 18:00 ` Jeff Moyer
@ 2009-12-08 20:46   ` Corrado Zoccolo
  2009-12-09 18:16     ` Jeff Moyer
  0 siblings, 1 reply; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-08 20:46 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Hi Jeff,
On Tue, Dec 8, 2009 at 7:00 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Corrado Zoccolo <czoccolo@gmail.com> writes:
>
>> The introduction of ramp-up formula for async queue depths has
>> slowed down dirty page reclaim, by reducing async write performance.
>> This patch makes sure the formula kicks in only when sync request
>> was recently delayed.
>> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
>>       cfqd->cfq_latency = 1;
>>       cfqd->cfq_group_isolation = 0;
>>       cfqd->hw_tag = -1;
>> -     cfqd->last_end_sync_rq = jiffies;
>> +     cfqd->last_delayed_sync = jiffies - HZ;
>
> So, umm, what's that about?

Previously, when cfq started, it started in a state where writes were
completely throttled. Now, we optimistically prefer to start with a
reasonable max depth (10)

Thanks
Corrado

> Cheers,
> Jeff
>



-- 
__________________________________________________________________________

dott. Corrado Zoccolo                          mailto:czoccolo@gmail.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
                               Tales of Power - C. Castaneda

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-08 20:43       ` Corrado Zoccolo
@ 2009-12-09 18:09         ` Jeff Moyer
  2009-12-11 17:15           ` Corrado Zoccolo
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-09 18:09 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> The numbers look good. Now, there is no penalty in having low_latency
> set for sequential writes, and just a small penalty for random ones.
> The fact that random reads are faster with low_latency set is interesting.
> Is the test is running with your patched tiobench (so that the number
> of random operations is comparable with sequential ones)?

No, I forgot all about that.  The number of random operations defaults
to 4000, which is pretty low.  I'll re-run the tests with a number
comparable to the sequential runs.  Sorry about that.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-08 20:46   ` Corrado Zoccolo
@ 2009-12-09 18:16     ` Jeff Moyer
  2009-12-09 19:05       ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-09 18:16 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> Hi Jeff,
> On Tue, Dec 8, 2009 at 7:00 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>> Corrado Zoccolo <czoccolo@gmail.com> writes:
>>
>>> The introduction of ramp-up formula for async queue depths has
>>> slowed down dirty page reclaim, by reducing async write performance.
>>> This patch makes sure the formula kicks in only when sync request
>>> was recently delayed.
>>> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
>>>       cfqd->cfq_latency = 1;
>>>       cfqd->cfq_group_isolation = 0;
>>>       cfqd->hw_tag = -1;
>>> -     cfqd->last_end_sync_rq = jiffies;
>>> +     cfqd->last_delayed_sync = jiffies - HZ;
>>
>> So, umm, what's that about?
>
> Previously, when cfq started, it started in a state where writes were
> completely throttled. Now, we optimistically prefer to start with a
> reasonable max depth (10)

OK.  Can we put a comment in there and change the initialization to
cfq_slice_sync * 10?

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-09 18:16     ` Jeff Moyer
@ 2009-12-09 19:05       ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2009-12-09 19:05 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Corrado Zoccolo, Linux-Kernel, Vivek Goyal

On Wed, Dec 09 2009, Jeff Moyer wrote:
> Corrado Zoccolo <czoccolo@gmail.com> writes:
> 
> > Hi Jeff,
> > On Tue, Dec 8, 2009 at 7:00 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> >> Corrado Zoccolo <czoccolo@gmail.com> writes:
> >>
> >>> The introduction of ramp-up formula for async queue depths has
> >>> slowed down dirty page reclaim, by reducing async write performance.
> >>> This patch makes sure the formula kicks in only when sync request
> >>> was recently delayed.
> >>> @@ -3706,7 +3707,7 @@ static void *cfq_init_queue(struct request_queue *q)
> >>>       cfqd->cfq_latency = 1;
> >>>       cfqd->cfq_group_isolation = 0;
> >>>       cfqd->hw_tag = -1;
> >>> -     cfqd->last_end_sync_rq = jiffies;
> >>> +     cfqd->last_delayed_sync = jiffies - HZ;
> >>
> >> So, umm, what's that about?
> >
> > Previously, when cfq started, it started in a state where writes were
> > completely throttled. Now, we optimistically prefer to start with a
> > reasonable max depth (10)
> 
> OK.  Can we put a comment in there and change the initialization to
> cfq_slice_sync * 10?

Agree, that would be MUCH easier to understand.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
@ 2009-12-09 19:45 Corrado Zoccolo
  2009-12-09 19:51 ` Jeff Moyer
  2009-12-09 19:54 ` Jens Axboe
  0 siblings, 2 replies; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-09 19:45 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Moyer, Linux-Kernel, Vivek Goyal

On Wed, Dec 9, 2009 at 8:05 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
> On Wed, Dec 09 2009, Jeff Moyer wrote:
>> OK.  Can we put a comment in there and change the initialization to
>> cfq_slice_sync * 10?
>
> Agree, that would be MUCH easier to understand.
>
Sure, we can put a comment there, but I don't like hardcoding a constant that depends on how the formula is computed (what if the formula is changed, 
and it doesn't depend on cfq_slice_sync any more, or if cfq_slice_sync changes dynamically?).
When I wrote it, what I really meant was exactly what you read in the C code (assume the last delayed sync happened 1 second ago). Then, the effect 
would be to start with a queue depth of 10 with the current formula, but even if we change the formula, 1 second is still meaningful (while 10 
*cfq_slice_sync, that has the same value, becomes misleading). So my proposed fix is just:

>From f06cd83b45b3a7ee13ae7322197b610085dc70dd Mon Sep 17 00:00:00 2001
From: Corrado Zoccolo <corrado@localhost.(none)>
Date: Wed, 9 Dec 2009 20:40:16 +0100
Subject: [PATCH] cfq-iosched: commenting non-obvious initialization

Added a comment to explain the initialization of last_delayed_sync.

Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
---
 block/cfq-iosched.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 98b15b9..69ecee7 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -3759,6 +3759,10 @@ static void *cfq_init_queue(struct request_queue *q)
 	cfqd->cfq_latency = 1;
 	cfqd->cfq_group_isolation = 0;
 	cfqd->hw_tag = -1;
+	/*
+	 * we optimistically start assuming sync ops weren't delayed in last
+	 * second, in order to have larger depth for async operations.
+	 */
 	cfqd->last_delayed_sync = jiffies - HZ;
 	INIT_RCU_HEAD(&cfqd->rcu);
 	return cfqd;
-- 
1.6.2.5




^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-09 19:45 [PATCH] cfq-iosched: reduce write depth only if sync was delayed Corrado Zoccolo
@ 2009-12-09 19:51 ` Jeff Moyer
  2009-12-09 19:54 ` Jens Axboe
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Moyer @ 2009-12-09 19:51 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Jens Axboe, Linux-Kernel, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> On Wed, Dec 9, 2009 at 8:05 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
>> On Wed, Dec 09 2009, Jeff Moyer wrote:
>>> OK.  Can we put a comment in there and change the initialization to
>>> cfq_slice_sync * 10?
>>
>> Agree, that would be MUCH easier to understand.
>>
> Sure, we can put a comment there, but I don't like hardcoding a
> constant that depends on how the formula is computed (what if the
> formula is changed, and it doesn't depend on cfq_slice_sync any more,
> or if cfq_slice_sync changes dynamically?).

Then presumably you'd change the initialization of that variable.

> When I wrote it, what I really meant was exactly what you read in the
> C code (assume the last delayed sync happened 1 second ago). Then, the
> effect would be to start with a queue depth of 10 with the current
> formula, but even if we change the formula, 1 second is still
> meaningful (while 10 *cfq_slice_sync, that has the same value, becomes
> misleading). So my proposed fix is just:

Well, given your initial explanation, my suggestion made sense to me.
Given this new explanation, I'm fine with the change below.  Thanks for
clarifying.

Acked-by: Jeff Moyer <jmoyer@redhat.com>

> From f06cd83b45b3a7ee13ae7322197b610085dc70dd Mon Sep 17 00:00:00 2001
> From: Corrado Zoccolo <corrado@localhost.(none)>
> Date: Wed, 9 Dec 2009 20:40:16 +0100
> Subject: [PATCH] cfq-iosched: commenting non-obvious initialization
>
> Added a comment to explain the initialization of last_delayed_sync.
>
> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
> ---
>  block/cfq-iosched.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 98b15b9..69ecee7 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3759,6 +3759,10 @@ static void *cfq_init_queue(struct request_queue *q)
>  	cfqd->cfq_latency = 1;
>  	cfqd->cfq_group_isolation = 0;
>  	cfqd->hw_tag = -1;
> +	/*
> +	 * we optimistically start assuming sync ops weren't delayed in last
> +	 * second, in order to have larger depth for async operations.
> +	 */
>  	cfqd->last_delayed_sync = jiffies - HZ;
>  	INIT_RCU_HEAD(&cfqd->rcu);
>  	return cfqd;


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-09 19:45 [PATCH] cfq-iosched: reduce write depth only if sync was delayed Corrado Zoccolo
  2009-12-09 19:51 ` Jeff Moyer
@ 2009-12-09 19:54 ` Jens Axboe
  1 sibling, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2009-12-09 19:54 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Jeff Moyer, Linux-Kernel, Vivek Goyal

On Wed, Dec 09 2009, Corrado Zoccolo wrote:
> On Wed, Dec 9, 2009 at 8:05 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
> > On Wed, Dec 09 2009, Jeff Moyer wrote:
> >> OK.  Can we put a comment in there and change the initialization to
> >> cfq_slice_sync * 10?
> >
> > Agree, that would be MUCH easier to understand.
> >
> Sure, we can put a comment there, but I don't like hardcoding a constant that depends on how the formula is computed (what if the formula is changed, 
> and it doesn't depend on cfq_slice_sync any more, or if cfq_slice_sync changes dynamically?).
> When I wrote it, what I really meant was exactly what you read in the C code (assume the last delayed sync happened 1 second ago). Then, the effect 
> would be to start with a queue depth of 10 with the current formula, but even if we change the formula, 1 second is still meaningful (while 10 
> *cfq_slice_sync, that has the same value, becomes misleading). So my proposed fix is just:
> 
> From f06cd83b45b3a7ee13ae7322197b610085dc70dd Mon Sep 17 00:00:00 2001
> From: Corrado Zoccolo <corrado@localhost.(none)>
> Date: Wed, 9 Dec 2009 20:40:16 +0100
> Subject: [PATCH] cfq-iosched: commenting non-obvious initialization
> 
> Added a comment to explain the initialization of last_delayed_sync.
> 
> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
> ---
>  block/cfq-iosched.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 98b15b9..69ecee7 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3759,6 +3759,10 @@ static void *cfq_init_queue(struct request_queue *q)
>  	cfqd->cfq_latency = 1;
>  	cfqd->cfq_group_isolation = 0;
>  	cfqd->hw_tag = -1;
> +	/*
> +	 * we optimistically start assuming sync ops weren't delayed in last
> +	 * second, in order to have larger depth for async operations.
> +	 */
>  	cfqd->last_delayed_sync = jiffies - HZ;

Thanks, that explains the HZ nicely. Applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-09 18:09         ` Jeff Moyer
@ 2009-12-11 17:15           ` Corrado Zoccolo
  2009-12-11 17:50             ` Jeff Moyer
  2009-12-18 15:32             ` Jeff Moyer
  0 siblings, 2 replies; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-11 17:15 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Hi Jeff,
On Wed, Dec 9, 2009 at 7:09 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Corrado Zoccolo <czoccolo@gmail.com> writes:
>
>> The numbers look good. Now, there is no penalty in having low_latency
>> set for sequential writes, and just a small penalty for random ones.
>> The fact that random reads are faster with low_latency set is interesting.
>> Is the test is running with your patched tiobench (so that the number
>> of random operations is comparable with sequential ones)?
>
> No, I forgot all about that.  The number of random operations defaults
> to 4000, which is pretty low.  I'll re-run the tests with a number
> comparable to the sequential runs.  Sorry about that.
>
N.P.
if you have time, can you also re-run the test changing:
iosched/fifo_expire_async to 8 ?
I hope that reducing the expire_async, will make cfq quicker at switching
between the different threads, allowing more parallelism for seq
writers on your hw.
If this is the case, I think I can try to estimate the
fifo_expire_async in the autotuning patch.

> Cheers,
> Jeff
>

Thanks
Corrado

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-11 17:15           ` Corrado Zoccolo
@ 2009-12-11 17:50             ` Jeff Moyer
  2009-12-18 15:32             ` Jeff Moyer
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff Moyer @ 2009-12-11 17:50 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> Hi Jeff,
> On Wed, Dec 9, 2009 at 7:09 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>> Corrado Zoccolo <czoccolo@gmail.com> writes:
>>
>>> The numbers look good. Now, there is no penalty in having low_latency
>>> set for sequential writes, and just a small penalty for random ones.
>>> The fact that random reads are faster with low_latency set is interesting.
>>> Is the test is running with your patched tiobench (so that the number
>>> of random operations is comparable with sequential ones)?
>>
>> No, I forgot all about that.  The number of random operations defaults
>> to 4000, which is pretty low.  I'll re-run the tests with a number
>> comparable to the sequential runs.  Sorry about that.
>>
> N.P.

low_latency=1

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   19.35 26.10%    74.639     3278.74   0.00000  0.00000    74
2.6.32                        8192  65536  16   20.40 53.07%   135.695     3705.55   0.00000  0.00000    38

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   64.64 342.2%    22.708     5960.22   0.00153  0.00000    19
2.6.32                        8192  65536  16   61.19 663.7%    47.648    11294.56   0.16098  0.00000     9


low_latency=0

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   19.71 25.32%    70.891     4340.92   0.00000  0.00000    78
2.6.32                        8192  65536  16   22.67 58.94%   121.669     6180.49   0.00229  0.00000    38

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   64.69 339.5%    22.651     6770.99   0.00152  0.00000    19
2.6.32                        8192  65536  16   61.94 663.0%    46.962    10719.13   0.12741  0.00000     9


> if you have time, can you also re-run the test changing:
> iosched/fifo_expire_async to 8 ?

Sure.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-11 17:15           ` Corrado Zoccolo
  2009-12-11 17:50             ` Jeff Moyer
@ 2009-12-18 15:32             ` Jeff Moyer
  2009-12-18 21:03               ` Corrado Zoccolo
  1 sibling, 1 reply; 24+ messages in thread
From: Jeff Moyer @ 2009-12-18 15:32 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

Corrado Zoccolo <czoccolo@gmail.com> writes:

> Hi Jeff,
> On Wed, Dec 9, 2009 at 7:09 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>> Corrado Zoccolo <czoccolo@gmail.com> writes:
>>
>>> The numbers look good. Now, there is no penalty in having low_latency
>>> set for sequential writes, and just a small penalty for random ones.
>>> The fact that random reads are faster with low_latency set is interesting.
>>> Is the test is running with your patched tiobench (so that the number
>>> of random operations is comparable with sequential ones)?
>>
>> No, I forgot all about that.  The number of random operations defaults
>> to 4000, which is pretty low.  I'll re-run the tests with a number
>> comparable to the sequential runs.  Sorry about that.
>>
> N.P.
> if you have time, can you also re-run the test changing:
> iosched/fifo_expire_async to 8 ?
> I hope that reducing the expire_async, will make cfq quicker at switching
> between the different threads, allowing more parallelism for seq
> writers on your hw.
> If this is the case, I think I can try to estimate the
> fifo_expire_async in the autotuning patch.

Sorry this took so long.  I've been rather busy of late.

Cheers,
Jeff

low_latency=1, fifo_expire_async=8

Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   88.39 89.74%    16.388     2032.62   0.00000  0.00000    98
2.6.32                        8192  65536  16   90.77 185.3%    32.213     2175.99   0.00000  0.00000    49

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   19.62 25.74%    71.827     3397.26   0.00000  0.00000    76
2.6.32                        8192  65536  16   23.82 55.01%   103.361     4075.53   0.00000  0.00000    43

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  108.28 1007.%    12.984     5643.55   0.00076  0.00000    11
2.6.32                        8192  65536  16  112.40 2014.%    25.430     8592.98   0.00839  0.00000     6

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   63.94 337.7%    22.885     6047.22   0.00076  0.00000    19
2.6.32                        8192  65536  16   61.94 662.5%    46.997    12759.69   0.15411  0.00000     9


low_latency=0, fifo_expire_async=8

Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   89.13 88.30%    15.872     3101.39   0.00000  0.00000   101
2.6.32                        8192  65536  16   86.78 161.7%    30.794     4909.02   0.00000  0.00000    54

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   20.21 26.64%    69.863     4285.42   0.00000  0.00000    76
2.6.32                        8192  65536  16   20.10 52.75%   139.761     5986.94   0.00076  0.00000    38

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8  108.74 1020.%    13.070     5331.78   0.00076  0.00000    11
2.6.32                        8192  65536  16  112.18 2020.%    25.559     7903.16   0.00992  0.00000     6

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32                        8192  65536   8   64.53 337.8%    22.671     5388.77   0.00000  0.00000    19
2.6.32                        8192  65536  16   61.75 668.9%    47.265    13271.37   0.12894  0.00000     9

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] cfq-iosched: reduce write depth only if sync was delayed
  2009-12-18 15:32             ` Jeff Moyer
@ 2009-12-18 21:03               ` Corrado Zoccolo
  0 siblings, 0 replies; 24+ messages in thread
From: Corrado Zoccolo @ 2009-12-18 21:03 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Linux-Kernel, Jens Axboe, Vivek Goyal

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 8537 bytes --]

Thanks Jeff.Your tests show that with the patch, low_latency doesn't penalizewriteback any more on fast hardware, so the goal of the patch isfulfilled. Lowering slice_async doesn't change the picture (I wasexpecting an improvement in seq write bandwidth).I'm still puzzled of why with 2.6.29 your SAN could achieve 300 MB/ssequential writes, and it can't achieve it any more.Can you try lowering slice_idle, trying e.g. 2ms?
ThanksCorrado
On Fri, Dec 18, 2009 at 4:32 PM, Jeff Moyer <jmoyer@redhat.com> wrote:> Corrado Zoccolo <czoccolo@gmail.com> writes:>>> Hi Jeff,>> On Wed, Dec 9, 2009 at 7:09 PM, Jeff Moyer <jmoyer@redhat.com> wrote:>>> Corrado Zoccolo <czoccolo@gmail.com> writes:>>>>>>> The numbers look good. Now, there is no penalty in having low_latency>>>> set for sequential writes, and just a small penalty for random ones.>>>> The fact that random reads are faster with low_latency set is interesting.>>>> Is the test is running with your patched tiobench (so that the number>>>> of random operations is comparable with sequential ones)?>>>>>> No, I forgot all about that. Â The number of random operations defaults>>> to 4000, which is pretty low. Â I'll re-run the tests with a number>>> comparable to the sequential runs. Â Sorry about that.>>>>> N.P.>> if you have time, can you also re-run the test changing:>> iosched/fifo_expire_async to 8 ?>> I hope that reducing the expire_async, will make cfq quicker at switching>> between the different threads, allowing more parallelism for seq>> writers on your hw.>> If this is the case, I think I can try to estimate the>> fifo_expire_async in the autotuning patch.>> Sorry this took so long. Â I've been rather busy of late.>> Cheers,> Jeff>> low_latency=1, fifo_expire_async=8>> Unit information> ================> File size = megabytes> Blk Size Â = bytes> Rate Â  Â  Â = megabytes per second> CPU% Â  Â  Â = percentage of CPU used during the test> Latency Â  = milliseconds> Lat% Â  Â  Â = percent of requests that took longer than X seconds> CPU Eff Â  = Rate divided by CPU% - throughput per cpu load>> Sequential Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  88.39 89.74% Â  Â 16.388 Â  Â  2032.62 Â  0.00000 Â 0.00000 Â  Â 98> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  90.77 185.3% Â  Â 32.213 Â  Â  2175.99 Â  0.00000 Â 0.00000 Â  Â 49>> Random Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  19.62 25.74% Â  Â 71.827 Â  Â  3397.26 Â  0.00000 Â 0.00000 Â  Â 76> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  23.82 55.01% Â  103.361 Â  Â  4075.53 Â  0.00000 Â 0.00000 Â  Â 43>> Sequential Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 108.28 1007.% Â  Â 12.984 Â  Â  5643.55 Â  0.00076 Â 0.00000 Â  Â 11> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 112.40 2014.% Â  Â 25.430 Â  Â  8592.98 Â  0.00839 Â 0.00000 Â  Â  6>> Random Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  63.94 337.7% Â  Â 22.885 Â  Â  6047.22 Â  0.00076 Â 0.00000 Â  Â 19> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  61.94 662.5% Â  Â 46.997 Â  Â 12759.69 Â  0.15411 Â 0.00000 Â  Â  9>>> low_latency=0, fifo_expire_async=8>> Unit information> ================> File size = megabytes> Blk Size Â = bytes> Rate Â  Â  Â = megabytes per second> CPU% Â  Â  Â = percentage of CPU used during the test> Latency Â  = milliseconds> Lat% Â  Â  Â = percent of requests that took longer than X seconds> CPU Eff Â  = Rate divided by CPU% - throughput per cpu load>> Sequential Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  89.13 88.30% Â  Â 15.872 Â  Â  3101.39 Â  0.00000 Â 0.00000 Â  101> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  86.78 161.7% Â  Â 30.794 Â  Â  4909.02 Â  0.00000 Â 0.00000 Â  Â 54>> Random Reads> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  20.21 26.64% Â  Â 69.863 Â  Â  4285.42 Â  0.00000 Â 0.00000 Â  Â 76> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  20.10 52.75% Â  139.761 Â  Â  5986.94 Â  0.00076 Â 0.00000 Â  Â 38>> Sequential Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â 108.74 1020.% Â  Â 13.070 Â  Â  5331.78 Â  0.00076 Â 0.00000 Â  Â 11> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â 112.18 2020.% Â  Â 25.559 Â  Â  7903.16 Â  0.00992 Â 0.00000 Â  Â  6>> Random Writes> Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â File Â Blk Â  Num Â  Â  Â  Â  Â  Â  Â  Â  Â  Avg Â  Â  Â Maximum Â  Â  Â Lat% Â  Â  Lat% Â  Â CPU> Identifier Â  Â  Â  Â  Â  Â  Â  Â  Â  Â Size Â Size Â Thr Â  Rate Â (CPU%) Â Latency Â  Â Latency Â  Â  Â >2s Â  Â  Â >10s Â  Â Eff> ---------------------------- ------ ----- --- Â ------ ------ --------- ----------- Â -------- -------- -----> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â  8 Â  64.53 337.8% Â  Â 22.671 Â  Â  5388.77 Â  0.00000 Â 0.00000 Â  Â 19> 2.6.32 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 8192 Â 65536 Â 16 Â  61.75 668.9% Â  Â 47.265 Â  Â 13271.37 Â  0.12894 Â 0.00000 Â  Â  9>

-- __________________________________________________________________________
dott. Corrado Zoccolo                          mailto:czoccolo@gmail.comPhD - Department of Computer Science - University of Pisa, Italy--------------------------------------------------------------------------The self-confidence of a warrior is not the self-confidence of the averageman. The average man seeks certainty in the eyes of the onlooker and callsthat self-confidence. The warrior seeks impeccability in his own eyes andcalls that humbleness.                               Tales of Power - C. Castanedaÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éÝ¶\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dÊ‡Ú™ë,j\a¢f£¢·hšïêÿ‘êçz_è®\x03(éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨èÚ&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2009-12-18 21:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-09 19:45 [PATCH] cfq-iosched: reduce write depth only if sync was delayed Corrado Zoccolo
2009-12-09 19:51 ` Jeff Moyer
2009-12-09 19:54 ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2009-12-04 12:35 Corrado Zoccolo
2009-12-05 11:13 ` Corrado Zoccolo
2009-12-06 10:45   ` Corrado Zoccolo
2009-12-06 10:49     ` Jens Axboe
2009-12-07 14:13       ` Jeff Moyer
2009-12-07 14:41         ` Jens Axboe
2009-12-07 16:45           ` Jeff Moyer
2009-12-07 17:04             ` Vivek Goyal
2009-12-07 17:27             ` Vivek Goyal
2009-12-07 18:03               ` Vivek Goyal
2009-12-08  0:07     ` Jeff Moyer
2009-12-08 20:43       ` Corrado Zoccolo
2009-12-09 18:09         ` Jeff Moyer
2009-12-11 17:15           ` Corrado Zoccolo
2009-12-11 17:50             ` Jeff Moyer
2009-12-18 15:32             ` Jeff Moyer
2009-12-18 21:03               ` Corrado Zoccolo
2009-12-08 18:00 ` Jeff Moyer
2009-12-08 20:46   ` Corrado Zoccolo
2009-12-09 18:16     ` Jeff Moyer
2009-12-09 19:05       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox