* [patch]cfq-iosched: delete deep seeky queue idle logic @ 2011-09-16 3:09 Shaohua Li 2011-09-16 6:04 ` Corrado Zoccolo 2011-09-16 9:54 ` Tao Ma 0 siblings, 2 replies; 19+ messages in thread From: Shaohua Li @ 2011-09-16 3:09 UTC (permalink / raw) To: lkml; +Cc: Jens Axboe, Maxim Patlasov, Vivek Goyal, Corrado Zoccolo Recently Maxim and I discussed why his aiostress workload performs poorly. If you didn't follow the discussion, here are the issues we found: 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue as seeky, but since when accessing A+1, A+1 is already in disk cache, this should be detected as sequential really. Not sure if any real workload has such access patern, and seems not easy to have a clean fix too. Any idea for this? 2. deep seeky queue idle. This makes raid performs poorly. I would think we revert the logic. Deep queue is more popular with high end hardware. In such hardware, we'd better not do idle. Note, currently we set a queue's slice after the first request is finished. This means the drive already idles a little time. If the queue is truely deep, new requests should already come in, so idle isn't required. Looks Vivek used to post a patch to rever it, but it gets ignored. http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html Signed-off-by: Shaohua Li<shaohua.li@intel.com> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index a33bd43..f75439e 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -334,7 +334,6 @@ enum cfqq_state_flags { CFQ_CFQQ_FLAG_sync, /* synchronous queue */ CFQ_CFQQ_FLAG_coop, /* cfqq is shared */ CFQ_CFQQ_FLAG_split_coop, /* shared cfqq will be splitted */ - CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */ CFQ_CFQQ_FLAG_wait_busy, /* Waiting for next request */ }; @@ -363,7 +362,6 @@ CFQ_CFQQ_FNS(slice_new); CFQ_CFQQ_FNS(sync); CFQ_CFQQ_FNS(coop); CFQ_CFQQ_FNS(split_coop); -CFQ_CFQQ_FNS(deep); CFQ_CFQQ_FNS(wait_busy); #undef CFQ_CFQQ_FNS @@ -2375,17 +2373,6 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd) goto keep_queue; } - /* - * This is a deep seek queue, but the device is much faster than - * the queue can deliver, don't idle - **/ - if (CFQQ_SEEKY(cfqq) && cfq_cfqq_idle_window(cfqq) && - (cfq_cfqq_slice_new(cfqq) || - (cfqq->slice_end - jiffies > jiffies - cfqq->slice_start))) { - cfq_clear_cfqq_deep(cfqq); - cfq_clear_cfqq_idle_window(cfqq); - } - if (cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) { cfqq = NULL; goto keep_queue; @@ -3298,13 +3285,10 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq, enable_idle = old_idle = cfq_cfqq_idle_window(cfqq); - if (cfqq->queued[0] + cfqq->queued[1] >= 4) - cfq_mark_cfqq_deep(cfqq); - if (cfqq->next_rq && (cfqq->next_rq->cmd_flags & REQ_NOIDLE)) enable_idle = 0; else if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle || - (!cfq_cfqq_deep(cfqq) && CFQQ_SEEKY(cfqq))) + CFQQ_SEEKY(cfqq)) enable_idle = 0; else if (sample_valid(cic->ttime.ttime_samples)) { if (cic->ttime.ttime_mean > cfqd->cfq_slice_idle) @@ -3874,11 +3858,6 @@ static void cfq_idle_slice_timer(unsigned long data) */ if (!RB_EMPTY_ROOT(&cfqq->sort_list)) goto out_kick; - - /* - * Queue depth flag is reset only when the idle didn't succeed - */ - cfq_clear_cfqq_deep(cfqq); } expire: cfq_slice_expired(cfqd, timed_out); ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 3:09 [patch]cfq-iosched: delete deep seeky queue idle logic Shaohua Li @ 2011-09-16 6:04 ` Corrado Zoccolo 2011-09-16 6:40 ` Shaohua Li ` (2 more replies) 2011-09-16 9:54 ` Tao Ma 1 sibling, 3 replies; 19+ messages in thread From: Corrado Zoccolo @ 2011-09-16 6:04 UTC (permalink / raw) To: Shaohua Li; +Cc: lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li <shaohua.li@intel.com> wrote: > Recently Maxim and I discussed why his aiostress workload performs poorly. If > you didn't follow the discussion, here are the issues we found: > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue > as seeky, but since when accessing A+1, A+1 is already in disk cache, this > should be detected as sequential really. Not sure if any real workload has such > access patern, and seems not easy to have a clean fix too. Any idea for this? Not all disks will cache 4 independent streams, we can't make that assumption in cfq. The current behaviour of assuming it as seeky should work well enough, in fact it will be put in the seeky tree, and it can enjoy the seeky tree quantum of time. If the second round takes a short time, it will be able to schedule a third round again after the idle time. If there are other seeky processes competing for the tree, the cache can be cleared by the time it gets back to your 4 streams process, so it will behave exactly as a seeky process from cfq point of view. If the various accesses were submitted in parallel, the deep seeky queue logic should kick in and make sure the process gets a sequential quantum, rather than sharing it with other seeky processes, so depending on your disk, it could perform better. > > 2. deep seeky queue idle. This makes raid performs poorly. I would think we > revert the logic. Deep queue is more popular with high end hardware. In such > hardware, we'd better not do idle. > Note, currently we set a queue's slice after the first request is finished. > This means the drive already idles a little time. If the queue is truely deep, > new requests should already come in, so idle isn't required. > Looks Vivek used to post a patch to rever it, but it gets ignored. > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html I get a 404 here. I think you are seeing only one half of the medal. That logic is there mainly to ensure fairness between deep seeky processes and normal seeky processes that want low latency. If you remove that logic, a single process making many parallel aio reads could completely swamp one machine, preventing other seeky processes from progressing. Instead of removing completely the logic, you should make the depth configurable, so multi-spindle storages could allow deeper queues before switching to fairness-enforcing policy. > Signed-off-by: Shaohua Li<shaohua.li@intel.com> > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index a33bd43..f75439e 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -334,7 +334,6 @@ enum cfqq_state_flags { > CFQ_CFQQ_FLAG_sync, /* synchronous queue */ > CFQ_CFQQ_FLAG_coop, /* cfqq is shared */ > CFQ_CFQQ_FLAG_split_coop, /* shared cfqq will be splitted */ > - CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */ > CFQ_CFQQ_FLAG_wait_busy, /* Waiting for next request */ > }; > > @@ -363,7 +362,6 @@ CFQ_CFQQ_FNS(slice_new); > CFQ_CFQQ_FNS(sync); > CFQ_CFQQ_FNS(coop); > CFQ_CFQQ_FNS(split_coop); > -CFQ_CFQQ_FNS(deep); > CFQ_CFQQ_FNS(wait_busy); > #undef CFQ_CFQQ_FNS > > @@ -2375,17 +2373,6 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd) > goto keep_queue; > } > > - /* > - * This is a deep seek queue, but the device is much faster than > - * the queue can deliver, don't idle > - **/ > - if (CFQQ_SEEKY(cfqq) && cfq_cfqq_idle_window(cfqq) && > - (cfq_cfqq_slice_new(cfqq) || > - (cfqq->slice_end - jiffies > jiffies - cfqq->slice_start))) { > - cfq_clear_cfqq_deep(cfqq); > - cfq_clear_cfqq_idle_window(cfqq); > - } > - I haven't seen the patch that introduced this code hunk, but this could disrupt the cache in your first scenario if the reads A B C D were sent in parallel. You mistakenly assume your disk can issue more requests in parallel only because many of them hit the cache. Now you start sending other unrelated requests (your 4 stream process is not identified as deep any more, so other processes in the seeky tree compete with it), and this makes your cache hit ratio drop and everything slows down. > if (cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) { > cfqq = NULL; > goto keep_queue; > @@ -3298,13 +3285,10 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq, > > enable_idle = old_idle = cfq_cfqq_idle_window(cfqq); > > - if (cfqq->queued[0] + cfqq->queued[1] >= 4) > - cfq_mark_cfqq_deep(cfqq); > - > if (cfqq->next_rq && (cfqq->next_rq->cmd_flags & REQ_NOIDLE)) > enable_idle = 0; > else if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle || > - (!cfq_cfqq_deep(cfqq) && CFQQ_SEEKY(cfqq))) > + CFQQ_SEEKY(cfqq)) > enable_idle = 0; > else if (sample_valid(cic->ttime.ttime_samples)) { > if (cic->ttime.ttime_mean > cfqd->cfq_slice_idle) > @@ -3874,11 +3858,6 @@ static void cfq_idle_slice_timer(unsigned long data) > */ > if (!RB_EMPTY_ROOT(&cfqq->sort_list)) > goto out_kick; > - > - /* > - * Queue depth flag is reset only when the idle didn't succeed > - */ > - cfq_clear_cfqq_deep(cfqq); > } > expire: > cfq_slice_expired(cfqd, timed_out); > > > Thanks, Corrado ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 6:04 ` Corrado Zoccolo @ 2011-09-16 6:40 ` Shaohua Li 2011-09-16 19:25 ` Corrado Zoccolo 2011-09-16 13:24 ` Vivek Goyal 2011-09-16 13:37 ` Vivek Goyal 2 siblings, 1 reply; 19+ messages in thread From: Shaohua Li @ 2011-09-16 6:40 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: > On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li <shaohua.li@intel.com> wrote: > > Recently Maxim and I discussed why his aiostress workload performs poorly. If > > you didn't follow the discussion, here are the issues we found: > > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, > > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue > > as seeky, but since when accessing A+1, A+1 is already in disk cache, this > > should be detected as sequential really. Not sure if any real workload has such > > access patern, and seems not easy to have a clean fix too. Any idea for this? > > Not all disks will cache 4 independent streams, we can't make that > assumption in cfq. sure thing. we can't make such assumption. I'm thinking if we should move the seeky detection in request finish. If time between two requests finish is short, we thought the queue is sequential. This will make the detection adaptive. but seems time measurement isn't easy. > The current behaviour of assuming it as seeky should work well enough, > in fact it will be put in the seeky tree, and it can enjoy the seeky > tree quantum of time. If the second round takes a short time, it will > be able to schedule a third round again after the idle time. > If there are other seeky processes competing for the tree, the cache > can be cleared by the time it gets back to your 4 streams process, so > it will behave exactly as a seeky process from cfq point of view. > If the various accesses were submitted in parallel, the deep seeky > queue logic should kick in and make sure the process gets a sequential > quantum, rather than sharing it with other seeky processes, so > depending on your disk, it could perform better. yes, the idle logic makes it ok, but sounds like "make things wrong first (in seeky detection) and then fix it later (the idle logic)". > > 2. deep seeky queue idle. This makes raid performs poorly. I would think we > > revert the logic. Deep queue is more popular with high end hardware. In such > > hardware, we'd better not do idle. > > Note, currently we set a queue's slice after the first request is finished. > > This means the drive already idles a little time. If the queue is truely deep, > > new requests should already come in, so idle isn't required. What did you think about this? Assume seeky request takes long time, so the queue is already idling for a little time. > > Looks Vivek used to post a patch to rever it, but it gets ignored. > > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > I get a 404 here. I think you are seeing only one half of the medal. > That logic is there mainly to ensure fairness between deep seeky > processes and normal seeky processes that want low latency. didn't understand it. The logic doesn't protect non-deep process. how could it make the normal seeky process have low latency? or did you have a test case for this, so I can analyze? I tried a workload with one task drives depth 4 and one task drives depth 16. Appears the behavior isn't changed w/wo the logic. > If you remove that logic, a single process making many parallel aio > reads could completely swamp one machine, preventing other seeky > processes from progressing. > Instead of removing completely the logic, you should make the depth > configurable, so multi-spindle storages could allow deeper queues > before switching to fairness-enforcing policy. we already had too many tunable ;( And we don't have a way to get the maximum depth a storage can provide. Could the driver detect it's a multi-spindle storage and report it to iosched, just like SSD detection? Thanks, Shaohua ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 6:40 ` Shaohua Li @ 2011-09-16 19:25 ` Corrado Zoccolo 2011-09-21 11:16 ` Shaohua Li 0 siblings, 1 reply; 19+ messages in thread From: Corrado Zoccolo @ 2011-09-16 19:25 UTC (permalink / raw) To: Shaohua Li; +Cc: lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> wrote: > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li <shaohua.li@intel.com> wrote: >> > Recently Maxim and I discussed why his aiostress workload performs poorly. If >> > you didn't follow the discussion, here are the issues we found: >> > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue >> > as seeky, but since when accessing A+1, A+1 is already in disk cache, this >> > should be detected as sequential really. Not sure if any real workload has such >> > access patern, and seems not easy to have a clean fix too. Any idea for this? >> >> Not all disks will cache 4 independent streams, we can't make that >> assumption in cfq. > sure thing. we can't make such assumption. I'm thinking if we should > move the seeky detection in request finish. If time between two requests > finish is short, we thought the queue is sequential. This will make the > detection adaptive. but seems time measurement isn't easy. > >> The current behaviour of assuming it as seeky should work well enough, >> in fact it will be put in the seeky tree, and it can enjoy the seeky >> tree quantum of time. If the second round takes a short time, it will >> be able to schedule a third round again after the idle time. >> If there are other seeky processes competing for the tree, the cache >> can be cleared by the time it gets back to your 4 streams process, so >> it will behave exactly as a seeky process from cfq point of view. >> If the various accesses were submitted in parallel, the deep seeky >> queue logic should kick in and make sure the process gets a sequential >> quantum, rather than sharing it with other seeky processes, so >> depending on your disk, it could perform better. > yes, the idle logic makes it ok, but sounds like "make things wrong > first (in seeky detection) and then fix it later (the idle logic)". > >> > 2. deep seeky queue idle. This makes raid performs poorly. I would think we >> > revert the logic. Deep queue is more popular with high end hardware. In such >> > hardware, we'd better not do idle. >> > Note, currently we set a queue's slice after the first request is finished. >> > This means the drive already idles a little time. If the queue is truely deep, >> > new requests should already come in, so idle isn't required. > What did you think about this? Assume seeky request takes long time, so > the queue is already idling for a little time. I don't think I understand. If cfq doesn't idle, it will dispatch an other request from the same or an other queue (if present) immediately, until all possible in-flight requests are sent. Now, you depend on NCQ for the order requests are handled, so you cannot guarantee fairness any more. > >> > Looks Vivek used to post a patch to rever it, but it gets ignored. >> > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html >> I get a 404 here. I think you are seeing only one half of the medal. >> That logic is there mainly to ensure fairness between deep seeky >> processes and normal seeky processes that want low latency. > didn't understand it. The logic doesn't protect non-deep process. how > could it make the normal seeky process have low latency? or did you have > a test case for this, so I can analyze? > I tried a workload with one task drives depth 4 and one task drives > depth 16. Appears the behavior isn't changed w/wo the logic. > Try a workload with one shallow seeky queue and one deep (16) one, on a single spindle NCQ disk. I think the behaviour when I submitted my patch was that both were getting 100ms slice (if this is not happening, probably some subsequent patch broke it). If you remove idling, they will get disk time roughly in proportion 16:1, i.e. pretty unfair. >> If you remove that logic, a single process making many parallel aio >> reads could completely swamp one machine, preventing other seeky >> processes from progressing. >> Instead of removing completely the logic, you should make the depth >> configurable, so multi-spindle storages could allow deeper queues >> before switching to fairness-enforcing policy. > we already had too many tunable ;( And we don't have a way to get the > maximum depth a storage can provide. > Could the driver detect it's a multi-spindle storage and report it to > iosched, just like SSD detection? I'd really like if this was possoble. Thanks, Corrado > > Thanks, > Shaohua > > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 19:25 ` Corrado Zoccolo @ 2011-09-21 11:16 ` Shaohua Li 2011-09-23 13:24 ` Vivek Goyal [not found] ` <CADX3swq0qURdi7VYLAVbsAmX5psPrzq-uvbqANsnLkHO0xcOMQ@mail.gmail.com> 0 siblings, 2 replies; 19+ messages in thread From: Shaohua Li @ 2011-09-21 11:16 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal On Sat, 2011-09-17 at 03:25 +0800, Corrado Zoccolo wrote: > On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> wrote: > > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: > >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li <shaohua.li@intel.com> wrote: > >> > Recently Maxim and I discussed why his aiostress workload performs poorly. If > >> > you didn't follow the discussion, here are the issues we found: > >> > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, > >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue > >> > as seeky, but since when accessing A+1, A+1 is already in disk cache, this > >> > should be detected as sequential really. Not sure if any real workload has such > >> > access patern, and seems not easy to have a clean fix too. Any idea for this? > >> > >> Not all disks will cache 4 independent streams, we can't make that > >> assumption in cfq. > > sure thing. we can't make such assumption. I'm thinking if we should > > move the seeky detection in request finish. If time between two requests > > finish is short, we thought the queue is sequential. This will make the > > detection adaptive. but seems time measurement isn't easy. > > > >> The current behaviour of assuming it as seeky should work well enough, > >> in fact it will be put in the seeky tree, and it can enjoy the seeky > >> tree quantum of time. If the second round takes a short time, it will > >> be able to schedule a third round again after the idle time. > >> If there are other seeky processes competing for the tree, the cache > >> can be cleared by the time it gets back to your 4 streams process, so > >> it will behave exactly as a seeky process from cfq point of view. > >> If the various accesses were submitted in parallel, the deep seeky > >> queue logic should kick in and make sure the process gets a sequential > >> quantum, rather than sharing it with other seeky processes, so > >> depending on your disk, it could perform better. > > yes, the idle logic makes it ok, but sounds like "make things wrong > > first (in seeky detection) and then fix it later (the idle logic)". > > > >> > 2. deep seeky queue idle. This makes raid performs poorly. I would think we > >> > revert the logic. Deep queue is more popular with high end hardware. In such > >> > hardware, we'd better not do idle. > >> > Note, currently we set a queue's slice after the first request is finished. > >> > This means the drive already idles a little time. If the queue is truely deep, > >> > new requests should already come in, so idle isn't required. > > What did you think about this? Assume seeky request takes long time, so > > the queue is already idling for a little time. > I don't think I understand. If cfq doesn't idle, it will dispatch an > other request from the same or an other queue (if present) > immediately, until all possible in-flight requests are sent. Now, you > depend on NCQ for the order requests are handled, so you cannot > guarantee fairness any more. > > > > >> > Looks Vivek used to post a patch to rever it, but it gets ignored. > >> > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > >> I get a 404 here. I think you are seeing only one half of the medal. > >> That logic is there mainly to ensure fairness between deep seeky > >> processes and normal seeky processes that want low latency. > > didn't understand it. The logic doesn't protect non-deep process. how > > could it make the normal seeky process have low latency? or did you have > > a test case for this, so I can analyze? > > I tried a workload with one task drives depth 4 and one task drives > > depth 16. Appears the behavior isn't changed w/wo the logic. sorry for the delay. > Try a workload with one shallow seeky queue and one deep (16) one, on > a single spindle NCQ disk. > I think the behaviour when I submitted my patch was that both were > getting 100ms slice (if this is not happening, probably some > subsequent patch broke it). > If you remove idling, they will get disk time roughly in proportion > 16:1, i.e. pretty unfair. I thought you are talking about a workload with one thread depth 4, and the other thread depth 16. I did some tests here. In an old kernel, without the deep seeky idle logic, the threads have disk time in proportion 1:5. With it, they get almost equal disk time. SO this reaches your goal. In a latest kernel, w/wo the logic, there is no big difference (the 16 depth thread get about 5x more disk time). With the logic, the depth 4 thread gets equal disk time in first several slices. But after an idle expiration(mostly because current block plug hold requests in task list and didn't add them to elevator), the queue never gets detected as deep, because the queue dispatch request one by one. So the logic is already broken for some time (maybe since block plug is added). Thanks, Shaohua ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-21 11:16 ` Shaohua Li @ 2011-09-23 13:24 ` Vivek Goyal 2011-09-25 7:34 ` Corrado Zoccolo 2011-09-26 0:51 ` Shaohua Li [not found] ` <CADX3swq0qURdi7VYLAVbsAmX5psPrzq-uvbqANsnLkHO0xcOMQ@mail.gmail.com> 1 sibling, 2 replies; 19+ messages in thread From: Vivek Goyal @ 2011-09-23 13:24 UTC (permalink / raw) To: Shaohua Li; +Cc: Corrado Zoccolo, lkml, Jens Axboe, Maxim Patlasov On Wed, Sep 21, 2011 at 07:16:20PM +0800, Shaohua Li wrote: [..] > > Try a workload with one shallow seeky queue and one deep (16) one, on > > a single spindle NCQ disk. > > I think the behaviour when I submitted my patch was that both were > > getting 100ms slice (if this is not happening, probably some > > subsequent patch broke it). > > If you remove idling, they will get disk time roughly in proportion > > 16:1, i.e. pretty unfair. > I thought you are talking about a workload with one thread depth 4, and > the other thread depth 16. I did some tests here. In an old kernel, > without the deep seeky idle logic, the threads have disk time in > proportion 1:5. With it, they get almost equal disk time. SO this > reaches your goal. In a latest kernel, w/wo the logic, there is no big > difference (the 16 depth thread get about 5x more disk time). With the > logic, the depth 4 thread gets equal disk time in first several slices. > But after an idle expiration(mostly because current block plug hold > requests in task list and didn't add them to elevator), the queue never > gets detected as deep, because the queue dispatch request one by one. When the plugged requests are flushed, then they will be added to elevator and at that point of time queue should be marked as deep? Anyway, what's wrong with the idea I suggested in other mail of expiring a sync-noidle queue afer few reuqest dispatches so that it does not starve other sync-noidle queues. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-23 13:24 ` Vivek Goyal @ 2011-09-25 7:34 ` Corrado Zoccolo 2011-09-27 13:08 ` Vivek Goyal 2011-09-26 0:51 ` Shaohua Li 1 sibling, 1 reply; 19+ messages in thread From: Corrado Zoccolo @ 2011-09-25 7:34 UTC (permalink / raw) To: Vivek Goyal; +Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov On Fri, Sep 23, 2011 at 3:24 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Sep 21, 2011 at 07:16:20PM +0800, Shaohua Li wrote: > > [..] >> > Try a workload with one shallow seeky queue and one deep (16) one, on >> > a single spindle NCQ disk. >> > I think the behaviour when I submitted my patch was that both were >> > getting 100ms slice (if this is not happening, probably some >> > subsequent patch broke it). >> > If you remove idling, they will get disk time roughly in proportion >> > 16:1, i.e. pretty unfair. >> I thought you are talking about a workload with one thread depth 4, and >> the other thread depth 16. I did some tests here. In an old kernel, >> without the deep seeky idle logic, the threads have disk time in >> proportion 1:5. With it, they get almost equal disk time. SO this >> reaches your goal. In a latest kernel, w/wo the logic, there is no big >> difference (the 16 depth thread get about 5x more disk time). With the >> logic, the depth 4 thread gets equal disk time in first several slices. >> But after an idle expiration(mostly because current block plug hold >> requests in task list and didn't add them to elevator), the queue never >> gets detected as deep, because the queue dispatch request one by one. > > When the plugged requests are flushed, then they will be added to elevator > and at that point of time queue should be marked as deep? > > Anyway, what's wrong with the idea I suggested in other mail of expiring > a sync-noidle queue afer few reuqest dispatches so that it does not > starve other sync-noidle queues. I don't know the current state of the code. Are the noidle queues sorted in some tree, by sector number? If that is the case, then even an expired queue could still be in front of the tree. > > Thanks > Vivek > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-25 7:34 ` Corrado Zoccolo @ 2011-09-27 13:08 ` Vivek Goyal 0 siblings, 0 replies; 19+ messages in thread From: Vivek Goyal @ 2011-09-27 13:08 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov On Sun, Sep 25, 2011 at 09:34:16AM +0200, Corrado Zoccolo wrote: [..] > > > > Anyway, what's wrong with the idea I suggested in other mail of expiring > > a sync-noidle queue afer few reuqest dispatches so that it does not > > starve other sync-noidle queues. > I don't know the current state of the code. Are the noidle queues > sorted in some tree, by sector number? > If that is the case, then even an expired queue could still be in > front of the tree. I am not aware of any queue sorting based on sector number on sync-noidle tree. So we should just be able to do round robin. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-23 13:24 ` Vivek Goyal 2011-09-25 7:34 ` Corrado Zoccolo @ 2011-09-26 0:51 ` Shaohua Li 2011-09-27 13:11 ` Vivek Goyal 1 sibling, 1 reply; 19+ messages in thread From: Shaohua Li @ 2011-09-26 0:51 UTC (permalink / raw) To: Vivek Goyal; +Cc: Corrado Zoccolo, lkml, Jens Axboe, Maxim Patlasov On Fri, 2011-09-23 at 21:24 +0800, Vivek Goyal wrote: > On Wed, Sep 21, 2011 at 07:16:20PM +0800, Shaohua Li wrote: > > [..] > > > Try a workload with one shallow seeky queue and one deep (16) one, on > > > a single spindle NCQ disk. > > > I think the behaviour when I submitted my patch was that both were > > > getting 100ms slice (if this is not happening, probably some > > > subsequent patch broke it). > > > If you remove idling, they will get disk time roughly in proportion > > > 16:1, i.e. pretty unfair. > > I thought you are talking about a workload with one thread depth 4, and > > the other thread depth 16. I did some tests here. In an old kernel, > > without the deep seeky idle logic, the threads have disk time in > > proportion 1:5. With it, they get almost equal disk time. SO this > > reaches your goal. In a latest kernel, w/wo the logic, there is no big > > difference (the 16 depth thread get about 5x more disk time). With the > > logic, the depth 4 thread gets equal disk time in first several slices. > > But after an idle expiration(mostly because current block plug hold > > requests in task list and didn't add them to elevator), the queue never > > gets detected as deep, because the queue dispatch request one by one. > > When the plugged requests are flushed, then they will be added to elevator > and at that point of time queue should be marked as deep? The problem is there are just 2 or 3 requests are hold to the per-task list and then get flushed into elevator later, so the queue isn't marked as deep. > Anyway, what's wrong with the idea I suggested in other mail of expiring > a sync-noidle queue afer few reuqest dispatches so that it does not > starve other sync-noidle queues. The problem is how many requests a queue should dispatch. cfq_prio_to_maxrq() == 16, which is too many. Maybe use 4, but it has its risk. seeky requests from one task might be still much far way with requests from other tasks. Thanks, Shaohua ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-26 0:51 ` Shaohua Li @ 2011-09-27 13:11 ` Vivek Goyal 0 siblings, 0 replies; 19+ messages in thread From: Vivek Goyal @ 2011-09-27 13:11 UTC (permalink / raw) To: Shaohua Li; +Cc: Corrado Zoccolo, lkml, Jens Axboe, Maxim Patlasov On Mon, Sep 26, 2011 at 08:51:39AM +0800, Shaohua Li wrote: > On Fri, 2011-09-23 at 21:24 +0800, Vivek Goyal wrote: > > On Wed, Sep 21, 2011 at 07:16:20PM +0800, Shaohua Li wrote: > > > > [..] > > > > Try a workload with one shallow seeky queue and one deep (16) one, on > > > > a single spindle NCQ disk. > > > > I think the behaviour when I submitted my patch was that both were > > > > getting 100ms slice (if this is not happening, probably some > > > > subsequent patch broke it). > > > > If you remove idling, they will get disk time roughly in proportion > > > > 16:1, i.e. pretty unfair. > > > I thought you are talking about a workload with one thread depth 4, and > > > the other thread depth 16. I did some tests here. In an old kernel, > > > without the deep seeky idle logic, the threads have disk time in > > > proportion 1:5. With it, they get almost equal disk time. SO this > > > reaches your goal. In a latest kernel, w/wo the logic, there is no big > > > difference (the 16 depth thread get about 5x more disk time). With the > > > logic, the depth 4 thread gets equal disk time in first several slices. > > > But after an idle expiration(mostly because current block plug hold > > > requests in task list and didn't add them to elevator), the queue never > > > gets detected as deep, because the queue dispatch request one by one. > > > > When the plugged requests are flushed, then they will be added to elevator > > and at that point of time queue should be marked as deep? > The problem is there are just 2 or 3 requests are hold to the per-task > list and then get flushed into elevator later, so the queue isn't marked > as deep. That would be workload dependent. Isn't it? > > > Anyway, what's wrong with the idea I suggested in other mail of expiring > > a sync-noidle queue afer few reuqest dispatches so that it does not > > starve other sync-noidle queues. > The problem is how many requests a queue should dispatch. > cfq_prio_to_maxrq() == 16, which is too many. Maybe use 4, but it has > its risk. seeky requests from one task might be still much far way with > requests from other tasks. 4-6 might be a reasonable number to begin with. I am not sure about the throughput impact thing because seek distance might be more by moving to a different task. And also fairness might have some cost. Lets run some tests and see if something shows up. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CADX3swq0qURdi7VYLAVbsAmX5psPrzq-uvbqANsnLkHO0xcOMQ@mail.gmail.com>]
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic [not found] ` <CADX3swq0qURdi7VYLAVbsAmX5psPrzq-uvbqANsnLkHO0xcOMQ@mail.gmail.com> @ 2011-09-26 0:55 ` Shaohua Li 2011-09-27 6:07 ` Corrado Zoccolo 0 siblings, 1 reply; 19+ messages in thread From: Shaohua Li @ 2011-09-26 0:55 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: Vivek Goyal, Maxim Patlasov, Jens Axboe, lkml On Fri, 2011-09-23 at 13:50 +0800, Corrado Zoccolo wrote: > Il giorno 21/set/2011 13:16, "Shaohua Li" <shaohua.li@intel.com> ha > scritto: > > > > On Sat, 2011-09-17 at 03:25 +0800, Corrado Zoccolo wrote: > > > On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> > wrote: > > > > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: > > > >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li > <shaohua.li@intel.com> wrote: > > > >> > Recently Maxim and I discussed why his aiostress workload > performs poorly. If > > > >> > you didn't follow the discussion, here are the issues we > found: > > > >> > 1. cfq seeky dection isn't good. Assume a task accesses > sector A, B, C, D, A+1, > > > >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will > detect the queue > > > >> > as seeky, but since when accessing A+1, A+1 is already in > disk cache, this > > > >> > should be detected as sequential really. Not sure if any real > workload has such > > > >> > access patern, and seems not easy to have a clean fix too. > Any idea for this? > > > >> > > > >> Not all disks will cache 4 independent streams, we can't make > that > > > >> assumption in cfq. > > > > sure thing. we can't make such assumption. I'm thinking if we > should > > > > move the seeky detection in request finish. If time between two > requests > > > > finish is short, we thought the queue is sequential. This will > make the > > > > detection adaptive. but seems time measurement isn't easy. > > > > > > > >> The current behaviour of assuming it as seeky should work well > enough, > > > >> in fact it will be put in the seeky tree, and it can enjoy the > seeky > > > >> tree quantum of time. If the second round takes a short time, > it will > > > >> be able to schedule a third round again after the idle time. > > > >> If there are other seeky processes competing for the tree, the > cache > > > >> can be cleared by the time it gets back to your 4 streams > process, so > > > >> it will behave exactly as a seeky process from cfq point of > view. > > > >> If the various accesses were submitted in parallel, the deep > seeky > > > >> queue logic should kick in and make sure the process gets a > sequential > > > >> quantum, rather than sharing it with other seeky processes, so > > > >> depending on your disk, it could perform better. > > > > yes, the idle logic makes it ok, but sounds like "make things > wrong > > > > first (in seeky detection) and then fix it later (the idle > logic)". > > > > > > > >> > 2. deep seeky queue idle. This makes raid performs poorly. I > would think we > > > >> > revert the logic. Deep queue is more popular with high end > hardware. In such > > > >> > hardware, we'd better not do idle. > > > >> > Note, currently we set a queue's slice after the first > request is finished. > > > >> > This means the drive already idles a little time. If the > queue is truely deep, > > > >> > new requests should already come in, so idle isn't required. > > > > What did you think about this? Assume seeky request takes long > time, so > > > > the queue is already idling for a little time. > > > I don't think I understand. If cfq doesn't idle, it will dispatch > an > > > other request from the same or an other queue (if present) > > > immediately, until all possible in-flight requests are sent. Now, > you > > > depend on NCQ for the order requests are handled, so you cannot > > > guarantee fairness any more. > > > > > > > > > > >> > Looks Vivek used to post a patch to rever it, but it gets > ignored. > > > >> > > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > > > >> I get a 404 here. I think you are seeing only one half of the > medal. > > > >> That logic is there mainly to ensure fairness between deep > seeky > > > >> processes and normal seeky processes that want low latency. > > > > didn't understand it. The logic doesn't protect non-deep > process. how > > > > could it make the normal seeky process have low latency? or did > you have > > > > a test case for this, so I can analyze? > > > > I tried a workload with one task drives depth 4 and one task > drives > > > > depth 16. Appears the behavior isn't changed w/wo the logic. > > sorry for the delay. > > > > > Try a workload with one shallow seeky queue and one deep (16) one, > on > > > a single spindle NCQ disk. > > > I think the behaviour when I submitted my patch was that both were > > > getting 100ms slice (if this is not happening, probably some > > > subsequent patch broke it). > > > If you remove idling, they will get disk time roughly in > proportion > > > 16:1, i.e. pretty unfair. > > I thought you are talking about a workload with one thread depth 4, > and > > the other thread depth 16. I did some tests here. In an old kernel, > > without the deep seeky idle logic, the threads have disk time in > > proportion 1:5. With it, they get almost equal disk time. SO this > > reaches your goal. In a latest kernel, w/wo the logic, there is no > big > > difference (the 16 depth thread get about 5x more disk time). With > the > > logic, the depth 4 thread gets equal disk time in first several > slices. > > But after an idle expiration(mostly because current block plug hold > > requests in task list and didn't add them to elevator), the queue > never > > gets detected as deep, because the queue dispatch request one by > one. So > > the logic is already broken for some time (maybe since block plug is > > added). > Could be that dispatching requests one by one is harming the > performance, then? Not really. Say 4 requests are running, the task dispatches a request after one previous request is completed. requests are dispatching one by one but there are still 4 requests running at any time. Checking the in_flight requests are more precise for the deep detection. Thanks, Shaohua ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-26 0:55 ` Shaohua Li @ 2011-09-27 6:07 ` Corrado Zoccolo 2011-09-27 6:33 ` Shaohua Li 0 siblings, 1 reply; 19+ messages in thread From: Corrado Zoccolo @ 2011-09-27 6:07 UTC (permalink / raw) To: Shaohua Li; +Cc: Vivek Goyal, Maxim Patlasov, Jens Axboe, lkml On Mon, Sep 26, 2011 at 2:55 AM, Shaohua Li <shaohua.li@intel.com> wrote: > On Fri, 2011-09-23 at 13:50 +0800, Corrado Zoccolo wrote: >> Il giorno 21/set/2011 13:16, "Shaohua Li" <shaohua.li@intel.com> ha >> scritto: >> > >> > On Sat, 2011-09-17 at 03:25 +0800, Corrado Zoccolo wrote: >> > > On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> >> wrote: >> > > > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: >> > > >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li >> <shaohua.li@intel.com> wrote: >> > > >> > Recently Maxim and I discussed why his aiostress workload >> performs poorly. If >> > > >> > you didn't follow the discussion, here are the issues we >> found: >> > > >> > 1. cfq seeky dection isn't good. Assume a task accesses >> sector A, B, C, D, A+1, >> > > >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will >> detect the queue >> > > >> > as seeky, but since when accessing A+1, A+1 is already in >> disk cache, this >> > > >> > should be detected as sequential really. Not sure if any real >> workload has such >> > > >> > access patern, and seems not easy to have a clean fix too. >> Any idea for this? >> > > >> >> > > >> Not all disks will cache 4 independent streams, we can't make >> that >> > > >> assumption in cfq. >> > > > sure thing. we can't make such assumption. I'm thinking if we >> should >> > > > move the seeky detection in request finish. If time between two >> requests >> > > > finish is short, we thought the queue is sequential. This will >> make the >> > > > detection adaptive. but seems time measurement isn't easy. >> > > > >> > > >> The current behaviour of assuming it as seeky should work well >> enough, >> > > >> in fact it will be put in the seeky tree, and it can enjoy the >> seeky >> > > >> tree quantum of time. If the second round takes a short time, >> it will >> > > >> be able to schedule a third round again after the idle time. >> > > >> If there are other seeky processes competing for the tree, the >> cache >> > > >> can be cleared by the time it gets back to your 4 streams >> process, so >> > > >> it will behave exactly as a seeky process from cfq point of >> view. >> > > >> If the various accesses were submitted in parallel, the deep >> seeky >> > > >> queue logic should kick in and make sure the process gets a >> sequential >> > > >> quantum, rather than sharing it with other seeky processes, so >> > > >> depending on your disk, it could perform better. >> > > > yes, the idle logic makes it ok, but sounds like "make things >> wrong >> > > > first (in seeky detection) and then fix it later (the idle >> logic)". >> > > > >> > > >> > 2. deep seeky queue idle. This makes raid performs poorly. I >> would think we >> > > >> > revert the logic. Deep queue is more popular with high end >> hardware. In such >> > > >> > hardware, we'd better not do idle. >> > > >> > Note, currently we set a queue's slice after the first >> request is finished. >> > > >> > This means the drive already idles a little time. If the >> queue is truely deep, >> > > >> > new requests should already come in, so idle isn't required. >> > > > What did you think about this? Assume seeky request takes long >> time, so >> > > > the queue is already idling for a little time. >> > > I don't think I understand. If cfq doesn't idle, it will dispatch >> an >> > > other request from the same or an other queue (if present) >> > > immediately, until all possible in-flight requests are sent. Now, >> you >> > > depend on NCQ for the order requests are handled, so you cannot >> > > guarantee fairness any more. >> > > >> > > > >> > > >> > Looks Vivek used to post a patch to rever it, but it gets >> ignored. >> > > >> > >> http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html >> > > >> I get a 404 here. I think you are seeing only one half of the >> medal. >> > > >> That logic is there mainly to ensure fairness between deep >> seeky >> > > >> processes and normal seeky processes that want low latency. >> > > > didn't understand it. The logic doesn't protect non-deep >> process. how >> > > > could it make the normal seeky process have low latency? or did >> you have >> > > > a test case for this, so I can analyze? >> > > > I tried a workload with one task drives depth 4 and one task >> drives >> > > > depth 16. Appears the behavior isn't changed w/wo the logic. >> > sorry for the delay. >> > >> > > Try a workload with one shallow seeky queue and one deep (16) one, >> on >> > > a single spindle NCQ disk. >> > > I think the behaviour when I submitted my patch was that both were >> > > getting 100ms slice (if this is not happening, probably some >> > > subsequent patch broke it). >> > > If you remove idling, they will get disk time roughly in >> proportion >> > > 16:1, i.e. pretty unfair. >> > I thought you are talking about a workload with one thread depth 4, >> and >> > the other thread depth 16. I did some tests here. In an old kernel, >> > without the deep seeky idle logic, the threads have disk time in >> > proportion 1:5. With it, they get almost equal disk time. SO this >> > reaches your goal. In a latest kernel, w/wo the logic, there is no >> big >> > difference (the 16 depth thread get about 5x more disk time). With >> the >> > logic, the depth 4 thread gets equal disk time in first several >> slices. >> > But after an idle expiration(mostly because current block plug hold >> > requests in task list and didn't add them to elevator), the queue >> never >> > gets detected as deep, because the queue dispatch request one by >> one. So >> > the logic is already broken for some time (maybe since block plug is >> > added). >> Could be that dispatching requests one by one is harming the >> performance, then? > Not really. Say 4 requests are running, the task dispatches a request > after one previous request is completed. requests are dispatching one by > one but there are still 4 requests running at any time. Checking the > in_flight requests are more precise for the deep detection. > What happens if there are 4 tasks, all that could dispatch 4 requests in parallel? Will we reach and sustain 16 in flight requests, or it will bounce around 4 in flight? I think here we could get a big difference. Probably it is better to move the deep queue detection logic in the per-task queue? Then cfq will decide if it should dispatch few requests from every task (shallow case) or all requests from a single task (deep), and then idle. Thanks Corrado ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-27 6:07 ` Corrado Zoccolo @ 2011-09-27 6:33 ` Shaohua Li 2011-09-28 7:09 ` Corrado Zoccolo 0 siblings, 1 reply; 19+ messages in thread From: Shaohua Li @ 2011-09-27 6:33 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: Vivek Goyal, Maxim Patlasov, Jens Axboe, lkml On Tue, 2011-09-27 at 14:07 +0800, Corrado Zoccolo wrote: > On Mon, Sep 26, 2011 at 2:55 AM, Shaohua Li <shaohua.li@intel.com> wrote: > > On Fri, 2011-09-23 at 13:50 +0800, Corrado Zoccolo wrote: > >> Il giorno 21/set/2011 13:16, "Shaohua Li" <shaohua.li@intel.com> ha > >> scritto: > >> > > >> > On Sat, 2011-09-17 at 03:25 +0800, Corrado Zoccolo wrote: > >> > > On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> > >> wrote: > >> > > > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: > >> > > >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li > >> <shaohua.li@intel.com> wrote: > >> > > >> > Recently Maxim and I discussed why his aiostress workload > >> performs poorly. If > >> > > >> > you didn't follow the discussion, here are the issues we > >> found: > >> > > >> > 1. cfq seeky dection isn't good. Assume a task accesses > >> sector A, B, C, D, A+1, > >> > > >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will > >> detect the queue > >> > > >> > as seeky, but since when accessing A+1, A+1 is already in > >> disk cache, this > >> > > >> > should be detected as sequential really. Not sure if any real > >> workload has such > >> > > >> > access patern, and seems not easy to have a clean fix too. > >> Any idea for this? > >> > > >> > >> > > >> Not all disks will cache 4 independent streams, we can't make > >> that > >> > > >> assumption in cfq. > >> > > > sure thing. we can't make such assumption. I'm thinking if we > >> should > >> > > > move the seeky detection in request finish. If time between two > >> requests > >> > > > finish is short, we thought the queue is sequential. This will > >> make the > >> > > > detection adaptive. but seems time measurement isn't easy. > >> > > > > >> > > >> The current behaviour of assuming it as seeky should work well > >> enough, > >> > > >> in fact it will be put in the seeky tree, and it can enjoy the > >> seeky > >> > > >> tree quantum of time. If the second round takes a short time, > >> it will > >> > > >> be able to schedule a third round again after the idle time. > >> > > >> If there are other seeky processes competing for the tree, the > >> cache > >> > > >> can be cleared by the time it gets back to your 4 streams > >> process, so > >> > > >> it will behave exactly as a seeky process from cfq point of > >> view. > >> > > >> If the various accesses were submitted in parallel, the deep > >> seeky > >> > > >> queue logic should kick in and make sure the process gets a > >> sequential > >> > > >> quantum, rather than sharing it with other seeky processes, so > >> > > >> depending on your disk, it could perform better. > >> > > > yes, the idle logic makes it ok, but sounds like "make things > >> wrong > >> > > > first (in seeky detection) and then fix it later (the idle > >> logic)". > >> > > > > >> > > >> > 2. deep seeky queue idle. This makes raid performs poorly. I > >> would think we > >> > > >> > revert the logic. Deep queue is more popular with high end > >> hardware. In such > >> > > >> > hardware, we'd better not do idle. > >> > > >> > Note, currently we set a queue's slice after the first > >> request is finished. > >> > > >> > This means the drive already idles a little time. If the > >> queue is truely deep, > >> > > >> > new requests should already come in, so idle isn't required. > >> > > > What did you think about this? Assume seeky request takes long > >> time, so > >> > > > the queue is already idling for a little time. > >> > > I don't think I understand. If cfq doesn't idle, it will dispatch > >> an > >> > > other request from the same or an other queue (if present) > >> > > immediately, until all possible in-flight requests are sent. Now, > >> you > >> > > depend on NCQ for the order requests are handled, so you cannot > >> > > guarantee fairness any more. > >> > > > >> > > > > >> > > >> > Looks Vivek used to post a patch to rever it, but it gets > >> ignored. > >> > > >> > > >> http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > >> > > >> I get a 404 here. I think you are seeing only one half of the > >> medal. > >> > > >> That logic is there mainly to ensure fairness between deep > >> seeky > >> > > >> processes and normal seeky processes that want low latency. > >> > > > didn't understand it. The logic doesn't protect non-deep > >> process. how > >> > > > could it make the normal seeky process have low latency? or did > >> you have > >> > > > a test case for this, so I can analyze? > >> > > > I tried a workload with one task drives depth 4 and one task > >> drives > >> > > > depth 16. Appears the behavior isn't changed w/wo the logic. > >> > sorry for the delay. > >> > > >> > > Try a workload with one shallow seeky queue and one deep (16) one, > >> on > >> > > a single spindle NCQ disk. > >> > > I think the behaviour when I submitted my patch was that both were > >> > > getting 100ms slice (if this is not happening, probably some > >> > > subsequent patch broke it). > >> > > If you remove idling, they will get disk time roughly in > >> proportion > >> > > 16:1, i.e. pretty unfair. > >> > I thought you are talking about a workload with one thread depth 4, > >> and > >> > the other thread depth 16. I did some tests here. In an old kernel, > >> > without the deep seeky idle logic, the threads have disk time in > >> > proportion 1:5. With it, they get almost equal disk time. SO this > >> > reaches your goal. In a latest kernel, w/wo the logic, there is no > >> big > >> > difference (the 16 depth thread get about 5x more disk time). With > >> the > >> > logic, the depth 4 thread gets equal disk time in first several > >> slices. > >> > But after an idle expiration(mostly because current block plug hold > >> > requests in task list and didn't add them to elevator), the queue > >> never > >> > gets detected as deep, because the queue dispatch request one by > >> one. So > >> > the logic is already broken for some time (maybe since block plug is > >> > added). > >> Could be that dispatching requests one by one is harming the > >> performance, then? > > Not really. Say 4 requests are running, the task dispatches a request > > after one previous request is completed. requests are dispatching one by > > one but there are still 4 requests running at any time. Checking the > > in_flight requests are more precise for the deep detection. > > > What happens if there are 4 tasks, all that could dispatch 4 requests > in parallel? Will we reach and sustain 16 in flight requests, or it > will bounce around 4 in flight? I think here we could get a big > difference. ah, yes, we really should change if (cfqq->queued[0] + cfqq->queued[1] >= 4) cfq_mark_cfqq_deep(cfqq); to if (cfqq->queued[0] + cfqq->queued[1] + cfqq->cfqq->dispatched >= 4) cfq_mark_cfqq_deep(cfqq); this is a bug, though this isn't related to the original raid issue. > Probably it is better to move the deep queue detection logic in the > per-task queue? it's per-task queue currently. > Then cfq will decide if it should dispatch few requests from every > task (shallow case) or all requests from a single task (deep), and > then idle. don't get your point. detect the deep logic considering all tasks? Thanks, Shaohua ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-27 6:33 ` Shaohua Li @ 2011-09-28 7:09 ` Corrado Zoccolo 0 siblings, 0 replies; 19+ messages in thread From: Corrado Zoccolo @ 2011-09-28 7:09 UTC (permalink / raw) To: Shaohua Li; +Cc: Vivek Goyal, Maxim Patlasov, Jens Axboe, lkml On Tue, Sep 27, 2011 at 8:33 AM, Shaohua Li <shaohua.li@intel.com> wrote: > On Tue, 2011-09-27 at 14:07 +0800, Corrado Zoccolo wrote: >> On Mon, Sep 26, 2011 at 2:55 AM, Shaohua Li <shaohua.li@intel.com> wrote: >> > On Fri, 2011-09-23 at 13:50 +0800, Corrado Zoccolo wrote: >> >> Il giorno 21/set/2011 13:16, "Shaohua Li" <shaohua.li@intel.com> ha >> >> scritto: >> >> > >> >> > On Sat, 2011-09-17 at 03:25 +0800, Corrado Zoccolo wrote: >> >> > > On Fri, Sep 16, 2011 at 8:40 AM, Shaohua Li <shaohua.li@intel.com> >> >> wrote: >> >> > > > On Fri, 2011-09-16 at 14:04 +0800, Corrado Zoccolo wrote: >> >> > > >> On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li >> >> <shaohua.li@intel.com> wrote: >> >> > > >> > Recently Maxim and I discussed why his aiostress workload >> >> performs poorly. If >> >> > > >> > you didn't follow the discussion, here are the issues we >> >> found: >> >> > > >> > 1. cfq seeky dection isn't good. Assume a task accesses >> >> sector A, B, C, D, A+1, >> >> > > >> > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will >> >> detect the queue >> >> > > >> > as seeky, but since when accessing A+1, A+1 is already in >> >> disk cache, this >> >> > > >> > should be detected as sequential really. Not sure if any real >> >> workload has such >> >> > > >> > access patern, and seems not easy to have a clean fix too. >> >> Any idea for this? >> >> > > >> >> >> > > >> Not all disks will cache 4 independent streams, we can't make >> >> that >> >> > > >> assumption in cfq. >> >> > > > sure thing. we can't make such assumption. I'm thinking if we >> >> should >> >> > > > move the seeky detection in request finish. If time between two >> >> requests >> >> > > > finish is short, we thought the queue is sequential. This will >> >> make the >> >> > > > detection adaptive. but seems time measurement isn't easy. >> >> > > > >> >> > > >> The current behaviour of assuming it as seeky should work well >> >> enough, >> >> > > >> in fact it will be put in the seeky tree, and it can enjoy the >> >> seeky >> >> > > >> tree quantum of time. If the second round takes a short time, >> >> it will >> >> > > >> be able to schedule a third round again after the idle time. >> >> > > >> If there are other seeky processes competing for the tree, the >> >> cache >> >> > > >> can be cleared by the time it gets back to your 4 streams >> >> process, so >> >> > > >> it will behave exactly as a seeky process from cfq point of >> >> view. >> >> > > >> If the various accesses were submitted in parallel, the deep >> >> seeky >> >> > > >> queue logic should kick in and make sure the process gets a >> >> sequential >> >> > > >> quantum, rather than sharing it with other seeky processes, so >> >> > > >> depending on your disk, it could perform better. >> >> > > > yes, the idle logic makes it ok, but sounds like "make things >> >> wrong >> >> > > > first (in seeky detection) and then fix it later (the idle >> >> logic)". >> >> > > > >> >> > > >> > 2. deep seeky queue idle. This makes raid performs poorly. I >> >> would think we >> >> > > >> > revert the logic. Deep queue is more popular with high end >> >> hardware. In such >> >> > > >> > hardware, we'd better not do idle. >> >> > > >> > Note, currently we set a queue's slice after the first >> >> request is finished. >> >> > > >> > This means the drive already idles a little time. If the >> >> queue is truely deep, >> >> > > >> > new requests should already come in, so idle isn't required. >> >> > > > What did you think about this? Assume seeky request takes long >> >> time, so >> >> > > > the queue is already idling for a little time. >> >> > > I don't think I understand. If cfq doesn't idle, it will dispatch >> >> an >> >> > > other request from the same or an other queue (if present) >> >> > > immediately, until all possible in-flight requests are sent. Now, >> >> you >> >> > > depend on NCQ for the order requests are handled, so you cannot >> >> > > guarantee fairness any more. >> >> > > >> >> > > > >> >> > > >> > Looks Vivek used to post a patch to rever it, but it gets >> >> ignored. >> >> > > >> > >> >> http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html >> >> > > >> I get a 404 here. I think you are seeing only one half of the >> >> medal. >> >> > > >> That logic is there mainly to ensure fairness between deep >> >> seeky >> >> > > >> processes and normal seeky processes that want low latency. >> >> > > > didn't understand it. The logic doesn't protect non-deep >> >> process. how >> >> > > > could it make the normal seeky process have low latency? or did >> >> you have >> >> > > > a test case for this, so I can analyze? >> >> > > > I tried a workload with one task drives depth 4 and one task >> >> drives >> >> > > > depth 16. Appears the behavior isn't changed w/wo the logic. >> >> > sorry for the delay. >> >> > >> >> > > Try a workload with one shallow seeky queue and one deep (16) one, >> >> on >> >> > > a single spindle NCQ disk. >> >> > > I think the behaviour when I submitted my patch was that both were >> >> > > getting 100ms slice (if this is not happening, probably some >> >> > > subsequent patch broke it). >> >> > > If you remove idling, they will get disk time roughly in >> >> proportion >> >> > > 16:1, i.e. pretty unfair. >> >> > I thought you are talking about a workload with one thread depth 4, >> >> and >> >> > the other thread depth 16. I did some tests here. In an old kernel, >> >> > without the deep seeky idle logic, the threads have disk time in >> >> > proportion 1:5. With it, they get almost equal disk time. SO this >> >> > reaches your goal. In a latest kernel, w/wo the logic, there is no >> >> big >> >> > difference (the 16 depth thread get about 5x more disk time). With >> >> the >> >> > logic, the depth 4 thread gets equal disk time in first several >> >> slices. >> >> > But after an idle expiration(mostly because current block plug hold >> >> > requests in task list and didn't add them to elevator), the queue >> >> never >> >> > gets detected as deep, because the queue dispatch request one by >> >> one. So >> >> > the logic is already broken for some time (maybe since block plug is >> >> > added). >> >> Could be that dispatching requests one by one is harming the >> >> performance, then? >> > Not really. Say 4 requests are running, the task dispatches a request >> > after one previous request is completed. requests are dispatching one by >> > one but there are still 4 requests running at any time. Checking the >> > in_flight requests are more precise for the deep detection. >> > >> What happens if there are 4 tasks, all that could dispatch 4 requests >> in parallel? Will we reach and sustain 16 in flight requests, or it >> will bounce around 4 in flight? I think here we could get a big >> difference. > ah, yes, we really should change > if (cfqq->queued[0] + cfqq->queued[1] >= 4) > cfq_mark_cfqq_deep(cfqq); > to > if (cfqq->queued[0] + cfqq->queued[1] + cfqq->cfqq->dispatched >= 4) > cfq_mark_cfqq_deep(cfqq); > this is a bug, though this isn't related to the original raid issue. > >> Probably it is better to move the deep queue detection logic in the >> per-task queue? > it's per-task queue currently. > >> Then cfq will decide if it should dispatch few requests from every >> task (shallow case) or all requests from a single task (deep), and >> then idle. > don't get your point. detect the deep logic considering all tasks? No no, I was just sayng that if no queue is marked deep, then they will end up all on the seeky tree, so cfq will dispatch requests from each before idling. This is the ideal case for raid, I think, so you only need to identify the sweet spot in which you prefer to insulate a seeky queue (mark it as deep) from the others, for the reasons that Vivek also points out, i.e. that a single queue, even being seeky, could have requests more concentrated, so the seek time could be shorter. Thanks, Corrado > Thanks, > Shaohua > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 6:04 ` Corrado Zoccolo 2011-09-16 6:40 ` Shaohua Li @ 2011-09-16 13:24 ` Vivek Goyal 2011-09-16 13:37 ` Vivek Goyal 2 siblings, 0 replies; 19+ messages in thread From: Vivek Goyal @ 2011-09-16 13:24 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov On Fri, Sep 16, 2011 at 08:04:49AM +0200, Corrado Zoccolo wrote: [..] > > > > 2. deep seeky queue idle. This makes raid performs poorly. I would think we > > revert the logic. Deep queue is more popular with high end hardware. In such > > hardware, we'd better not do idle. > > Note, currently we set a queue's slice after the first request is finished. > > This means the drive already idles a little time. If the queue is truely deep, > > new requests should already come in, so idle isn't required. > > Looks Vivek used to post a patch to rever it, but it gets ignored. > > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > I get a 404 here. I think you are seeing only one half of the medal. > That logic is there mainly to ensure fairness between deep seeky > processes and normal seeky processes that want low latency. > If you remove that logic, a single process making many parallel aio > reads could completely swamp one machine, preventing other seeky > processes from progressing. I think to tackle that we can expire the queue early once it has dispatched few requests. (Something along the lines of async queues being expired once they dispatch cfq_prio_to_maxrq() requests). That way one deep queue will not starve other random queues and at the same time we will not introduce per queue idling for deep queues. Anyway deep queue here is seeky so idling is not helping getting improved throughput. > Instead of removing completely the logic, you should make the depth > configurable, so multi-spindle storages could allow deeper queues > before switching to fairness-enforcing policy. I would think getting rid of deep queue logic is probably better than introducing more tunables. We just need to expire queue after dispatching few requests so that we do good rounding robin dispatch among all the queues on sync-noidle service tree. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 6:04 ` Corrado Zoccolo 2011-09-16 6:40 ` Shaohua Li 2011-09-16 13:24 ` Vivek Goyal @ 2011-09-16 13:37 ` Vivek Goyal 2 siblings, 0 replies; 19+ messages in thread From: Vivek Goyal @ 2011-09-16 13:37 UTC (permalink / raw) To: Corrado Zoccolo; +Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov On Fri, Sep 16, 2011 at 08:04:49AM +0200, Corrado Zoccolo wrote: > On Fri, Sep 16, 2011 at 5:09 AM, Shaohua Li <shaohua.li@intel.com> wrote: > > Recently Maxim and I discussed why his aiostress workload performs poorly. If > > you didn't follow the discussion, here are the issues we found: > > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, > > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue > > as seeky, but since when accessing A+1, A+1 is already in disk cache, this > > should be detected as sequential really. Not sure if any real workload has such > > access patern, and seems not easy to have a clean fix too. Any idea for this? > > Not all disks will cache 4 independent streams, we can't make that > assumption in cfq. > The current behaviour of assuming it as seeky should work well enough, > in fact it will be put in the seeky tree, and it can enjoy the seeky > tree quantum of time. If the second round takes a short time, it will > be able to schedule a third round again after the idle time. > If there are other seeky processes competing for the tree, the cache > can be cleared by the time it gets back to your 4 streams process, so > it will behave exactly as a seeky process from cfq point of view. > If the various accesses were submitted in parallel, the deep seeky > queue logic should kick in and make sure the process gets a sequential > quantum, rather than sharing it with other seeky processes, so > depending on your disk, it could perform better. I think I agree that we probably can not optimize CFQ for cache behavior without even knowing what a cache on a device might be doing. There are no gurantees that by making this 4 stream process sequential you will get better throughput in fact additional idling can kill the throughput on faster storage. It probably should be left to the device cache to optimize for such IO patterns. Thanks Vivek ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 3:09 [patch]cfq-iosched: delete deep seeky queue idle logic Shaohua Li 2011-09-16 6:04 ` Corrado Zoccolo @ 2011-09-16 9:54 ` Tao Ma 2011-09-16 14:08 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: Tao Ma @ 2011-09-16 9:54 UTC (permalink / raw) To: Shaohua Li; +Cc: lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal, Corrado Zoccolo Hi Shaohua, On 09/16/2011 11:09 AM, Shaohua Li wrote: > Recently Maxim and I discussed why his aiostress workload performs poorly. If > you didn't follow the discussion, here are the issues we found: > 1. cfq seeky dection isn't good. Assume a task accesses sector A, B, C, D, A+1, > B+1, C+1, D+1, A+2...Accessing A, B, C, D is random. cfq will detect the queue > as seeky, but since when accessing A+1, A+1 is already in disk cache, this > should be detected as sequential really. Not sure if any real workload has such > access patern, and seems not easy to have a clean fix too. Any idea for this? This year's FAST has a paper named "A Scheduling Framework That Makes Any Disk Schedulers Non-Work-Conserving Solely Based on Request Characteristics". It has described this situation and suggests a new scheduler named "stream scheduler" to resolve this. But I am not sure whether CFQ can work like that or not. Thanks Tao > > 2. deep seeky queue idle. This makes raid performs poorly. I would think we > revert the logic. Deep queue is more popular with high end hardware. In such > hardware, we'd better not do idle. > Note, currently we set a queue's slice after the first request is finished. > This means the drive already idles a little time. If the queue is truely deep, > new requests should already come in, so idle isn't required. > Looks Vivek used to post a patch to rever it, but it gets ignored. > http://us.generation-nt.com/patch-cfq-iosched-revert-logic-deep-queues-help-198339681.html > > Signed-off-by: Shaohua Li<shaohua.li@intel.com> > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index a33bd43..f75439e 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -334,7 +334,6 @@ enum cfqq_state_flags { > CFQ_CFQQ_FLAG_sync, /* synchronous queue */ > CFQ_CFQQ_FLAG_coop, /* cfqq is shared */ > CFQ_CFQQ_FLAG_split_coop, /* shared cfqq will be splitted */ > - CFQ_CFQQ_FLAG_deep, /* sync cfqq experienced large depth */ > CFQ_CFQQ_FLAG_wait_busy, /* Waiting for next request */ > }; > > @@ -363,7 +362,6 @@ CFQ_CFQQ_FNS(slice_new); > CFQ_CFQQ_FNS(sync); > CFQ_CFQQ_FNS(coop); > CFQ_CFQQ_FNS(split_coop); > -CFQ_CFQQ_FNS(deep); > CFQ_CFQQ_FNS(wait_busy); > #undef CFQ_CFQQ_FNS > > @@ -2375,17 +2373,6 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd) > goto keep_queue; > } > > - /* > - * This is a deep seek queue, but the device is much faster than > - * the queue can deliver, don't idle > - **/ > - if (CFQQ_SEEKY(cfqq) && cfq_cfqq_idle_window(cfqq) && > - (cfq_cfqq_slice_new(cfqq) || > - (cfqq->slice_end - jiffies > jiffies - cfqq->slice_start))) { > - cfq_clear_cfqq_deep(cfqq); > - cfq_clear_cfqq_idle_window(cfqq); > - } > - > if (cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) { > cfqq = NULL; > goto keep_queue; > @@ -3298,13 +3285,10 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq, > > enable_idle = old_idle = cfq_cfqq_idle_window(cfqq); > > - if (cfqq->queued[0] + cfqq->queued[1] >= 4) > - cfq_mark_cfqq_deep(cfqq); > - > if (cfqq->next_rq && (cfqq->next_rq->cmd_flags & REQ_NOIDLE)) > enable_idle = 0; > else if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle || > - (!cfq_cfqq_deep(cfqq) && CFQQ_SEEKY(cfqq))) > + CFQQ_SEEKY(cfqq)) > enable_idle = 0; > else if (sample_valid(cic->ttime.ttime_samples)) { > if (cic->ttime.ttime_mean > cfqd->cfq_slice_idle) > @@ -3874,11 +3858,6 @@ static void cfq_idle_slice_timer(unsigned long data) > */ > if (!RB_EMPTY_ROOT(&cfqq->sort_list)) > goto out_kick; > - > - /* > - * Queue depth flag is reset only when the idle didn't succeed > - */ > - cfq_clear_cfqq_deep(cfqq); > } > expire: > cfq_slice_expired(cfqd, timed_out); > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 9:54 ` Tao Ma @ 2011-09-16 14:08 ` Christoph Hellwig 2011-09-16 14:50 ` Tao Ma 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2011-09-16 14:08 UTC (permalink / raw) To: Tao Ma Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal, Corrado Zoccolo On Fri, Sep 16, 2011 at 05:54:51PM +0800, Tao Ma wrote: > This year's FAST has a paper named "A Scheduling Framework That Makes > Any Disk Schedulers Non-Work-Conserving Solely Based on Request > Characteristics". It has described this situation and suggests a new > scheduler named "stream scheduler" to resolve this. But I am not sure > whether CFQ can work like that or not. As usual I suspect the best thing is to just use noop for these kinds of cases. E.g. when you use xfs with the filestreams options you'll get patterns pretty similar to that in the initial post - that is intentional as it is generally use to place them into different areas of a complex RAID array. Any scheduler "smarts" will just help to break these I/O streams. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [patch]cfq-iosched: delete deep seeky queue idle logic 2011-09-16 14:08 ` Christoph Hellwig @ 2011-09-16 14:50 ` Tao Ma 0 siblings, 0 replies; 19+ messages in thread From: Tao Ma @ 2011-09-16 14:50 UTC (permalink / raw) To: Christoph Hellwig Cc: Shaohua Li, lkml, Jens Axboe, Maxim Patlasov, Vivek Goyal, Corrado Zoccolo On 09/16/2011 10:08 PM, Christoph Hellwig wrote: > On Fri, Sep 16, 2011 at 05:54:51PM +0800, Tao Ma wrote: >> This year's FAST has a paper named "A Scheduling Framework That Makes >> Any Disk Schedulers Non-Work-Conserving Solely Based on Request >> Characteristics". It has described this situation and suggests a new >> scheduler named "stream scheduler" to resolve this. But I am not sure >> whether CFQ can work like that or not. > > As usual I suspect the best thing is to just use noop for these kinds of > cases. E.g. when you use xfs with the filestreams options you'll get > patterns pretty similar to that in the initial post - that is > intentional as it is generally use to place them into different areas > of a complex RAID array. Any scheduler "smarts" will just help to break these > I/O streams. yeah, actually the paper does show that the performance of cfq is worse than noop in this case. ;) See section 3.4 if you are interested. Thanks Tao ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-09-28 7:09 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-16 3:09 [patch]cfq-iosched: delete deep seeky queue idle logic Shaohua Li
2011-09-16 6:04 ` Corrado Zoccolo
2011-09-16 6:40 ` Shaohua Li
2011-09-16 19:25 ` Corrado Zoccolo
2011-09-21 11:16 ` Shaohua Li
2011-09-23 13:24 ` Vivek Goyal
2011-09-25 7:34 ` Corrado Zoccolo
2011-09-27 13:08 ` Vivek Goyal
2011-09-26 0:51 ` Shaohua Li
2011-09-27 13:11 ` Vivek Goyal
[not found] ` <CADX3swq0qURdi7VYLAVbsAmX5psPrzq-uvbqANsnLkHO0xcOMQ@mail.gmail.com>
2011-09-26 0:55 ` Shaohua Li
2011-09-27 6:07 ` Corrado Zoccolo
2011-09-27 6:33 ` Shaohua Li
2011-09-28 7:09 ` Corrado Zoccolo
2011-09-16 13:24 ` Vivek Goyal
2011-09-16 13:37 ` Vivek Goyal
2011-09-16 9:54 ` Tao Ma
2011-09-16 14:08 ` Christoph Hellwig
2011-09-16 14:50 ` Tao Ma
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox