* [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling. @ 2017-03-26 2:18 sbates 2017-03-26 2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates 2017-03-26 2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates 0 siblings, 2 replies; 13+ messages in thread From: sbates @ 2017-03-26 2:18 UTC (permalink / raw) To: axboe; +Cc: linux-block, linux-nvme, Damien.LeMoal, osandov, sbates From: Stephen Bates <sbates@raithlin.com> Omar recently developed some patches for block layer stats that use callbacks to determine which bucket an IO should be considered for. At the same time there was discussion at LSF/MM that we might not want to consider all IO when generating stats for certain algorithms (e.g. IO completion polling). This set does two things. It makes the bucket callback for stats signed so we can now ignore IO that cause a negative to be returned from the bucket function. It then uses this new functionality to filter IO for the IO completion polling algorithms. This patchset applies cleanly on a83b576c9c25cf (block: fix stacked driver stats init and free) in Jens' for-next tree. I've lightly tested this using QEMU and a real NVMe low-latency device. I do not have performance number yet. Feedback would be appreciated! [BTW this is my first submission for an all new setup so apologies if this does not come through correctly!] Cc: Damien.LeMoal@wdc.com Cc: osandov@osandov.com Stephen Bates (2): blk-stat: convert blk-stat bucket callback to signed blk-stat: add a poll_size value to the request_queue struct block/blk-mq.c | 17 +++++++++++++++-- block/blk-stat.c | 8 +++++--- block/blk-stat.h | 9 +++++---- block/blk-sysfs.c | 30 ++++++++++++++++++++++++++++++ include/linux/blkdev.h | 1 + 5 files changed, 56 insertions(+), 9 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed 2017-03-26 2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates @ 2017-03-26 2:18 ` sbates 2017-03-26 2:49 ` Omar Sandoval 2017-03-26 2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates 1 sibling, 1 reply; 13+ messages in thread From: sbates @ 2017-03-26 2:18 UTC (permalink / raw) To: axboe; +Cc: linux-block, linux-nvme, Damien.LeMoal, osandov, sbates From: Stephen Bates <sbates@raithlin.com> In order to allow for filtering of IO based on some other properties of the request than direction we allow the bucket function to return an int. If the bucket callback returns a negative do no count it in the stats accumulation. Signed-off-by: Stephen Bates <sbates@raithlin.com> --- block/blk-stat.c | 8 +++++--- block/blk-stat.h | 9 +++++---- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/block/blk-stat.c b/block/blk-stat.c index 188b535..936adfb 100644 --- a/block/blk-stat.c +++ b/block/blk-stat.c @@ -17,9 +17,9 @@ struct blk_queue_stats { spinlock_t lock; }; -unsigned int blk_stat_rq_ddir(const struct request *rq) +int blk_stat_rq_ddir(const struct request *rq) { - return rq_data_dir(rq); + return (int)rq_data_dir(rq); } EXPORT_SYMBOL_GPL(blk_stat_rq_ddir); @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq) list_for_each_entry_rcu(cb, &q->stats->callbacks, list) { if (blk_stat_is_active(cb)) { bucket = cb->bucket_fn(rq); + if (bucket < 0) + continue; stat = &this_cpu_ptr(cb->cpu_stat)[bucket]; __blk_stat_add(stat, value); } @@ -131,7 +133,7 @@ static void blk_stat_timer_fn(unsigned long data) struct blk_stat_callback * blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), - unsigned int (*bucket_fn)(const struct request *), + int (*bucket_fn)(const struct request *), unsigned int buckets, void *data) { struct blk_stat_callback *cb; diff --git a/block/blk-stat.h b/block/blk-stat.h index 6ad5b8c..7417805 100644 --- a/block/blk-stat.h +++ b/block/blk-stat.h @@ -41,9 +41,10 @@ struct blk_stat_callback { /** * @bucket_fn: Given a request, returns which statistics bucket it - * should be accounted under. + * should be accounted under. Return -1 for no bucket for this + * request. */ - unsigned int (*bucket_fn)(const struct request *); + int (*bucket_fn)(const struct request *); /** * @buckets: Number of statistics buckets. @@ -98,7 +99,7 @@ static inline u64 blk_stat_time(struct blk_issue_stat *stat) * * Return: Data direction of the request, either READ or WRITE. */ -unsigned int blk_stat_rq_ddir(const struct request *rq); +int blk_stat_rq_ddir(const struct request *rq); /** * blk_stat_alloc_callback() - Allocate a block statistics callback. @@ -113,7 +114,7 @@ unsigned int blk_stat_rq_ddir(const struct request *rq); */ struct blk_stat_callback * blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), - unsigned int (*bucket_fn)(const struct request *), + int (*bucket_fn)(const struct request *), unsigned int buckets, void *data); /** -- 2.7.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed 2017-03-26 2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates @ 2017-03-26 2:49 ` Omar Sandoval 2017-03-27 16:00 ` Stephen Bates 0 siblings, 1 reply; 13+ messages in thread From: Omar Sandoval @ 2017-03-26 2:49 UTC (permalink / raw) To: sbates; +Cc: axboe, linux-block, linux-nvme, Damien.LeMoal Hey, Stephen, On Sat, Mar 25, 2017 at 08:18:11PM -0600, sbates@raithlin.com wrote: > From: Stephen Bates <sbates@raithlin.com> > > In order to allow for filtering of IO based on some other properties > of the request than direction we allow the bucket function to return > an int. > > If the bucket callback returns a negative do no count it in the stats > accumulation. This is great, I had it this way at first but didn't know if anyone would want to omit stats in this way. A couple of comments below. > Signed-off-by: Stephen Bates <sbates@raithlin.com> > --- > block/blk-stat.c | 8 +++++--- > block/blk-stat.h | 9 +++++---- > 2 files changed, 10 insertions(+), 7 deletions(-) > > diff --git a/block/blk-stat.c b/block/blk-stat.c > index 188b535..936adfb 100644 > --- a/block/blk-stat.c > +++ b/block/blk-stat.c > @@ -17,9 +17,9 @@ struct blk_queue_stats { > spinlock_t lock; > }; > > -unsigned int blk_stat_rq_ddir(const struct request *rq) > +int blk_stat_rq_ddir(const struct request *rq) > { > - return rq_data_dir(rq); > + return (int)rq_data_dir(rq); The cast here here isn't necessary, let's leave it off. > } > EXPORT_SYMBOL_GPL(blk_stat_rq_ddir); > > @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq) > list_for_each_entry_rcu(cb, &q->stats->callbacks, list) { > if (blk_stat_is_active(cb)) { > bucket = cb->bucket_fn(rq); > + if (bucket < 0) > + continue; You also need to change the declaration of bucket to int above. > stat = &this_cpu_ptr(cb->cpu_stat)[bucket]; > __blk_stat_add(stat, value); > } > @@ -131,7 +133,7 @@ static void blk_stat_timer_fn(unsigned long data) > > struct blk_stat_callback * > blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), > - unsigned int (*bucket_fn)(const struct request *), > + int (*bucket_fn)(const struct request *), > unsigned int buckets, void *data) > { > struct blk_stat_callback *cb; > diff --git a/block/blk-stat.h b/block/blk-stat.h > index 6ad5b8c..7417805 100644 > --- a/block/blk-stat.h > +++ b/block/blk-stat.h > @@ -41,9 +41,10 @@ struct blk_stat_callback { > > /** > * @bucket_fn: Given a request, returns which statistics bucket it > - * should be accounted under. > + * should be accounted under. Return -1 for no bucket for this > + * request. > */ > - unsigned int (*bucket_fn)(const struct request *); > + int (*bucket_fn)(const struct request *); > > /** > * @buckets: Number of statistics buckets. > @@ -98,7 +99,7 @@ static inline u64 blk_stat_time(struct blk_issue_stat *stat) > * > * Return: Data direction of the request, either READ or WRITE. > */ > -unsigned int blk_stat_rq_ddir(const struct request *rq); > +int blk_stat_rq_ddir(const struct request *rq); > > /** > * blk_stat_alloc_callback() - Allocate a block statistics callback. > @@ -113,7 +114,7 @@ unsigned int blk_stat_rq_ddir(const struct request *rq); > */ > struct blk_stat_callback * > blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), > - unsigned int (*bucket_fn)(const struct request *), > + int (*bucket_fn)(const struct request *), > unsigned int buckets, void *data); > > /** > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed 2017-03-26 2:49 ` Omar Sandoval @ 2017-03-27 16:00 ` Stephen Bates 2017-03-27 16:01 ` Omar Sandoval 0 siblings, 1 reply; 13+ messages in thread From: Stephen Bates @ 2017-03-27 16:00 UTC (permalink / raw) To: Omar Sandoval Cc: axboe@kernel.dk, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Damien.LeMoal@wdc.com VGhhbmtzIGZvciB0aGUgcmV2aWV3IE9tYXIhDQoNCj4+ICANCj4+IC11bnNpZ25lZCBpbnQgYmxr X3N0YXRfcnFfZGRpcihjb25zdCBzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+PiAraW50IGJsa19zdGF0 X3JxX2RkaXIoY29uc3Qgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPj4gIHsNCj4+IC0JcmV0dXJuIHJx X2RhdGFfZGlyKHJxKTsNCj4+ICsJcmV0dXJuIChpbnQpcnFfZGF0YV9kaXIocnEpOw0KPg0KPlRo ZSBjYXN0IGhlcmUgaGVyZSBpc24ndCBuZWNlc3NhcnksIGxldCdzIGxlYXZlIGl0IG9mZi4NCj4N Cg0KT0ssIEkgd2lsbCBhZGQgdGhhdCBpbiB0aGUgcmVzcGluIQ0KDQo+PiAgfQ0KPj4gIEVYUE9S VF9TWU1CT0xfR1BMKGJsa19zdGF0X3JxX2RkaXIpOw0KPj4gIA0KPj4gQEAgLTEwMCw2ICsxMDAs OCBAQCB2b2lkIGJsa19zdGF0X2FkZChzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+PiAgCWxpc3RfZm9y X2VhY2hfZW50cnlfcmN1KGNiLCAmcS0+c3RhdHMtPmNhbGxiYWNrcywgbGlzdCkgew0KPj4gIAkJ aWYgKGJsa19zdGF0X2lzX2FjdGl2ZShjYikpIHsNCj4+ICAJCQlidWNrZXQgPSBjYi0+YnVja2V0 X2ZuKHJxKTsNCj4+ICsJCQlpZiAoYnVja2V0IDwgMCkNCj4+ICsJCQkJY29udGludWU7DQo+DQo+ WW91IGFsc28gbmVlZCB0byBjaGFuZ2UgdGhlIGRlY2xhcmF0aW9uIG9mIGJ1Y2tldCB0byBpbnQg YWJvdmUuDQo+DQoNClVtbW1tLCBpdCBpcyBhbHJlYWR5IGlzIGFuIGludOKApg0KDQpTdGVwaGVu DQoNCg0K ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed 2017-03-27 16:00 ` Stephen Bates @ 2017-03-27 16:01 ` Omar Sandoval 0 siblings, 0 replies; 13+ messages in thread From: Omar Sandoval @ 2017-03-27 16:01 UTC (permalink / raw) To: Stephen Bates Cc: axboe@kernel.dk, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Damien.LeMoal@wdc.com On Mon, Mar 27, 2017 at 04:00:08PM +0000, Stephen Bates wrote: > Thanks for the review Omar! > > >> > >> -unsigned int blk_stat_rq_ddir(const struct request *rq) > >> +int blk_stat_rq_ddir(const struct request *rq) > >> { > >> - return rq_data_dir(rq); > >> + return (int)rq_data_dir(rq); > > > >The cast here here isn't necessary, let's leave it off. > > > > OK, I will add that in the respin! > > >> } > >> EXPORT_SYMBOL_GPL(blk_stat_rq_ddir); > >> > >> @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq) > >> list_for_each_entry_rcu(cb, &q->stats->callbacks, list) { > >> if (blk_stat_is_active(cb)) { > >> bucket = cb->bucket_fn(rq); > >> + if (bucket < 0) > >> + continue; > > > >You also need to change the declaration of bucket to int above. > > > > Ummmm, it is already is an int… Yup, I was looking at the wrong place, sorry. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-26 2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates 2017-03-26 2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates @ 2017-03-26 2:18 ` sbates 2017-03-28 10:50 ` Sagi Grimberg 1 sibling, 1 reply; 13+ messages in thread From: sbates @ 2017-03-26 2:18 UTC (permalink / raw) To: axboe; +Cc: linux-block, linux-nvme, Damien.LeMoal, osandov, sbates From: Stephen Bates <sbates@raithlin.com> In order to bucket IO for the polling algorithm we use a sysfs entry to set the filter value. It is signed and we will use that as follows: 0 : No filtering. All IO are considered in stat generation > 0 : Filtering based on IO of exactly this size only. < 0 : Filtering based on IO less than or equal to -1 time this value. Use this member to implement a new bucket function for the io polling stats. Signed-off-by: Stephen Bates <sbates@raithlin.com> --- block/blk-mq.c | 17 +++++++++++++++-- block/blk-sysfs.c | 30 ++++++++++++++++++++++++++++++ include/linux/blkdev.h | 1 + 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5ff66f2..5ea9481 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2327,6 +2327,18 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, blk_mq_sysfs_register(q); } +int blk_mq_poll_stats_bucket(const struct request *rq) +{ + int val = rq->q->poll_size; + + if ((val == 0) || + (val > 0 && blk_rq_bytes(rq) == val) || + (val < 0 && blk_rq_bytes(rq) <= -val)) + return (int)rq_data_dir(rq); + + return -1; +} + struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, struct request_queue *q) { @@ -2338,7 +2350,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, goto err_exit; q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn, - blk_stat_rq_ddir, 2, q); + blk_mq_poll_stats_bucket, 2, q); if (!q->poll_cb) goto err_exit; @@ -2387,9 +2399,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->nr_requests = set->queue_depth; /* - * Default to classic polling + * Default to classic polling. Default to considering all IO. */ q->poll_nsec = -1; + q->poll_size = 0; if (set->ops->complete) blk_queue_softirq_done(q, set->ops->complete); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index fa831cb..000d5db 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -394,6 +394,29 @@ static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page, return count; } +static ssize_t queue_poll_size_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%d\n", q->poll_size); +} + +static ssize_t queue_poll_size_store(struct request_queue *q, const char *page, + size_t count) +{ + int err, val; + + if (!q->mq_ops || !q->mq_ops->poll) + return -EINVAL; + + err = kstrtoint(page, 10, &val); + if (err < 0) + return err; + + q->poll_size = val; + + return count; +} + + static ssize_t queue_poll_show(struct request_queue *q, char *page) { return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page); @@ -654,6 +677,12 @@ static struct queue_sysfs_entry queue_poll_entry = { .store = queue_poll_store, }; +static struct queue_sysfs_entry queue_poll_size_entry = { + .attr = {.name = "io_poll_size", .mode = S_IRUGO | S_IWUSR }, + .show = queue_poll_size_show, + .store = queue_poll_size_store, +}; + static struct queue_sysfs_entry queue_poll_delay_entry = { .attr = {.name = "io_poll_delay", .mode = S_IRUGO | S_IWUSR }, .show = queue_poll_delay_show, @@ -710,6 +739,7 @@ static struct attribute *default_attrs[] = { &queue_dax_entry.attr, &queue_wb_lat_entry.attr, &queue_poll_delay_entry.attr, + &queue_poll_size_entry.attr, NULL, }; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 1a7dc42..7ff5388 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -517,6 +517,7 @@ struct request_queue { unsigned int rq_timeout; int poll_nsec; + int poll_size; struct blk_stat_callback *poll_cb; struct blk_rq_stat poll_stat[2]; -- 2.7.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-26 2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates @ 2017-03-28 10:50 ` Sagi Grimberg 2017-03-28 19:38 ` Stephen Bates 0 siblings, 1 reply; 13+ messages in thread From: Sagi Grimberg @ 2017-03-28 10:50 UTC (permalink / raw) To: sbates, axboe; +Cc: linux-block, Damien.LeMoal, osandov, linux-nvme On 26/03/17 05:18, sbates@raithlin.com wrote: > From: Stephen Bates <sbates@raithlin.com> > > In order to bucket IO for the polling algorithm we use a sysfs entry > to set the filter value. It is signed and we will use that as follows: > > 0 : No filtering. All IO are considered in stat generation > > 0 : Filtering based on IO of exactly this size only. > < 0 : Filtering based on IO less than or equal to -1 time this value. I'd say that this is a fairly non-trivial semantic meanning to this... Is there any use for the size exact match filter? If not then I suggest we always make it (<=) in its semantics... > Use this member to implement a new bucket function for the io polling > stats. > > Signed-off-by: Stephen Bates <sbates@raithlin.com> > --- > block/blk-mq.c | 17 +++++++++++++++-- > block/blk-sysfs.c | 30 ++++++++++++++++++++++++++++++ > include/linux/blkdev.h | 1 + > 3 files changed, 46 insertions(+), 2 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 5ff66f2..5ea9481 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -2327,6 +2327,18 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, > blk_mq_sysfs_register(q); > } > > +int blk_mq_poll_stats_bucket(const struct request *rq) > +{ > + int val = rq->q->poll_size; > + > + if ((val == 0) || > + (val > 0 && blk_rq_bytes(rq) == val) || > + (val < 0 && blk_rq_bytes(rq) <= -val)) > + return (int)rq_data_dir(rq); > + > + return -1; > +} > + > struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, > struct request_queue *q) > { > @@ -2338,7 +2350,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, > goto err_exit; > > q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn, > - blk_stat_rq_ddir, 2, q); > + blk_mq_poll_stats_bucket, 2, q); > if (!q->poll_cb) > goto err_exit; > > @@ -2387,9 +2399,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, > q->nr_requests = set->queue_depth; > > /* > - * Default to classic polling > + * Default to classic polling. Default to considering all IO. > */ > q->poll_nsec = -1; > + q->poll_size = 0; > > if (set->ops->complete) > blk_queue_softirq_done(q, set->ops->complete); > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index fa831cb..000d5db 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -394,6 +394,29 @@ static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page, > return count; > } > > +static ssize_t queue_poll_size_show(struct request_queue *q, char *page) > +{ > + return sprintf(page, "%d\n", q->poll_size); > +} > + > +static ssize_t queue_poll_size_store(struct request_queue *q, const char *page, > + size_t count) > +{ > + int err, val; > + > + if (!q->mq_ops || !q->mq_ops->poll) > + return -EINVAL; > + > + err = kstrtoint(page, 10, &val); > + if (err < 0) > + return err; > + > + q->poll_size = val; > + > + return count; > +} > + > + > static ssize_t queue_poll_show(struct request_queue *q, char *page) > { > return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page); > @@ -654,6 +677,12 @@ static struct queue_sysfs_entry queue_poll_entry = { > .store = queue_poll_store, > }; > > +static struct queue_sysfs_entry queue_poll_size_entry = { > + .attr = {.name = "io_poll_size", .mode = S_IRUGO | S_IWUSR }, > + .show = queue_poll_size_show, > + .store = queue_poll_size_store, > +}; > + > static struct queue_sysfs_entry queue_poll_delay_entry = { > .attr = {.name = "io_poll_delay", .mode = S_IRUGO | S_IWUSR }, > .show = queue_poll_delay_show, > @@ -710,6 +739,7 @@ static struct attribute *default_attrs[] = { > &queue_dax_entry.attr, > &queue_wb_lat_entry.attr, > &queue_poll_delay_entry.attr, > + &queue_poll_size_entry.attr, > NULL, > }; > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 1a7dc42..7ff5388 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -517,6 +517,7 @@ struct request_queue { > > unsigned int rq_timeout; > int poll_nsec; > + int poll_size; > > struct blk_stat_callback *poll_cb; > struct blk_rq_stat poll_stat[2]; > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 10:50 ` Sagi Grimberg @ 2017-03-28 19:38 ` Stephen Bates 2017-03-28 19:46 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: Stephen Bates @ 2017-03-28 19:38 UTC (permalink / raw) To: Sagi Grimberg, axboe@kernel.dk Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org Pj4NCj4+ICBJbiBvcmRlciB0byBidWNrZXQgSU8gZm9yIHRoZSBwb2xsaW5nIGFsZ29yaXRobSB3 ZSB1c2UgYSBzeXNmcyBlbnRyeQ0KPj4gIHRvIHNldCB0aGUgZmlsdGVyIHZhbHVlLiBJdCBpcyBz aWduZWQgYW5kIHdlIHdpbGwgdXNlIHRoYXQgYXMgZm9sbG93czoNCj4+DQo+PiAgIDAgICA6IE5v IGZpbHRlcmluZy4gQWxsIElPIGFyZSBjb25zaWRlcmVkIGluIHN0YXQgZ2VuZXJhdGlvbg0KPj4g ICA+IDAgOiBGaWx0ZXJpbmcgYmFzZWQgb24gSU8gb2YgZXhhY3RseSB0aGlzIHNpemUgb25seS4N Cj4+ICAgPCAwIDogRmlsdGVyaW5nIGJhc2VkIG9uIElPIGxlc3MgdGhhbiBvciBlcXVhbCB0byAt MSB0aW1lIHRoaXMgdmFsdWUuPg0KPg0KPiBJJ2Qgc2F5IHRoYXQgdGhpcyBpcyBhIGZhaXJseSBu b24tdHJpdmlhbCBzZW1hbnRpYyBtZWFubmluZyB0byB0aGlzLi4uDQo+DQo+IElzIHRoZXJlIGFu eSB1c2UgZm9yIHRoZSBzaXplIGV4YWN0IG1hdGNoIGZpbHRlcj8gSWYgbm90IHRoZW4NCj4gSSBz dWdnZXN0IHdlIGFsd2F5cyBtYWtlIGl0ICg8PSkgaW4gaXRzIHNlbWFudGljcy4uLg0KDQpUaGFu a3MgZm9yIHRoZSByZXZpZXcgU2FnaS4gSeKAmWQgYmUgT0sgZ29pbmcgd2l0aCA8PTAgYXMgdGhl IGV4YWN0IG1hdGNoIHdvdWxkIG5vcm1hbGx5IGJlIGZvciBtaW5pbWFsIElPIHNpemVzICh3aGVy ZSA8PSBhbmQgPSBhcmUgdGhlIHNhbWUgdGhpbmcpLiBJIHdpbGwgc2VlIHdoYXQgb3RoZXIgZmVl ZGJhY2sgSSBnZXQgYW5kIGFpbSB0byBkbyBhIHJlc3BpbiBzb29u4oCmDQoNClN0ZXBoZW4NCg0K ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 19:38 ` Stephen Bates @ 2017-03-28 19:46 ` Jens Axboe 2017-03-28 19:58 ` Stephen Bates 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2017-03-28 19:46 UTC (permalink / raw) To: Stephen Bates, Sagi Grimberg Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org On 03/28/2017 01:38 PM, Stephen Bates wrote: >>> >>> In order to bucket IO for the polling algorithm we use a sysfs entry >>> to set the filter value. It is signed and we will use that as follows: >>> >>> 0 : No filtering. All IO are considered in stat generation >>> > 0 : Filtering based on IO of exactly this size only. >>> < 0 : Filtering based on IO less than or equal to -1 time this value.> >> >> I'd say that this is a fairly non-trivial semantic meanning to this... >> >> Is there any use for the size exact match filter? If not then >> I suggest we always make it (<=) in its semantics... > > Thanks for the review Sagi. I�d be OK going with <=0 as the exact > match would normally be for minimal IO sizes (where <= and = are the > same thing). I will see what other feedback I get and aim to do a > respin soon� No tunables for this, please. There's absolutely no reason why we should need it. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 19:46 ` Jens Axboe @ 2017-03-28 19:58 ` Stephen Bates 2017-03-28 20:38 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: Stephen Bates @ 2017-03-28 19:58 UTC (permalink / raw) To: Jens Axboe, Sagi Grimberg Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org Pj4gDQo+PiBUaGFua3MgZm9yIHRoZSByZXZpZXcgU2FnaS4gSeKAmWQgYmUgT0sgZ29pbmcgd2l0 aCA8PTAgYXMgdGhlIGV4YWN0DQo+PiBtYXRjaCB3b3VsZCBub3JtYWxseSBiZSBmb3IgbWluaW1h bCBJTyBzaXplcyAod2hlcmUgPD0gYW5kID0gYXJlIHRoZQ0KPj4gc2FtZSB0aGluZykuIEkgd2ls bCBzZWUgd2hhdCBvdGhlciBmZWVkYmFjayBJIGdldCBhbmQgYWltIHRvIGRvIGENCj4+IHJlc3Bp biBzb29u4oCmDQo+DQo+IE5vIHR1bmFibGVzIGZvciB0aGlzLCBwbGVhc2UuIFRoZXJlJ3MgYWJz b2x1dGVseSBubyByZWFzb24gd2h5IHdlIHNob3VsZA0KPiBuZWVkIGl0Lg0KDQpKZW5zIOKAkyBi eSB0aGlzIHlvdSBtZWFuIHlvdSB3YW50IHRvIG9ubHkgYnVja2V0IElPIHRoYXQgYXJlIGV4YWN0 bHkgdGhlIG1pbmltdW0gYmxvY2sgc2l6ZSBzdXBwb3J0ZWQgYnkgdGhlIHVuZGVybHlpbmcgYmxv Y2sgZGV2aWNlPyBJIHdhcyBlbnZpc2lvbmluZyB3ZSBtaWdodCB3YW50IHRvIHJlbGF4IHRoYXQg aW4gY2VydGFpbiBjYXNlcyAoZS5nLiBidWNrZXQgNEtCIGFuZCBiZWxvdyBnb2luZyB0byBhIDUx MkIgZGV2aWNlKS4NCg0KU3RlcGhlbg0KDQoNCg== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 19:58 ` Stephen Bates @ 2017-03-28 20:38 ` Jens Axboe 2017-03-28 21:49 ` Stephen Bates 2017-03-28 21:52 ` Stephen Bates 0 siblings, 2 replies; 13+ messages in thread From: Jens Axboe @ 2017-03-28 20:38 UTC (permalink / raw) To: Stephen Bates, Sagi Grimberg Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org On 03/28/2017 01:58 PM, Stephen Bates wrote: >>> >>> Thanks for the review Sagi. I’d be OK going with <=0 as the exact >>> match would normally be for minimal IO sizes (where <= and = are the >>> same thing). I will see what other feedback I get and aim to do a >>> respin soon… >> >> No tunables for this, please. There's absolutely no reason why we should >> need it. > > Jens – by this you mean you want to only bucket IO that are exactly > the minimum block size supported by the underlying block device? I was > envisioning we might want to relax that in certain cases (e.g. bucket > 4KB and below going to a 512B device). Sorry, the above was a bit terse. I think a much better solution would be to create a number of buckets (per data direction) and do stats on all of them. The buckets would cover a reasonable range of request sizes. Then when you poll for a given request, we can base our timed number on the data direction AND size of it. You can get pretty far with a few buckets: 512b 4k 8k 16k 32k 64k 128k and you could even have your time estimation function turn these into something sane. Or just use a composite of buckets, if you wish. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 20:38 ` Jens Axboe @ 2017-03-28 21:49 ` Stephen Bates 2017-03-28 21:52 ` Stephen Bates 1 sibling, 0 replies; 13+ messages in thread From: Stephen Bates @ 2017-03-28 21:49 UTC (permalink / raw) To: Jens Axboe, Sagi Grimberg Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org [-- Attachment #1: Type: text/plain, Size: 1711 bytes --] >>> >>> Thanks for the review Sagi. I’d be OK going with <=0 as the exact >>> match would normally be for minimal IO sizes (where <= and = are the >>> same thing). I will see what other feedback I get and aim to do a >>> respin soon… >> >> No tunables for this, please. There's absolutely no reason why we should >> need it. > > Jens – by this you mean you want to only bucket IO that are exactly > the minimum block size supported by the underlying block device? I was > envisioning we might want to relax that in certain cases (e.g. bucket > 4KB and below going to a 512B device). > Sorry, the above was a bit terse. I think a much better solution would > be to create a number of buckets (per data direction) and do stats on > all of them. The buckets would cover a reasonable range of request > sizes. Then when you poll for a given request, we can base our timed > number on the data direction AND size of it. You can get pretty far with > a few buckets: > > 512b > 4k > 8k > 16k > 32k > 64k > 128k > > and you could even have your time estimation function turn these into > something sane. Or just use a composite of buckets, if you wish. I did go down this path initially but then moved away from it since we were focusing only on the smaller IO size. However I can definitely take a look at this again as I agree that it could be more useful in the long run. I would like to keep my first patch in this series alive since I do think having the option to not bucket an IO is a useful thing to have. I’ll take all the feedback to date and work on a v2. Thanks! Stephen [-- Attachment #2: Type: text/html, Size: 5137 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct 2017-03-28 20:38 ` Jens Axboe 2017-03-28 21:49 ` Stephen Bates @ 2017-03-28 21:52 ` Stephen Bates 1 sibling, 0 replies; 13+ messages in thread From: Stephen Bates @ 2017-03-28 21:52 UTC (permalink / raw) To: Jens Axboe, Sagi Grimberg Cc: linux-block@vger.kernel.org, Damien.LeMoal@wdc.com, osandov@osandov.com, linux-nvme@lists.infradead.org W1JldHJ5aW5nIGFzIG15IG5ldyBzZXR1cCBzZWNyZXRseSBjb252ZXJ0ZWQgdG8gaHRtbCBmb3Jt YXQgd2l0aG91dCB0ZWxsaW5nIG1lLiBBcG9sb2dpZXMgZm9yIHRoZSByZXNlbmQuXQ0KDQo+Pj4g DQo+Pj4gVGhhbmtzIGZvciB0aGUgcmV2aWV3IFNhZ2kuIEnigJlkIGJlIE9LIGdvaW5nIHdpdGgg PD0wIGFzIHRoZSBleGFjdA0KPj4+IG1hdGNoIHdvdWxkIG5vcm1hbGx5IGJlIGZvciBtaW5pbWFs IElPIHNpemVzICh3aGVyZSA8PSBhbmQgPSBhcmUgdGhlDQo+Pj4gc2FtZSB0aGluZykuIEkgd2ls bCBzZWUgd2hhdCBvdGhlciBmZWVkYmFjayBJIGdldCBhbmQgYWltIHRvIGRvIGENCj4+PiByZXNw aW4gc29vbuKApg0KPj4gDQo+PiBObyB0dW5hYmxlcyBmb3IgdGhpcywgcGxlYXNlLiBUaGVyZSdz IGFic29sdXRlbHkgbm8gcmVhc29uIHdoeSB3ZSBzaG91bGQNCj4+IG5lZWQgaXQuDQo+DQo+IEpl bnMg4oCTIGJ5IHRoaXMgeW91IG1lYW4geW91IHdhbnQgdG8gb25seSBidWNrZXQgSU8gdGhhdCBh cmUgZXhhY3RseQ0KPiB0aGUgbWluaW11bSBibG9jayBzaXplIHN1cHBvcnRlZCBieSB0aGUgdW5k ZXJseWluZyBibG9jayBkZXZpY2U/IEkgd2FzDQo+IGVudmlzaW9uaW5nIHdlIG1pZ2h0IHdhbnQg dG8gcmVsYXggdGhhdCBpbiBjZXJ0YWluIGNhc2VzIChlLmcuIGJ1Y2tldA0KPiA0S0IgYW5kIGJl bG93IGdvaW5nIHRvIGEgNTEyQiBkZXZpY2UpLg0KIA0KPiBTb3JyeSwgdGhlIGFib3ZlIHdhcyBh IGJpdCB0ZXJzZS4gSSB0aGluayBhIG11Y2ggYmV0dGVyIHNvbHV0aW9uIHdvdWxkDQo+IGJlIHRv IGNyZWF0ZSBhIG51bWJlciBvZiBidWNrZXRzIChwZXIgZGF0YSBkaXJlY3Rpb24pIGFuZCBkbyBz dGF0cyBvbg0KPiBhbGwgb2YgdGhlbS4gVGhlIGJ1Y2tldHMgd291bGQgY292ZXIgYSByZWFzb25h YmxlIHJhbmdlIG9mIHJlcXVlc3QNCj4gc2l6ZXMuIFRoZW4gd2hlbiB5b3UgcG9sbCBmb3IgYSBn aXZlbiByZXF1ZXN0LCB3ZSBjYW4gYmFzZSBvdXIgdGltZWQNCj4gbnVtYmVyIG9uIHRoZSBkYXRh IGRpcmVjdGlvbiBBTkQgc2l6ZSBvZiBpdC4gWW91IGNhbiBnZXQgcHJldHR5IGZhciB3aXRoDQo+ IGEgZmV3IGJ1Y2tldHM6DQo+IA0KPiA1MTJiDQo+IDRrDQo+IDhrDQo+IDE2aw0KPiAzMmsNCj4g NjRrDQo+IDEyOGsNCj4gDQo+IGFuZCB5b3UgY291bGQgZXZlbiBoYXZlIHlvdXIgdGltZSBlc3Rp bWF0aW9uIGZ1bmN0aW9uIHR1cm4gdGhlc2UgaW50bw0KPiBzb21ldGhpbmcgc2FuZS4gT3IganVz dCB1c2UgYSBjb21wb3NpdGUgb2YgYnVja2V0cywgaWYgeW91IHdpc2guDQogDQpJIGRpZCBnbyBk b3duIHRoaXMgcGF0aCBpbml0aWFsbHkgYnV0IHRoZW4gbW92ZWQgYXdheSBmcm9tIGl0IHNpbmNl IHdlIHdlcmUgZm9jdXNpbmcgb25seSBvbiB0aGUgc21hbGxlciBJTyBzaXplLiBIb3dldmVyIEkg Y2FuIGRlZmluaXRlbHkgdGFrZSBhIGxvb2sgYXQgdGhpcyBhZ2FpbiBhcyBJIGFncmVlIHRoYXQg aXQgY291bGQgYmUgbW9yZSB1c2VmdWwgaW4gdGhlIGxvbmcgcnVuLg0KIA0KSSB3b3VsZCBsaWtl IHRvIGtlZXAgbXkgZmlyc3QgcGF0Y2ggaW4gdGhpcyBzZXJpZXMgYWxpdmUgc2luY2UgSSBkbyB0 aGluayBoYXZpbmcgdGhlIG9wdGlvbiB0byBub3QgYnVja2V0IGFuIElPIGlzIGEgdXNlZnVsIHRo aW5nIHRvIGhhdmUuDQogDQpJ4oCZbGwgdGFrZSBhbGwgdGhlIGZlZWRiYWNrIHRvIGRhdGUgYW5k IHdvcmsgb24gYSB2Mi4gVGhhbmtzIQ0KIA0KU3RlcGhlbg0KDQoNCg0K ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-03-28 21:52 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-26 2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates 2017-03-26 2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates 2017-03-26 2:49 ` Omar Sandoval 2017-03-27 16:00 ` Stephen Bates 2017-03-27 16:01 ` Omar Sandoval 2017-03-26 2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates 2017-03-28 10:50 ` Sagi Grimberg 2017-03-28 19:38 ` Stephen Bates 2017-03-28 19:46 ` Jens Axboe 2017-03-28 19:58 ` Stephen Bates 2017-03-28 20:38 ` Jens Axboe 2017-03-28 21:49 ` Stephen Bates 2017-03-28 21:52 ` Stephen Bates
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).