* [PATCH v3 0/2] blk-stat: Add ability to not bucket IO; improve IO polling. @ 2017-04-07 12:24 sbates 2017-04-07 12:24 ` [PATCH v3 1/2] blk-stat: convert blk-stat bucket callback to signed sbates 2017-04-07 12:24 ` [PATCH v3 2/2] blk-mq: Add a polling specific stats function sbates 0 siblings, 2 replies; 18+ messages in thread From: sbates @ 2017-04-07 12:24 UTC (permalink / raw) From: Stephen Bates <sbates@raithlin.com> Omar recently developed some patches for block layer stats that use callbacks to determine which bucket an IO should be considered for. At the same time there was discussion at LSF/MM that we might not want to consider all IO when generating stats for certain algorithms (e.g. IO completion polling) or to bucket them in a more optimal fashion. This set does two things. It makes the bucket callback for stats signed so we can now ignore IO that cause a negative to be returned from the bucket function. It then improves the IO polling latency estimations by bucketing stats based on IO size and direction. This patchset applies cleanly on 6809ef67eb7b4b68d (Merge branch 'for-4.12/block' into for-next) in Jens' for-next tree. I've lightly tested this using QEMU and a real NVMe low-latency device. I do not have performance number yet. Feedback would be appreciated! I am not *super* happy with how the bucketing by size is done. Any suggestions on how to improve this would be appreciated! Cc: Damien.LeMoal at wdc.com Cc: osandov at osandov.com Changes since v1: Modified the bucket function as per Jens' suggestions. Changes since v1: Dropped the cast in blk_stat_rq_ddir() as per Omar's suggestion. Moved to an array of buckets based on IO size rather than a filter as suggested by Jens and Damien. Rebased on Jen's for-next tree Stephen Bates (2): blk-stat: convert blk-stat bucket callback to signed blk-mq: Add a polling specific stats function block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- block/blk-stat.c | 6 ++++-- block/blk-stat.h | 9 +++++---- 3 files changed, 44 insertions(+), 16 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 1/2] blk-stat: convert blk-stat bucket callback to signed 2017-04-07 12:24 [PATCH v3 0/2] blk-stat: Add ability to not bucket IO; improve IO polling sbates @ 2017-04-07 12:24 ` sbates 2017-04-07 12:24 ` [PATCH v3 2/2] blk-mq: Add a polling specific stats function sbates 1 sibling, 0 replies; 18+ messages in thread From: sbates @ 2017-04-07 12:24 UTC (permalink / raw) From: Stephen Bates <sbates@raithlin.com> In order to allow for filtering of IO based on some other properties of the request than direction we allow the bucket function to return an int. If the bucket callback returns a negative do no count it in the stats accumulation. Signed-off-by: Stephen Bates <sbates at raithlin.com> --- block/blk-stat.c | 6 ++++-- block/blk-stat.h | 9 +++++---- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/block/blk-stat.c b/block/blk-stat.c index e77ec52..dde9d39 100644 --- a/block/blk-stat.c +++ b/block/blk-stat.c @@ -19,7 +19,7 @@ struct blk_queue_stats { bool enable_accounting; }; -unsigned int blk_stat_rq_ddir(const struct request *rq) +int blk_stat_rq_ddir(const struct request *rq) { return rq_data_dir(rq); } @@ -104,6 +104,8 @@ void blk_stat_add(struct request *rq) list_for_each_entry_rcu(cb, &q->stats->callbacks, list) { if (blk_stat_is_active(cb)) { bucket = cb->bucket_fn(rq); + if (bucket < 0) + continue; stat = &this_cpu_ptr(cb->cpu_stat)[bucket]; __blk_stat_add(stat, value); } @@ -135,7 +137,7 @@ static void blk_stat_timer_fn(unsigned long data) struct blk_stat_callback * blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), - unsigned int (*bucket_fn)(const struct request *), + int (*bucket_fn)(const struct request *), unsigned int buckets, void *data) { struct blk_stat_callback *cb; diff --git a/block/blk-stat.h b/block/blk-stat.h index 53f08a6..622a62c 100644 --- a/block/blk-stat.h +++ b/block/blk-stat.h @@ -48,9 +48,10 @@ struct blk_stat_callback { /** * @bucket_fn: Given a request, returns which statistics bucket it - * should be accounted under. + * should be accounted under. Return -1 for no bucket for this + * request. */ - unsigned int (*bucket_fn)(const struct request *); + int (*bucket_fn)(const struct request *); /** * @buckets: Number of statistics buckets. @@ -120,7 +121,7 @@ void blk_stat_enable_accounting(struct request_queue *q); * * Return: Data direction of the request, either READ or WRITE. */ -unsigned int blk_stat_rq_ddir(const struct request *rq); +int blk_stat_rq_ddir(const struct request *rq); /** * blk_stat_alloc_callback() - Allocate a block statistics callback. @@ -135,7 +136,7 @@ unsigned int blk_stat_rq_ddir(const struct request *rq); */ struct blk_stat_callback * blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), - unsigned int (*bucket_fn)(const struct request *), + int (*bucket_fn)(const struct request *), unsigned int buckets, void *data); /** -- 2.7.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-07 12:24 [PATCH v3 0/2] blk-stat: Add ability to not bucket IO; improve IO polling sbates 2017-04-07 12:24 ` [PATCH v3 1/2] blk-stat: convert blk-stat bucket callback to signed sbates @ 2017-04-07 12:24 ` sbates 2017-04-20 20:07 ` Omar Sandoval 1 sibling, 1 reply; 18+ messages in thread From: sbates @ 2017-04-07 12:24 UTC (permalink / raw) From: Stephen Bates <sbates@raithlin.com> Rather than bucketing IO statisics based on direction only we also bucket based on the IO size. This leads to improved polling performance. Update the bucket callback function and use it in the polling latency estimation. Signed-off-by: Stephen Bates <sbates at raithlin.com> --- block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 10 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 061fc2c..5fd376b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -42,6 +42,25 @@ static LIST_HEAD(all_q_list); static void blk_mq_poll_stats_start(struct request_queue *q); static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); +/* Must be consisitent with function below */ +#define BLK_MQ_POLL_STATS_BKTS 16 +static int blk_mq_poll_stats_bkt(const struct request *rq) +{ + int ddir, bytes, bucket; + + ddir = blk_stat_rq_ddir(rq); + bytes = blk_rq_bytes(rq); + + bucket = ddir + 2*(ilog2(bytes) - 9); + + if (bucket < 0) + return -1; + else if (bucket >= BLK_MQ_POLL_STATS_BKTS) + return ddir + BLK_MQ_POLL_STATS_BKTS - 2; + + return bucket; +} + /* * Check if any of the ctx's have pending work in this hardware queue */ @@ -2245,7 +2264,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->mq_ops = set->ops; q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn, - blk_stat_rq_ddir, 2, q); + blk_mq_poll_stats_bkt, + BLK_MQ_POLL_STATS_BKTS, q); if (!q->poll_cb) goto err_exit; @@ -2663,11 +2683,12 @@ static void blk_mq_poll_stats_start(struct request_queue *q) static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb) { struct request_queue *q = cb->data; + int bucket; - if (cb->stat[READ].nr_samples) - q->poll_stat[READ] = cb->stat[READ]; - if (cb->stat[WRITE].nr_samples) - q->poll_stat[WRITE] = cb->stat[WRITE]; + for (bucket = 0; bucket < BLK_MQ_POLL_STATS_BKTS; bucket++) { + if (cb->stat[bucket].nr_samples) + q->poll_stat[bucket] = cb->stat[bucket]; + } } static unsigned long blk_mq_poll_nsecs(struct request_queue *q, @@ -2675,6 +2696,7 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q, struct request *rq) { unsigned long ret = 0; + int bucket; /* * If stats collection isn't on, don't sleep but turn it on for @@ -2689,12 +2711,15 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q, * For instance, if the completion latencies are tight, we can * get closer than just half the mean. This is especially * important on devices where the completion latencies are longer - * than ~10 usec. + * than ~10 usec. We do use the stats for the relevant IO size + * if available which does lead to better estimates. */ - if (req_op(rq) == REQ_OP_READ && q->poll_stat[READ].nr_samples) - ret = (q->poll_stat[READ].mean + 1) / 2; - else if (req_op(rq) == REQ_OP_WRITE && q->poll_stat[WRITE].nr_samples) - ret = (q->poll_stat[WRITE].mean + 1) / 2; + bucket = blk_mq_poll_stats_bkt(rq); + if (bucket < 0) + return ret; + + if (q->poll_stat[bucket].nr_samples) + ret = (q->poll_stat[bucket].mean + 1) / 2; return ret; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-07 12:24 ` [PATCH v3 2/2] blk-mq: Add a polling specific stats function sbates @ 2017-04-20 20:07 ` Omar Sandoval 2017-04-20 20:16 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Omar Sandoval @ 2017-04-20 20:07 UTC (permalink / raw) On Fri, Apr 07, 2017@06:24:03AM -0600, sbates@raithlin.com wrote: > From: Stephen Bates <sbates at raithlin.com> > > Rather than bucketing IO statisics based on direction only we also > bucket based on the IO size. This leads to improved polling > performance. Update the bucket callback function and use it in the > polling latency estimation. > > Signed-off-by: Stephen Bates <sbates at raithlin.com> Hey, Stephen, just taking a look at this now. Comments below. > --- > block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- > 1 file changed, 35 insertions(+), 10 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 061fc2c..5fd376b 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -42,6 +42,25 @@ static LIST_HEAD(all_q_list); > static void blk_mq_poll_stats_start(struct request_queue *q); > static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); > > +/* Must be consisitent with function below */ > +#define BLK_MQ_POLL_STATS_BKTS 16 > +static int blk_mq_poll_stats_bkt(const struct request *rq) > +{ > + int ddir, bytes, bucket; > + > + ddir = blk_stat_rq_ddir(rq); No need to call the wrapper function here, we can use rq_data_dir() directly. > + bytes = blk_rq_bytes(rq); > + > + bucket = ddir + 2*(ilog2(bytes) - 9); > + > + if (bucket < 0) > + return -1; > + else if (bucket >= BLK_MQ_POLL_STATS_BKTS) > + return ddir + BLK_MQ_POLL_STATS_BKTS - 2; > + > + return bucket; > +} Nitpicking here, but defining things in terms of the number of size buckets seems more natural to me. How about something like this (untested)? Note that this obviates the need for patch 1. #define BLK_MQ_POLL_STATS_SIZE_BKTS 8 static unsigned int blk_mq_poll_stats_bkt(const struct request *rq) { unsigned int size_bucket; size_bucket = clamp(ilog2(blk_rq_bytes(rq)) - 9, 0, BLK_MQ_POLL_STATS_SIZE_BKTS - 1); return 2 * size_bucket + rq_data_dir(rq); } > /* > * Check if any of the ctx's have pending work in this hardware queue > */ > @@ -2245,7 +2264,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, > q->mq_ops = set->ops; > > q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn, > - blk_stat_rq_ddir, 2, q); > + blk_mq_poll_stats_bkt, > + BLK_MQ_POLL_STATS_BKTS, q); With the above change, this would become 2 * BLK_MQ_POLL_STATS_SIZE_BKTS. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:07 ` Omar Sandoval @ 2017-04-20 20:16 ` Jens Axboe 2017-04-20 20:20 ` Omar Sandoval 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2017-04-20 20:16 UTC (permalink / raw) On 04/20/2017 02:07 PM, Omar Sandoval wrote: > On Fri, Apr 07, 2017@06:24:03AM -0600, sbates@raithlin.com wrote: >> From: Stephen Bates <sbates at raithlin.com> >> >> Rather than bucketing IO statisics based on direction only we also >> bucket based on the IO size. This leads to improved polling >> performance. Update the bucket callback function and use it in the >> polling latency estimation. >> >> Signed-off-by: Stephen Bates <sbates at raithlin.com> > > Hey, Stephen, just taking a look at this now. Comments below. > >> --- >> block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- >> 1 file changed, 35 insertions(+), 10 deletions(-) >> >> diff --git a/block/blk-mq.c b/block/blk-mq.c >> index 061fc2c..5fd376b 100644 >> --- a/block/blk-mq.c >> +++ b/block/blk-mq.c >> @@ -42,6 +42,25 @@ static LIST_HEAD(all_q_list); >> static void blk_mq_poll_stats_start(struct request_queue *q); >> static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); >> >> +/* Must be consisitent with function below */ >> +#define BLK_MQ_POLL_STATS_BKTS 16 >> +static int blk_mq_poll_stats_bkt(const struct request *rq) >> +{ >> + int ddir, bytes, bucket; >> + >> + ddir = blk_stat_rq_ddir(rq); > > No need to call the wrapper function here, we can use rq_data_dir() > directly. > >> + bytes = blk_rq_bytes(rq); >> + >> + bucket = ddir + 2*(ilog2(bytes) - 9); >> + >> + if (bucket < 0) >> + return -1; >> + else if (bucket >= BLK_MQ_POLL_STATS_BKTS) >> + return ddir + BLK_MQ_POLL_STATS_BKTS - 2; >> + >> + return bucket; >> +} > > Nitpicking here, but defining things in terms of the number of size > buckets seems more natural to me. How about something like this > (untested)? Note that this obviates the need for patch 1. > > #define BLK_MQ_POLL_STATS_SIZE_BKTS 8 > static unsigned int blk_mq_poll_stats_bkt(const struct request *rq) > { > unsigned int size_bucket; > > size_bucket = clamp(ilog2(blk_rq_bytes(rq)) - 9, 0, > BLK_MQ_POLL_STATS_SIZE_BKTS - 1); > return 2 * size_bucket + rq_data_dir(rq); > } As I wrote in an earlier reply, it would be a lot cleaner to just have the buckets be: buckets[2][BUCKETS_PER_RW]; and not have to do weird math based on both size and data direction. Just have it return the bucket index based on size, and have the caller do: bucket[rq_data_dir(rq)][bucket_index]; -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:16 ` Jens Axboe @ 2017-04-20 20:20 ` Omar Sandoval 2017-04-20 20:22 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Omar Sandoval @ 2017-04-20 20:20 UTC (permalink / raw) On Thu, Apr 20, 2017@02:16:04PM -0600, Jens Axboe wrote: > On 04/20/2017 02:07 PM, Omar Sandoval wrote: > > On Fri, Apr 07, 2017@06:24:03AM -0600, sbates@raithlin.com wrote: > >> From: Stephen Bates <sbates at raithlin.com> > >> > >> Rather than bucketing IO statisics based on direction only we also > >> bucket based on the IO size. This leads to improved polling > >> performance. Update the bucket callback function and use it in the > >> polling latency estimation. > >> > >> Signed-off-by: Stephen Bates <sbates at raithlin.com> > > > > Hey, Stephen, just taking a look at this now. Comments below. > > > >> --- > >> block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- > >> 1 file changed, 35 insertions(+), 10 deletions(-) > >> > >> diff --git a/block/blk-mq.c b/block/blk-mq.c > >> index 061fc2c..5fd376b 100644 > >> --- a/block/blk-mq.c > >> +++ b/block/blk-mq.c > >> @@ -42,6 +42,25 @@ static LIST_HEAD(all_q_list); > >> static void blk_mq_poll_stats_start(struct request_queue *q); > >> static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); > >> > >> +/* Must be consisitent with function below */ > >> +#define BLK_MQ_POLL_STATS_BKTS 16 > >> +static int blk_mq_poll_stats_bkt(const struct request *rq) > >> +{ > >> + int ddir, bytes, bucket; > >> + > >> + ddir = blk_stat_rq_ddir(rq); > > > > No need to call the wrapper function here, we can use rq_data_dir() > > directly. > > > >> + bytes = blk_rq_bytes(rq); > >> + > >> + bucket = ddir + 2*(ilog2(bytes) - 9); > >> + > >> + if (bucket < 0) > >> + return -1; > >> + else if (bucket >= BLK_MQ_POLL_STATS_BKTS) > >> + return ddir + BLK_MQ_POLL_STATS_BKTS - 2; > >> + > >> + return bucket; > >> +} > > > > Nitpicking here, but defining things in terms of the number of size > > buckets seems more natural to me. How about something like this > > (untested)? Note that this obviates the need for patch 1. > > > > #define BLK_MQ_POLL_STATS_SIZE_BKTS 8 > > static unsigned int blk_mq_poll_stats_bkt(const struct request *rq) > > { > > unsigned int size_bucket; > > > > size_bucket = clamp(ilog2(blk_rq_bytes(rq)) - 9, 0, > > BLK_MQ_POLL_STATS_SIZE_BKTS - 1); > > return 2 * size_bucket + rq_data_dir(rq); > > } > > As I wrote in an earlier reply, it would be a lot cleaner to just have > the buckets be: > > buckets[2][BUCKETS_PER_RW]; > > and not have to do weird math based on both size and data direction. > Just have it return the bucket index based on size, and have the caller > do: > > bucket[rq_data_dir(rq)][bucket_index]; This removes a lot of the flexibility of the interface. Kyber, for one, has this stats callback: static unsigned int rq_sched_domain(const struct request *rq) { unsigned int op = rq->cmd_flags; if ((op & REQ_OP_MASK) == REQ_OP_READ) return KYBER_READ; else if ((op & REQ_OP_MASK) == REQ_OP_WRITE && op_is_sync(op)) return KYBER_SYNC_WRITE; else return KYBER_OTHER; } The buckets aren't subdivisions of read vs. write. We could shoehorn it in your way if we really wanted to, but that's pointless. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:20 ` Omar Sandoval @ 2017-04-20 20:22 ` Jens Axboe 2017-04-20 20:33 ` Stephen Bates 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2017-04-20 20:22 UTC (permalink / raw) On 04/20/2017 02:20 PM, Omar Sandoval wrote: > On Thu, Apr 20, 2017@02:16:04PM -0600, Jens Axboe wrote: >> On 04/20/2017 02:07 PM, Omar Sandoval wrote: >>> On Fri, Apr 07, 2017@06:24:03AM -0600, sbates@raithlin.com wrote: >>>> From: Stephen Bates <sbates at raithlin.com> >>>> >>>> Rather than bucketing IO statisics based on direction only we also >>>> bucket based on the IO size. This leads to improved polling >>>> performance. Update the bucket callback function and use it in the >>>> polling latency estimation. >>>> >>>> Signed-off-by: Stephen Bates <sbates at raithlin.com> >>> >>> Hey, Stephen, just taking a look at this now. Comments below. >>> >>>> --- >>>> block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++---------- >>>> 1 file changed, 35 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>> index 061fc2c..5fd376b 100644 >>>> --- a/block/blk-mq.c >>>> +++ b/block/blk-mq.c >>>> @@ -42,6 +42,25 @@ static LIST_HEAD(all_q_list); >>>> static void blk_mq_poll_stats_start(struct request_queue *q); >>>> static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); >>>> >>>> +/* Must be consisitent with function below */ >>>> +#define BLK_MQ_POLL_STATS_BKTS 16 >>>> +static int blk_mq_poll_stats_bkt(const struct request *rq) >>>> +{ >>>> + int ddir, bytes, bucket; >>>> + >>>> + ddir = blk_stat_rq_ddir(rq); >>> >>> No need to call the wrapper function here, we can use rq_data_dir() >>> directly. >>> >>>> + bytes = blk_rq_bytes(rq); >>>> + >>>> + bucket = ddir + 2*(ilog2(bytes) - 9); >>>> + >>>> + if (bucket < 0) >>>> + return -1; >>>> + else if (bucket >= BLK_MQ_POLL_STATS_BKTS) >>>> + return ddir + BLK_MQ_POLL_STATS_BKTS - 2; >>>> + >>>> + return bucket; >>>> +} >>> >>> Nitpicking here, but defining things in terms of the number of size >>> buckets seems more natural to me. How about something like this >>> (untested)? Note that this obviates the need for patch 1. >>> >>> #define BLK_MQ_POLL_STATS_SIZE_BKTS 8 >>> static unsigned int blk_mq_poll_stats_bkt(const struct request *rq) >>> { >>> unsigned int size_bucket; >>> >>> size_bucket = clamp(ilog2(blk_rq_bytes(rq)) - 9, 0, >>> BLK_MQ_POLL_STATS_SIZE_BKTS - 1); >>> return 2 * size_bucket + rq_data_dir(rq); >>> } >> >> As I wrote in an earlier reply, it would be a lot cleaner to just have >> the buckets be: >> >> buckets[2][BUCKETS_PER_RW]; >> >> and not have to do weird math based on both size and data direction. >> Just have it return the bucket index based on size, and have the caller >> do: >> >> bucket[rq_data_dir(rq)][bucket_index]; > > This removes a lot of the flexibility of the interface. Kyber, for one, > has this stats callback: > > static unsigned int rq_sched_domain(const struct request *rq) > { > unsigned int op = rq->cmd_flags; > > if ((op & REQ_OP_MASK) == REQ_OP_READ) > return KYBER_READ; > else if ((op & REQ_OP_MASK) == REQ_OP_WRITE && op_is_sync(op)) > return KYBER_SYNC_WRITE; > else > return KYBER_OTHER; > } Good point, I guess other users could have different methods of bucketization. > The buckets aren't subdivisions of read vs. write. We could shoehorn it > in your way if we really wanted to, but that's pointless. Nah, let's just leave it as-is then, even though I don't think it's the prettiest thing I've ever seen. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:22 ` Jens Axboe @ 2017-04-20 20:33 ` Stephen Bates 2017-04-20 20:34 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Stephen Bates @ 2017-04-20 20:33 UTC (permalink / raw) > Nah, let's just leave it as-is then, even though I don't think it's the > prettiest thing I've ever seen. I did look at making the stats buckets in the request_queue struct based on dir and size. Something like: - struct blk_rq_stat poll_stat[2]; + struct blk_rq_stat poll_stat[2][BLK_MQ_POLL_STATS_BKTS/2]; This actually did clean up some in some places but because the callback still uses a linear array of buckets we do get this: - if (cb->stat[READ].nr_samples) - q->poll_stat[READ] = cb->stat[READ]; - if (cb->stat[WRITE].nr_samples) - q->poll_stat[WRITE] = cb->stat[WRITE]; + for (bucket = 0; bucket < BLK_MQ_POLL_STATS_BKTS; bucket++) { + if (cb->stat[bucket].nr_samples) + q->poll_stat[bucket%2][bucket/2] = cb->stat[bucket]; I tend to agree with Omar that keeping the buckets in a linear array is overall cleaner and more generalized. However right now I am stuck as I am seeing the kernel oops I reported before in testing of my latest patchset [1]. I will try and find some time to bisect that but it looks like it was introduced when the support for mq schedulers was added (on or around bd166ef18). Stephen [1] http://marc.info/?l=linux-block&m=149156785215919&w=2 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:33 ` Stephen Bates @ 2017-04-20 20:34 ` Jens Axboe 2017-04-20 20:47 ` Stephen Bates 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2017-04-20 20:34 UTC (permalink / raw) On 04/20/2017 02:33 PM, Stephen Bates wrote: > >> Nah, let's just leave it as-is then, even though I don't think it's the >> prettiest thing I've ever seen. > > I did look at making the stats buckets in the request_queue struct > based on dir and size. Something like: > > - struct blk_rq_stat poll_stat[2]; > + struct blk_rq_stat poll_stat[2][BLK_MQ_POLL_STATS_BKTS/2]; > > This actually did clean up some in some places but because the > callback still uses a linear array of buckets we do get this: > > - if (cb->stat[READ].nr_samples) > - q->poll_stat[READ] = cb->stat[READ]; > - if (cb->stat[WRITE].nr_samples) > - q->poll_stat[WRITE] = cb->stat[WRITE]; > + for (bucket = 0; bucket < BLK_MQ_POLL_STATS_BKTS; bucket++) { > + if (cb->stat[bucket].nr_samples) > + q->poll_stat[bucket%2][bucket/2] = cb->stat[bucket]; > > I tend to agree with Omar that keeping the buckets in a linear array > is overall cleaner and more generalized. I agree, it's fine as-is. We should queue it up for 4.12. > However right now I am stuck as I am seeing the kernel oops I reported > before in testing of my latest patchset [1]. I will try and find some > time to bisect that but it looks like it was introduced when the > support for mq schedulers was added (on or around bd166ef18). Just replied to that one, let me know if the suggestion works. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:34 ` Jens Axboe @ 2017-04-20 20:47 ` Stephen Bates 2017-04-20 20:53 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Stephen Bates @ 2017-04-20 20:47 UTC (permalink / raw) > I agree, it's fine as-is. We should queue it up for 4.12. Great. I will get something based on Omar?s latest comments asap. > > However right now I am stuck as I am seeing the kernel oops I reported > > before in testing of my latest patchset [1]. I will try and find some >> time to bisect that but it looks like it was introduced when the > > support for mq schedulers was added (on or around bd166ef18). > Just replied to that one, let me know if the suggestion works. That suggestion worked. Do you want me to send a patch for it? Stephen ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:47 ` Stephen Bates @ 2017-04-20 20:53 ` Jens Axboe 2017-04-20 21:08 ` Stephen Bates 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2017-04-20 20:53 UTC (permalink / raw) On 04/20/2017 02:47 PM, Stephen Bates wrote: > >> I agree, it's fine as-is. We should queue it up for 4.12. > > Great. I will get something based on Omar?s latest comments asap. > >>> However right now I am stuck as I am seeing the kernel oops I reported >>> before in testing of my latest patchset [1]. I will try and find some >>> time to bisect that but it looks like it was introduced when the >>> support for mq schedulers was added (on or around bd166ef18). > >> Just replied to that one, let me know if the suggestion works. > > That suggestion worked. Do you want me to send a patch for it? Is the patch going to be different than the one I sent? Here it is, with a comment added. Can I add you tested-by? diff --git a/block/blk-mq.c b/block/blk-mq.c index 572966f49596..c7836a1ded97 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2928,8 +2928,17 @@ bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie) hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)]; if (!blk_qc_t_is_internal(cookie)) rq = blk_mq_tag_to_rq(hctx->tags, blk_qc_t_to_tag(cookie)); - else + else { rq = blk_mq_tag_to_rq(hctx->sched_tags, blk_qc_t_to_tag(cookie)); + /* + * With scheduling, if the request has completed, we'll + * get a NULL return here, as we clear the sched tag when + * that happens. The request still remains valid, like always, + * so we should be safe with just the NULL check. + */ + if (!rq) + return false; + } return __blk_mq_poll(hctx, rq); } -- Jens Axboe ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 20:53 ` Jens Axboe @ 2017-04-20 21:08 ` Stephen Bates 2017-04-20 21:14 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Stephen Bates @ 2017-04-20 21:08 UTC (permalink / raw) > Is the patch going to be different than the one I sent? Here it > is, with a comment added. Can I add you tested-by? Yes you can add a Tested-By from me?. Tested-By: Stephen Bates <sbates at raithlin.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 21:08 ` Stephen Bates @ 2017-04-20 21:14 ` Jens Axboe 2017-04-20 21:41 ` Stephen Bates 0 siblings, 1 reply; 18+ messages in thread From: Jens Axboe @ 2017-04-20 21:14 UTC (permalink / raw) On 04/20/2017 03:08 PM, Stephen Bates wrote: > >> Is the patch going to be different than the one I sent? Here it >> is, with a comment added. Can I add you tested-by? > > Yes you can add a Tested-By from me?. > > Tested-By: Stephen Bates <sbates at raithlin.com> Great thanks, pushed. I'll get this added for 4.11. Thanks for the report! -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 21:14 ` Jens Axboe @ 2017-04-20 21:41 ` Stephen Bates 2017-04-20 21:42 ` Jens Axboe 0 siblings, 1 reply; 18+ messages in thread From: Stephen Bates @ 2017-04-20 21:41 UTC (permalink / raw) > Great thanks, pushed. I'll get this added for 4.11. Thanks for > the report! I see you applied my v3 series to the for-4.12/block branch. There is one issue there I need to fix. Can I send you a v4 a bit later today instead? -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 21:41 ` Stephen Bates @ 2017-04-20 21:42 ` Jens Axboe 2017-04-20 21:45 ` Stephen Bates 2017-04-20 22:05 ` Stephen Bates 0 siblings, 2 replies; 18+ messages in thread From: Jens Axboe @ 2017-04-20 21:42 UTC (permalink / raw) On 04/20/2017 03:41 PM, Stephen Bates wrote: > >> Great thanks, pushed. I'll get this added for 4.11. Thanks for >> the report! > > I see you applied my v3 series to the for-4.12/block branch. There is > one issue there I need to fix. Can I send you a v4 a bit later today > instead? You can, but it won't do much good since v3 is already applied. Any further changes must be incremental. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 21:42 ` Jens Axboe @ 2017-04-20 21:45 ` Stephen Bates 2017-04-20 22:05 ` Stephen Bates 1 sibling, 0 replies; 18+ messages in thread From: Stephen Bates @ 2017-04-20 21:45 UTC (permalink / raw) > You can, but it won't do much good since v3 is already applied. Any > further changes must be incremental. OK, I will send a small patch on top of what you just applied. Thanks! ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 21:42 ` Jens Axboe 2017-04-20 21:45 ` Stephen Bates @ 2017-04-20 22:05 ` Stephen Bates 2017-04-20 22:06 ` Jens Axboe 1 sibling, 1 reply; 18+ messages in thread From: Stephen Bates @ 2017-04-20 22:05 UTC (permalink / raw) > You can, but it won't do much good since v3 is already applied. Any > further changes must be incremental. BTW getting a compile error from the Kyber code in for-4.12/block due to the fact we now return a signed from the bucket function? batesste at ubuntu64-batesste:~/kernel/linux$ make -j 2 all CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CHK include/generated/timeconst.h CHK include/generated/bounds.h CHK include/generated/asm-offsets.h CALL scripts/checksyscalls.sh CHK include/generated/compile.h CC block/kyber-iosched.o block/kyber-iosched.c: In function ?kyber_queue_data_alloc?: block/kyber-iosched.c:295:57: error: passing argument 2 of ?blk_stat_alloc_callback? from incompatible pointer type [-Werror=incompatible-pointer-types] kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, rq_sched_domain, ^ In file included from block/blk-mq.h:4:0, from block/blk.h:6, from block/kyber-iosched.c:27: block/blk-stat.h:138:1: note: expected ?int (*)(const struct request *)? but argument is of type ?unsigned int (*)(const struct request *)? blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), ^ CC block/compat_ioctl.o cc1: some warnings being treated as errors ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 2/2] blk-mq: Add a polling specific stats function 2017-04-20 22:05 ` Stephen Bates @ 2017-04-20 22:06 ` Jens Axboe 0 siblings, 0 replies; 18+ messages in thread From: Jens Axboe @ 2017-04-20 22:06 UTC (permalink / raw) On 04/20/2017 04:05 PM, Stephen Bates wrote: > >> You can, but it won't do much good since v3 is already applied. Any >> further changes must be incremental. > > BTW getting a compile error from the Kyber code in for-4.12/block due to the fact we now return a signed from the bucket function? > > batesste at ubuntu64-batesste:~/kernel/linux$ make -j 2 all > CHK include/config/kernel.release > CHK include/generated/uapi/linux/version.h > CHK include/generated/utsrelease.h > CHK include/generated/timeconst.h > CHK include/generated/bounds.h > CHK include/generated/asm-offsets.h > CALL scripts/checksyscalls.sh > CHK include/generated/compile.h > CC block/kyber-iosched.o > block/kyber-iosched.c: In function ?kyber_queue_data_alloc?: > block/kyber-iosched.c:295:57: error: passing argument 2 of ?blk_stat_alloc_callback? from incompatible pointer type [-Werror=incompatible-pointer-types] > kqd->cb = blk_stat_alloc_callback(kyber_stat_timer_fn, rq_sched_domain, > ^ > In file included from block/blk-mq.h:4:0, > from block/blk.h:6, > from block/kyber-iosched.c:27: > block/blk-stat.h:138:1: note: expected ?int (*)(const struct request *)? but argument is of type ?unsigned int (*)(const struct request *)? > blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *), > ^ > CC block/compat_ioctl.o > cc1: some warnings being treated as errors I did fix that one up, I did a ninja rebase, but apparently you pulled in the few minutes it existed. So just update, and you should be fine. -- Jens Axboe ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2017-04-20 22:06 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-04-07 12:24 [PATCH v3 0/2] blk-stat: Add ability to not bucket IO; improve IO polling sbates 2017-04-07 12:24 ` [PATCH v3 1/2] blk-stat: convert blk-stat bucket callback to signed sbates 2017-04-07 12:24 ` [PATCH v3 2/2] blk-mq: Add a polling specific stats function sbates 2017-04-20 20:07 ` Omar Sandoval 2017-04-20 20:16 ` Jens Axboe 2017-04-20 20:20 ` Omar Sandoval 2017-04-20 20:22 ` Jens Axboe 2017-04-20 20:33 ` Stephen Bates 2017-04-20 20:34 ` Jens Axboe 2017-04-20 20:47 ` Stephen Bates 2017-04-20 20:53 ` Jens Axboe 2017-04-20 21:08 ` Stephen Bates 2017-04-20 21:14 ` Jens Axboe 2017-04-20 21:41 ` Stephen Bates 2017-04-20 21:42 ` Jens Axboe 2017-04-20 21:45 ` Stephen Bates 2017-04-20 22:05 ` Stephen Bates 2017-04-20 22:06 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox