linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling.
@ 2017-03-26  2:18 sbates
  2017-03-26  2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates
  2017-03-26  2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates
  0 siblings, 2 replies; 12+ messages in thread
From: sbates @ 2017-03-26  2:18 UTC (permalink / raw)


From: Stephen Bates <sbates@raithlin.com>

Omar recently developed some patches for block layer stats that use
callbacks to determine which bucket an IO should be considered for. At
the same time there was discussion at LSF/MM that we might not want to
consider all IO when generating stats for certain algorithms (e.g. IO
completion polling).

This set does two things. It makes the bucket callback for stats
signed so we can now ignore IO that cause a negative to be returned
from the bucket function. It then uses this new functionality to
filter IO for the IO completion polling algorithms.

This patchset applies cleanly on a83b576c9c25cf (block: fix stacked
driver stats init and free) in Jens' for-next tree.

I've lightly tested this using QEMU and a real NVMe low-latency
device. I do not have performance number yet. Feedback would be
appreciated!

[BTW this is my first submission for an all new setup so apologies if
this does not come through correctly!]

Cc: Damien.LeMoal at wdc.com
Cc: osandov at osandov.com

Stephen Bates (2):
  blk-stat: convert blk-stat bucket callback to signed
  blk-stat: add a poll_size value to the request_queue struct

 block/blk-mq.c         | 17 +++++++++++++++--
 block/blk-stat.c       |  8 +++++---
 block/blk-stat.h       |  9 +++++----
 block/blk-sysfs.c      | 30 ++++++++++++++++++++++++++++++
 include/linux/blkdev.h |  1 +
 5 files changed, 56 insertions(+), 9 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed
  2017-03-26  2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates
@ 2017-03-26  2:18 ` sbates
  2017-03-26  2:49   ` Omar Sandoval
  2017-03-26  2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates
  1 sibling, 1 reply; 12+ messages in thread
From: sbates @ 2017-03-26  2:18 UTC (permalink / raw)


From: Stephen Bates <sbates@raithlin.com>

In order to allow for filtering of IO based on some other properties
of the request than direction we allow the bucket function to return
an int.

If the bucket callback returns a negative do no count it in the stats
accumulation.

Signed-off-by: Stephen Bates <sbates at raithlin.com>
---
 block/blk-stat.c | 8 +++++---
 block/blk-stat.h | 9 +++++----
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/block/blk-stat.c b/block/blk-stat.c
index 188b535..936adfb 100644
--- a/block/blk-stat.c
+++ b/block/blk-stat.c
@@ -17,9 +17,9 @@ struct blk_queue_stats {
 	spinlock_t lock;
 };
 
-unsigned int blk_stat_rq_ddir(const struct request *rq)
+int blk_stat_rq_ddir(const struct request *rq)
 {
-	return rq_data_dir(rq);
+	return (int)rq_data_dir(rq);
 }
 EXPORT_SYMBOL_GPL(blk_stat_rq_ddir);
 
@@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq)
 	list_for_each_entry_rcu(cb, &q->stats->callbacks, list) {
 		if (blk_stat_is_active(cb)) {
 			bucket = cb->bucket_fn(rq);
+			if (bucket < 0)
+				continue;
 			stat = &this_cpu_ptr(cb->cpu_stat)[bucket];
 			__blk_stat_add(stat, value);
 		}
@@ -131,7 +133,7 @@ static void blk_stat_timer_fn(unsigned long data)
 
 struct blk_stat_callback *
 blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *),
-			unsigned int (*bucket_fn)(const struct request *),
+			int (*bucket_fn)(const struct request *),
 			unsigned int buckets, void *data)
 {
 	struct blk_stat_callback *cb;
diff --git a/block/blk-stat.h b/block/blk-stat.h
index 6ad5b8c..7417805 100644
--- a/block/blk-stat.h
+++ b/block/blk-stat.h
@@ -41,9 +41,10 @@ struct blk_stat_callback {
 
 	/**
 	 * @bucket_fn: Given a request, returns which statistics bucket it
-	 * should be accounted under.
+	 * should be accounted under. Return -1 for no bucket for this
+	 * request.
 	 */
-	unsigned int (*bucket_fn)(const struct request *);
+	int (*bucket_fn)(const struct request *);
 
 	/**
 	 * @buckets: Number of statistics buckets.
@@ -98,7 +99,7 @@ static inline u64 blk_stat_time(struct blk_issue_stat *stat)
  *
  * Return: Data direction of the request, either READ or WRITE.
  */
-unsigned int blk_stat_rq_ddir(const struct request *rq);
+int blk_stat_rq_ddir(const struct request *rq);
 
 /**
  * blk_stat_alloc_callback() - Allocate a block statistics callback.
@@ -113,7 +114,7 @@ unsigned int blk_stat_rq_ddir(const struct request *rq);
  */
 struct blk_stat_callback *
 blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *),
-			unsigned int (*bucket_fn)(const struct request *),
+			int (*bucket_fn)(const struct request *),
 			unsigned int buckets, void *data);
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-26  2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates
  2017-03-26  2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates
@ 2017-03-26  2:18 ` sbates
  2017-03-28 10:50   ` Sagi Grimberg
  1 sibling, 1 reply; 12+ messages in thread
From: sbates @ 2017-03-26  2:18 UTC (permalink / raw)


From: Stephen Bates <sbates@raithlin.com>

In order to bucket IO for the polling algorithm we use a sysfs entry
to set the filter value. It is signed and we will use that as follows:

 0   : No filtering. All IO are considered in stat generation
 > 0 : Filtering based on IO of exactly this size only.
 < 0 : Filtering based on IO less than or equal to -1 time this value.

Use this member to implement a new bucket function for the io polling
stats.

Signed-off-by: Stephen Bates <sbates at raithlin.com>
---
 block/blk-mq.c         | 17 +++++++++++++++--
 block/blk-sysfs.c      | 30 ++++++++++++++++++++++++++++++
 include/linux/blkdev.h |  1 +
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5ff66f2..5ea9481 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2327,6 +2327,18 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
 	blk_mq_sysfs_register(q);
 }
 
+int blk_mq_poll_stats_bucket(const struct request *rq)
+{
+	int val = rq->q->poll_size;
+
+	if ((val == 0) ||
+	    (val > 0 && blk_rq_bytes(rq) == val) ||
+	    (val < 0 && blk_rq_bytes(rq) <= -val))
+		return (int)rq_data_dir(rq);
+
+	return -1;
+}
+
 struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 						  struct request_queue *q)
 {
@@ -2338,7 +2350,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 		goto err_exit;
 
 	q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn,
-					     blk_stat_rq_ddir, 2, q);
+					     blk_mq_poll_stats_bucket, 2, q);
 	if (!q->poll_cb)
 		goto err_exit;
 
@@ -2387,9 +2399,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	q->nr_requests = set->queue_depth;
 
 	/*
-	 * Default to classic polling
+	 * Default to classic polling. Default to considering all IO.
 	 */
 	q->poll_nsec = -1;
+	q->poll_size = 0;
 
 	if (set->ops->complete)
 		blk_queue_softirq_done(q, set->ops->complete);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index fa831cb..000d5db 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -394,6 +394,29 @@ static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page,
 	return count;
 }
 
+static ssize_t queue_poll_size_show(struct request_queue *q, char *page)
+{
+	return sprintf(page, "%d\n", q->poll_size);
+}
+
+static ssize_t queue_poll_size_store(struct request_queue *q, const char *page,
+				     size_t count)
+{
+	int err, val;
+
+	if (!q->mq_ops || !q->mq_ops->poll)
+		return -EINVAL;
+
+	err = kstrtoint(page, 10, &val);
+	if (err < 0)
+		return err;
+
+	q->poll_size = val;
+
+	return count;
+}
+
+
 static ssize_t queue_poll_show(struct request_queue *q, char *page)
 {
 	return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page);
@@ -654,6 +677,12 @@ static struct queue_sysfs_entry queue_poll_entry = {
 	.store = queue_poll_store,
 };
 
+static struct queue_sysfs_entry queue_poll_size_entry = {
+	.attr = {.name = "io_poll_size", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_poll_size_show,
+	.store = queue_poll_size_store,
+};
+
 static struct queue_sysfs_entry queue_poll_delay_entry = {
 	.attr = {.name = "io_poll_delay", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_poll_delay_show,
@@ -710,6 +739,7 @@ static struct attribute *default_attrs[] = {
 	&queue_dax_entry.attr,
 	&queue_wb_lat_entry.attr,
 	&queue_poll_delay_entry.attr,
+	&queue_poll_size_entry.attr,
 	NULL,
 };
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1a7dc42..7ff5388 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -517,6 +517,7 @@ struct request_queue {
 
 	unsigned int		rq_timeout;
 	int			poll_nsec;
+	int			poll_size;
 
 	struct blk_stat_callback	*poll_cb;
 	struct blk_rq_stat	poll_stat[2];
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed
  2017-03-26  2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates
@ 2017-03-26  2:49   ` Omar Sandoval
  2017-03-27 16:00     ` Stephen  Bates
  0 siblings, 1 reply; 12+ messages in thread
From: Omar Sandoval @ 2017-03-26  2:49 UTC (permalink / raw)


Hey, Stephen,

On Sat, Mar 25, 2017@08:18:11PM -0600, sbates@raithlin.com wrote:
> From: Stephen Bates <sbates at raithlin.com>
> 
> In order to allow for filtering of IO based on some other properties
> of the request than direction we allow the bucket function to return
> an int.
> 
> If the bucket callback returns a negative do no count it in the stats
> accumulation.

This is great, I had it this way at first but didn't know if anyone
would want to omit stats in this way. A couple of comments below.

> Signed-off-by: Stephen Bates <sbates at raithlin.com>
> ---
>  block/blk-stat.c | 8 +++++---
>  block/blk-stat.h | 9 +++++----
>  2 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-stat.c b/block/blk-stat.c
> index 188b535..936adfb 100644
> --- a/block/blk-stat.c
> +++ b/block/blk-stat.c
> @@ -17,9 +17,9 @@ struct blk_queue_stats {
>  	spinlock_t lock;
>  };
>  
> -unsigned int blk_stat_rq_ddir(const struct request *rq)
> +int blk_stat_rq_ddir(const struct request *rq)
>  {
> -	return rq_data_dir(rq);
> +	return (int)rq_data_dir(rq);

The cast here here isn't necessary, let's leave it off.

>  }
>  EXPORT_SYMBOL_GPL(blk_stat_rq_ddir);
>  
> @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq)
>  	list_for_each_entry_rcu(cb, &q->stats->callbacks, list) {
>  		if (blk_stat_is_active(cb)) {
>  			bucket = cb->bucket_fn(rq);
> +			if (bucket < 0)
> +				continue;

You also need to change the declaration of bucket to int above.

>  			stat = &this_cpu_ptr(cb->cpu_stat)[bucket];
>  			__blk_stat_add(stat, value);
>  		}
> @@ -131,7 +133,7 @@ static void blk_stat_timer_fn(unsigned long data)
>  
>  struct blk_stat_callback *
>  blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *),
> -			unsigned int (*bucket_fn)(const struct request *),
> +			int (*bucket_fn)(const struct request *),
>  			unsigned int buckets, void *data)
>  {
>  	struct blk_stat_callback *cb;
> diff --git a/block/blk-stat.h b/block/blk-stat.h
> index 6ad5b8c..7417805 100644
> --- a/block/blk-stat.h
> +++ b/block/blk-stat.h
> @@ -41,9 +41,10 @@ struct blk_stat_callback {
>  
>  	/**
>  	 * @bucket_fn: Given a request, returns which statistics bucket it
> -	 * should be accounted under.
> +	 * should be accounted under. Return -1 for no bucket for this
> +	 * request.
>  	 */
> -	unsigned int (*bucket_fn)(const struct request *);
> +	int (*bucket_fn)(const struct request *);
>  
>  	/**
>  	 * @buckets: Number of statistics buckets.
> @@ -98,7 +99,7 @@ static inline u64 blk_stat_time(struct blk_issue_stat *stat)
>   *
>   * Return: Data direction of the request, either READ or WRITE.
>   */
> -unsigned int blk_stat_rq_ddir(const struct request *rq);
> +int blk_stat_rq_ddir(const struct request *rq);
>  
>  /**
>   * blk_stat_alloc_callback() - Allocate a block statistics callback.
> @@ -113,7 +114,7 @@ unsigned int blk_stat_rq_ddir(const struct request *rq);
>   */
>  struct blk_stat_callback *
>  blk_stat_alloc_callback(void (*timer_fn)(struct blk_stat_callback *),
> -			unsigned int (*bucket_fn)(const struct request *),
> +			int (*bucket_fn)(const struct request *),
>  			unsigned int buckets, void *data);
>  
>  /**
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed
  2017-03-26  2:49   ` Omar Sandoval
@ 2017-03-27 16:00     ` Stephen  Bates
  2017-03-27 16:01       ` Omar Sandoval
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen  Bates @ 2017-03-27 16:00 UTC (permalink / raw)


Thanks for the review Omar!

>>  
>> -unsigned int blk_stat_rq_ddir(const struct request *rq)
>> +int blk_stat_rq_ddir(const struct request *rq)
>>  {
>> -	return rq_data_dir(rq);
>> +	return (int)rq_data_dir(rq);
>
>The cast here here isn't necessary, let's leave it off.
>

OK, I will add that in the respin!

>>  }
>>  EXPORT_SYMBOL_GPL(blk_stat_rq_ddir);
>>  
>> @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq)
>>  	list_for_each_entry_rcu(cb, &q->stats->callbacks, list) {
>>  		if (blk_stat_is_active(cb)) {
>>  			bucket = cb->bucket_fn(rq);
>> +			if (bucket < 0)
>> +				continue;
>
>You also need to change the declaration of bucket to int above.
>

Ummmm, it is already is an int?

Stephen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed
  2017-03-27 16:00     ` Stephen  Bates
@ 2017-03-27 16:01       ` Omar Sandoval
  0 siblings, 0 replies; 12+ messages in thread
From: Omar Sandoval @ 2017-03-27 16:01 UTC (permalink / raw)


On Mon, Mar 27, 2017@04:00:08PM +0000, Stephen  Bates wrote:
> Thanks for the review Omar!
> 
> >>  
> >> -unsigned int blk_stat_rq_ddir(const struct request *rq)
> >> +int blk_stat_rq_ddir(const struct request *rq)
> >>  {
> >> -	return rq_data_dir(rq);
> >> +	return (int)rq_data_dir(rq);
> >
> >The cast here here isn't necessary, let's leave it off.
> >
> 
> OK, I will add that in the respin!
> 
> >>  }
> >>  EXPORT_SYMBOL_GPL(blk_stat_rq_ddir);
> >>  
> >> @@ -100,6 +100,8 @@ void blk_stat_add(struct request *rq)
> >>  	list_for_each_entry_rcu(cb, &q->stats->callbacks, list) {
> >>  		if (blk_stat_is_active(cb)) {
> >>  			bucket = cb->bucket_fn(rq);
> >> +			if (bucket < 0)
> >> +				continue;
> >
> >You also need to change the declaration of bucket to int above.
> >
> 
> Ummmm, it is already is an int?

Yup, I was looking at the wrong place, sorry.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-26  2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates
@ 2017-03-28 10:50   ` Sagi Grimberg
  2017-03-28 19:38     ` Stephen  Bates
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2017-03-28 10:50 UTC (permalink / raw)




On 26/03/17 05:18, sbates@raithlin.com wrote:
> From: Stephen Bates <sbates at raithlin.com>
>
> In order to bucket IO for the polling algorithm we use a sysfs entry
> to set the filter value. It is signed and we will use that as follows:
>
>  0   : No filtering. All IO are considered in stat generation
>  > 0 : Filtering based on IO of exactly this size only.
>  < 0 : Filtering based on IO less than or equal to -1 time this value.

I'd say that this is a fairly non-trivial semantic meanning to this...

Is there any use for the size exact match filter? If not then
I suggest we always make it (<=) in its semantics...

> Use this member to implement a new bucket function for the io polling
> stats.
>
> Signed-off-by: Stephen Bates <sbates at raithlin.com>
> ---
>  block/blk-mq.c         | 17 +++++++++++++++--
>  block/blk-sysfs.c      | 30 ++++++++++++++++++++++++++++++
>  include/linux/blkdev.h |  1 +
>  3 files changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 5ff66f2..5ea9481 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2327,6 +2327,18 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
>  	blk_mq_sysfs_register(q);
>  }
>
> +int blk_mq_poll_stats_bucket(const struct request *rq)
> +{
> +	int val = rq->q->poll_size;
> +
> +	if ((val == 0) ||
> +	    (val > 0 && blk_rq_bytes(rq) == val) ||
> +	    (val < 0 && blk_rq_bytes(rq) <= -val))
> +		return (int)rq_data_dir(rq);
> +
> +	return -1;
> +}
> +
>  struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  						  struct request_queue *q)
>  {
> @@ -2338,7 +2350,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  		goto err_exit;
>
>  	q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn,
> -					     blk_stat_rq_ddir, 2, q);
> +					     blk_mq_poll_stats_bucket, 2, q);
>  	if (!q->poll_cb)
>  		goto err_exit;
>
> @@ -2387,9 +2399,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  	q->nr_requests = set->queue_depth;
>
>  	/*
> -	 * Default to classic polling
> +	 * Default to classic polling. Default to considering all IO.
>  	 */
>  	q->poll_nsec = -1;
> +	q->poll_size = 0;
>
>  	if (set->ops->complete)
>  		blk_queue_softirq_done(q, set->ops->complete);
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index fa831cb..000d5db 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -394,6 +394,29 @@ static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page,
>  	return count;
>  }
>
> +static ssize_t queue_poll_size_show(struct request_queue *q, char *page)
> +{
> +	return sprintf(page, "%d\n", q->poll_size);
> +}
> +
> +static ssize_t queue_poll_size_store(struct request_queue *q, const char *page,
> +				     size_t count)
> +{
> +	int err, val;
> +
> +	if (!q->mq_ops || !q->mq_ops->poll)
> +		return -EINVAL;
> +
> +	err = kstrtoint(page, 10, &val);
> +	if (err < 0)
> +		return err;
> +
> +	q->poll_size = val;
> +
> +	return count;
> +}
> +
> +
>  static ssize_t queue_poll_show(struct request_queue *q, char *page)
>  {
>  	return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page);
> @@ -654,6 +677,12 @@ static struct queue_sysfs_entry queue_poll_entry = {
>  	.store = queue_poll_store,
>  };
>
> +static struct queue_sysfs_entry queue_poll_size_entry = {
> +	.attr = {.name = "io_poll_size", .mode = S_IRUGO | S_IWUSR },
> +	.show = queue_poll_size_show,
> +	.store = queue_poll_size_store,
> +};
> +
>  static struct queue_sysfs_entry queue_poll_delay_entry = {
>  	.attr = {.name = "io_poll_delay", .mode = S_IRUGO | S_IWUSR },
>  	.show = queue_poll_delay_show,
> @@ -710,6 +739,7 @@ static struct attribute *default_attrs[] = {
>  	&queue_dax_entry.attr,
>  	&queue_wb_lat_entry.attr,
>  	&queue_poll_delay_entry.attr,
> +	&queue_poll_size_entry.attr,
>  	NULL,
>  };
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 1a7dc42..7ff5388 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -517,6 +517,7 @@ struct request_queue {
>
>  	unsigned int		rq_timeout;
>  	int			poll_nsec;
> +	int			poll_size;
>
>  	struct blk_stat_callback	*poll_cb;
>  	struct blk_rq_stat	poll_stat[2];
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-28 10:50   ` Sagi Grimberg
@ 2017-03-28 19:38     ` Stephen  Bates
  2017-03-28 19:46       ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen  Bates @ 2017-03-28 19:38 UTC (permalink / raw)


>>
>>  In order to bucket IO for the polling algorithm we use a sysfs entry
>>  to set the filter value. It is signed and we will use that as follows:
>>
>>   0   : No filtering. All IO are considered in stat generation
>>   > 0 : Filtering based on IO of exactly this size only.
>>   < 0 : Filtering based on IO less than or equal to -1 time this value.>
>
> I'd say that this is a fairly non-trivial semantic meanning to this...
>
> Is there any use for the size exact match filter? If not then
> I suggest we always make it (<=) in its semantics...

Thanks for the review Sagi. I?d be OK going with <=0 as the exact match would normally be for minimal IO sizes (where <= and = are the same thing). I will see what other feedback I get and aim to do a respin soon?

Stephen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-28 19:38     ` Stephen  Bates
@ 2017-03-28 19:46       ` Jens Axboe
  2017-03-28 19:58         ` Stephen  Bates
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2017-03-28 19:46 UTC (permalink / raw)


On 03/28/2017 01:38 PM, Stephen  Bates wrote:
>>>
>>>  In order to bucket IO for the polling algorithm we use a sysfs entry
>>>  to set the filter value. It is signed and we will use that as follows:
>>>
>>>   0   : No filtering. All IO are considered in stat generation
>>>   > 0 : Filtering based on IO of exactly this size only.
>>>   < 0 : Filtering based on IO less than or equal to -1 time this value.>
>>
>> I'd say that this is a fairly non-trivial semantic meanning to this...
>>
>> Is there any use for the size exact match filter? If not then
>> I suggest we always make it (<=) in its semantics...
> 
> Thanks for the review Sagi. I?d be OK going with <=0 as the exact
> match would normally be for minimal IO sizes (where <= and = are the
> same thing). I will see what other feedback I get and aim to do a
> respin soon?

No tunables for this, please. There's absolutely no reason why we should
need it.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-28 19:46       ` Jens Axboe
@ 2017-03-28 19:58         ` Stephen  Bates
  2017-03-28 20:38           ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen  Bates @ 2017-03-28 19:58 UTC (permalink / raw)


>> 
>> Thanks for the review Sagi. I?d be OK going with <=0 as the exact
>> match would normally be for minimal IO sizes (where <= and = are the
>> same thing). I will see what other feedback I get and aim to do a
>> respin soon?
>
> No tunables for this, please. There's absolutely no reason why we should
> need it.

Jens ? by this you mean you want to only bucket IO that are exactly the minimum block size supported by the underlying block device? I was envisioning we might want to relax that in certain cases (e.g. bucket 4KB and below going to a 512B device).

Stephen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-28 19:58         ` Stephen  Bates
@ 2017-03-28 20:38           ` Jens Axboe
  2017-03-28 21:52             ` Stephen  Bates
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2017-03-28 20:38 UTC (permalink / raw)


On 03/28/2017 01:58 PM, Stephen  Bates wrote:
>>>
>>> Thanks for the review Sagi. I?d be OK going with <=0 as the exact
>>> match would normally be for minimal IO sizes (where <= and = are the
>>> same thing). I will see what other feedback I get and aim to do a
>>> respin soon?
>>
>> No tunables for this, please. There's absolutely no reason why we should
>> need it.
> 
> Jens ? by this you mean you want to only bucket IO that are exactly
> the minimum block size supported by the underlying block device? I was
> envisioning we might want to relax that in certain cases (e.g. bucket
> 4KB and below going to a 512B device).

Sorry, the above was a bit terse. I think a much better solution would
be to create a number of buckets (per data direction) and do stats on
all of them. The buckets would cover a reasonable range of request
sizes. Then when you poll for a given request, we can base our timed
number on the data direction AND size of it. You can get pretty far with
a few buckets:

512b
4k
8k
16k
32k
64k
128k

and you could even have your time estimation function turn these into
something sane. Or just use a composite of buckets, if you wish.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct
  2017-03-28 20:38           ` Jens Axboe
@ 2017-03-28 21:52             ` Stephen  Bates
  0 siblings, 0 replies; 12+ messages in thread
From: Stephen  Bates @ 2017-03-28 21:52 UTC (permalink / raw)


[Retrying as my new setup secretly converted to html format without telling me. Apologies for the resend.]

>>> 
>>> Thanks for the review Sagi. I?d be OK going with <=0 as the exact
>>> match would normally be for minimal IO sizes (where <= and = are the
>>> same thing). I will see what other feedback I get and aim to do a
>>> respin soon?
>> 
>> No tunables for this, please. There's absolutely no reason why we should
>> need it.
>
> Jens ? by this you mean you want to only bucket IO that are exactly
> the minimum block size supported by the underlying block device? I was
> envisioning we might want to relax that in certain cases (e.g. bucket
> 4KB and below going to a 512B device).
 
> Sorry, the above was a bit terse. I think a much better solution would
> be to create a number of buckets (per data direction) and do stats on
> all of them. The buckets would cover a reasonable range of request
> sizes. Then when you poll for a given request, we can base our timed
> number on the data direction AND size of it. You can get pretty far with
> a few buckets:
> 
> 512b
> 4k
> 8k
> 16k
> 32k
> 64k
> 128k
> 
> and you could even have your time estimation function turn these into
> something sane. Or just use a composite of buckets, if you wish.
 
I did go down this path initially but then moved away from it since we were focusing only on the smaller IO size. However I can definitely take a look at this again as I agree that it could be more useful in the long run.
 
I would like to keep my first patch in this series alive since I do think having the option to not bucket an IO is a useful thing to have.
 
I?ll take all the feedback to date and work on a v2. Thanks!
 
Stephen

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-28 21:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-26  2:18 [PATCH 0/2] blk-stat: Add ability to not bucket IO, add this to IO poling sbates
2017-03-26  2:18 ` [PATCH 1/2] blk-stat: convert blk-stat bucket callback to signed sbates
2017-03-26  2:49   ` Omar Sandoval
2017-03-27 16:00     ` Stephen  Bates
2017-03-27 16:01       ` Omar Sandoval
2017-03-26  2:18 ` [PATCH 2/2] blk-stat: add a poll_size value to the request_queue struct sbates
2017-03-28 10:50   ` Sagi Grimberg
2017-03-28 19:38     ` Stephen  Bates
2017-03-28 19:46       ` Jens Axboe
2017-03-28 19:58         ` Stephen  Bates
2017-03-28 20:38           ` Jens Axboe
2017-03-28 21:52             ` Stephen  Bates

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).