[PATCHSET v2 0/4] struct request optimizations

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET v2 0/4] struct request optimizations
@ 2018-01-10  0:29 Jens Axboe
  2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10  0:29 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, bart.vanassche

With the latest patchset from Tejun, we grew the request structure
a little bit. It's been quite a while since I've taken a look at
the layout of the structure, this patchset is a first attempt at
doing that.

One advantage of Tejun's patchset is that we no longer rely on
the atomic complete flag on blk-mq. We can use that to shuffle
some bits, and reclaim the full atomic_flags field.

Cache optimize the layout of struct request a bit, to group
things a little more logically. Not a huge shuffle, just a few
select members.

We end up doing better in synthetic testing after this, details
in the last patch.

Testing with timeouts on blk-mq and the legacy path work fine,
no regressions introduced.

Changes since v1:
- Add comment on blk_rq_set_deadline() clearing COMPLETE bit.
  Decided to keep it like that after discussing with Omar,
  as it doesn't false convey safety wrt completions atomicity.
- Rearrange a tiny bit more, and move CGROUP stuff to the end
  so that enabling that doesn't screw everything up. Also
  move hole near INTEGRITY, so that if that does get enabled,
  we simply fill the hole.
- Improve layout in mq rq filling.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT
  2018-01-10  0:29 [PATCHSET v2 0/4] struct request optimizations Jens Axboe
@ 2018-01-10  0:29 ` Jens Axboe
  2018-01-10 18:25   ` Bart Van Assche
  2018-01-10 18:35   ` Omar Sandoval
  2018-01-10  0:29 ` [PATCH 2/4] block: add accessors for setting/querying request deadline Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10  0:29 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, bart.vanassche, Jens Axboe

We don't need this to be an atomic flag, it can be a regular
flag. We either end up on the same CPU for the polling, in which
case the state is sane, or we did the sleep which would imply
the needed barrier to ensure we see the right state.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq-debugfs.c | 1 -
 block/blk-mq.c         | 5 ++---
 block/blk.h            | 2 --
 include/linux/blkdev.h | 2 ++
 4 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 8adc83786256..2a9c9f8b6162 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -294,7 +294,6 @@ static const char *const rqf_name[] = {
 #define RQAF_NAME(name) [REQ_ATOM_##name] = #name
 static const char *const rqaf_name[] = {
 	RQAF_NAME(COMPLETE),
-	RQAF_NAME(POLL_SLEPT),
 };
 #undef RQAF_NAME
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9aa24c9508f9..faa31814983c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -483,7 +483,6 @@ void blk_mq_free_request(struct request *rq)
 		blk_put_rl(blk_rq_rl(rq));
 
 	blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
-	clear_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
 	if (rq->tag != -1)
 		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
 	if (sched_tag != -1)
@@ -2970,7 +2969,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	unsigned int nsecs;
 	ktime_t kt;
 
-	if (test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
+	if (rq->rq_flags & RQF_MQ_POLL_SLEPT)
 		return false;
 
 	/*
@@ -2990,7 +2989,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	if (!nsecs)
 		return false;
 
-	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
+	rq->rq_flags |= RQF_MQ_POLL_SLEPT;
 
 	/*
 	 * This will be replaced with the stats tracking code, using
diff --git a/block/blk.h b/block/blk.h
index a68dbe312ea3..eb306c52121e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -124,8 +124,6 @@ void blk_account_io_done(struct request *req);
  */
 enum rq_atomic_flags {
 	REQ_ATOM_COMPLETE = 0,
-
-	REQ_ATOM_POLL_SLEPT,
 };
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 007a7cf1f262..ba31674d8581 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -127,6 +127,8 @@ typedef __u32 __bitwise req_flags_t;
 #define RQF_ZONE_WRITE_LOCKED	((__force req_flags_t)(1 << 19))
 /* timeout is expired */
 #define RQF_MQ_TIMEOUT_EXPIRED	((__force req_flags_t)(1 << 20))
+/* already slept for hybrid poll */
+#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 21))
 
 /* flags that prevent us from merging requests: */
 #define RQF_NOMERGE_FLAGS \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT
  2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
@ 2018-01-10 18:25   ` Bart Van Assche
  2018-01-10 18:32     ` Jens Axboe
  2018-01-10 18:35   ` Omar Sandoval
  1 sibling, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:25 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gVHVlLCAyMDE4LTAxLTA5IGF0IDE3OjI5IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBX
ZSBkb24ndCBuZWVkIHRoaXMgdG8gYmUgYW4gYXRvbWljIGZsYWcsIGl0IGNhbiBiZSBhIHJlZ3Vs
YXINCj4gZmxhZy4gV2UgZWl0aGVyIGVuZCB1cCBvbiB0aGUgc2FtZSBDUFUgZm9yIHRoZSBwb2xs
aW5nLCBpbiB3aGljaA0KPiBjYXNlIHRoZSBzdGF0ZSBpcyBzYW5lLCBvciB3ZSBkaWQgdGhlIHNs
ZWVwIHdoaWNoIHdvdWxkIGltcGx5DQo+IHRoZSBuZWVkZWQgYmFycmllciB0byBlbnN1cmUgd2Ug
c2VlIHRoZSByaWdodCBzdGF0ZS4NCg0KUGxlYXNlIGNvbnNpZGVyIHVwZGF0aW5nIHJxZl9uYW1l
W10gaW4gYmxrLW1xLWRlYnVnZnMuYy4gQW55d2F5Og0KDQpSZXZpZXdlZC1ieTogQmFydCBWYW4g
QXNzY2hlIDxiYXJ0LnZhbmFzc2NoZUB3ZGMuY29tPg0KDQo=

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT
  2018-01-10 18:25   ` Bart Van Assche
@ 2018-01-10 18:32     ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10 18:32 UTC (permalink / raw)
  To: Bart Van Assche, linux-block@vger.kernel.org; +Cc: osandov@fb.com

On 1/10/18 11:25 AM, Bart Van Assche wrote:
> On Tue, 2018-01-09 at 17:29 -0700, Jens Axboe wrote:
>> We don't need this to be an atomic flag, it can be a regular
>> flag. We either end up on the same CPU for the polling, in which
>> case the state is sane, or we did the sleep which would imply
>> the needed barrier to ensure we see the right state.
> 
> Please consider updating rqf_name[] in blk-mq-debugfs.c.

Good catch - I'll do that, and also add a small prep patch that syncs
with the current situation, looks like we're missing the timeout
and zone locked flags.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT
  2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
  2018-01-10 18:25   ` Bart Van Assche
@ 2018-01-10 18:35   ` Omar Sandoval
  1 sibling, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2018-01-10 18:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, osandov, bart.vanassche

On Tue, Jan 09, 2018 at 05:29:24PM -0700, Jens Axboe wrote:
> We don't need this to be an atomic flag, it can be a regular
> flag. We either end up on the same CPU for the polling, in which
> case the state is sane, or we did the sleep which would imply
> the needed barrier to ensure we see the right state.

Second Bart's comment, looks good otherwise.

Reviewed-by: Omar Sandoval <osandov@fb.com>

> Signed-off-by: Jens Axboe <axboe@kernel.dk>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 2/4] block: add accessors for setting/querying request deadline
  2018-01-10  0:29 [PATCHSET v2 0/4] struct request optimizations Jens Axboe
  2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
@ 2018-01-10  0:29 ` Jens Axboe
  2018-01-10 18:27   ` Bart Van Assche
  2018-01-10 18:41   ` Omar Sandoval
  2018-01-10  0:29 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
  2018-01-10  0:29 ` [PATCH 4/4] block: rearrange a few request fields for better cache layout Jens Axboe
  3 siblings, 2 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10  0:29 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, bart.vanassche, Jens Axboe

We reduce the resolution of request expiry, but since we're already
using jiffies for this where resolution depends on the kernel
configuration and since the timeout resolution is coarse anyway,
that should be fine.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c         |  2 +-
 block/blk-timeout.c    | 14 ++++++++------
 block/blk.h            | 15 +++++++++++++++
 include/linux/blkdev.h |  4 +++-
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index faa31814983c..d875c51bcff8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -858,7 +858,7 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	while (true) {
 		start = read_seqcount_begin(&rq->gstate_seq);
 		gstate = READ_ONCE(rq->gstate);
-		deadline = rq->deadline;
+		deadline = blk_rq_deadline(rq);
 		if (!read_seqcount_retry(&rq->gstate_seq, start))
 			break;
 		cond_resched();
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index ebe99963386c..a05e3676d24a 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -112,7 +112,9 @@ static void blk_rq_timed_out(struct request *req)
 static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout,
 			  unsigned int *next_set)
 {
-	if (time_after_eq(jiffies, rq->deadline)) {
+	const unsigned long deadline = blk_rq_deadline(rq);
+
+	if (time_after_eq(jiffies, deadline)) {
 		list_del_init(&rq->timeout_list);
 
 		/*
@@ -120,8 +122,8 @@ static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout
 		 */
 		if (!blk_mark_rq_complete(rq))
 			blk_rq_timed_out(rq);
-	} else if (!*next_set || time_after(*next_timeout, rq->deadline)) {
-		*next_timeout = rq->deadline;
+	} else if (!*next_set || time_after(*next_timeout, deadline)) {
+		*next_timeout = deadline;
 		*next_set = 1;
 	}
 }
@@ -162,7 +164,7 @@ void blk_abort_request(struct request *req)
 		 * immediately and that scan sees the new timeout value.
 		 * No need for fancy synchronizations.
 		 */
-		req->deadline = jiffies;
+		blk_rq_set_deadline(req, jiffies);
 		mod_timer(&req->q->timeout, 0);
 	} else {
 		if (blk_mark_rq_complete(req))
@@ -213,7 +215,7 @@ void blk_add_timer(struct request *req)
 	if (!req->timeout)
 		req->timeout = q->rq_timeout;
 
-	req->deadline = jiffies + req->timeout;
+	blk_rq_set_deadline(req, jiffies + req->timeout);
 	req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
 
 	/*
@@ -228,7 +230,7 @@ void blk_add_timer(struct request *req)
 	 * than an existing one, modify the timer. Round up to next nearest
 	 * second.
 	 */
-	expiry = blk_rq_timeout(round_jiffies_up(req->deadline));
+	expiry = blk_rq_timeout(round_jiffies_up(blk_rq_deadline(req)));
 
 	if (!timer_pending(&q->timeout) ||
 	    time_before(expiry, q->timeout.expires)) {
diff --git a/block/blk.h b/block/blk.h
index eb306c52121e..bcd9cf7db0d4 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -237,6 +237,21 @@ static inline void req_set_nomerge(struct request_queue *q, struct request *req)
 }
 
 /*
+ * Steal a bit from this field for legacy IO path atomic IO marking. Note that
+ * setting the deadline clears the bottom bit, potentially clearing the
+ * completed bit. The user has to be OK with this (current ones are fine).
+ */
+static inline void blk_rq_set_deadline(struct request *rq, unsigned long time)
+{
+	rq->__deadline = time & ~0x1UL;
+}
+
+static inline unsigned long blk_rq_deadline(struct request *rq)
+{
+	return rq->__deadline & ~0x1UL;
+}
+
+/*
  * Internal io_context interface
  */
 void get_io_context(struct io_context *ioc);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ba31674d8581..aa6698cf483c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -257,7 +257,9 @@ struct request {
 	struct u64_stats_sync aborted_gstate_sync;
 	u64 aborted_gstate;
 
-	unsigned long deadline;
+	/* access through blk_rq_set_deadline, blk_rq_deadline */
+	unsigned long __deadline;
+
 	struct list_head timeout_list;
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] block: add accessors for setting/querying request deadline
  2018-01-10  0:29 ` [PATCH 2/4] block: add accessors for setting/querying request deadline Jens Axboe
@ 2018-01-10 18:27   ` Bart Van Assche
  2018-01-10 18:41   ` Omar Sandoval
  1 sibling, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:27 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gVHVlLCAyMDE4LTAxLTA5IGF0IDE3OjI5IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBX
ZSByZWR1Y2UgdGhlIHJlc29sdXRpb24gb2YgcmVxdWVzdCBleHBpcnksIGJ1dCBzaW5jZSB3ZSdy
ZSBhbHJlYWR5DQo+IHVzaW5nIGppZmZpZXMgZm9yIHRoaXMgd2hlcmUgcmVzb2x1dGlvbiBkZXBl
bmRzIG9uIHRoZSBrZXJuZWwNCj4gY29uZmlndXJhdGlvbiBhbmQgc2luY2UgdGhlIHRpbWVvdXQg
cmVzb2x1dGlvbiBpcyBjb2Fyc2UgYW55d2F5LA0KPiB0aGF0IHNob3VsZCBiZSBmaW5lLg0KDQpS
ZXZpZXdlZC1ieTogQmFydCBWYW4gQXNzY2hlIDxiYXJ0LnZhbmFzc2NoZUB3ZGMuY29tPg0K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] block: add accessors for setting/querying request deadline
  2018-01-10  0:29 ` [PATCH 2/4] block: add accessors for setting/querying request deadline Jens Axboe
  2018-01-10 18:27   ` Bart Van Assche
@ 2018-01-10 18:41   ` Omar Sandoval
  1 sibling, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2018-01-10 18:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, osandov, bart.vanassche

On Tue, Jan 09, 2018 at 05:29:25PM -0700, Jens Axboe wrote:
> We reduce the resolution of request expiry, but since we're already
> using jiffies for this where resolution depends on the kernel
> configuration and since the timeout resolution is coarse anyway,
> that should be fine.

Reviewed-by: Omar Sandoval <osandov@fb.com>

> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-mq.c         |  2 +-
>  block/blk-timeout.c    | 14 ++++++++------
>  block/blk.h            | 15 +++++++++++++++
>  include/linux/blkdev.h |  4 +++-
>  4 files changed, 27 insertions(+), 8 deletions(-)
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10  0:29 [PATCHSET v2 0/4] struct request optimizations Jens Axboe
  2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
  2018-01-10  0:29 ` [PATCH 2/4] block: add accessors for setting/querying request deadline Jens Axboe
@ 2018-01-10  0:29 ` Jens Axboe
  2018-01-10 18:29   ` Bart Van Assche
  2018-01-10  0:29 ` [PATCH 4/4] block: rearrange a few request fields for better cache layout Jens Axboe
  3 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-10  0:29 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, bart.vanassche, Jens Axboe

We only have one atomic flag left. Instead of using an entire
unsigned long for that, steal the bottom bit of the deadline
field that we already reserved.

Remove ->atomic_flags, since it's now unused.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c       |  2 +-
 block/blk-mq-debugfs.c |  8 --------
 block/blk-mq.c         |  2 +-
 block/blk.h            | 19 +++++++++----------
 include/linux/blkdev.h |  2 --
 5 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f843ae4f858d..7ba607527487 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2853,7 +2853,7 @@ void blk_start_request(struct request *req)
 		wbt_issue(req->q->rq_wb, &req->issue_stat);
 	}
 
-	BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));
+	BUG_ON(blk_rq_is_complete(req));
 	blk_add_timer(req);
 }
 EXPORT_SYMBOL(blk_start_request);
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 2a9c9f8b6162..ac99b78415ec 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -291,12 +291,6 @@ static const char *const rqf_name[] = {
 };
 #undef RQF_NAME
 
-#define RQAF_NAME(name) [REQ_ATOM_##name] = #name
-static const char *const rqaf_name[] = {
-	RQAF_NAME(COMPLETE),
-};
-#undef RQAF_NAME
-
 int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 {
 	const struct blk_mq_ops *const mq_ops = rq->q->mq_ops;
@@ -313,8 +307,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 	seq_puts(m, ", .rq_flags=");
 	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
 		       ARRAY_SIZE(rqf_name));
-	seq_puts(m, ", .atomic_flags=");
-	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
 	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
 		   rq->internal_tag);
 	if (mq_ops->show_rq)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d875c51bcff8..7248ee043651 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -294,7 +294,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 		rq->rq_flags |= RQF_PREEMPT;
 	if (blk_queue_io_stat(data->q))
 		rq->rq_flags |= RQF_IO_STAT;
-	/* do not touch atomic flags, it needs atomic ops against the timer */
 	rq->cpu = -1;
 	INIT_HLIST_NODE(&rq->hash);
 	RB_CLEAR_NODE(&rq->rb_node);
@@ -313,6 +312,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	rq->special = NULL;
 	/* tag was already set */
 	rq->extra_len = 0;
+	rq->__deadline = 0;
 
 	INIT_LIST_HEAD(&rq->timeout_list);
 	rq->timeout = 0;
diff --git a/block/blk.h b/block/blk.h
index bcd9cf7db0d4..c84ae0e21ebd 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -120,24 +120,23 @@ void blk_account_io_completion(struct request *req, unsigned int bytes);
 void blk_account_io_done(struct request *req);
 
 /*
- * Internal atomic flags for request handling
- */
-enum rq_atomic_flags {
-	REQ_ATOM_COMPLETE = 0,
-};
-
-/*
  * EH timer and IO completion will both attempt to 'grab' the request, make
- * sure that only one of them succeeds
+ * sure that only one of them succeeds. Steal the bottom bit of the
+ * __deadline field for this.
  */
 static inline int blk_mark_rq_complete(struct request *rq)
 {
-	return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	return test_and_set_bit(0, &rq->__deadline);
 }
 
 static inline void blk_clear_rq_complete(struct request *rq)
 {
-	clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	clear_bit(0, &rq->__deadline);
+}
+
+static inline bool blk_rq_is_complete(struct request *rq)
+{
+	return test_bit(0, &rq->__deadline);
 }
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index aa6698cf483c..d4b2f7bb18d6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -156,8 +156,6 @@ struct request {
 
 	int internal_tag;
 
-	unsigned long atomic_flags;
-
 	/* the following two fields are internal, NEVER access directly */
 	unsigned int __data_len;	/* total data len */
 	int tag;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10  0:29 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
@ 2018-01-10 18:29   ` Bart Van Assche
  2018-01-10 18:32     ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:29 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gVHVlLCAyMDE4LTAxLTA5IGF0IDE3OjI5IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBA
QCAtMzEzLDggKzMwNyw2IEBAIGludCBfX2Jsa19tcV9kZWJ1Z2ZzX3JxX3Nob3coc3RydWN0IHNl
cV9maWxlICptLCBzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICAJc2VxX3B1dHMobSwgIiwgLnJxX2Zs
YWdzPSIpOw0KPiAgCWJsa19mbGFnc19zaG93KG0sIChfX2ZvcmNlIHVuc2lnbmVkIGludClycS0+
cnFfZmxhZ3MsIHJxZl9uYW1lLA0KPiAgCQkgICAgICAgQVJSQVlfU0laRShycWZfbmFtZSkpOw0K
PiAtCXNlcV9wdXRzKG0sICIsIC5hdG9taWNfZmxhZ3M9Iik7DQo+IC0JYmxrX2ZsYWdzX3Nob3co
bSwgcnEtPmF0b21pY19mbGFncywgcnFhZl9uYW1lLCBBUlJBWV9TSVpFKHJxYWZfbmFtZSkpOw0K
PiAgCXNlcV9wcmludGYobSwgIiwgLnRhZz0lZCwgLmludGVybmFsX3RhZz0lZCIsIHJxLT50YWcs
DQo+ICAJCSAgIHJxLT5pbnRlcm5hbF90YWcpOw0KPiAgCWlmIChtcV9vcHMtPnNob3dfcnEpDQoN
CldoZXRoZXIgb3Igbm90IGEgcmVxdWVzdCBoYXMgYmVlbiBtYXJrZWQgY29tcGxldGUgaXMgdmVy
eSB1c2VmdWwgdG8ga25vdy4gSGF2ZSB5b3UNCmNvbnNpZGVyZWQgdG8gc2hvdyB0aGUgcmV0dXJu
IHZhbHVlIG9mIGJsa19ycV9pc19jb21wbGV0ZSgpIGluIHRoZSBkZWJ1Z2ZzIG91dHB1dD8NCg0K
VGhhbmtzLA0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10 18:29   ` Bart Van Assche
@ 2018-01-10 18:32     ` Jens Axboe
  2018-01-10 18:33       ` Bart Van Assche
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-10 18:32 UTC (permalink / raw)
  To: Bart Van Assche, linux-block@vger.kernel.org; +Cc: osandov@fb.com

On 1/10/18 11:29 AM, Bart Van Assche wrote:
> On Tue, 2018-01-09 at 17:29 -0700, Jens Axboe wrote:
>> @@ -313,8 +307,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
>>  	seq_puts(m, ", .rq_flags=");
>>  	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
>>  		       ARRAY_SIZE(rqf_name));
>> -	seq_puts(m, ", .atomic_flags=");
>> -	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
>>  	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
>>  		   rq->internal_tag);
>>  	if (mq_ops->show_rq)
> 
> Whether or not a request has been marked complete is very useful to know. Have you
> considered to show the return value of blk_rq_is_complete() in the debugfs output?

Yeah, that's a good point. Let me add that in lieu of the atomic flags that
are being killed. Are you fine with the patch then?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10 18:32     ` Jens Axboe
@ 2018-01-10 18:33       ` Bart Van Assche
  2018-01-10 18:35         ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:33 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gV2VkLCAyMDE4LTAxLTEwIGF0IDExOjMyIC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBP
biAxLzEwLzE4IDExOjI5IEFNLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gT24gVHVlLCAy
MDE4LTAxLTA5IGF0IDE3OjI5IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiA+ID4gQEAgLTMx
Myw4ICszMDcsNiBAQCBpbnQgX19ibGtfbXFfZGVidWdmc19ycV9zaG93KHN0cnVjdCBzZXFfZmls
ZSAqbSwgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ID4gIAlzZXFfcHV0cyhtLCAiLCAucnFfZmxh
Z3M9Iik7DQo+ID4gPiAgCWJsa19mbGFnc19zaG93KG0sIChfX2ZvcmNlIHVuc2lnbmVkIGludCly
cS0+cnFfZmxhZ3MsIHJxZl9uYW1lLA0KPiA+ID4gIAkJICAgICAgIEFSUkFZX1NJWkUocnFmX25h
bWUpKTsNCj4gPiA+IC0Jc2VxX3B1dHMobSwgIiwgLmF0b21pY19mbGFncz0iKTsNCj4gPiA+IC0J
YmxrX2ZsYWdzX3Nob3cobSwgcnEtPmF0b21pY19mbGFncywgcnFhZl9uYW1lLCBBUlJBWV9TSVpF
KHJxYWZfbmFtZSkpOw0KPiA+ID4gIAlzZXFfcHJpbnRmKG0sICIsIC50YWc9JWQsIC5pbnRlcm5h
bF90YWc9JWQiLCBycS0+dGFnLA0KPiA+ID4gIAkJICAgcnEtPmludGVybmFsX3RhZyk7DQo+ID4g
PiAgCWlmIChtcV9vcHMtPnNob3dfcnEpDQo+ID4gDQo+ID4gV2hldGhlciBvciBub3QgYSByZXF1
ZXN0IGhhcyBiZWVuIG1hcmtlZCBjb21wbGV0ZSBpcyB2ZXJ5IHVzZWZ1bCB0byBrbm93LiBIYXZl
IHlvdQ0KPiA+IGNvbnNpZGVyZWQgdG8gc2hvdyB0aGUgcmV0dXJuIHZhbHVlIG9mIGJsa19ycV9p
c19jb21wbGV0ZSgpIGluIHRoZSBkZWJ1Z2ZzIG91dHB1dD8NCj4gDQo+IFllYWgsIHRoYXQncyBh
IGdvb2QgcG9pbnQuIExldCBtZSBhZGQgdGhhdCBpbiBsaWV1IG9mIHRoZSBhdG9taWMgZmxhZ3Mg
dGhhdA0KPiBhcmUgYmVpbmcga2lsbGVkLiBBcmUgeW91IGZpbmUgd2l0aCB0aGUgcGF0Y2ggdGhl
bj8NCg0KVGhlIHJlc3Qgb2YgdGhlIHBhdGNoIGxvb2tzIGZpbmUgdG8gbWUuIFRoaXMgaXMgdGhl
IG9ubHkgY29tbWVudCBJIGhhZCBhYm91dCB0aGlzIHBhdGNoLg0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10 18:33       ` Bart Van Assche
@ 2018-01-10 18:35         ` Jens Axboe
  2018-01-10 18:42           ` Bart Van Assche
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-10 18:35 UTC (permalink / raw)
  To: Bart Van Assche, linux-block@vger.kernel.org; +Cc: osandov@fb.com

On 1/10/18 11:33 AM, Bart Van Assche wrote:
> On Wed, 2018-01-10 at 11:32 -0700, Jens Axboe wrote:
>> On 1/10/18 11:29 AM, Bart Van Assche wrote:
>>> On Tue, 2018-01-09 at 17:29 -0700, Jens Axboe wrote:
>>>> @@ -313,8 +307,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
>>>>  	seq_puts(m, ", .rq_flags=");
>>>>  	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
>>>>  		       ARRAY_SIZE(rqf_name));
>>>> -	seq_puts(m, ", .atomic_flags=");
>>>> -	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
>>>>  	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
>>>>  		   rq->internal_tag);
>>>>  	if (mq_ops->show_rq)
>>>
>>> Whether or not a request has been marked complete is very useful to know. Have you
>>> considered to show the return value of blk_rq_is_complete() in the debugfs output?
>>
>> Yeah, that's a good point. Let me add that in lieu of the atomic flags that
>> are being killed. Are you fine with the patch then?
> 
> The rest of the patch looks fine to me. This is the only comment I had about this patch.

http://git.kernel.dk/cgit/linux-block/commit/?h=blk-kill-atomic-flags&id=3d34009082f5e72137d6bb38cbc2ff6047f03fd1

Added a complete= entry for it.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10 18:35         ` Jens Axboe
@ 2018-01-10 18:42           ` Bart Van Assche
  2018-01-10 18:45             ` Omar Sandoval
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:42 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gV2VkLCAyMDE4LTAxLTEwIGF0IDExOjM1IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBP
biAxLzEwLzE4IDExOjMzIEFNLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gT24gV2VkLCAy
MDE4LTAxLTEwIGF0IDExOjMyIC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiA+ID4gT24gMS8x
MC8xOCAxMToyOSBBTSwgQmFydCBWYW4gQXNzY2hlIHdyb3RlOg0KPiA+ID4gPiBPbiBUdWUsIDIw
MTgtMDEtMDkgYXQgMTc6MjkgLTA3MDAsIEplbnMgQXhib2Ugd3JvdGU6DQo+ID4gPiA+ID4gQEAg
LTMxMyw4ICszMDcsNiBAQCBpbnQgX19ibGtfbXFfZGVidWdmc19ycV9zaG93KHN0cnVjdCBzZXFf
ZmlsZSAqbSwgc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiA+ID4gPiA+ICAJc2VxX3B1dHMobSwgIiwg
LnJxX2ZsYWdzPSIpOw0KPiA+ID4gPiA+ICAJYmxrX2ZsYWdzX3Nob3cobSwgKF9fZm9yY2UgdW5z
aWduZWQgaW50KXJxLT5ycV9mbGFncywgcnFmX25hbWUsDQo+ID4gPiA+ID4gIAkJICAgICAgIEFS
UkFZX1NJWkUocnFmX25hbWUpKTsNCj4gPiA+ID4gPiAtCXNlcV9wdXRzKG0sICIsIC5hdG9taWNf
ZmxhZ3M9Iik7DQo+ID4gPiA+ID4gLQlibGtfZmxhZ3Nfc2hvdyhtLCBycS0+YXRvbWljX2ZsYWdz
LCBycWFmX25hbWUsIEFSUkFZX1NJWkUocnFhZl9uYW1lKSk7DQo+ID4gPiA+ID4gIAlzZXFfcHJp
bnRmKG0sICIsIC50YWc9JWQsIC5pbnRlcm5hbF90YWc9JWQiLCBycS0+dGFnLA0KPiA+ID4gPiA+
ICAJCSAgIHJxLT5pbnRlcm5hbF90YWcpOw0KPiA+ID4gPiA+ICAJaWYgKG1xX29wcy0+c2hvd19y
cSkNCj4gPiA+ID4gDQo+ID4gPiA+IFdoZXRoZXIgb3Igbm90IGEgcmVxdWVzdCBoYXMgYmVlbiBt
YXJrZWQgY29tcGxldGUgaXMgdmVyeSB1c2VmdWwgdG8ga25vdy4gSGF2ZSB5b3UNCj4gPiA+ID4g
Y29uc2lkZXJlZCB0byBzaG93IHRoZSByZXR1cm4gdmFsdWUgb2YgYmxrX3JxX2lzX2NvbXBsZXRl
KCkgaW4gdGhlIGRlYnVnZnMgb3V0cHV0Pw0KPiA+ID4gDQo+ID4gPiBZZWFoLCB0aGF0J3MgYSBn
b29kIHBvaW50LiBMZXQgbWUgYWRkIHRoYXQgaW4gbGlldSBvZiB0aGUgYXRvbWljIGZsYWdzIHRo
YXQNCj4gPiA+IGFyZSBiZWluZyBraWxsZWQuIEFyZSB5b3UgZmluZSB3aXRoIHRoZSBwYXRjaCB0
aGVuPw0KPiA+IA0KPiA+IFRoZSByZXN0IG9mIHRoZSBwYXRjaCBsb29rcyBmaW5lIHRvIG1lLiBU
aGlzIGlzIHRoZSBvbmx5IGNvbW1lbnQgSSBoYWQgYWJvdXQgdGhpcyBwYXRjaC4NCj4gDQo+IGh0
dHA6Ly9naXQua2VybmVsLmRrL2NnaXQvbGludXgtYmxvY2svY29tbWl0Lz9oPWJsay1raWxsLWF0
b21pYy1mbGFncyZpZD0zZDM0MDA5MDgyZjVlNzIxMzdkNmJiMzhjYmMyZmY2MDQ3ZjAzZmQxDQo+
IA0KPiBBZGRlZCBhIGNvbXBsZXRlPSBlbnRyeSBmb3IgaXQuDQoNCkZvciB0aGF0IHBhdGNoOiBS
ZXZpZXdlZC1ieTogQmFydCBWYW4gQXNzY2hlIDxiYXJ0LnZhbmFzc2NoZUB3ZGMuY29tPg0KDQpU
aGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-10 18:42           ` Bart Van Assche
@ 2018-01-10 18:45             ` Omar Sandoval
  0 siblings, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2018-01-10 18:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, axboe@kernel.dk, osandov@fb.com

On Wed, Jan 10, 2018 at 06:42:17PM +0000, Bart Van Assche wrote:
> On Wed, 2018-01-10 at 11:35 -0700, Jens Axboe wrote:
> > On 1/10/18 11:33 AM, Bart Van Assche wrote:
> > > On Wed, 2018-01-10 at 11:32 -0700, Jens Axboe wrote:
> > > > On 1/10/18 11:29 AM, Bart Van Assche wrote:
> > > > > On Tue, 2018-01-09 at 17:29 -0700, Jens Axboe wrote:
> > > > > > @@ -313,8 +307,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
> > > > > >  	seq_puts(m, ", .rq_flags=");
> > > > > >  	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
> > > > > >  		       ARRAY_SIZE(rqf_name));
> > > > > > -	seq_puts(m, ", .atomic_flags=");
> > > > > > -	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
> > > > > >  	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
> > > > > >  		   rq->internal_tag);
> > > > > >  	if (mq_ops->show_rq)
> > > > > 
> > > > > Whether or not a request has been marked complete is very useful to know. Have you
> > > > > considered to show the return value of blk_rq_is_complete() in the debugfs output?
> > > > 
> > > > Yeah, that's a good point. Let me add that in lieu of the atomic flags that
> > > > are being killed. Are you fine with the patch then?
> > > 
> > > The rest of the patch looks fine to me. This is the only comment I had about this patch.
> > 
> > http://git.kernel.dk/cgit/linux-block/commit/?h=blk-kill-atomic-flags&id=3d34009082f5e72137d6bb38cbc2ff6047f03fd1
> > 
> > Added a complete= entry for it.
> 
> For that patch: Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>

Also Reviewed-by: Omar Sandoval <osandov@fb.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 4/4] block: rearrange a few request fields for better cache layout
  2018-01-10  0:29 [PATCHSET v2 0/4] struct request optimizations Jens Axboe
                   ` (2 preceding siblings ...)
  2018-01-10  0:29 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
@ 2018-01-10  0:29 ` Jens Axboe
  2018-01-10 18:34   ` Bart Van Assche
  2018-01-10 18:43   ` Omar Sandoval
  3 siblings, 2 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10  0:29 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, bart.vanassche, Jens Axboe

Move completion related items (like the call single data) near the
end of the struct, instead of mixing them in with the initial
queueing related fields.

Move queuelist below the bio structures. Then we have all
queueing related bits in the first cache line.

This yields a 1.5-2% increase in IOPS for a null_blk test, both for
sync and for high thread count access. Sync test goes form 975K to
992K, 32-thread case from 20.8M to 21.2M IOPS.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c         | 19 ++++++++++---------
 include/linux/blkdev.h | 28 +++++++++++++++-------------
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7248ee043651..ec128001ea8b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -270,8 +270,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
 	struct request *rq = tags->static_rqs[tag];
 
-	rq->rq_flags = 0;
-
 	if (data->flags & BLK_MQ_REQ_INTERNAL) {
 		rq->tag = -1;
 		rq->internal_tag = tag;
@@ -285,26 +283,23 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 		data->hctx->tags->rqs[rq->tag] = rq;
 	}
 
-	INIT_LIST_HEAD(&rq->queuelist);
 	/* csd/requeue_work/fifo_time is initialized before use */
 	rq->q = data->q;
 	rq->mq_ctx = data->ctx;
+	rq->rq_flags = 0;
+	rq->cpu = -1;
 	rq->cmd_flags = op;
 	if (data->flags & BLK_MQ_REQ_PREEMPT)
 		rq->rq_flags |= RQF_PREEMPT;
 	if (blk_queue_io_stat(data->q))
 		rq->rq_flags |= RQF_IO_STAT;
-	rq->cpu = -1;
+	/* do not touch atomic flags, it needs atomic ops against the timer */
+	INIT_LIST_HEAD(&rq->queuelist);
 	INIT_HLIST_NODE(&rq->hash);
 	RB_CLEAR_NODE(&rq->rb_node);
 	rq->rq_disk = NULL;
 	rq->part = NULL;
 	rq->start_time = jiffies;
-#ifdef CONFIG_BLK_CGROUP
-	rq->rl = NULL;
-	set_start_time_ns(rq);
-	rq->io_start_time_ns = 0;
-#endif
 	rq->nr_phys_segments = 0;
 #if defined(CONFIG_BLK_DEV_INTEGRITY)
 	rq->nr_integrity_segments = 0;
@@ -321,6 +316,12 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	rq->end_io_data = NULL;
 	rq->next_rq = NULL;
 
+#ifdef CONFIG_BLK_CGROUP
+	rq->rl = NULL;
+	set_start_time_ns(rq);
+	rq->io_start_time_ns = 0;
+#endif
+
 	data->ctx->rq_dispatched[op_is_sync(op)]++;
 	return rq;
 }
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d4b2f7bb18d6..71a9371c8182 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -141,12 +141,6 @@ typedef __u32 __bitwise req_flags_t;
  * especially blk_mq_rq_ctx_init() to take care of the added fields.
  */
 struct request {
-	struct list_head queuelist;
-	union {
-		call_single_data_t csd;
-		u64 fifo_time;
-	};
-
 	struct request_queue *q;
 	struct blk_mq_ctx *mq_ctx;
 
@@ -164,6 +158,8 @@ struct request {
 	struct bio *bio;
 	struct bio *biotail;
 
+	struct list_head queuelist;
+
 	/*
 	 * The hash is used inside the scheduler, and killed once the
 	 * request reaches the dispatch list. The ipi_list is only used
@@ -211,19 +207,16 @@ struct request {
 	struct hd_struct *part;
 	unsigned long start_time;
 	struct blk_issue_stat issue_stat;
-#ifdef CONFIG_BLK_CGROUP
-	struct request_list *rl;		/* rl this rq is alloced from */
-	unsigned long long start_time_ns;
-	unsigned long long io_start_time_ns;    /* when passed to hardware */
-#endif
 	/* Number of scatter-gather DMA addr+len pairs after
 	 * physical address coalescing is performed.
 	 */
 	unsigned short nr_phys_segments;
+
 #if defined(CONFIG_BLK_DEV_INTEGRITY)
 	unsigned short nr_integrity_segments;
 #endif
 
+	unsigned short write_hint;
 	unsigned short ioprio;
 
 	unsigned int timeout;
@@ -232,8 +225,6 @@ struct request {
 
 	unsigned int extra_len;	/* length of alignment and padding */
 
-	unsigned short write_hint;
-
 	/*
 	 * On blk-mq, the lower bits of ->gstate (generation number and
 	 * state) carry the MQ_RQ_* state value and the upper bits the
@@ -260,6 +251,11 @@ struct request {
 
 	struct list_head timeout_list;
 
+	union {
+		call_single_data_t csd;
+		u64 fifo_time;
+	};
+
 	/*
 	 * completion callback.
 	 */
@@ -268,6 +264,12 @@ struct request {
 
 	/* for bidi */
 	struct request *next_rq;
+
+#ifdef CONFIG_BLK_CGROUP
+	struct request_list *rl;		/* rl this rq is alloced from */
+	unsigned long long start_time_ns;
+	unsigned long long io_start_time_ns;    /* when passed to hardware */
+#endif
 };
 
 static inline bool blk_rq_is_scsi(struct request *rq)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] block: rearrange a few request fields for better cache layout
  2018-01-10  0:29 ` [PATCH 4/4] block: rearrange a few request fields for better cache layout Jens Axboe
@ 2018-01-10 18:34   ` Bart Van Assche
  2018-01-10 18:43   ` Omar Sandoval
  1 sibling, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-10 18:34 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk; +Cc: osandov@fb.com

T24gVHVlLCAyMDE4LTAxLTA5IGF0IDE3OjI5IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBN
b3ZlIGNvbXBsZXRpb24gcmVsYXRlZCBpdGVtcyAobGlrZSB0aGUgY2FsbCBzaW5nbGUgZGF0YSkg
bmVhciB0aGUNCj4gZW5kIG9mIHRoZSBzdHJ1Y3QsIGluc3RlYWQgb2YgbWl4aW5nIHRoZW0gaW4g
d2l0aCB0aGUgaW5pdGlhbA0KPiBxdWV1ZWluZyByZWxhdGVkIGZpZWxkcy4NCj4gDQo+IE1vdmUg
cXVldWVsaXN0IGJlbG93IHRoZSBiaW8gc3RydWN0dXJlcy4gVGhlbiB3ZSBoYXZlIGFsbA0KPiBx
dWV1ZWluZyByZWxhdGVkIGJpdHMgaW4gdGhlIGZpcnN0IGNhY2hlIGxpbmUuDQo+IA0KPiBUaGlz
IHlpZWxkcyBhIDEuNS0yJSBpbmNyZWFzZSBpbiBJT1BTIGZvciBhIG51bGxfYmxrIHRlc3QsIGJv
dGggZm9yDQo+IHN5bmMgYW5kIGZvciBoaWdoIHRocmVhZCBjb3VudCBhY2Nlc3MuIFN5bmMgdGVz
dCBnb2VzIGZvcm0gOTc1SyB0bw0KPiA5OTJLLCAzMi10aHJlYWQgY2FzZSBmcm9tIDIwLjhNIHRv
IDIxLjJNIElPUFMuDQoNClRoYXQncyBhIG5pY2UgcmVzdWx0IQ0KDQpSZXZpZXdlZC1ieTogQmFy
dCBWYW4gQXNzY2hlIDxiYXJ0LnZhbmFzc2NoZUB3ZGMuY29tPg0KDQo=

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] block: rearrange a few request fields for better cache layout
  2018-01-10  0:29 ` [PATCH 4/4] block: rearrange a few request fields for better cache layout Jens Axboe
  2018-01-10 18:34   ` Bart Van Assche
@ 2018-01-10 18:43   ` Omar Sandoval
  2018-01-10 18:45     ` Jens Axboe
  1 sibling, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2018-01-10 18:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, osandov, bart.vanassche

On Tue, Jan 09, 2018 at 05:29:27PM -0700, Jens Axboe wrote:
> Move completion related items (like the call single data) near the
> end of the struct, instead of mixing them in with the initial
> queueing related fields.
> 
> Move queuelist below the bio structures. Then we have all
> queueing related bits in the first cache line.
> 
> This yields a 1.5-2% increase in IOPS for a null_blk test, both for
> sync and for high thread count access. Sync test goes form 975K to
> 992K, 32-thread case from 20.8M to 21.2M IOPS.

One nit below, otherwise

Reviewed-by: Omar Sandoval <osandov@fb.com>

> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-mq.c         | 19 ++++++++++---------
>  include/linux/blkdev.h | 28 +++++++++++++++-------------
>  2 files changed, 25 insertions(+), 22 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 7248ee043651..ec128001ea8b 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -270,8 +270,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
>  	struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
>  	struct request *rq = tags->static_rqs[tag];
>  
> -	rq->rq_flags = 0;
> -
>  	if (data->flags & BLK_MQ_REQ_INTERNAL) {
>  		rq->tag = -1;
>  		rq->internal_tag = tag;
> @@ -285,26 +283,23 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
>  		data->hctx->tags->rqs[rq->tag] = rq;
>  	}
>  
> -	INIT_LIST_HEAD(&rq->queuelist);
>  	/* csd/requeue_work/fifo_time is initialized before use */
>  	rq->q = data->q;
>  	rq->mq_ctx = data->ctx;
> +	rq->rq_flags = 0;
> +	rq->cpu = -1;
>  	rq->cmd_flags = op;
>  	if (data->flags & BLK_MQ_REQ_PREEMPT)
>  		rq->rq_flags |= RQF_PREEMPT;
>  	if (blk_queue_io_stat(data->q))
>  		rq->rq_flags |= RQF_IO_STAT;
> -	rq->cpu = -1;
> +	/* do not touch atomic flags, it needs atomic ops against the timer */

This comment was just removed in a previous patch but it snuck back in.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] block: rearrange a few request fields for better cache layout
  2018-01-10 18:43   ` Omar Sandoval
@ 2018-01-10 18:45     ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-10 18:45 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-block, osandov, bart.vanassche

On 1/10/18 11:43 AM, Omar Sandoval wrote:
>> -	INIT_LIST_HEAD(&rq->queuelist);
>>  	/* csd/requeue_work/fifo_time is initialized before use */
>>  	rq->q = data->q;
>>  	rq->mq_ctx = data->ctx;
>> +	rq->rq_flags = 0;
>> +	rq->cpu = -1;
>>  	rq->cmd_flags = op;
>>  	if (data->flags & BLK_MQ_REQ_PREEMPT)
>>  		rq->rq_flags |= RQF_PREEMPT;
>>  	if (blk_queue_io_stat(data->q))
>>  		rq->rq_flags |= RQF_IO_STAT;
>> -	rq->cpu = -1;
>> +	/* do not touch atomic flags, it needs atomic ops against the timer */
> 
> This comment was just removed in a previous patch but it snuck back in.

Eagle eyes - thanks, I will kill it.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCHSET 0/4] struct request optimizations
@ 2018-01-09 18:26 Jens Axboe
  2018-01-09 18:27 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-09 18:26 UTC (permalink / raw)
  To: linux-block

With the latest patchset from Tejun, we grew the request structure
a little bit. It's been quite a while since I've taken a look at
the layout of the structure, this patchset is a first attempt at
doing that.

One advantage of Tejun's patchset is that we no longer rely on
the atomic complete flag on blk-mq. We can use that to shuffle
some bits, and reclaim the full atomic_flags field.

Cache optimize the layout of struct request a bit, to group
things a little more logically. Not a huge shuffle, just a few
select members.

We end up doing better in synthetic testing after this, details
in the last patch.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-09 18:26 [PATCHSET 0/4] struct request optimizations Jens Axboe
@ 2018-01-09 18:27 ` Jens Axboe
  2018-01-09 18:43   ` Bart Van Assche
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-09 18:27 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

We only have one atomic flag left. Instead of using an entire
unsigned long for that, steal the bottom bit of the deadline
field that we already reserved.

Remove ->atomic_flags, since it's now unused.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c       |  2 +-
 block/blk-mq-debugfs.c |  8 --------
 block/blk.h            | 19 +++++++++----------
 include/linux/blkdev.h |  2 --
 4 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f843ae4f858d..7ba607527487 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2853,7 +2853,7 @@ void blk_start_request(struct request *req)
 		wbt_issue(req->q->rq_wb, &req->issue_stat);
 	}
 
-	BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));
+	BUG_ON(blk_rq_is_complete(req));
 	blk_add_timer(req);
 }
 EXPORT_SYMBOL(blk_start_request);
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 2a9c9f8b6162..ac99b78415ec 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -291,12 +291,6 @@ static const char *const rqf_name[] = {
 };
 #undef RQF_NAME
 
-#define RQAF_NAME(name) [REQ_ATOM_##name] = #name
-static const char *const rqaf_name[] = {
-	RQAF_NAME(COMPLETE),
-};
-#undef RQAF_NAME
-
 int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 {
 	const struct blk_mq_ops *const mq_ops = rq->q->mq_ops;
@@ -313,8 +307,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 	seq_puts(m, ", .rq_flags=");
 	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
 		       ARRAY_SIZE(rqf_name));
-	seq_puts(m, ", .atomic_flags=");
-	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
 	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
 		   rq->internal_tag);
 	if (mq_ops->show_rq)
diff --git a/block/blk.h b/block/blk.h
index 8b26a8872e05..d5ae07cc4abb 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -120,24 +120,23 @@ void blk_account_io_completion(struct request *req, unsigned int bytes);
 void blk_account_io_done(struct request *req);
 
 /*
- * Internal atomic flags for request handling
- */
-enum rq_atomic_flags {
-	REQ_ATOM_COMPLETE = 0,
-};
-
-/*
  * EH timer and IO completion will both attempt to 'grab' the request, make
- * sure that only one of them succeeds
+ * sure that only one of them succeeds. Steal the bottom bit of the
+ * __deadline field for this.
  */
 static inline int blk_mark_rq_complete(struct request *rq)
 {
-	return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	return test_and_set_bit(0, &rq->__deadline);
 }
 
 static inline void blk_clear_rq_complete(struct request *rq)
 {
-	clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	clear_bit(0, &rq->__deadline);
+}
+
+static inline bool blk_rq_is_complete(struct request *rq)
+{
+	return test_bit(0, &rq->__deadline);
 }
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index aa6698cf483c..d4b2f7bb18d6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -156,8 +156,6 @@ struct request {
 
 	int internal_tag;
 
-	unsigned long atomic_flags;
-
 	/* the following two fields are internal, NEVER access directly */
 	unsigned int __data_len;	/* total data len */
 	int tag;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-09 18:27 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
@ 2018-01-09 18:43   ` Bart Van Assche
  2018-01-09 18:44     ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2018-01-09 18:43 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, axboe@kernel.dk

T24gVHVlLCAyMDE4LTAxLTA5IGF0IDExOjI3IC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiAg
c3RhdGljIGlubGluZSBpbnQgYmxrX21hcmtfcnFfY29tcGxldGUoc3RydWN0IHJlcXVlc3QgKnJx
KQ0KPiAgew0KPiAtCXJldHVybiB0ZXN0X2FuZF9zZXRfYml0KFJFUV9BVE9NX0NPTVBMRVRFLCAm
cnEtPmF0b21pY19mbGFncyk7DQo+ICsJcmV0dXJuIHRlc3RfYW5kX3NldF9iaXQoMCwgJnJxLT5f
X2RlYWRsaW5lKTsNCj4gIH0NCj4gIA0KPiAgc3RhdGljIGlubGluZSB2b2lkIGJsa19jbGVhcl9y
cV9jb21wbGV0ZShzdHJ1Y3QgcmVxdWVzdCAqcnEpDQo+ICB7DQo+IC0JY2xlYXJfYml0KFJFUV9B
VE9NX0NPTVBMRVRFLCAmcnEtPmF0b21pY19mbGFncyk7DQo+ICsJY2xlYXJfYml0KDAsICZycS0+
X19kZWFkbGluZSk7DQo+ICt9DQo+ICsNCj4gK3N0YXRpYyBpbmxpbmUgYm9vbCBibGtfcnFfaXNf
Y29tcGxldGUoc3RydWN0IHJlcXVlc3QgKnJxKQ0KPiArew0KPiArCXJldHVybiB0ZXN0X2JpdCgw
LCAmcnEtPl9fZGVhZGxpbmUpOw0KPiAgfQ0KDQpIZWxsbyBKZW5zLA0KDQpXaXRoIHRoaXMgY2hh
bmdlIHNldHRpbmcgb3IgY2hhbmdpbmcgdGhlIGRlYWRsaW5lIGNsZWFycyB0aGUgQ09NUExFVEUg
ZmxhZy4NCklzIHRoYXQgdGhlIGludGVuZGVkIGJlaGF2aW9yPyBJZiBzbywgc2hvdWxkIHBlcmhh
cHMgYSBjb21tZW50IGJlIGFkZGVkIGFib3ZlDQpibGtfcnFfc2V0X2RlYWRsaW5lKCk/DQoNClRo
YW5rcywNCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-09 18:43   ` Bart Van Assche
@ 2018-01-09 18:44     ` Jens Axboe
  2018-01-09 18:52       ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-09 18:44 UTC (permalink / raw)
  To: Bart Van Assche, linux-block@vger.kernel.org

On 1/9/18 11:43 AM, Bart Van Assche wrote:
> On Tue, 2018-01-09 at 11:27 -0700, Jens Axboe wrote:
>>  static inline int blk_mark_rq_complete(struct request *rq)
>>  {
>> -	return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
>> +	return test_and_set_bit(0, &rq->__deadline);
>>  }
>>  
>>  static inline void blk_clear_rq_complete(struct request *rq)
>>  {
>> -	clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
>> +	clear_bit(0, &rq->__deadline);
>> +}
>> +
>> +static inline bool blk_rq_is_complete(struct request *rq)
>> +{
>> +	return test_bit(0, &rq->__deadline);
>>  }
> 
> Hello Jens,
> 
> With this change setting or changing the deadline clears the COMPLETE flag.
> Is that the intended behavior? If so, should perhaps a comment be added above
> blk_rq_set_deadline()?

Yeah, it's intentional. I can add a comment to that effect. It's only done
before queueing - except for the case where we force a timeout, but for that
it's only on the blk-mq side, which doesn't care.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit
  2018-01-09 18:44     ` Jens Axboe
@ 2018-01-09 18:52       ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2018-01-09 18:52 UTC (permalink / raw)
  To: Bart Van Assche, linux-block@vger.kernel.org

On 1/9/18 11:44 AM, Jens Axboe wrote:
> On 1/9/18 11:43 AM, Bart Van Assche wrote:
>> On Tue, 2018-01-09 at 11:27 -0700, Jens Axboe wrote:
>>>  static inline int blk_mark_rq_complete(struct request *rq)
>>>  {
>>> -	return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
>>> +	return test_and_set_bit(0, &rq->__deadline);
>>>  }
>>>  
>>>  static inline void blk_clear_rq_complete(struct request *rq)
>>>  {
>>> -	clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
>>> +	clear_bit(0, &rq->__deadline);
>>> +}
>>> +
>>> +static inline bool blk_rq_is_complete(struct request *rq)
>>> +{
>>> +	return test_bit(0, &rq->__deadline);
>>>  }
>>
>> Hello Jens,
>>
>> With this change setting or changing the deadline clears the COMPLETE flag.
>> Is that the intended behavior? If so, should perhaps a comment be added above
>> blk_rq_set_deadline()?
> 
> Yeah, it's intentional. I can add a comment to that effect. It's only done
> before queueing - except for the case where we force a timeout, but for that
> it's only on the blk-mq side, which doesn't care.

Since we clear it when we init the request, we could also just leave the
bit intact when setting the deadline. That's probably the safer choice:

static inline void blk_rq_set_deadline(struct request *rq, unsigned long time)  
{                                                                               
        rq->__deadline = (time & ~0x1UL) | (rq->__deadline & 0x1UL);
}

I'll test that, previous testing didn't find anything wrong with clearing
the bit, but this does seem safer.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-01-10 18:45 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-10  0:29 [PATCHSET v2 0/4] struct request optimizations Jens Axboe
2018-01-10  0:29 ` [PATCH 1/4] block: remove REQ_ATOM_POLL_SLEPT Jens Axboe
2018-01-10 18:25   ` Bart Van Assche
2018-01-10 18:32     ` Jens Axboe
2018-01-10 18:35   ` Omar Sandoval
2018-01-10  0:29 ` [PATCH 2/4] block: add accessors for setting/querying request deadline Jens Axboe
2018-01-10 18:27   ` Bart Van Assche
2018-01-10 18:41   ` Omar Sandoval
2018-01-10  0:29 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
2018-01-10 18:29   ` Bart Van Assche
2018-01-10 18:32     ` Jens Axboe
2018-01-10 18:33       ` Bart Van Assche
2018-01-10 18:35         ` Jens Axboe
2018-01-10 18:42           ` Bart Van Assche
2018-01-10 18:45             ` Omar Sandoval
2018-01-10  0:29 ` [PATCH 4/4] block: rearrange a few request fields for better cache layout Jens Axboe
2018-01-10 18:34   ` Bart Van Assche
2018-01-10 18:43   ` Omar Sandoval
2018-01-10 18:45     ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2018-01-09 18:26 [PATCHSET 0/4] struct request optimizations Jens Axboe
2018-01-09 18:27 ` [PATCH 3/4] block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit Jens Axboe
2018-01-09 18:43   ` Bart Van Assche
2018-01-09 18:44     ` Jens Axboe
2018-01-09 18:52       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).