All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leonardo Bras <leobras@redhat.com>
To: Jens Axboe <axboe@kernel.dk>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Palmer Dabbelt <palmer@rivosinc.com>,
	Leonardo Bras <leobras@redhat.com>, Guo Ren <guoren@kernel.org>,
	Valentin Schneider <vschneid@redhat.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Juergen Gross <jgross@suse.com>,
	Yury Norov <yury.norov@gmail.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH v2 1/3] blk-mq: Move csd inside struct request so it's 32-byte aligned
Date: Sat, 20 May 2023 02:29:56 -0300	[thread overview]
Message-ID: <20230520052957.798486-2-leobras@redhat.com> (raw)
In-Reply-To: <20230520052957.798486-1-leobras@redhat.com>

Currently, request->csd has type struct __call_single_data.

call_single_data_t is defined in include/linux/smp.h :

/* Use __aligned() to avoid to use 2 cache lines for 1 csd */
typedef struct __call_single_data call_single_data_t
	__aligned(sizeof(struct __call_single_data));

As the comment above the typedef suggests, having struct __call_single_data
split between 2 cachelines causes the need to fetch / invalidate / bounce 2
cachelines instead of 1 when the cpu receiving the request gets to run the
requested function. This is usually bad for performance, due to one extra
memory access and 1 extra cacheline usage.

As an example with a 64-bit machine with
CONFIG_BLK_RQ_ALLOC_TIME=y
CONFIG_BLK_WBT=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_INLINE_ENCRYPTION=y

Will output pahole with:
struct request {
[...]
	union {
		struct __call_single_data csd;           /*   240    32 */
		u64                fifo_time;            /*   240     8 */
	};                                               /*   240    32 */
[...]
}

At this config, and any cacheline size between 32 and 256, will cause csd
to be split between 2 cachelines: csd->node (16 bytes) in the first
cacheline, and csd->func (8 bytes) & csd->info (8 bytes) in the second.

During blk_mq_complete_send_ipi(), csd->func and csd->info are getting
changed, and when it calls __smp_call_single_queue() csd->node will get
changed.

On the cpu which got the request, csd->func and csd->info get read by
__flush_smp_call_function_queue() and csd->node gets changed by
csd_unlock(), meaning the two cachelines containing csd will get accessed.

To avoid this, it would be necessary to make sure request->csd is placed
somewhere else in the struct, so it is always in a single cacheline,
while avoiding the introduction of any hole in the struct. In order to
achieve this, move request->csd to after 'struct block_device *part'.

The rationale of this change is that:
- There will be no CONFIG_*-dependent field before csd, so there is no
  chance of having unexpected holes on given configs.
- On 64-bit machines, csd will be at byte 96, and
- On 32-bit machines, csd will be at byte 64.

This means after this change, request->csd will always be cacheline aligned
for cachelines >= 32-bytes (64-bit) and cachelines >= 16-bytes (32-bits),
as long as struct request is cacheline aligned.

In above change, the struct request size is not supposed to change in any
configuration.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 include/linux/blk-mq.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 06caacd77ed6..44201e18681f 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -105,6 +105,11 @@ struct request {
 	};
 
 	struct block_device *part;
+
+	union {
+		struct __call_single_data csd;
+		u64 fifo_time;
+	};
 #ifdef CONFIG_BLK_RQ_ALLOC_TIME
 	/* Time that the first bio started allocating this request. */
 	u64 alloc_time_ns;
@@ -189,11 +194,6 @@ struct request {
 		} flush;
 	};
 
-	union {
-		struct __call_single_data csd;
-		u64 fifo_time;
-	};
-
 	/*
 	 * completion callback.
 	 */
-- 
2.40.1


  reply	other threads:[~2023-05-20  5:31 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-20  5:29 [RFC PATCH v2 0/3] Move usages of struct __call_single_data to call_single_data_t Leonardo Bras
2023-05-20  5:29 ` Leonardo Bras [this message]
2023-05-20  5:29 ` [RFC PATCH v2 2/3] blk-mq: Change request->csd type " Leonardo Bras
2023-05-20  5:29 ` [RFC PATCH v2 3/3] smp: Change signatures to use call_single_data_t Leonardo Bras
2023-06-13  3:51 ` [RFC PATCH v2 0/3] Move usages of struct __call_single_data to call_single_data_t Leonardo Bras Soares Passos
2023-07-04  7:22   ` Leonardo Brás
2023-08-29  0:55     ` Leonardo Brás
2023-08-29  2:29       ` Chengming Zhou
2023-08-30 22:29         ` Leonardo Brás
2023-08-30 22:48           ` Jens Axboe
2023-08-31  2:04             ` Leonardo Brás

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230520052957.798486-2-leobras@redhat.com \
    --to=leobras@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=guoren@kernel.org \
    --cc=jgross@suse.com \
    --cc=jpoimboe@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=palmer@rivosinc.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=vschneid@redhat.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.