* [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg
@ 2025-08-06 9:48 Max Kellermann
2025-08-06 9:48 ` [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer Max Kellermann
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: Max Kellermann @ 2025-08-06 9:48 UTC (permalink / raw)
To: xiubli, idryomov, amarkuze, ceph-devel, linux-kernel; +Cc: Max Kellermann
These patches reduce reloads of con->out_msg by passing pointers that
we already have in local variables (i.e. registers) as parameters.
Access to con->out_queue is now gone completely from v1/v2 and only
few references to con->out_msg remain. In the long run, I'd like to
get rid of con->out_msg completely and instead send the whole
con->out_queue in one kernel_sendmsg() call. This patch series helps
with preparing that.
Max Kellermann (3):
net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
net/ceph/messenger_v[12]: pass ceph_msg* instead of loading
con->out_msg
net/ceph/messenger: add empty check to ceph_con_get_out_msg()
include/linux/ceph/messenger.h | 6 +-
net/ceph/messenger.c | 12 +--
net/ceph/messenger_v1.c | 59 ++++++------
net/ceph/messenger_v2.c | 160 ++++++++++++++++-----------------
4 files changed, 119 insertions(+), 118 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
@ 2025-08-06 9:48 ` Max Kellermann
2025-08-08 17:40 ` Viacheslav Dubeyko
2025-08-06 9:48 ` [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg Max Kellermann
` (3 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Max Kellermann @ 2025-08-06 9:48 UTC (permalink / raw)
To: xiubli, idryomov, amarkuze, ceph-devel, linux-kernel; +Cc: Max Kellermann
The caller in messenger_v1.c loads it anyway, so let's keep the
pointer in the register instead of reloading it from memory. This
eliminates a tiny bit of unnecessary overhead.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
include/linux/ceph/messenger.h | 2 +-
net/ceph/messenger.c | 4 ++--
net/ceph/messenger_v1.c | 3 +--
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 1717cc57cdac..57fa70c6edfb 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -548,7 +548,7 @@ void ceph_addr_set_port(struct ceph_entity_addr *addr, int p);
void ceph_con_process_message(struct ceph_connection *con);
int ceph_con_in_msg_alloc(struct ceph_connection *con,
struct ceph_msg_header *hdr, int *skip);
-void ceph_con_get_out_msg(struct ceph_connection *con);
+struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con);
/* messenger_v1.c */
int ceph_con_v1_try_read(struct ceph_connection *con);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d1b5705dc0c6..7ab2176b977e 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2109,7 +2109,7 @@ int ceph_con_in_msg_alloc(struct ceph_connection *con,
return ret;
}
-void ceph_con_get_out_msg(struct ceph_connection *con)
+struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con)
{
struct ceph_msg *msg;
@@ -2140,7 +2140,7 @@ void ceph_con_get_out_msg(struct ceph_connection *con)
* message or in case of a fault.
*/
WARN_ON(con->out_msg);
- con->out_msg = ceph_msg_get(msg);
+ return con->out_msg = ceph_msg_get(msg);
}
/*
diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
index 0cb61c76b9b8..eebe4e19d75a 100644
--- a/net/ceph/messenger_v1.c
+++ b/net/ceph/messenger_v1.c
@@ -210,8 +210,7 @@ static void prepare_write_message(struct ceph_connection *con)
&con->v1.out_temp_ack);
}
- ceph_con_get_out_msg(con);
- m = con->out_msg;
+ m = ceph_con_get_out_msg(con);
dout("prepare_write_message %p seq %lld type %d len %d+%d+%zd\n",
m, con->out_seq, le16_to_cpu(m->hdr.type),
--
2.47.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
2025-08-06 9:48 ` [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer Max Kellermann
@ 2025-08-06 9:48 ` Max Kellermann
2025-08-08 17:41 ` Viacheslav Dubeyko
2025-08-06 9:48 ` [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg() Max Kellermann
` (2 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Max Kellermann @ 2025-08-06 9:48 UTC (permalink / raw)
To: xiubli, idryomov, amarkuze, ceph-devel, linux-kernel; +Cc: Max Kellermann
This pointer is in a register anyway, so let's use that instead of
reloading from memory everywhere.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
include/linux/ceph/messenger.h | 4 +-
net/ceph/messenger.c | 4 +-
net/ceph/messenger_v1.c | 43 +++++----
net/ceph/messenger_v2.c | 158 ++++++++++++++++-----------------
4 files changed, 102 insertions(+), 107 deletions(-)
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 57fa70c6edfb..585844a28237 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -553,7 +553,7 @@ struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con);
/* messenger_v1.c */
int ceph_con_v1_try_read(struct ceph_connection *con);
int ceph_con_v1_try_write(struct ceph_connection *con);
-void ceph_con_v1_revoke(struct ceph_connection *con);
+void ceph_con_v1_revoke(struct ceph_connection *con, struct ceph_msg *msg);
void ceph_con_v1_revoke_incoming(struct ceph_connection *con);
bool ceph_con_v1_opened(struct ceph_connection *con);
void ceph_con_v1_reset_session(struct ceph_connection *con);
@@ -562,7 +562,7 @@ void ceph_con_v1_reset_protocol(struct ceph_connection *con);
/* messenger_v2.c */
int ceph_con_v2_try_read(struct ceph_connection *con);
int ceph_con_v2_try_write(struct ceph_connection *con);
-void ceph_con_v2_revoke(struct ceph_connection *con);
+void ceph_con_v2_revoke(struct ceph_connection *con, struct ceph_msg *msg);
void ceph_con_v2_revoke_incoming(struct ceph_connection *con);
bool ceph_con_v2_opened(struct ceph_connection *con);
void ceph_con_v2_reset_session(struct ceph_connection *con);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 7ab2176b977e..424fb2769b71 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1792,9 +1792,9 @@ void ceph_msg_revoke(struct ceph_msg *msg)
WARN_ON(con->state != CEPH_CON_S_OPEN);
dout("%s con %p msg %p was sending\n", __func__, con, msg);
if (ceph_msgr2(from_msgr(con->msgr)))
- ceph_con_v2_revoke(con);
+ ceph_con_v2_revoke(con, msg);
else
- ceph_con_v1_revoke(con);
+ ceph_con_v1_revoke(con, msg);
ceph_msg_put(con->out_msg);
con->out_msg = NULL;
} else {
diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
index eebe4e19d75a..516f2eeb122a 100644
--- a/net/ceph/messenger_v1.c
+++ b/net/ceph/messenger_v1.c
@@ -169,10 +169,8 @@ static void prepare_message_data(struct ceph_msg *msg, u32 data_len)
* Prepare footer for currently outgoing message, and finish things
* off. Assumes out_kvec* are already valid.. we just add on to the end.
*/
-static void prepare_write_message_footer(struct ceph_connection *con)
+static void prepare_write_message_footer(struct ceph_connection *con, struct ceph_msg *m)
{
- struct ceph_msg *m = con->out_msg;
-
m->footer.flags |= CEPH_MSG_FOOTER_COMPLETE;
dout("prepare_write_message_footer %p\n", con);
@@ -230,31 +228,31 @@ static void prepare_write_message(struct ceph_connection *con)
/* fill in hdr crc and finalize hdr */
crc = crc32c(0, &m->hdr, offsetof(struct ceph_msg_header, crc));
- con->out_msg->hdr.crc = cpu_to_le32(crc);
- memcpy(&con->v1.out_hdr, &con->out_msg->hdr, sizeof(con->v1.out_hdr));
+ m->hdr.crc = cpu_to_le32(crc);
+ memcpy(&con->v1.out_hdr, &m->hdr, sizeof(con->v1.out_hdr));
/* fill in front and middle crc, footer */
crc = crc32c(0, m->front.iov_base, m->front.iov_len);
- con->out_msg->footer.front_crc = cpu_to_le32(crc);
+ m->footer.front_crc = cpu_to_le32(crc);
if (m->middle) {
crc = crc32c(0, m->middle->vec.iov_base,
m->middle->vec.iov_len);
- con->out_msg->footer.middle_crc = cpu_to_le32(crc);
+ m->footer.middle_crc = cpu_to_le32(crc);
} else
- con->out_msg->footer.middle_crc = 0;
+ m->footer.middle_crc = 0;
dout("%s front_crc %u middle_crc %u\n", __func__,
- le32_to_cpu(con->out_msg->footer.front_crc),
- le32_to_cpu(con->out_msg->footer.middle_crc));
- con->out_msg->footer.flags = 0;
+ le32_to_cpu(m->footer.front_crc),
+ le32_to_cpu(m->footer.middle_crc));
+ m->footer.flags = 0;
/* is there a data payload? */
- con->out_msg->footer.data_crc = 0;
+ m->footer.data_crc = 0;
if (m->data_length) {
- prepare_message_data(con->out_msg, m->data_length);
+ prepare_message_data(m, m->data_length);
con->v1.out_more = 1; /* data + footer will follow */
} else {
/* no, queue up footer too and be done */
- prepare_write_message_footer(con);
+ prepare_write_message_footer(con, m);
}
ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
@@ -461,9 +459,8 @@ static int write_partial_kvec(struct ceph_connection *con)
* 0 -> socket full, but more to do
* <0 -> error
*/
-static int write_partial_message_data(struct ceph_connection *con)
+static int write_partial_message_data(struct ceph_connection *con, struct ceph_msg *msg)
{
- struct ceph_msg *msg = con->out_msg;
struct ceph_msg_data_cursor *cursor = &msg->cursor;
bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC);
u32 crc;
@@ -515,7 +512,7 @@ static int write_partial_message_data(struct ceph_connection *con)
else
msg->footer.flags |= CEPH_MSG_FOOTER_NOCRC;
con_out_kvec_reset(con);
- prepare_write_message_footer(con);
+ prepare_write_message_footer(con, msg);
return 1; /* must return > 0 to indicate success */
}
@@ -1471,6 +1468,7 @@ int ceph_con_v1_try_read(struct ceph_connection *con)
*/
int ceph_con_v1_try_write(struct ceph_connection *con)
{
+ struct ceph_msg *msg;
int ret = 1;
dout("try_write start %p state %d\n", con, con->state);
@@ -1517,14 +1515,15 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
}
/* msg pages? */
- if (con->out_msg) {
+ msg = con->out_msg;
+ if (msg) {
if (con->v1.out_msg_done) {
- ceph_msg_put(con->out_msg);
+ ceph_msg_put(msg);
con->out_msg = NULL; /* we're done with this one */
goto do_next;
}
- ret = write_partial_message_data(con);
+ ret = write_partial_message_data(con, msg);
if (ret == 1)
goto more; /* we need to send the footer, too! */
if (ret == 0)
@@ -1563,10 +1562,8 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
return ret;
}
-void ceph_con_v1_revoke(struct ceph_connection *con)
+void ceph_con_v1_revoke(struct ceph_connection *con, struct ceph_msg *msg)
{
- struct ceph_msg *msg = con->out_msg;
-
WARN_ON(con->v1.out_skip);
/* footer */
if (con->v1.out_msg_done) {
diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
index 5483b4eed94e..90109fa0fe60 100644
--- a/net/ceph/messenger_v2.c
+++ b/net/ceph/messenger_v2.c
@@ -1589,10 +1589,10 @@ static int prepare_ack(struct ceph_connection *con)
return prepare_control(con, FRAME_TAG_ACK, con->v2.out_buf, 8);
}
-static void prepare_epilogue_plain(struct ceph_connection *con, bool aborted)
+static void prepare_epilogue_plain(struct ceph_connection *con, struct ceph_msg *msg, bool aborted)
{
dout("%s con %p msg %p aborted %d crcs %u %u %u\n", __func__, con,
- con->out_msg, aborted, con->v2.out_epil.front_crc,
+ msg, aborted, con->v2.out_epil.front_crc,
con->v2.out_epil.middle_crc, con->v2.out_epil.data_crc);
encode_epilogue_plain(con, aborted);
@@ -1603,10 +1603,8 @@ static void prepare_epilogue_plain(struct ceph_connection *con, bool aborted)
* For "used" empty segments, crc is -1. For unused (trailing)
* segments, crc is 0.
*/
-static void prepare_message_plain(struct ceph_connection *con)
+static void prepare_message_plain(struct ceph_connection *con, struct ceph_msg *msg)
{
- struct ceph_msg *msg = con->out_msg;
-
prepare_head_plain(con, con->v2.out_buf,
sizeof(struct ceph_msg_header2), NULL, 0, false);
@@ -1647,7 +1645,7 @@ static void prepare_message_plain(struct ceph_connection *con)
con->v2.out_state = OUT_S_QUEUE_DATA;
} else {
con->v2.out_epil.data_crc = 0;
- prepare_epilogue_plain(con, false);
+ prepare_epilogue_plain(con, msg, false);
con->v2.out_state = OUT_S_FINISH_MESSAGE;
}
}
@@ -1659,7 +1657,7 @@ static void prepare_message_plain(struct ceph_connection *con)
* allocate pages for the entire tail of the message (currently up
* to ~32M) and two sgs arrays (up to ~256K each)...
*/
-static int prepare_message_secure(struct ceph_connection *con)
+static int prepare_message_secure(struct ceph_connection *con, struct ceph_msg *msg)
{
void *zerop = page_address(ceph_zero_page);
struct sg_table enc_sgt = {};
@@ -1674,7 +1672,7 @@ static int prepare_message_secure(struct ceph_connection *con)
if (ret)
return ret;
- tail_len = tail_onwire_len(con->out_msg, true);
+ tail_len = tail_onwire_len(msg, true);
if (!tail_len) {
/*
* Empty message: once the head is written,
@@ -1685,7 +1683,7 @@ static int prepare_message_secure(struct ceph_connection *con)
}
encode_epilogue_secure(con, false);
- ret = setup_message_sgs(&sgt, con->out_msg, zerop, zerop, zerop,
+ ret = setup_message_sgs(&sgt, msg, zerop, zerop, zerop,
&con->v2.out_epil, NULL, 0, false);
if (ret)
goto out;
@@ -1714,7 +1712,7 @@ static int prepare_message_secure(struct ceph_connection *con)
goto out;
dout("%s con %p msg %p sg_cnt %d enc_page_cnt %d\n", __func__, con,
- con->out_msg, sgt.orig_nents, enc_page_cnt);
+ msg, sgt.orig_nents, enc_page_cnt);
con->v2.out_state = OUT_S_QUEUE_ENC_PAGE;
out:
@@ -1723,19 +1721,19 @@ static int prepare_message_secure(struct ceph_connection *con)
return ret;
}
-static int prepare_message(struct ceph_connection *con)
+static int prepare_message(struct ceph_connection *con, struct ceph_msg *msg)
{
int lens[] = {
sizeof(struct ceph_msg_header2),
- front_len(con->out_msg),
- middle_len(con->out_msg),
- data_len(con->out_msg)
+ front_len(msg),
+ middle_len(msg),
+ data_len(msg)
};
struct ceph_frame_desc desc;
int ret;
dout("%s con %p msg %p logical %d+%d+%d+%d\n", __func__, con,
- con->out_msg, lens[0], lens[1], lens[2], lens[3]);
+ msg, lens[0], lens[1], lens[2], lens[3]);
if (con->in_seq > con->in_seq_acked) {
dout("%s con %p in_seq_acked %llu -> %llu\n", __func__, con,
@@ -1746,15 +1744,15 @@ static int prepare_message(struct ceph_connection *con)
reset_out_kvecs(con);
init_frame_desc(&desc, FRAME_TAG_MESSAGE, lens, 4);
encode_preamble(&desc, con->v2.out_buf);
- fill_header2(CTRL_BODY(con->v2.out_buf), &con->out_msg->hdr,
+ fill_header2(CTRL_BODY(con->v2.out_buf), &msg->hdr,
con->in_seq_acked);
if (con_secure(con)) {
- ret = prepare_message_secure(con);
+ ret = prepare_message_secure(con, msg);
if (ret)
return ret;
} else {
- prepare_message_plain(con);
+ prepare_message_plain(con, msg);
}
ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
@@ -3184,20 +3182,20 @@ int ceph_con_v2_try_read(struct ceph_connection *con)
}
}
-static void queue_data(struct ceph_connection *con)
+static void queue_data(struct ceph_connection *con, struct ceph_msg *msg)
{
struct bio_vec bv;
con->v2.out_epil.data_crc = -1;
- ceph_msg_data_cursor_init(&con->v2.out_cursor, con->out_msg,
- data_len(con->out_msg));
+ ceph_msg_data_cursor_init(&con->v2.out_cursor, msg,
+ data_len(msg));
get_bvec_at(&con->v2.out_cursor, &bv);
set_out_bvec(con, &bv, true);
con->v2.out_state = OUT_S_QUEUE_DATA_CONT;
}
-static void queue_data_cont(struct ceph_connection *con)
+static void queue_data_cont(struct ceph_connection *con, struct ceph_msg *msg)
{
struct bio_vec bv;
@@ -3218,7 +3216,7 @@ static void queue_data_cont(struct ceph_connection *con)
* we are done.
*/
reset_out_kvecs(con);
- prepare_epilogue_plain(con, false);
+ prepare_epilogue_plain(con, msg, false);
con->v2.out_state = OUT_S_FINISH_MESSAGE;
}
@@ -3250,7 +3248,7 @@ static void queue_enc_page(struct ceph_connection *con)
con->v2.out_state = OUT_S_FINISH_MESSAGE;
}
-static void queue_zeros(struct ceph_connection *con)
+static void queue_zeros(struct ceph_connection *con, struct ceph_msg *msg)
{
dout("%s con %p out_zero %d\n", __func__, con, con->v2.out_zero);
@@ -3267,7 +3265,7 @@ static void queue_zeros(struct ceph_connection *con)
* Once it's written, we are done patching up for the revoke.
*/
reset_out_kvecs(con);
- prepare_epilogue_plain(con, true);
+ prepare_epilogue_plain(con, msg, true);
con->v2.out_state = OUT_S_FINISH_MESSAGE;
}
@@ -3309,18 +3307,18 @@ static int populate_out_iter(struct ceph_connection *con)
switch (con->v2.out_state) {
case OUT_S_QUEUE_DATA:
WARN_ON(!con->out_msg);
- queue_data(con);
+ queue_data(con, con->out_msg);
goto populated;
case OUT_S_QUEUE_DATA_CONT:
WARN_ON(!con->out_msg);
- queue_data_cont(con);
+ queue_data_cont(con, con->out_msg);
goto populated;
case OUT_S_QUEUE_ENC_PAGE:
queue_enc_page(con);
goto populated;
case OUT_S_QUEUE_ZEROS:
WARN_ON(con->out_msg); /* revoked */
- queue_zeros(con);
+ queue_zeros(con, con->out_msg);
goto populated;
case OUT_S_FINISH_MESSAGE:
finish_message(con);
@@ -3340,8 +3338,8 @@ static int populate_out_iter(struct ceph_connection *con)
return ret;
}
} else if (!list_empty(&con->out_queue)) {
- ceph_con_get_out_msg(con);
- ret = prepare_message(con);
+ struct ceph_msg *msg = ceph_con_get_out_msg(con);
+ ret = prepare_message(con, msg);
if (ret) {
pr_err("prepare_message failed: %d\n", ret);
return ret;
@@ -3453,17 +3451,17 @@ static u32 crc32c_zeros(u32 crc, int zero_len)
return crc;
}
-static void prepare_zero_front(struct ceph_connection *con, int resid)
+static void prepare_zero_front(struct ceph_connection *con, struct ceph_msg *msg, int resid)
{
int sent;
- WARN_ON(!resid || resid > front_len(con->out_msg));
- sent = front_len(con->out_msg) - resid;
+ WARN_ON(!resid || resid > front_len(msg));
+ sent = front_len(msg) - resid;
dout("%s con %p sent %d resid %d\n", __func__, con, sent, resid);
if (sent) {
con->v2.out_epil.front_crc =
- crc32c(-1, con->out_msg->front.iov_base, sent);
+ crc32c(-1, msg->front.iov_base, sent);
con->v2.out_epil.front_crc =
crc32c_zeros(con->v2.out_epil.front_crc, resid);
} else {
@@ -3474,17 +3472,17 @@ static void prepare_zero_front(struct ceph_connection *con, int resid)
out_zero_add(con, resid);
}
-static void prepare_zero_middle(struct ceph_connection *con, int resid)
+static void prepare_zero_middle(struct ceph_connection *con, struct ceph_msg *msg, int resid)
{
int sent;
- WARN_ON(!resid || resid > middle_len(con->out_msg));
- sent = middle_len(con->out_msg) - resid;
+ WARN_ON(!resid || resid > middle_len(msg));
+ sent = middle_len(msg) - resid;
dout("%s con %p sent %d resid %d\n", __func__, con, sent, resid);
if (sent) {
con->v2.out_epil.middle_crc =
- crc32c(-1, con->out_msg->middle->vec.iov_base, sent);
+ crc32c(-1, msg->middle->vec.iov_base, sent);
con->v2.out_epil.middle_crc =
crc32c_zeros(con->v2.out_epil.middle_crc, resid);
} else {
@@ -3495,61 +3493,61 @@ static void prepare_zero_middle(struct ceph_connection *con, int resid)
out_zero_add(con, resid);
}
-static void prepare_zero_data(struct ceph_connection *con)
+static void prepare_zero_data(struct ceph_connection *con, struct ceph_msg *msg)
{
dout("%s con %p\n", __func__, con);
- con->v2.out_epil.data_crc = crc32c_zeros(-1, data_len(con->out_msg));
- out_zero_add(con, data_len(con->out_msg));
+ con->v2.out_epil.data_crc = crc32c_zeros(-1, data_len(msg));
+ out_zero_add(con, data_len(msg));
}
-static void revoke_at_queue_data(struct ceph_connection *con)
+static void revoke_at_queue_data(struct ceph_connection *con, struct ceph_msg *msg)
{
int boundary;
int resid;
- WARN_ON(!data_len(con->out_msg));
+ WARN_ON(!data_len(msg));
WARN_ON(!iov_iter_is_kvec(&con->v2.out_iter));
resid = iov_iter_count(&con->v2.out_iter);
- boundary = front_len(con->out_msg) + middle_len(con->out_msg);
+ boundary = front_len(msg) + middle_len(msg);
if (resid > boundary) {
resid -= boundary;
WARN_ON(resid > MESSAGE_HEAD_PLAIN_LEN);
dout("%s con %p was sending head\n", __func__, con);
- if (front_len(con->out_msg))
- prepare_zero_front(con, front_len(con->out_msg));
- if (middle_len(con->out_msg))
- prepare_zero_middle(con, middle_len(con->out_msg));
- prepare_zero_data(con);
+ if (front_len(msg))
+ prepare_zero_front(con, msg, front_len(msg));
+ if (middle_len(msg))
+ prepare_zero_middle(con, msg, middle_len(msg));
+ prepare_zero_data(con, msg);
WARN_ON(iov_iter_count(&con->v2.out_iter) != resid);
con->v2.out_state = OUT_S_QUEUE_ZEROS;
return;
}
- boundary = middle_len(con->out_msg);
+ boundary = middle_len(msg);
if (resid > boundary) {
resid -= boundary;
dout("%s con %p was sending front\n", __func__, con);
- prepare_zero_front(con, resid);
- if (middle_len(con->out_msg))
- prepare_zero_middle(con, middle_len(con->out_msg));
- prepare_zero_data(con);
- queue_zeros(con);
+ prepare_zero_front(con, msg, resid);
+ if (middle_len(msg))
+ prepare_zero_middle(con, msg, middle_len(msg));
+ prepare_zero_data(con, msg);
+ queue_zeros(con, msg);
return;
}
WARN_ON(!resid);
dout("%s con %p was sending middle\n", __func__, con);
- prepare_zero_middle(con, resid);
- prepare_zero_data(con);
- queue_zeros(con);
+ prepare_zero_middle(con, msg, resid);
+ prepare_zero_data(con, msg);
+ queue_zeros(con, msg);
}
-static void revoke_at_queue_data_cont(struct ceph_connection *con)
+static void revoke_at_queue_data_cont(struct ceph_connection *con, struct ceph_msg *msg)
{
int sent, resid; /* current piece of data */
- WARN_ON(!data_len(con->out_msg));
+ WARN_ON(!data_len(msg));
WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter));
resid = iov_iter_count(&con->v2.out_iter);
WARN_ON(!resid || resid > con->v2.out_bvec.bv_len);
@@ -3568,10 +3566,10 @@ static void revoke_at_queue_data_cont(struct ceph_connection *con)
con->v2.out_iter.count -= resid;
out_zero_add(con, con->v2.out_cursor.total_resid);
- queue_zeros(con);
+ queue_zeros(con, msg);
}
-static void revoke_at_finish_message(struct ceph_connection *con)
+static void revoke_at_finish_message(struct ceph_connection *con, struct ceph_msg *msg)
{
int boundary;
int resid;
@@ -3579,39 +3577,39 @@ static void revoke_at_finish_message(struct ceph_connection *con)
WARN_ON(!iov_iter_is_kvec(&con->v2.out_iter));
resid = iov_iter_count(&con->v2.out_iter);
- if (!front_len(con->out_msg) && !middle_len(con->out_msg) &&
- !data_len(con->out_msg)) {
+ if (!front_len(msg) && !middle_len(msg) &&
+ !data_len(msg)) {
WARN_ON(!resid || resid > MESSAGE_HEAD_PLAIN_LEN);
dout("%s con %p was sending head (empty message) - noop\n",
__func__, con);
return;
}
- boundary = front_len(con->out_msg) + middle_len(con->out_msg) +
+ boundary = front_len(msg) + middle_len(msg) +
CEPH_EPILOGUE_PLAIN_LEN;
if (resid > boundary) {
resid -= boundary;
WARN_ON(resid > MESSAGE_HEAD_PLAIN_LEN);
dout("%s con %p was sending head\n", __func__, con);
- if (front_len(con->out_msg))
- prepare_zero_front(con, front_len(con->out_msg));
- if (middle_len(con->out_msg))
- prepare_zero_middle(con, middle_len(con->out_msg));
+ if (front_len(msg))
+ prepare_zero_front(con, msg, front_len(msg));
+ if (middle_len(msg))
+ prepare_zero_middle(con, msg, middle_len(msg));
con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
WARN_ON(iov_iter_count(&con->v2.out_iter) != resid);
con->v2.out_state = OUT_S_QUEUE_ZEROS;
return;
}
- boundary = middle_len(con->out_msg) + CEPH_EPILOGUE_PLAIN_LEN;
+ boundary = middle_len(msg) + CEPH_EPILOGUE_PLAIN_LEN;
if (resid > boundary) {
resid -= boundary;
dout("%s con %p was sending front\n", __func__, con);
- prepare_zero_front(con, resid);
- if (middle_len(con->out_msg))
- prepare_zero_middle(con, middle_len(con->out_msg));
+ prepare_zero_front(con, msg, resid);
+ if (middle_len(msg))
+ prepare_zero_middle(con, msg, middle_len(msg));
con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
- queue_zeros(con);
+ queue_zeros(con, msg);
return;
}
@@ -3619,9 +3617,9 @@ static void revoke_at_finish_message(struct ceph_connection *con)
if (resid > boundary) {
resid -= boundary;
dout("%s con %p was sending middle\n", __func__, con);
- prepare_zero_middle(con, resid);
+ prepare_zero_middle(con, msg, resid);
con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
- queue_zeros(con);
+ queue_zeros(con, msg);
return;
}
@@ -3629,7 +3627,7 @@ static void revoke_at_finish_message(struct ceph_connection *con)
dout("%s con %p was sending epilogue - noop\n", __func__, con);
}
-void ceph_con_v2_revoke(struct ceph_connection *con)
+void ceph_con_v2_revoke(struct ceph_connection *con, struct ceph_msg *msg)
{
WARN_ON(con->v2.out_zero);
@@ -3642,13 +3640,13 @@ void ceph_con_v2_revoke(struct ceph_connection *con)
switch (con->v2.out_state) {
case OUT_S_QUEUE_DATA:
- revoke_at_queue_data(con);
+ revoke_at_queue_data(con, msg);
break;
case OUT_S_QUEUE_DATA_CONT:
- revoke_at_queue_data_cont(con);
+ revoke_at_queue_data_cont(con, msg);
break;
case OUT_S_FINISH_MESSAGE:
- revoke_at_finish_message(con);
+ revoke_at_finish_message(con, msg);
break;
default:
WARN(1, "bad out_state %d", con->v2.out_state);
--
2.47.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
2025-08-06 9:48 ` [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer Max Kellermann
2025-08-06 9:48 ` [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg Max Kellermann
@ 2025-08-06 9:48 ` Max Kellermann
2025-08-08 17:41 ` Viacheslav Dubeyko
2025-10-09 11:18 ` Ilya Dryomov
2025-08-08 17:43 ` [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Viacheslav Dubeyko
2025-08-11 17:05 ` Viacheslav Dubeyko
4 siblings, 2 replies; 16+ messages in thread
From: Max Kellermann @ 2025-08-06 9:48 UTC (permalink / raw)
To: xiubli, idryomov, amarkuze, ceph-devel, linux-kernel; +Cc: Max Kellermann
This moves the list_empty() checks from the two callers (v1 and v2)
into the base messenger.c library. Now the v1/v2 specializations do
not need to know about con->out_queue; that implementation detail is
now hidden behind the ceph_con_get_out_msg() function.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
net/ceph/messenger.c | 4 +++-
net/ceph/messenger_v1.c | 15 ++++++++++-----
net/ceph/messenger_v2.c | 4 ++--
3 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 424fb2769b71..8886c38a55d2 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2113,7 +2113,9 @@ struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con)
{
struct ceph_msg *msg;
- BUG_ON(list_empty(&con->out_queue));
+ if (list_empty(&con->out_queue))
+ return NULL;
+
msg = list_first_entry(&con->out_queue, struct ceph_msg, list_head);
WARN_ON(msg->con != con);
diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
index 516f2eeb122a..5eb6cfdbc494 100644
--- a/net/ceph/messenger_v1.c
+++ b/net/ceph/messenger_v1.c
@@ -189,12 +189,18 @@ static void prepare_write_message_footer(struct ceph_connection *con, struct cep
/*
* Prepare headers for the next outgoing message.
+ *
+ * @return false if there are no outgoing messages
*/
-static void prepare_write_message(struct ceph_connection *con)
+static bool prepare_write_message(struct ceph_connection *con)
{
struct ceph_msg *m;
u32 crc;
+ m = ceph_con_get_out_msg(con);
+ if (m == NULL)
+ return false;
+
con_out_kvec_reset(con);
con->v1.out_msg_done = false;
@@ -208,8 +214,6 @@ static void prepare_write_message(struct ceph_connection *con)
&con->v1.out_temp_ack);
}
- m = ceph_con_get_out_msg(con);
-
dout("prepare_write_message %p seq %lld type %d len %d+%d+%zd\n",
m, con->out_seq, le16_to_cpu(m->hdr.type),
le32_to_cpu(m->hdr.front_len), le32_to_cpu(m->hdr.middle_len),
@@ -256,6 +260,8 @@ static void prepare_write_message(struct ceph_connection *con)
}
ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
+
+ return true;
}
/*
@@ -1543,8 +1549,7 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
goto more;
}
/* is anything else pending? */
- if (!list_empty(&con->out_queue)) {
- prepare_write_message(con);
+ if (prepare_write_message(con)) {
goto more;
}
if (con->in_seq > con->in_seq_acked) {
diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
index 90109fa0fe60..e0b5f2e2582d 100644
--- a/net/ceph/messenger_v2.c
+++ b/net/ceph/messenger_v2.c
@@ -3292,6 +3292,7 @@ static void finish_message(struct ceph_connection *con)
static int populate_out_iter(struct ceph_connection *con)
{
+ struct ceph_msg *msg;
int ret;
dout("%s con %p state %d out_state %d\n", __func__, con, con->state,
@@ -3337,8 +3338,7 @@ static int populate_out_iter(struct ceph_connection *con)
pr_err("prepare_keepalive2 failed: %d\n", ret);
return ret;
}
- } else if (!list_empty(&con->out_queue)) {
- struct ceph_msg *msg = ceph_con_get_out_msg(con);
+ } else if ((msg = ceph_con_get_out_msg(con)) != NULL) {
ret = prepare_message(con, msg);
if (ret) {
pr_err("prepare_message failed: %d\n", ret);
--
2.47.2
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
2025-08-06 9:48 ` [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer Max Kellermann
@ 2025-08-08 17:40 ` Viacheslav Dubeyko
2025-08-11 23:29 ` Viacheslav Dubeyko
0 siblings, 1 reply; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-08 17:40 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, max.kellermann@ionos.com, Xiubo Li,
idryomov@gmail.com, linux-kernel@vger.kernel.org, Alex Markuze
On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> The caller in messenger_v1.c loads it anyway, so let's keep the
> pointer in the register instead of reloading it from memory. This
> eliminates a tiny bit of unnecessary overhead.
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Looks good.
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
> ---
> include/linux/ceph/messenger.h | 2 +-
> net/ceph/messenger.c | 4 ++--
> net/ceph/messenger_v1.c | 3 +--
> 3 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 1717cc57cdac..57fa70c6edfb 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -548,7 +548,7 @@ void ceph_addr_set_port(struct ceph_entity_addr *addr, int p);
> void ceph_con_process_message(struct ceph_connection *con);
> int ceph_con_in_msg_alloc(struct ceph_connection *con,
> struct ceph_msg_header *hdr, int *skip);
> -void ceph_con_get_out_msg(struct ceph_connection *con);
> +struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con);
>
> /* messenger_v1.c */
> int ceph_con_v1_try_read(struct ceph_connection *con);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index d1b5705dc0c6..7ab2176b977e 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2109,7 +2109,7 @@ int ceph_con_in_msg_alloc(struct ceph_connection *con,
> return ret;
> }
>
> -void ceph_con_get_out_msg(struct ceph_connection *con)
> +struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con)
> {
> struct ceph_msg *msg;
>
> @@ -2140,7 +2140,7 @@ void ceph_con_get_out_msg(struct ceph_connection *con)
> * message or in case of a fault.
> */
> WARN_ON(con->out_msg);
> - con->out_msg = ceph_msg_get(msg);
> + return con->out_msg = ceph_msg_get(msg);
> }
>
> /*
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 0cb61c76b9b8..eebe4e19d75a 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -210,8 +210,7 @@ static void prepare_write_message(struct ceph_connection *con)
> &con->v1.out_temp_ack);
> }
>
> - ceph_con_get_out_msg(con);
> - m = con->out_msg;
> + m = ceph_con_get_out_msg(con);
>
> dout("prepare_write_message %p seq %lld type %d len %d+%d+%zd\n",
> m, con->out_seq, le16_to_cpu(m->hdr.type),
--
Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg
2025-08-06 9:48 ` [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg Max Kellermann
@ 2025-08-08 17:41 ` Viacheslav Dubeyko
2025-08-11 23:28 ` Viacheslav Dubeyko
0 siblings, 1 reply; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-08 17:41 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, max.kellermann@ionos.com, Xiubo Li,
idryomov@gmail.com, linux-kernel@vger.kernel.org, Alex Markuze
On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> This pointer is in a register anyway, so let's use that instead of
> reloading from memory everywhere.
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Looks good.
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
> ---
> include/linux/ceph/messenger.h | 4 +-
> net/ceph/messenger.c | 4 +-
> net/ceph/messenger_v1.c | 43 +++++----
> net/ceph/messenger_v2.c | 158 ++++++++++++++++-----------------
> 4 files changed, 102 insertions(+), 107 deletions(-)
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 57fa70c6edfb..585844a28237 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -553,7 +553,7 @@ struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con);
> /* messenger_v1.c */
> int ceph_con_v1_try_read(struct ceph_connection *con);
> int ceph_con_v1_try_write(struct ceph_connection *con);
> -void ceph_con_v1_revoke(struct ceph_connection *con);
> +void ceph_con_v1_revoke(struct ceph_connection *con, struct ceph_msg *msg);
> void ceph_con_v1_revoke_incoming(struct ceph_connection *con);
> bool ceph_con_v1_opened(struct ceph_connection *con);
> void ceph_con_v1_reset_session(struct ceph_connection *con);
> @@ -562,7 +562,7 @@ void ceph_con_v1_reset_protocol(struct ceph_connection *con);
> /* messenger_v2.c */
> int ceph_con_v2_try_read(struct ceph_connection *con);
> int ceph_con_v2_try_write(struct ceph_connection *con);
> -void ceph_con_v2_revoke(struct ceph_connection *con);
> +void ceph_con_v2_revoke(struct ceph_connection *con, struct ceph_msg *msg);
> void ceph_con_v2_revoke_incoming(struct ceph_connection *con);
> bool ceph_con_v2_opened(struct ceph_connection *con);
> void ceph_con_v2_reset_session(struct ceph_connection *con);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 7ab2176b977e..424fb2769b71 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -1792,9 +1792,9 @@ void ceph_msg_revoke(struct ceph_msg *msg)
> WARN_ON(con->state != CEPH_CON_S_OPEN);
> dout("%s con %p msg %p was sending\n", __func__, con, msg);
> if (ceph_msgr2(from_msgr(con->msgr)))
> - ceph_con_v2_revoke(con);
> + ceph_con_v2_revoke(con, msg);
> else
> - ceph_con_v1_revoke(con);
> + ceph_con_v1_revoke(con, msg);
> ceph_msg_put(con->out_msg);
> con->out_msg = NULL;
> } else {
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index eebe4e19d75a..516f2eeb122a 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -169,10 +169,8 @@ static void prepare_message_data(struct ceph_msg *msg, u32 data_len)
> * Prepare footer for currently outgoing message, and finish things
> * off. Assumes out_kvec* are already valid.. we just add on to the end.
> */
> -static void prepare_write_message_footer(struct ceph_connection *con)
> +static void prepare_write_message_footer(struct ceph_connection *con, struct ceph_msg *m)
> {
> - struct ceph_msg *m = con->out_msg;
> -
> m->footer.flags |= CEPH_MSG_FOOTER_COMPLETE;
>
> dout("prepare_write_message_footer %p\n", con);
> @@ -230,31 +228,31 @@ static void prepare_write_message(struct ceph_connection *con)
>
> /* fill in hdr crc and finalize hdr */
> crc = crc32c(0, &m->hdr, offsetof(struct ceph_msg_header, crc));
> - con->out_msg->hdr.crc = cpu_to_le32(crc);
> - memcpy(&con->v1.out_hdr, &con->out_msg->hdr, sizeof(con->v1.out_hdr));
> + m->hdr.crc = cpu_to_le32(crc);
> + memcpy(&con->v1.out_hdr, &m->hdr, sizeof(con->v1.out_hdr));
>
> /* fill in front and middle crc, footer */
> crc = crc32c(0, m->front.iov_base, m->front.iov_len);
> - con->out_msg->footer.front_crc = cpu_to_le32(crc);
> + m->footer.front_crc = cpu_to_le32(crc);
> if (m->middle) {
> crc = crc32c(0, m->middle->vec.iov_base,
> m->middle->vec.iov_len);
> - con->out_msg->footer.middle_crc = cpu_to_le32(crc);
> + m->footer.middle_crc = cpu_to_le32(crc);
> } else
> - con->out_msg->footer.middle_crc = 0;
> + m->footer.middle_crc = 0;
> dout("%s front_crc %u middle_crc %u\n", __func__,
> - le32_to_cpu(con->out_msg->footer.front_crc),
> - le32_to_cpu(con->out_msg->footer.middle_crc));
> - con->out_msg->footer.flags = 0;
> + le32_to_cpu(m->footer.front_crc),
> + le32_to_cpu(m->footer.middle_crc));
> + m->footer.flags = 0;
>
> /* is there a data payload? */
> - con->out_msg->footer.data_crc = 0;
> + m->footer.data_crc = 0;
> if (m->data_length) {
> - prepare_message_data(con->out_msg, m->data_length);
> + prepare_message_data(m, m->data_length);
> con->v1.out_more = 1; /* data + footer will follow */
> } else {
> /* no, queue up footer too and be done */
> - prepare_write_message_footer(con);
> + prepare_write_message_footer(con, m);
> }
>
> ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
> @@ -461,9 +459,8 @@ static int write_partial_kvec(struct ceph_connection *con)
> * 0 -> socket full, but more to do
> * <0 -> error
> */
> -static int write_partial_message_data(struct ceph_connection *con)
> +static int write_partial_message_data(struct ceph_connection *con, struct ceph_msg *msg)
> {
> - struct ceph_msg *msg = con->out_msg;
> struct ceph_msg_data_cursor *cursor = &msg->cursor;
> bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC);
> u32 crc;
> @@ -515,7 +512,7 @@ static int write_partial_message_data(struct ceph_connection *con)
> else
> msg->footer.flags |= CEPH_MSG_FOOTER_NOCRC;
> con_out_kvec_reset(con);
> - prepare_write_message_footer(con);
> + prepare_write_message_footer(con, msg);
>
> return 1; /* must return > 0 to indicate success */
> }
> @@ -1471,6 +1468,7 @@ int ceph_con_v1_try_read(struct ceph_connection *con)
> */
> int ceph_con_v1_try_write(struct ceph_connection *con)
> {
> + struct ceph_msg *msg;
> int ret = 1;
>
> dout("try_write start %p state %d\n", con, con->state);
> @@ -1517,14 +1515,15 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
> }
>
> /* msg pages? */
> - if (con->out_msg) {
> + msg = con->out_msg;
> + if (msg) {
> if (con->v1.out_msg_done) {
> - ceph_msg_put(con->out_msg);
> + ceph_msg_put(msg);
> con->out_msg = NULL; /* we're done with this one */
> goto do_next;
> }
>
> - ret = write_partial_message_data(con);
> + ret = write_partial_message_data(con, msg);
> if (ret == 1)
> goto more; /* we need to send the footer, too! */
> if (ret == 0)
> @@ -1563,10 +1562,8 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
> return ret;
> }
>
> -void ceph_con_v1_revoke(struct ceph_connection *con)
> +void ceph_con_v1_revoke(struct ceph_connection *con, struct ceph_msg *msg)
> {
> - struct ceph_msg *msg = con->out_msg;
> -
> WARN_ON(con->v1.out_skip);
> /* footer */
> if (con->v1.out_msg_done) {
> diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
> index 5483b4eed94e..90109fa0fe60 100644
> --- a/net/ceph/messenger_v2.c
> +++ b/net/ceph/messenger_v2.c
> @@ -1589,10 +1589,10 @@ static int prepare_ack(struct ceph_connection *con)
> return prepare_control(con, FRAME_TAG_ACK, con->v2.out_buf, 8);
> }
>
> -static void prepare_epilogue_plain(struct ceph_connection *con, bool aborted)
> +static void prepare_epilogue_plain(struct ceph_connection *con, struct ceph_msg *msg, bool aborted)
> {
> dout("%s con %p msg %p aborted %d crcs %u %u %u\n", __func__, con,
> - con->out_msg, aborted, con->v2.out_epil.front_crc,
> + msg, aborted, con->v2.out_epil.front_crc,
> con->v2.out_epil.middle_crc, con->v2.out_epil.data_crc);
>
> encode_epilogue_plain(con, aborted);
> @@ -1603,10 +1603,8 @@ static void prepare_epilogue_plain(struct ceph_connection *con, bool aborted)
> * For "used" empty segments, crc is -1. For unused (trailing)
> * segments, crc is 0.
> */
> -static void prepare_message_plain(struct ceph_connection *con)
> +static void prepare_message_plain(struct ceph_connection *con, struct ceph_msg *msg)
> {
> - struct ceph_msg *msg = con->out_msg;
> -
> prepare_head_plain(con, con->v2.out_buf,
> sizeof(struct ceph_msg_header2), NULL, 0, false);
>
> @@ -1647,7 +1645,7 @@ static void prepare_message_plain(struct ceph_connection *con)
> con->v2.out_state = OUT_S_QUEUE_DATA;
> } else {
> con->v2.out_epil.data_crc = 0;
> - prepare_epilogue_plain(con, false);
> + prepare_epilogue_plain(con, msg, false);
> con->v2.out_state = OUT_S_FINISH_MESSAGE;
> }
> }
> @@ -1659,7 +1657,7 @@ static void prepare_message_plain(struct ceph_connection *con)
> * allocate pages for the entire tail of the message (currently up
> * to ~32M) and two sgs arrays (up to ~256K each)...
> */
> -static int prepare_message_secure(struct ceph_connection *con)
> +static int prepare_message_secure(struct ceph_connection *con, struct ceph_msg *msg)
> {
> void *zerop = page_address(ceph_zero_page);
> struct sg_table enc_sgt = {};
> @@ -1674,7 +1672,7 @@ static int prepare_message_secure(struct ceph_connection *con)
> if (ret)
> return ret;
>
> - tail_len = tail_onwire_len(con->out_msg, true);
> + tail_len = tail_onwire_len(msg, true);
> if (!tail_len) {
> /*
> * Empty message: once the head is written,
> @@ -1685,7 +1683,7 @@ static int prepare_message_secure(struct ceph_connection *con)
> }
>
> encode_epilogue_secure(con, false);
> - ret = setup_message_sgs(&sgt, con->out_msg, zerop, zerop, zerop,
> + ret = setup_message_sgs(&sgt, msg, zerop, zerop, zerop,
> &con->v2.out_epil, NULL, 0, false);
> if (ret)
> goto out;
> @@ -1714,7 +1712,7 @@ static int prepare_message_secure(struct ceph_connection *con)
> goto out;
>
> dout("%s con %p msg %p sg_cnt %d enc_page_cnt %d\n", __func__, con,
> - con->out_msg, sgt.orig_nents, enc_page_cnt);
> + msg, sgt.orig_nents, enc_page_cnt);
> con->v2.out_state = OUT_S_QUEUE_ENC_PAGE;
>
> out:
> @@ -1723,19 +1721,19 @@ static int prepare_message_secure(struct ceph_connection *con)
> return ret;
> }
>
> -static int prepare_message(struct ceph_connection *con)
> +static int prepare_message(struct ceph_connection *con, struct ceph_msg *msg)
> {
> int lens[] = {
> sizeof(struct ceph_msg_header2),
> - front_len(con->out_msg),
> - middle_len(con->out_msg),
> - data_len(con->out_msg)
> + front_len(msg),
> + middle_len(msg),
> + data_len(msg)
> };
> struct ceph_frame_desc desc;
> int ret;
>
> dout("%s con %p msg %p logical %d+%d+%d+%d\n", __func__, con,
> - con->out_msg, lens[0], lens[1], lens[2], lens[3]);
> + msg, lens[0], lens[1], lens[2], lens[3]);
>
> if (con->in_seq > con->in_seq_acked) {
> dout("%s con %p in_seq_acked %llu -> %llu\n", __func__, con,
> @@ -1746,15 +1744,15 @@ static int prepare_message(struct ceph_connection *con)
> reset_out_kvecs(con);
> init_frame_desc(&desc, FRAME_TAG_MESSAGE, lens, 4);
> encode_preamble(&desc, con->v2.out_buf);
> - fill_header2(CTRL_BODY(con->v2.out_buf), &con->out_msg->hdr,
> + fill_header2(CTRL_BODY(con->v2.out_buf), &msg->hdr,
> con->in_seq_acked);
>
> if (con_secure(con)) {
> - ret = prepare_message_secure(con);
> + ret = prepare_message_secure(con, msg);
> if (ret)
> return ret;
> } else {
> - prepare_message_plain(con);
> + prepare_message_plain(con, msg);
> }
>
> ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
> @@ -3184,20 +3182,20 @@ int ceph_con_v2_try_read(struct ceph_connection *con)
> }
> }
>
> -static void queue_data(struct ceph_connection *con)
> +static void queue_data(struct ceph_connection *con, struct ceph_msg *msg)
> {
> struct bio_vec bv;
>
> con->v2.out_epil.data_crc = -1;
> - ceph_msg_data_cursor_init(&con->v2.out_cursor, con->out_msg,
> - data_len(con->out_msg));
> + ceph_msg_data_cursor_init(&con->v2.out_cursor, msg,
> + data_len(msg));
>
> get_bvec_at(&con->v2.out_cursor, &bv);
> set_out_bvec(con, &bv, true);
> con->v2.out_state = OUT_S_QUEUE_DATA_CONT;
> }
>
> -static void queue_data_cont(struct ceph_connection *con)
> +static void queue_data_cont(struct ceph_connection *con, struct ceph_msg *msg)
> {
> struct bio_vec bv;
>
> @@ -3218,7 +3216,7 @@ static void queue_data_cont(struct ceph_connection *con)
> * we are done.
> */
> reset_out_kvecs(con);
> - prepare_epilogue_plain(con, false);
> + prepare_epilogue_plain(con, msg, false);
> con->v2.out_state = OUT_S_FINISH_MESSAGE;
> }
>
> @@ -3250,7 +3248,7 @@ static void queue_enc_page(struct ceph_connection *con)
> con->v2.out_state = OUT_S_FINISH_MESSAGE;
> }
>
> -static void queue_zeros(struct ceph_connection *con)
> +static void queue_zeros(struct ceph_connection *con, struct ceph_msg *msg)
> {
> dout("%s con %p out_zero %d\n", __func__, con, con->v2.out_zero);
>
> @@ -3267,7 +3265,7 @@ static void queue_zeros(struct ceph_connection *con)
> * Once it's written, we are done patching up for the revoke.
> */
> reset_out_kvecs(con);
> - prepare_epilogue_plain(con, true);
> + prepare_epilogue_plain(con, msg, true);
> con->v2.out_state = OUT_S_FINISH_MESSAGE;
> }
>
> @@ -3309,18 +3307,18 @@ static int populate_out_iter(struct ceph_connection *con)
> switch (con->v2.out_state) {
> case OUT_S_QUEUE_DATA:
> WARN_ON(!con->out_msg);
> - queue_data(con);
> + queue_data(con, con->out_msg);
> goto populated;
> case OUT_S_QUEUE_DATA_CONT:
> WARN_ON(!con->out_msg);
> - queue_data_cont(con);
> + queue_data_cont(con, con->out_msg);
> goto populated;
> case OUT_S_QUEUE_ENC_PAGE:
> queue_enc_page(con);
> goto populated;
> case OUT_S_QUEUE_ZEROS:
> WARN_ON(con->out_msg); /* revoked */
> - queue_zeros(con);
> + queue_zeros(con, con->out_msg);
> goto populated;
> case OUT_S_FINISH_MESSAGE:
> finish_message(con);
> @@ -3340,8 +3338,8 @@ static int populate_out_iter(struct ceph_connection *con)
> return ret;
> }
> } else if (!list_empty(&con->out_queue)) {
> - ceph_con_get_out_msg(con);
> - ret = prepare_message(con);
> + struct ceph_msg *msg = ceph_con_get_out_msg(con);
> + ret = prepare_message(con, msg);
> if (ret) {
> pr_err("prepare_message failed: %d\n", ret);
> return ret;
> @@ -3453,17 +3451,17 @@ static u32 crc32c_zeros(u32 crc, int zero_len)
> return crc;
> }
>
> -static void prepare_zero_front(struct ceph_connection *con, int resid)
> +static void prepare_zero_front(struct ceph_connection *con, struct ceph_msg *msg, int resid)
> {
> int sent;
>
> - WARN_ON(!resid || resid > front_len(con->out_msg));
> - sent = front_len(con->out_msg) - resid;
> + WARN_ON(!resid || resid > front_len(msg));
> + sent = front_len(msg) - resid;
> dout("%s con %p sent %d resid %d\n", __func__, con, sent, resid);
>
> if (sent) {
> con->v2.out_epil.front_crc =
> - crc32c(-1, con->out_msg->front.iov_base, sent);
> + crc32c(-1, msg->front.iov_base, sent);
> con->v2.out_epil.front_crc =
> crc32c_zeros(con->v2.out_epil.front_crc, resid);
> } else {
> @@ -3474,17 +3472,17 @@ static void prepare_zero_front(struct ceph_connection *con, int resid)
> out_zero_add(con, resid);
> }
>
> -static void prepare_zero_middle(struct ceph_connection *con, int resid)
> +static void prepare_zero_middle(struct ceph_connection *con, struct ceph_msg *msg, int resid)
> {
> int sent;
>
> - WARN_ON(!resid || resid > middle_len(con->out_msg));
> - sent = middle_len(con->out_msg) - resid;
> + WARN_ON(!resid || resid > middle_len(msg));
> + sent = middle_len(msg) - resid;
> dout("%s con %p sent %d resid %d\n", __func__, con, sent, resid);
>
> if (sent) {
> con->v2.out_epil.middle_crc =
> - crc32c(-1, con->out_msg->middle->vec.iov_base, sent);
> + crc32c(-1, msg->middle->vec.iov_base, sent);
> con->v2.out_epil.middle_crc =
> crc32c_zeros(con->v2.out_epil.middle_crc, resid);
> } else {
> @@ -3495,61 +3493,61 @@ static void prepare_zero_middle(struct ceph_connection *con, int resid)
> out_zero_add(con, resid);
> }
>
> -static void prepare_zero_data(struct ceph_connection *con)
> +static void prepare_zero_data(struct ceph_connection *con, struct ceph_msg *msg)
> {
> dout("%s con %p\n", __func__, con);
> - con->v2.out_epil.data_crc = crc32c_zeros(-1, data_len(con->out_msg));
> - out_zero_add(con, data_len(con->out_msg));
> + con->v2.out_epil.data_crc = crc32c_zeros(-1, data_len(msg));
> + out_zero_add(con, data_len(msg));
> }
>
> -static void revoke_at_queue_data(struct ceph_connection *con)
> +static void revoke_at_queue_data(struct ceph_connection *con, struct ceph_msg *msg)
> {
> int boundary;
> int resid;
>
> - WARN_ON(!data_len(con->out_msg));
> + WARN_ON(!data_len(msg));
> WARN_ON(!iov_iter_is_kvec(&con->v2.out_iter));
> resid = iov_iter_count(&con->v2.out_iter);
>
> - boundary = front_len(con->out_msg) + middle_len(con->out_msg);
> + boundary = front_len(msg) + middle_len(msg);
> if (resid > boundary) {
> resid -= boundary;
> WARN_ON(resid > MESSAGE_HEAD_PLAIN_LEN);
> dout("%s con %p was sending head\n", __func__, con);
> - if (front_len(con->out_msg))
> - prepare_zero_front(con, front_len(con->out_msg));
> - if (middle_len(con->out_msg))
> - prepare_zero_middle(con, middle_len(con->out_msg));
> - prepare_zero_data(con);
> + if (front_len(msg))
> + prepare_zero_front(con, msg, front_len(msg));
> + if (middle_len(msg))
> + prepare_zero_middle(con, msg, middle_len(msg));
> + prepare_zero_data(con, msg);
> WARN_ON(iov_iter_count(&con->v2.out_iter) != resid);
> con->v2.out_state = OUT_S_QUEUE_ZEROS;
> return;
> }
>
> - boundary = middle_len(con->out_msg);
> + boundary = middle_len(msg);
> if (resid > boundary) {
> resid -= boundary;
> dout("%s con %p was sending front\n", __func__, con);
> - prepare_zero_front(con, resid);
> - if (middle_len(con->out_msg))
> - prepare_zero_middle(con, middle_len(con->out_msg));
> - prepare_zero_data(con);
> - queue_zeros(con);
> + prepare_zero_front(con, msg, resid);
> + if (middle_len(msg))
> + prepare_zero_middle(con, msg, middle_len(msg));
> + prepare_zero_data(con, msg);
> + queue_zeros(con, msg);
> return;
> }
>
> WARN_ON(!resid);
> dout("%s con %p was sending middle\n", __func__, con);
> - prepare_zero_middle(con, resid);
> - prepare_zero_data(con);
> - queue_zeros(con);
> + prepare_zero_middle(con, msg, resid);
> + prepare_zero_data(con, msg);
> + queue_zeros(con, msg);
> }
>
> -static void revoke_at_queue_data_cont(struct ceph_connection *con)
> +static void revoke_at_queue_data_cont(struct ceph_connection *con, struct ceph_msg *msg)
> {
> int sent, resid; /* current piece of data */
>
> - WARN_ON(!data_len(con->out_msg));
> + WARN_ON(!data_len(msg));
> WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter));
> resid = iov_iter_count(&con->v2.out_iter);
> WARN_ON(!resid || resid > con->v2.out_bvec.bv_len);
> @@ -3568,10 +3566,10 @@ static void revoke_at_queue_data_cont(struct ceph_connection *con)
>
> con->v2.out_iter.count -= resid;
> out_zero_add(con, con->v2.out_cursor.total_resid);
> - queue_zeros(con);
> + queue_zeros(con, msg);
> }
>
> -static void revoke_at_finish_message(struct ceph_connection *con)
> +static void revoke_at_finish_message(struct ceph_connection *con, struct ceph_msg *msg)
> {
> int boundary;
> int resid;
> @@ -3579,39 +3577,39 @@ static void revoke_at_finish_message(struct ceph_connection *con)
> WARN_ON(!iov_iter_is_kvec(&con->v2.out_iter));
> resid = iov_iter_count(&con->v2.out_iter);
>
> - if (!front_len(con->out_msg) && !middle_len(con->out_msg) &&
> - !data_len(con->out_msg)) {
> + if (!front_len(msg) && !middle_len(msg) &&
> + !data_len(msg)) {
> WARN_ON(!resid || resid > MESSAGE_HEAD_PLAIN_LEN);
> dout("%s con %p was sending head (empty message) - noop\n",
> __func__, con);
> return;
> }
>
> - boundary = front_len(con->out_msg) + middle_len(con->out_msg) +
> + boundary = front_len(msg) + middle_len(msg) +
> CEPH_EPILOGUE_PLAIN_LEN;
> if (resid > boundary) {
> resid -= boundary;
> WARN_ON(resid > MESSAGE_HEAD_PLAIN_LEN);
> dout("%s con %p was sending head\n", __func__, con);
> - if (front_len(con->out_msg))
> - prepare_zero_front(con, front_len(con->out_msg));
> - if (middle_len(con->out_msg))
> - prepare_zero_middle(con, middle_len(con->out_msg));
> + if (front_len(msg))
> + prepare_zero_front(con, msg, front_len(msg));
> + if (middle_len(msg))
> + prepare_zero_middle(con, msg, middle_len(msg));
> con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
> WARN_ON(iov_iter_count(&con->v2.out_iter) != resid);
> con->v2.out_state = OUT_S_QUEUE_ZEROS;
> return;
> }
>
> - boundary = middle_len(con->out_msg) + CEPH_EPILOGUE_PLAIN_LEN;
> + boundary = middle_len(msg) + CEPH_EPILOGUE_PLAIN_LEN;
> if (resid > boundary) {
> resid -= boundary;
> dout("%s con %p was sending front\n", __func__, con);
> - prepare_zero_front(con, resid);
> - if (middle_len(con->out_msg))
> - prepare_zero_middle(con, middle_len(con->out_msg));
> + prepare_zero_front(con, msg, resid);
> + if (middle_len(msg))
> + prepare_zero_middle(con, msg, middle_len(msg));
> con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
> - queue_zeros(con);
> + queue_zeros(con, msg);
> return;
> }
>
> @@ -3619,9 +3617,9 @@ static void revoke_at_finish_message(struct ceph_connection *con)
> if (resid > boundary) {
> resid -= boundary;
> dout("%s con %p was sending middle\n", __func__, con);
> - prepare_zero_middle(con, resid);
> + prepare_zero_middle(con, msg, resid);
> con->v2.out_iter.count -= CEPH_EPILOGUE_PLAIN_LEN;
> - queue_zeros(con);
> + queue_zeros(con, msg);
> return;
> }
>
> @@ -3629,7 +3627,7 @@ static void revoke_at_finish_message(struct ceph_connection *con)
> dout("%s con %p was sending epilogue - noop\n", __func__, con);
> }
>
> -void ceph_con_v2_revoke(struct ceph_connection *con)
> +void ceph_con_v2_revoke(struct ceph_connection *con, struct ceph_msg *msg)
> {
> WARN_ON(con->v2.out_zero);
>
> @@ -3642,13 +3640,13 @@ void ceph_con_v2_revoke(struct ceph_connection *con)
>
> switch (con->v2.out_state) {
> case OUT_S_QUEUE_DATA:
> - revoke_at_queue_data(con);
> + revoke_at_queue_data(con, msg);
> break;
> case OUT_S_QUEUE_DATA_CONT:
> - revoke_at_queue_data_cont(con);
> + revoke_at_queue_data_cont(con, msg);
> break;
> case OUT_S_FINISH_MESSAGE:
> - revoke_at_finish_message(con);
> + revoke_at_finish_message(con, msg);
> break;
> default:
> WARN(1, "bad out_state %d", con->v2.out_state);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-08-06 9:48 ` [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg() Max Kellermann
@ 2025-08-08 17:41 ` Viacheslav Dubeyko
2025-08-11 23:29 ` Viacheslav Dubeyko
2025-10-09 11:18 ` Ilya Dryomov
1 sibling, 1 reply; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-08 17:41 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, max.kellermann@ionos.com, Xiubo Li,
idryomov@gmail.com, linux-kernel@vger.kernel.org, Alex Markuze
On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> This moves the list_empty() checks from the two callers (v1 and v2)
> into the base messenger.c library. Now the v1/v2 specializations do
> not need to know about con->out_queue; that implementation detail is
> now hidden behind the ceph_con_get_out_msg() function.
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Looks good.
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
> ---
> net/ceph/messenger.c | 4 +++-
> net/ceph/messenger_v1.c | 15 ++++++++++-----
> net/ceph/messenger_v2.c | 4 ++--
> 3 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 424fb2769b71..8886c38a55d2 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2113,7 +2113,9 @@ struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con)
> {
> struct ceph_msg *msg;
>
> - BUG_ON(list_empty(&con->out_queue));
> + if (list_empty(&con->out_queue))
> + return NULL;
> +
> msg = list_first_entry(&con->out_queue, struct ceph_msg, list_head);
> WARN_ON(msg->con != con);
>
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 516f2eeb122a..5eb6cfdbc494 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -189,12 +189,18 @@ static void prepare_write_message_footer(struct ceph_connection *con, struct cep
>
> /*
> * Prepare headers for the next outgoing message.
> + *
> + * @return false if there are no outgoing messages
> */
> -static void prepare_write_message(struct ceph_connection *con)
> +static bool prepare_write_message(struct ceph_connection *con)
> {
> struct ceph_msg *m;
> u32 crc;
>
> + m = ceph_con_get_out_msg(con);
> + if (m == NULL)
> + return false;
> +
> con_out_kvec_reset(con);
> con->v1.out_msg_done = false;
>
> @@ -208,8 +214,6 @@ static void prepare_write_message(struct ceph_connection *con)
> &con->v1.out_temp_ack);
> }
>
> - m = ceph_con_get_out_msg(con);
> -
> dout("prepare_write_message %p seq %lld type %d len %d+%d+%zd\n",
> m, con->out_seq, le16_to_cpu(m->hdr.type),
> le32_to_cpu(m->hdr.front_len), le32_to_cpu(m->hdr.middle_len),
> @@ -256,6 +260,8 @@ static void prepare_write_message(struct ceph_connection *con)
> }
>
> ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
> +
> + return true;
> }
>
> /*
> @@ -1543,8 +1549,7 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
> goto more;
> }
> /* is anything else pending? */
> - if (!list_empty(&con->out_queue)) {
> - prepare_write_message(con);
> + if (prepare_write_message(con)) {
> goto more;
> }
> if (con->in_seq > con->in_seq_acked) {
> diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
> index 90109fa0fe60..e0b5f2e2582d 100644
> --- a/net/ceph/messenger_v2.c
> +++ b/net/ceph/messenger_v2.c
> @@ -3292,6 +3292,7 @@ static void finish_message(struct ceph_connection *con)
>
> static int populate_out_iter(struct ceph_connection *con)
> {
> + struct ceph_msg *msg;
> int ret;
>
> dout("%s con %p state %d out_state %d\n", __func__, con, con->state,
> @@ -3337,8 +3338,7 @@ static int populate_out_iter(struct ceph_connection *con)
> pr_err("prepare_keepalive2 failed: %d\n", ret);
> return ret;
> }
> - } else if (!list_empty(&con->out_queue)) {
> - struct ceph_msg *msg = ceph_con_get_out_msg(con);
> + } else if ((msg = ceph_con_get_out_msg(con)) != NULL) {
> ret = prepare_message(con, msg);
> if (ret) {
> pr_err("prepare_message failed: %d\n", ret);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
` (2 preceding siblings ...)
2025-08-06 9:48 ` [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg() Max Kellermann
@ 2025-08-08 17:43 ` Viacheslav Dubeyko
2025-08-11 17:05 ` Viacheslav Dubeyko
4 siblings, 0 replies; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-08 17:43 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, max.kellermann@ionos.com, Xiubo Li,
idryomov@gmail.com, linux-kernel@vger.kernel.org, Alex Markuze
On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> These patches reduce reloads of con->out_msg by passing pointers that
> we already have in local variables (i.e. registers) as parameters.
>
> Access to con->out_queue is now gone completely from v1/v2 and only
> few references to con->out_msg remain. In the long run, I'd like to
> get rid of con->out_msg completely and instead send the whole
> con->out_queue in one kernel_sendmsg() call. This patch series helps
> with preparing that.
>
> Max Kellermann (3):
> net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
> net/ceph/messenger_v[12]: pass ceph_msg* instead of loading
> con->out_msg
> net/ceph/messenger: add empty check to ceph_con_get_out_msg()
>
> include/linux/ceph/messenger.h | 6 +-
> net/ceph/messenger.c | 12 +--
> net/ceph/messenger_v1.c | 59 ++++++------
> net/ceph/messenger_v2.c | 160 ++++++++++++++++-----------------
> 4 files changed, 119 insertions(+), 118 deletions(-)
I will apply this patchset on testing branch and do the testing today.
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
` (3 preceding siblings ...)
2025-08-08 17:43 ` [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Viacheslav Dubeyko
@ 2025-08-11 17:05 ` Viacheslav Dubeyko
2025-08-11 23:27 ` Viacheslav Dubeyko
4 siblings, 1 reply; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-11 17:05 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org, max.kellermann@ionos.com, Xiubo Li,
idryomov@gmail.com, linux-kernel@vger.kernel.org, Alex Markuze
On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> These patches reduce reloads of con->out_msg by passing pointers that
> we already have in local variables (i.e. registers) as parameters.
>
> Access to con->out_queue is now gone completely from v1/v2 and only
> few references to con->out_msg remain. In the long run, I'd like to
> get rid of con->out_msg completely and instead send the whole
> con->out_queue in one kernel_sendmsg() call. This patch series helps
> with preparing that.
>
> Max Kellermann (3):
> net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
> net/ceph/messenger_v[12]: pass ceph_msg* instead of loading
> con->out_msg
> net/ceph/messenger: add empty check to ceph_con_get_out_msg()
>
> include/linux/ceph/messenger.h | 6 +-
> net/ceph/messenger.c | 12 +--
> net/ceph/messenger_v1.c | 59 ++++++------
> net/ceph/messenger_v2.c | 160 ++++++++++++++++-----------------
> 4 files changed, 119 insertions(+), 118 deletions(-)
Unexpectedly, I can see xfstests failures with applied patchset:
Failures: generic/633 generic/644 generic/645 generic/689 generic/696
generic/697
Failed 6 of 610 tests
I will repeat xfestests run with and without patchset. Maybe, it is the glitch
on my side.
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg
2025-08-11 17:05 ` Viacheslav Dubeyko
@ 2025-08-11 23:27 ` Viacheslav Dubeyko
0 siblings, 0 replies; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-11 23:27 UTC (permalink / raw)
To: Alex Markuze, max.kellermann@ionos.com,
ceph-devel@vger.kernel.org, Xiubo Li,
linux-kernel@vger.kernel.org, idryomov@gmail.com
On Mon, 2025-08-11 at 17:05 +0000, Viacheslav Dubeyko wrote:
> On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> > These patches reduce reloads of con->out_msg by passing pointers that
> > we already have in local variables (i.e. registers) as parameters.
> >
> > Access to con->out_queue is now gone completely from v1/v2 and only
> > few references to con->out_msg remain. In the long run, I'd like to
> > get rid of con->out_msg completely and instead send the whole
> > con->out_queue in one kernel_sendmsg() call. This patch series helps
> > with preparing that.
> >
> > Max Kellermann (3):
> > net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
> > net/ceph/messenger_v[12]: pass ceph_msg* instead of loading
> > con->out_msg
> > net/ceph/messenger: add empty check to ceph_con_get_out_msg()
> >
> > include/linux/ceph/messenger.h | 6 +-
> > net/ceph/messenger.c | 12 +--
> > net/ceph/messenger_v1.c | 59 ++++++------
> > net/ceph/messenger_v2.c | 160 ++++++++++++++++-----------------
> > 4 files changed, 119 insertions(+), 118 deletions(-)
>
> Unexpectedly, I can see xfstests failures with applied patchset:
>
> Failures: generic/633 generic/644 generic/645 generic/689 generic/696
> generic/697
> Failed 6 of 610 tests
>
> I will repeat xfestests run with and without patchset. Maybe, it is the glitch
> on my side.
>
>
I double checked the xfstests run. Everything works well. It was glitch on my
side.
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg
2025-08-08 17:41 ` Viacheslav Dubeyko
@ 2025-08-11 23:28 ` Viacheslav Dubeyko
0 siblings, 0 replies; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-11 23:28 UTC (permalink / raw)
To: Alex Markuze, max.kellermann@ionos.com,
ceph-devel@vger.kernel.org, Xiubo Li,
linux-kernel@vger.kernel.org, idryomov@gmail.com
On Fri, 2025-08-08 at 17:41 +0000, Viacheslav Dubeyko wrote:
> On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> > This pointer is in a register anyway, so let's use that instead of
> > reloading from memory everywhere.
> >
> > Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
>
> Looks good.
>
> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
>
>
Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer
2025-08-08 17:40 ` Viacheslav Dubeyko
@ 2025-08-11 23:29 ` Viacheslav Dubeyko
0 siblings, 0 replies; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-11 23:29 UTC (permalink / raw)
To: Alex Markuze, max.kellermann@ionos.com,
ceph-devel@vger.kernel.org, Xiubo Li,
linux-kernel@vger.kernel.org, idryomov@gmail.com
On Fri, 2025-08-08 at 17:40 +0000, Viacheslav Dubeyko wrote:
> On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> > The caller in messenger_v1.c loads it anyway, so let's keep the
> > pointer in the register instead of reloading it from memory. This
> > eliminates a tiny bit of unnecessary overhead.
> >
> > Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
>
> Looks good.
>
> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
>
>
Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-08-08 17:41 ` Viacheslav Dubeyko
@ 2025-08-11 23:29 ` Viacheslav Dubeyko
0 siblings, 0 replies; 16+ messages in thread
From: Viacheslav Dubeyko @ 2025-08-11 23:29 UTC (permalink / raw)
To: Alex Markuze, max.kellermann@ionos.com,
ceph-devel@vger.kernel.org, Xiubo Li,
linux-kernel@vger.kernel.org, idryomov@gmail.com
On Fri, 2025-08-08 at 17:41 +0000, Viacheslav Dubeyko wrote:
> On Wed, 2025-08-06 at 11:48 +0200, Max Kellermann wrote:
> > This moves the list_empty() checks from the two callers (v1 and v2)
> > into the base messenger.c library. Now the v1/v2 specializations do
> > not need to know about con->out_queue; that implementation detail is
> > now hidden behind the ceph_con_get_out_msg() function.
> >
> > Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
>
> Looks good.
>
> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
>
>
Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-08-06 9:48 ` [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg() Max Kellermann
2025-08-08 17:41 ` Viacheslav Dubeyko
@ 2025-10-09 11:18 ` Ilya Dryomov
2025-10-09 11:47 ` Max Kellermann
1 sibling, 1 reply; 16+ messages in thread
From: Ilya Dryomov @ 2025-10-09 11:18 UTC (permalink / raw)
To: Max Kellermann; +Cc: xiubli, amarkuze, ceph-devel, linux-kernel
On Wed, Aug 6, 2025 at 11:49 AM Max Kellermann <max.kellermann@ionos.com> wrote:
>
> This moves the list_empty() checks from the two callers (v1 and v2)
> into the base messenger.c library. Now the v1/v2 specializations do
> not need to know about con->out_queue; that implementation detail is
> now hidden behind the ceph_con_get_out_msg() function.
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> net/ceph/messenger.c | 4 +++-
> net/ceph/messenger_v1.c | 15 ++++++++++-----
> net/ceph/messenger_v2.c | 4 ++--
> 3 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 424fb2769b71..8886c38a55d2 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2113,7 +2113,9 @@ struct ceph_msg *ceph_con_get_out_msg(struct ceph_connection *con)
> {
> struct ceph_msg *msg;
>
> - BUG_ON(list_empty(&con->out_queue));
> + if (list_empty(&con->out_queue))
> + return NULL;
> +
> msg = list_first_entry(&con->out_queue, struct ceph_msg, list_head);
> WARN_ON(msg->con != con);
>
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 516f2eeb122a..5eb6cfdbc494 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -189,12 +189,18 @@ static void prepare_write_message_footer(struct ceph_connection *con, struct cep
>
> /*
> * Prepare headers for the next outgoing message.
> + *
> + * @return false if there are no outgoing messages
> */
> -static void prepare_write_message(struct ceph_connection *con)
> +static bool prepare_write_message(struct ceph_connection *con)
> {
> struct ceph_msg *m;
> u32 crc;
>
> + m = ceph_con_get_out_msg(con);
> + if (m == NULL)
> + return false;
> +
> con_out_kvec_reset(con);
> con->v1.out_msg_done = false;
>
> @@ -208,8 +214,6 @@ static void prepare_write_message(struct ceph_connection *con)
> &con->v1.out_temp_ack);
> }
>
> - m = ceph_con_get_out_msg(con);
> -
> dout("prepare_write_message %p seq %lld type %d len %d+%d+%zd\n",
> m, con->out_seq, le16_to_cpu(m->hdr.type),
> le32_to_cpu(m->hdr.front_len), le32_to_cpu(m->hdr.middle_len),
> @@ -256,6 +260,8 @@ static void prepare_write_message(struct ceph_connection *con)
> }
>
> ceph_con_flag_set(con, CEPH_CON_F_WRITE_PENDING);
> +
> + return true;
> }
>
> /*
> @@ -1543,8 +1549,7 @@ int ceph_con_v1_try_write(struct ceph_connection *con)
> goto more;
> }
> /* is anything else pending? */
> - if (!list_empty(&con->out_queue)) {
> - prepare_write_message(con);
> + if (prepare_write_message(con)) {
Hi Max,
I made a change to net/ceph/messenger_v1.c hunks of this patch to
follow what is done for msgr2 where ceph_con_get_out_msg() is called
outside of the prepare helper and the new message is passed in.
prepare_write_message() doesn't need to return a bool anymore.
Let me know if you see something wrong there:
https://github.com/ceph/ceph-client/commit/6140f1d43ba9425dc55b12bdfd8877b0c5118d9a
Thanks,
Ilya
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-10-09 11:18 ` Ilya Dryomov
@ 2025-10-09 11:47 ` Max Kellermann
2025-10-09 13:01 ` Ilya Dryomov
0 siblings, 1 reply; 16+ messages in thread
From: Max Kellermann @ 2025-10-09 11:47 UTC (permalink / raw)
To: Ilya Dryomov; +Cc: xiubli, amarkuze, ceph-devel, linux-kernel
On Thu, Oct 9, 2025 at 1:18 PM Ilya Dryomov <idryomov@gmail.com> wrote:
> I made a change to net/ceph/messenger_v1.c hunks of this patch to
> follow what is done for msgr2 where ceph_con_get_out_msg() is called
> outside of the prepare helper and the new message is passed in.
> prepare_write_message() doesn't need to return a bool anymore.
But ... why?
Your change is not bad, but I don't believe it belongs in this patch,
because it is out of this patch's scope. It would have been a good
follow-up patch.
There are lots of unnecessary (and sometimes confusing) differences
between the v1 and v2 messengers, but unifying these is out of the
scope of my patch. All my patch does is remove visibility of a
messenger.c implementation detail from the v1/v2 specializations.
(My end goal was to have unified multi-message send in one function
call to reduce overhead for sending bulk messages, but I did not yet
follow up on this idea yet. The Ceph kernel messenger, just like the
user-space messenger, leave a lot of room for optimizations -
unfortunately, my user-space optimizations that I submitted last year
were not merged by the Ceph project.)
Max
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg()
2025-10-09 11:47 ` Max Kellermann
@ 2025-10-09 13:01 ` Ilya Dryomov
0 siblings, 0 replies; 16+ messages in thread
From: Ilya Dryomov @ 2025-10-09 13:01 UTC (permalink / raw)
To: Max Kellermann; +Cc: xiubli, amarkuze, ceph-devel, linux-kernel
On Thu, Oct 9, 2025 at 1:47 PM Max Kellermann <max.kellermann@ionos.com> wrote:
>
> On Thu, Oct 9, 2025 at 1:18 PM Ilya Dryomov <idryomov@gmail.com> wrote:
> > I made a change to net/ceph/messenger_v1.c hunks of this patch to
> > follow what is done for msgr2 where ceph_con_get_out_msg() is called
> > outside of the prepare helper and the new message is passed in.
> > prepare_write_message() doesn't need to return a bool anymore.
>
> But ... why?
> Your change is not bad, but I don't believe it belongs in this patch,
> because it is out of this patch's scope. It would have been a good
> follow-up patch.
Changing a function to return a bool only to immediately undo that
change in a follow-up patch (both touching the same 10-15 lines of
code) seemed like pointless churn to me.
>
> There are lots of unnecessary (and sometimes confusing) differences
> between the v1 and v2 messengers, but unifying these is out of the
> scope of my patch. All my patch does is remove visibility of a
> messenger.c implementation detail from the v1/v2 specializations.
ceph_con_get_out_msg() is a common helper and given that this series
changed its signature and how it's supposed to be used, I wouldn't say
that adjusting the specializations to do the same thing with it is out
of scope. This isn't unifying some completely unrelated aspect of v1
vs v2 and the resulting patch actually ended up being smaller.
Thanks,
Ilya
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-10-09 13:01 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 9:48 [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Max Kellermann
2025-08-06 9:48 ` [PATCH 1/3] net/ceph/messenger: ceph_con_get_out_msg() returns the message pointer Max Kellermann
2025-08-08 17:40 ` Viacheslav Dubeyko
2025-08-11 23:29 ` Viacheslav Dubeyko
2025-08-06 9:48 ` [PATCH 2/3] net/ceph/messenger_v[12]: pass ceph_msg* instead of loading con->out_msg Max Kellermann
2025-08-08 17:41 ` Viacheslav Dubeyko
2025-08-11 23:28 ` Viacheslav Dubeyko
2025-08-06 9:48 ` [PATCH 3/3] net/ceph/messenger: add empty check to ceph_con_get_out_msg() Max Kellermann
2025-08-08 17:41 ` Viacheslav Dubeyko
2025-08-11 23:29 ` Viacheslav Dubeyko
2025-10-09 11:18 ` Ilya Dryomov
2025-10-09 11:47 ` Max Kellermann
2025-10-09 13:01 ` Ilya Dryomov
2025-08-08 17:43 ` [PATCH 0/3] net/ceph/messenger: micro-optimizations for out_msg Viacheslav Dubeyko
2025-08-11 17:05 ` Viacheslav Dubeyko
2025-08-11 23:27 ` Viacheslav Dubeyko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.