* [PATCH v2] libceph: reject monitor replies with oversized data segment
@ 2026-05-01 19:37 Dhiraj Mishra
2026-05-01 21:37 ` Viacheslav Dubeyko
2026-05-02 5:22 ` [PATCH v3] " Dhiraj Mishra
0 siblings, 2 replies; 7+ messages in thread
From: Dhiraj Mishra @ 2026-05-01 19:37 UTC (permalink / raw)
To: Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko
Cc: ceph-devel, linux-kernel, Greg Kroah-Hartman, Willy Tarreau,
security, Dhiraj Mishra
Monitor messages can be allocated from preallocated reply messages or
with ceph_msg_new(), both of which may provide only front-buffer storage
and no data items. The messenger receive path copies the wire header
into the selected ceph_msg and later uses hdr.data_len to decide whether
to initialize a data cursor.
If a malicious or compromised monitor advertises a non-zero data segment
for one of these front-only replies, the receive path can call
ceph_msg_data_cursor_init() with length greater than msg->data_length and
hit its BUG_ON() checks, crashing the client kernel.
I verified the issue against v7.1-rc1-123-ge75a43c7cec4. The msgr2
trigger path is present since commit cd1a677cad99 ("libceph, ceph:
implement msgr2.1 protocol (crc and secure modes)"), which is contained
in v5.11-rc1 and later. The monitor allocator pattern is older, but I
have not tested older msgr1-only kernels.
A concrete trigger is a monitor connection over msgr2 after
CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type
handled by mon_alloc_msg(), a valid front_len for that message type and
data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is
allocated with ceph_msg_new(), leaving msg->data_length and
msg->num_data_items as zero.
Reject monitor replies whose wire data segment is larger than the data
backing allocated for the selected ceph_msg, mirroring the existing OSD
reply hardening.
Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com>
---
v2:
- Resend as an inline plain-text patch.
- Add full email address to the From and Signed-off-by identities.
- Add ceph-devel and LKML to the recipient list when sending.
net/ceph/mon_client.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index d2cdc8ee3155..7f2293122bba 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con,
struct ceph_mon_client *monc = con->private;
struct ceph_mon_generic_request *req;
u64 tid = le64_to_cpu(hdr->tid);
+ u32 data_len = le32_to_cpu(hdr->data_len);
struct ceph_msg *m;
mutex_lock(&monc->mutex);
@@ -720,6 +721,16 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con,
dout("get_generic_reply %lld dne\n", tid);
*skip = 1;
m = NULL;
+ } else if (!req->reply) {
+ pr_warn("%s tid %llu missing reply buffer, skipping\n",
+ __func__, tid);
+ *skip = 1;
+ m = NULL;
+ } else if (data_len > req->reply->data_length) {
+ pr_warn("%s tid %llu data %u > preallocated %zu, skipping\n",
+ __func__, tid, data_len, req->reply->data_length);
+ *skip = 1;
+ m = NULL;
} else {
dout("get_generic_reply %lld got %p\n", tid, req->reply);
*skip = 0;
@@ -1499,6 +1510,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con,
struct ceph_mon_client *monc = con->private;
int type = le16_to_cpu(hdr->type);
int front_len = le32_to_cpu(hdr->front_len);
+ u32 data_len = le32_to_cpu(hdr->data_len);
struct ceph_msg *m = NULL;
*skip = 0;
@@ -1544,6 +1556,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con,
ceph_msg_put(m);
m = ceph_msg_new(type, front_len, GFP_NOFS, false);
}
+ if (m && data_len > m->data_length) {
+ pr_warn("%s data %u > prealloc %zu (%u#%llu), skipping\n",
+ __func__, data_len, m->data_length,
+ (unsigned int)con->peer_name.type,
+ le64_to_cpu(con->peer_name.num));
+ ceph_msg_put(m);
+ m = NULL;
+ *skip = 1;
+ }
return m;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2] libceph: reject monitor replies with oversized data segment 2026-05-01 19:37 [PATCH v2] libceph: reject monitor replies with oversized data segment Dhiraj Mishra @ 2026-05-01 21:37 ` Viacheslav Dubeyko 2026-05-02 5:22 ` [PATCH v3] " Dhiraj Mishra 1 sibling, 0 replies; 7+ messages in thread From: Viacheslav Dubeyko @ 2026-05-01 21:37 UTC (permalink / raw) To: idryomov@gmail.com, Alex Markuze, slava@dubeyko.com, mishra.dhiraj95@gmail.com Cc: security@kernel.org, w@1wt.eu, ceph-devel@vger.kernel.org, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org On Fri, 2026-05-01 at 23:37 +0400, Dhiraj Mishra wrote: > Monitor messages can be allocated from preallocated reply messages or > with ceph_msg_new(), both of which may provide only front-buffer storage > and no data items. The messenger receive path copies the wire header > into the selected ceph_msg and later uses hdr.data_len to decide whether > to initialize a data cursor. > > If a malicious or compromised monitor advertises a non-zero data segment > for one of these front-only replies, the receive path can call > ceph_msg_data_cursor_init() with length greater than msg->data_length and > hit its BUG_ON() checks, crashing the client kernel. > > I verified the issue against v7.1-rc1-123-ge75a43c7cec4. The msgr2 > trigger path is present since commit cd1a677cad99 ("libceph, ceph: > implement msgr2.1 protocol (crc and secure modes)"), which is contained > in v5.11-rc1 and later. The monitor allocator pattern is older, but I > have not tested older msgr1-only kernels. > > A concrete trigger is a monitor connection over msgr2 after > CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type > handled by mon_alloc_msg(), a valid front_len for that message type and > data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is > allocated with ceph_msg_new(), leaving msg->data_length and > msg->num_data_items as zero. > > Reject monitor replies whose wire data segment is larger than the data > backing allocated for the selected ceph_msg, mirroring the existing OSD > reply hardening. > > Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") > Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com> > --- > v2: > - Resend as an inline plain-text patch. > - Add full email address to the From and Signed-off-by identities. > - Add ceph-devel and LKML to the recipient list when sending. > > net/ceph/mon_client.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c > index d2cdc8ee3155..7f2293122bba 100644 > --- a/net/ceph/mon_client.c > +++ b/net/ceph/mon_client.c > @@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > struct ceph_mon_generic_request *req; > u64 tid = le64_to_cpu(hdr->tid); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m; > > mutex_lock(&monc->mutex); > @@ -720,6 +721,16 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > dout("get_generic_reply %lld dne\n", tid); > *skip = 1; > m = NULL; > + } else if (!req->reply) { The req->reply is set unconditionally when the generic request is allocated and submitted; it is never NULL when req is found in the tree. It looks like a dead code. > + pr_warn("%s tid %llu missing reply buffer, skipping\n", > + __func__, tid); > + *skip = 1; > + m = NULL; > + } else if (data_len > req->reply->data_length) { > + pr_warn("%s tid %llu data %u > preallocated %zu, skipping\n", > + __func__, tid, data_len, req->reply->data_length); As far as I can see, other pr_err() calls in this file never show function name. So, it's not consistent with common style of the module. A malicious monitor can send a flood of oversized frames at line rate. Probably, we need to consider pr_warn_ratelimited(). > + *skip = 1; > + m = NULL; > } else { > dout("get_generic_reply %lld got %p\n", tid, req->reply); > *skip = 0; > @@ -1499,6 +1510,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > int type = le16_to_cpu(hdr->type); > int front_len = le32_to_cpu(hdr->front_len); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m = NULL; > > *skip = 0; > @@ -1544,6 +1556,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > ceph_msg_put(m); > m = ceph_msg_new(type, front_len, GFP_NOFS, false); > } I would like to see an empty line between these code blocks. > + if (m && data_len > m->data_length) { > + pr_warn("%s data %u > prealloc %zu (%u#%llu), skipping\n", > + __func__, data_len, m->data_length, > + (unsigned int)con->peer_name.type, > + le64_to_cpu(con->peer_name.num)); > + ceph_msg_put(m); > + m = NULL; > + *skip = 1; > + } I don't quite follow to the purpose of this code block. As far as I can see, we have: case CEPH_MSG_STATFS_REPLY: case CEPH_MSG_MON_COMMAND_ACK: return get_generic_reply(con, hdr, skip); What is the purpose of this? Thanks, Slava. > > return m; > } ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3] libceph: reject monitor replies with oversized data segment 2026-05-01 19:37 [PATCH v2] libceph: reject monitor replies with oversized data segment Dhiraj Mishra 2026-05-01 21:37 ` Viacheslav Dubeyko @ 2026-05-02 5:22 ` Dhiraj Mishra 2026-05-04 8:34 ` Raphael Zimmer 1 sibling, 1 reply; 7+ messages in thread From: Dhiraj Mishra @ 2026-05-02 5:22 UTC (permalink / raw) To: Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko Cc: ceph-devel, linux-kernel, Greg Kroah-Hartman, Willy Tarreau, security, Dhiraj Mishra Monitor messages can be allocated from preallocated reply messages or with ceph_msg_new(), both of which may provide only front-buffer storage and no data items. The messenger receive path copies the wire header into the selected ceph_msg and later uses hdr.data_len to decide whether to initialize a data cursor. If a malicious or compromised monitor advertises a non-zero data segment for one of these front-only replies, the receive path can call ceph_msg_data_cursor_init() with length greater than msg->data_length and hit its BUG_ON() checks, crashing the client kernel. I verified the issue against v7.1-rc1-123-ge75a43c7cec4. The msgr2 trigger path is present since commit cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)"), which is contained in v5.11-rc1 and later. The monitor allocator pattern is older, but I have not tested older msgr1-only kernels. A concrete trigger is a monitor connection over msgr2 after CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type handled by mon_alloc_msg(), a valid front_len for that message type and data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is allocated with ceph_msg_new(), leaving msg->data_length and msg->num_data_items as zero. Reject monitor replies whose wire data segment is larger than the data backing allocated for the selected ceph_msg, mirroring the existing OSD reply hardening. Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com> --- v3: - Remove the impossible !req->reply check for generic requests. - Use pr_warn_ratelimited() for malicious-monitor log spam resistance. - Avoid adding __func__ to the new mon_client warnings. - Add a blank line between the front_len and data_len checks. v2: - Resend as an inline plain-text patch. - Add full email address to the From and Signed-off-by identities. - Add ceph-devel and LKML to the recipient list when sending. net/ceph/mon_client.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index d2cdc8ee3155..9f1c7ca42f36 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, struct ceph_mon_client *monc = con->private; struct ceph_mon_generic_request *req; u64 tid = le64_to_cpu(hdr->tid); + u32 data_len = le32_to_cpu(hdr->data_len); struct ceph_msg *m; mutex_lock(&monc->mutex); @@ -720,6 +721,11 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, dout("get_generic_reply %lld dne\n", tid); *skip = 1; m = NULL; + } else if (data_len > req->reply->data_length) { + pr_warn_ratelimited("mon generic reply tid %llu data %u > preallocated %zu, skipping\n", + tid, data_len, req->reply->data_length); + *skip = 1; + m = NULL; } else { dout("get_generic_reply %lld got %p\n", tid, req->reply); *skip = 0; @@ -1499,6 +1505,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, struct ceph_mon_client *monc = con->private; int type = le16_to_cpu(hdr->type); int front_len = le32_to_cpu(hdr->front_len); + u32 data_len = le32_to_cpu(hdr->data_len); struct ceph_msg *m = NULL; *skip = 0; @@ -1545,5 +1552,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, m = ceph_msg_new(type, front_len, GFP_NOFS, false); } + if (m && data_len > m->data_length) { + pr_warn_ratelimited("mon message data %u > prealloc %zu (%u#%llu), skipping\n", + data_len, m->data_length, + (unsigned int)con->peer_name.type, + le64_to_cpu(con->peer_name.num)); + ceph_msg_put(m); + m = NULL; + *skip = 1; + } + return m; } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3] libceph: reject monitor replies with oversized data segment 2026-05-02 5:22 ` [PATCH v3] " Dhiraj Mishra @ 2026-05-04 8:34 ` Raphael Zimmer 2026-05-04 9:16 ` [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments Dhiraj Mishra 0 siblings, 1 reply; 7+ messages in thread From: Raphael Zimmer @ 2026-05-04 8:34 UTC (permalink / raw) To: Dhiraj Mishra, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko Cc: ceph-devel, linux-kernel, Greg Kroah-Hartman, Willy Tarreau, security On 02.05.26 7:22 AM, Dhiraj Mishra wrote: > Monitor messages can be allocated from preallocated reply messages or > with ceph_msg_new(), both of which may provide only front-buffer storage > and no data items. The messenger receive path copies the wire header > into the selected ceph_msg and later uses hdr.data_len to decide whether > to initialize a data cursor. > > If a malicious or compromised monitor advertises a non-zero data segment > for one of these front-only replies, the receive path can call > ceph_msg_data_cursor_init() with length greater than msg->data_length and > hit its BUG_ON() checks, crashing the client kernel. > > I verified the issue against v7.1-rc1-123-ge75a43c7cec4. The msgr2 > trigger path is present since commit cd1a677cad99 ("libceph, ceph: > implement msgr2.1 protocol (crc and secure modes)"), which is contained > in v5.11-rc1 and later. The monitor allocator pattern is older, but I > have not tested older msgr1-only kernels. > > A concrete trigger is a monitor connection over msgr2 after > CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type > handled by mon_alloc_msg(), a valid front_len for that message type and > data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is > allocated with ceph_msg_new(), leaving msg->data_length and > msg->num_data_items as zero. > > Reject monitor replies whose wire data segment is larger than the data > backing allocated for the selected ceph_msg, mirroring the existing OSD > reply hardening. > > Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") > Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com> > --- > v3: > - Remove the impossible !req->reply check for generic requests. > - Use pr_warn_ratelimited() for malicious-monitor log spam resistance. > - Avoid adding __func__ to the new mon_client warnings. > - Add a blank line between the front_len and data_len checks. > > v2: > - Resend as an inline plain-text patch. > - Add full email address to the From and Signed-off-by identities. > - Add ceph-devel and LKML to the recipient list when sending. > > net/ceph/mon_client.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c > index d2cdc8ee3155..9f1c7ca42f36 100644 > --- a/net/ceph/mon_client.c > +++ b/net/ceph/mon_client.c > @@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > struct ceph_mon_generic_request *req; > u64 tid = le64_to_cpu(hdr->tid); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m; > > mutex_lock(&monc->mutex); > @@ -720,6 +721,11 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > dout("get_generic_reply %lld dne\n", tid); > *skip = 1; > m = NULL; > + } else if (data_len > req->reply->data_length) { > + pr_warn_ratelimited("mon generic reply tid %llu data %u > preallocated %zu, skipping\n", > + tid, data_len, req->reply->data_length); > + *skip = 1; > + m = NULL; > } else { > dout("get_generic_reply %lld got %p\n", tid, req->reply); > *skip = 0; > @@ -1499,6 +1505,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > int type = le16_to_cpu(hdr->type); > int front_len = le32_to_cpu(hdr->front_len); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m = NULL; > > *skip = 0; > @@ -1545,5 +1552,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > m = ceph_msg_new(type, front_len, GFP_NOFS, false); > } > > + if (m && data_len > m->data_length) { > + pr_warn_ratelimited("mon message data %u > prealloc %zu (%u#%llu), skipping\n", > + data_len, m->data_length, > + (unsigned int)con->peer_name.type, > + le64_to_cpu(con->peer_name.num)); > + ceph_msg_put(m); > + m = NULL; > + *skip = 1; > + } > + > return m; > } Hi Dhiraj, as far as I can see, the same issue can also occur on MDS connections here: https://elixir.bootlin.com/linux/v7.0.1/source/fs/ceph/mds_client.c#L6477 I guess a similar check should also be added there. Maybe this can be done in one patch. Best regards, Raphael ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments 2026-05-04 8:34 ` Raphael Zimmer @ 2026-05-04 9:16 ` Dhiraj Mishra 2026-05-04 19:46 ` Viacheslav Dubeyko 0 siblings, 1 reply; 7+ messages in thread From: Dhiraj Mishra @ 2026-05-04 9:16 UTC (permalink / raw) To: Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko Cc: Raphael Zimmer, ceph-devel, linux-kernel, Greg Kroah-Hartman, Willy Tarreau, security, Dhiraj Mishra Monitor and MDS messages can be allocated with only front-buffer storage and no data items. The messenger receive path copies the wire header into the selected ceph_msg and later uses hdr.data_len to decide whether to initialize a data cursor. If a malicious or compromised peer advertises a non-zero data segment for one of these front-only messages, the receive path can call ceph_msg_data_cursor_init() with length greater than msg->data_length and hit its BUG_ON() checks, crashing the client kernel. I verified the monitor issue against v7.1-rc1-123-ge75a43c7cec4. The msgr2 trigger path is present since commit cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)"), which is contained in v5.11-rc1 and later. The allocator patterns are older, but I have not tested older msgr1-only kernels. A concrete monitor trigger is a monitor connection over msgr2 after CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type handled by mon_alloc_msg(), a valid front_len for that message type and data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is allocated with ceph_msg_new(), leaving msg->data_length and msg->num_data_items as zero. The MDS allocation path has the same issue: mds_alloc_msg() allocates the incoming message with ceph_msg_new(type, front_len, GFP_NOFS, false), leaving msg->data_length as zero for a front-only message selected from an attacker-controlled wire header. Reject monitor and MDS messages whose wire data segment is larger than the data backing allocated for the selected ceph_msg, mirroring the existing OSD reply hardening. Fixes: 53ded495c6ac ("libceph: define mds_alloc_msg() method") Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com> --- v4: - Add the same oversized data_len guard to the MDS allocation path. v3: - Remove the impossible !req->reply check for generic requests. - Use pr_warn_ratelimited() for malicious-monitor log spam resistance. - Avoid adding __func__ to the new mon_client warnings. - Add a blank line between the front_len and data_len checks. v2: - Resend as an inline plain-text patch. - Add full email address to the From and Signed-off-by identities. - Add ceph-devel and LKML to the recipient list when sending. fs/ceph/mds_client.c | 13 +++++++++++++ net/ceph/mon_client.c | 17 +++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index ed17e0023705..3b92d37983d1 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -6543,9 +6543,12 @@ static int mds_handle_auth_bad_method(struct ceph_connection *con, static struct ceph_msg *mds_alloc_msg(struct ceph_connection *con, struct ceph_msg_header *hdr, int *skip) { + struct ceph_mds_session *s = con->private; + struct ceph_client *cl = s->s_mdsc->fsc->client; struct ceph_msg *msg; int type = (int) le16_to_cpu(hdr->type); int front_len = (int) le32_to_cpu(hdr->front_len); + u32 data_len = le32_to_cpu(hdr->data_len); if (con->in_msg) return con->in_msg; @@ -6558,6 +6561,16 @@ static struct ceph_msg *mds_alloc_msg(struct ceph_connection *con, return NULL; } + if (data_len > msg->data_length) { + pr_warn_ratelimited_client(cl, + "mds%d message data %u > prealloc %zu, skipping\n", + s->s_mds, data_len, + msg->data_length); + ceph_msg_put(msg); + *skip = 1; + return NULL; + } + return msg; } diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index d2cdc8ee3155..9f1c7ca42f36 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, struct ceph_mon_client *monc = con->private; struct ceph_mon_generic_request *req; u64 tid = le64_to_cpu(hdr->tid); + u32 data_len = le32_to_cpu(hdr->data_len); struct ceph_msg *m; mutex_lock(&monc->mutex); @@ -720,6 +721,11 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, dout("get_generic_reply %lld dne\n", tid); *skip = 1; m = NULL; + } else if (data_len > req->reply->data_length) { + pr_warn_ratelimited("mon generic reply tid %llu data %u > preallocated %zu, skipping\n", + tid, data_len, req->reply->data_length); + *skip = 1; + m = NULL; } else { dout("get_generic_reply %lld got %p\n", tid, req->reply); *skip = 0; @@ -1499,6 +1505,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, struct ceph_mon_client *monc = con->private; int type = le16_to_cpu(hdr->type); int front_len = le32_to_cpu(hdr->front_len); + u32 data_len = le32_to_cpu(hdr->data_len); struct ceph_msg *m = NULL; *skip = 0; @@ -1545,5 +1552,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, m = ceph_msg_new(type, front_len, GFP_NOFS, false); } + if (m && data_len > m->data_length) { + pr_warn_ratelimited("mon message data %u > prealloc %zu (%u#%llu), skipping\n", + data_len, m->data_length, + (unsigned int)con->peer_name.type, + le64_to_cpu(con->peer_name.num)); + ceph_msg_put(m); + m = NULL; + *skip = 1; + } + return m; } ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments 2026-05-04 9:16 ` [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments Dhiraj Mishra @ 2026-05-04 19:46 ` Viacheslav Dubeyko [not found] ` <CAG8b5tTiRK3AfTRx0hLeTi5GLfQ6sxXxfgUT+xzEqgzzDUMtOg@mail.gmail.com> 0 siblings, 1 reply; 7+ messages in thread From: Viacheslav Dubeyko @ 2026-05-04 19:46 UTC (permalink / raw) To: idryomov@gmail.com, Alex Markuze, slava@dubeyko.com, mishra.dhiraj95@gmail.com Cc: raphael.zimmer@tu-ilmenau.de, w@1wt.eu, ceph-devel@vger.kernel.org, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, security@kernel.org On Mon, 2026-05-04 at 13:16 +0400, Dhiraj Mishra wrote: > Monitor and MDS messages can be allocated with only front-buffer storage > and no data items. The messenger receive path copies the wire header > into the selected ceph_msg and later uses hdr.data_len to decide whether > to initialize a data cursor. > > If a malicious or compromised peer advertises a non-zero data segment for > one of these front-only messages, the receive path can call > ceph_msg_data_cursor_init() with length greater than msg->data_length and > hit its BUG_ON() checks, crashing the client kernel. > > I verified the monitor issue against v7.1-rc1-123-ge75a43c7cec4. The > msgr2 trigger path is present since commit cd1a677cad99 ("libceph, > ceph: implement msgr2.1 protocol (crc and secure modes)"), which is > contained in v5.11-rc1 and later. The allocator patterns are older, but > I have not tested older msgr1-only kernels. > > A concrete monitor trigger is a monitor connection over msgr2 after > CEPH_CON_S_OPEN where a FRAME_TAG_MESSAGE contains a monitor reply type > handled by mon_alloc_msg(), a valid front_len for that message type and > data_len = 1. CEPH_MSG_MON_MAP is one such example: the message is > allocated with ceph_msg_new(), leaving msg->data_length and > msg->num_data_items as zero. > > The MDS allocation path has the same issue: mds_alloc_msg() allocates the > incoming message with ceph_msg_new(type, front_len, GFP_NOFS, false), > leaving msg->data_length as zero for a front-only message selected from > an attacker-controlled wire header. > > Reject monitor and MDS messages whose wire data segment is larger than > the data backing allocated for the selected ceph_msg, mirroring the > existing OSD reply hardening. > > Fixes: 53ded495c6ac ("libceph: define mds_alloc_msg() method") > Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") > Signed-off-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com> > --- > v4: > - Add the same oversized data_len guard to the MDS allocation path. > > v3: > - Remove the impossible !req->reply check for generic requests. > - Use pr_warn_ratelimited() for malicious-monitor log spam resistance. > - Avoid adding __func__ to the new mon_client warnings. > - Add a blank line between the front_len and data_len checks. > > v2: > - Resend as an inline plain-text patch. > - Add full email address to the From and Signed-off-by identities. > - Add ceph-devel and LKML to the recipient list when sending. > > fs/ceph/mds_client.c | 13 +++++++++++++ > net/ceph/mon_client.c | 17 +++++++++++++++++ > 2 files changed, 30 insertions(+) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index ed17e0023705..3b92d37983d1 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -6543,9 +6543,12 @@ static int mds_handle_auth_bad_method(struct ceph_connection *con, > static struct ceph_msg *mds_alloc_msg(struct ceph_connection *con, > struct ceph_msg_header *hdr, int *skip) > { > + struct ceph_mds_session *s = con->private; > + struct ceph_client *cl = s->s_mdsc->fsc->client; > struct ceph_msg *msg; > int type = (int) le16_to_cpu(hdr->type); > int front_len = (int) le32_to_cpu(hdr->front_len); > + u32 data_len = le32_to_cpu(hdr->data_len); > > if (con->in_msg) > return con->in_msg; > @@ -6558,6 +6561,16 @@ static struct ceph_msg *mds_alloc_msg(struct ceph_connection *con, > return NULL; > } > > + if (data_len > msg->data_length) { > + pr_warn_ratelimited_client(cl, > + "mds%d message data %u > prealloc %zu, skipping\n", > + s->s_mds, data_len, > + msg->data_length); > + ceph_msg_put(msg); > + *skip = 1; > + return NULL; > + } > + > return msg; > } > > diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c > index d2cdc8ee3155..9f1c7ca42f36 100644 > --- a/net/ceph/mon_client.c > +++ b/net/ceph/mon_client.c > @@ -712,6 +712,7 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > struct ceph_mon_generic_request *req; > u64 tid = le64_to_cpu(hdr->tid); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m; > > mutex_lock(&monc->mutex); > @@ -720,6 +721,11 @@ static struct ceph_msg *get_generic_reply(struct ceph_connection *con, > dout("get_generic_reply %lld dne\n", tid); > *skip = 1; > m = NULL; > + } else if (data_len > req->reply->data_length) { > + pr_warn_ratelimited("mon generic reply tid %llu data %u > preallocated %zu, skipping\n", > + tid, data_len, req->reply->data_length); > + *skip = 1; > + m = NULL; > } else { > dout("get_generic_reply %lld got %p\n", tid, req->reply); > *skip = 0; > @@ -1499,6 +1505,7 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > struct ceph_mon_client *monc = con->private; > int type = le16_to_cpu(hdr->type); > int front_len = le32_to_cpu(hdr->front_len); > + u32 data_len = le32_to_cpu(hdr->data_len); > struct ceph_msg *m = NULL; > > *skip = 0; > @@ -1545,5 +1552,15 @@ static struct ceph_msg *mon_alloc_msg(struct ceph_connection *con, > m = ceph_msg_new(type, front_len, GFP_NOFS, false); > } > > + if (m && data_len > m->data_length) { > + pr_warn_ratelimited("mon message data %u > prealloc %zu (%u#%llu), skipping\n", > + data_len, m->data_length, > + (unsigned int)con->peer_name.type, > + le64_to_cpu(con->peer_name.num)); > + ceph_msg_put(m); > + m = NULL; > + *skip = 1; > + } > + I still don't see the answer on my question. We have: case CEPH_MSG_STATFS_REPLY: case CEPH_MSG_MON_COMMAND_ACK: return get_generic_reply(con, hdr, skip); I had impression that this code was related to get_generic_reply() case. But it looks like that I am not correct here. If it is not related to this, then we need to rework this code to integrate suggested new functionality into the whole logic: if (!m) { pr_info("alloc_msg unknown type %d\n", type); *skip = 1; } else if (front_len > m->front_alloc_len) { pr_warn("mon_alloc_msg front %d > prealloc %d (%u#%llu)\n", front_len, m->front_alloc_len, (unsigned int)con->peer_name.type, le64_to_cpu(con->peer_name.num)); ceph_msg_put(m); m = ceph_msg_new(type, front_len, GFP_NOFS, false); } Thanks, Slava. > return m; > } ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAG8b5tTiRK3AfTRx0hLeTi5GLfQ6sxXxfgUT+xzEqgzzDUMtOg@mail.gmail.com>]
* RE: [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments [not found] ` <CAG8b5tTiRK3AfTRx0hLeTi5GLfQ6sxXxfgUT+xzEqgzzDUMtOg@mail.gmail.com> @ 2026-05-05 18:48 ` Viacheslav Dubeyko 0 siblings, 0 replies; 7+ messages in thread From: Viacheslav Dubeyko @ 2026-05-05 18:48 UTC (permalink / raw) To: mishra.dhiraj95@gmail.com Cc: gregkh@linuxfoundation.org, raphael.zimmer@tu-ilmenau.de, slava@dubeyko.com, ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Alex Markuze, w@1wt.eu, idryomov@gmail.com, security@kernel.org On Tue, 2026-05-05 at 12:24 +0400, Dhiraj Mishra wrote: > > Hi Slava, > > You are correct that the new check at the tail of mon_alloc_msg() is not > for the get_generic_reply() cases. > > For CEPH_MSG_STATFS_REPLY and CEPH_MSG_MON_COMMAND_ACK we return early to > get_generic_reply(), so those generic replies are covered by the separate > data_len check there. > > The purpose of the mon_alloc_msg() check is the other monitor-reply path > that stays in mon_alloc_msg(). In particular, the concrete trigger I > tested is CEPH_MSG_MON_MAP: it does not use get_generic_reply(), it is > allocated directly with ceph_msg_new(type, front_len, GFP_NOFS, false), > and therefore can have msg->data_length == 0 while the wire header > advertises a non-zero data_len. > > That is also why I kept the data_len validation after this existing > logic: > > if (!m) { > ... > } else if (front_len > m->front_alloc_len) { > ... > ceph_msg_put(m); > m = ceph_msg_new(type, front_len, GFP_NOFS, false); > } > > The front_len branch may replace m with a newly allocated message, so the > data_len check needs to run on that final m as well. If it were folded > into the same else-if chain, the replacement m would not be validated. > > So the split is intentional: > > - get_generic_reply(): > generic replies backed by req->reply > - mon_alloc_msg() tail check: > non-generic monitor replies allocated directly in mon_alloc_msg() > (including the tested CEPH_MSG_MON_MAP case) > > I agree this separation was not obvious enough from the previous diff. > If you prefer, I can respin once more and move the validation into the > common receive-side path so it is checked in one place. > I think that likewise code flow makes more sense: if (!m) { pr_info("alloc_msg unknown type %d\n", type); *skip = 1; return m; } if (front_len > m->front_alloc_len) { pr_warn("mon_alloc_msg front %d > prealloc %d (%u#%llu)\n", front_len, m->front_alloc_len, (unsigned int)con->peer_name.type, le64_to_cpu(con->peer_name.num)); ceph_msg_put(m); m = ceph_msg_new(type, front_len, GFP_NOFS, false); } if (data_len > m->data_length) { pr_warn_ratelimited("mon message data %u > prealloc %zu (%u#%llu), skipping\n", data_len, m->data_length, (unsigned int)con->peer_name.type, le64_to_cpu(con->peer_name.num)); ceph_msg_put(m); m = NULL; *skip = 1; } Is it possible that we could have both issues? Why do we behave differently in both cases? Thanks, Slava. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-05-05 18:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-01 19:37 [PATCH v2] libceph: reject monitor replies with oversized data segment Dhiraj Mishra
2026-05-01 21:37 ` Viacheslav Dubeyko
2026-05-02 5:22 ` [PATCH v3] " Dhiraj Mishra
2026-05-04 8:34 ` Raphael Zimmer
2026-05-04 9:16 ` [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments Dhiraj Mishra
2026-05-04 19:46 ` Viacheslav Dubeyko
[not found] ` <CAG8b5tTiRK3AfTRx0hLeTi5GLfQ6sxXxfgUT+xzEqgzzDUMtOg@mail.gmail.com>
2026-05-05 18:48 ` Viacheslav Dubeyko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox