* [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() [not found] <2026031805-stretch-skid-ae5b@gregkh> @ 2026-03-19 1:26 ` Shivam Kumar 2026-03-19 7:59 ` Maurizio Lombardi 0 siblings, 1 reply; 17+ messages in thread From: Shivam Kumar @ 2026-03-19 1:26 UTC (permalink / raw) To: gregkh; +Cc: security, hch, sagi, kch, linux-nvme, kumar.shivam43666 nvmet_tcp_map_data() reads the SGL length from the incoming NVMe command and passes it to sgl_alloc() for scatter-gather allocation. The only existing size check (len > inline_data_size) is gated behind sgl->type == NVME_SGL_FMT_DATA_DESC | NVME_SGL_FMT_OFFSET, meaning any other SGL type bypasses the validation entirely. A remote attacker can send a CapsuleCmd with a non-offset SGL type and a large sgl->length (e.g. ~3 GB) to trigger an excessive kernel allocation, resulting in a page allocator WARNING and kernel panic when panic_on_warn is set. Add an upper-bound check against NVMET_TCP_MAXH2CDATA before calling sgl_alloc(). This value is already advertised to the host in the ICResp maxdata field as the maximum data transfer size. Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> --- drivers/nvme/target/tcp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index a456dd2fd8bd..c7253221c342 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -431,6 +431,8 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR; cmd->pdu_len = len; } + if (len > NVMET_TCP_MAXH2CDATA) + return NVME_SC_SGL_INVALID_DATA | NVME_STATUS_DNR; cmd->req.transfer_len += len; cmd->req.sg = sgl_alloc(len, GFP_KERNEL, &cmd->req.sg_cnt); -- 2.34.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-03-19 1:26 ` [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() Shivam Kumar @ 2026-03-19 7:59 ` Maurizio Lombardi 2026-03-19 18:00 ` Shivam Kumar 0 siblings, 1 reply; 17+ messages in thread From: Maurizio Lombardi @ 2026-03-19 7:59 UTC (permalink / raw) To: Shivam Kumar, gregkh; +Cc: security, hch, sagi, kch, linux-nvme On Thu Mar 19, 2026 at 2:26 AM CET, Shivam Kumar wrote: > nvmet_tcp_map_data() reads the SGL length from the incoming NVMe command > and passes it to sgl_alloc() for scatter-gather allocation. The only > existing size check (len > inline_data_size) is gated behind > sgl->type == NVME_SGL_FMT_DATA_DESC | NVME_SGL_FMT_OFFSET, meaning any > other SGL type bypasses the validation entirely. A remote attacker can > send a CapsuleCmd with a non-offset SGL type and a large sgl->length > (e.g. ~3 GB) to trigger an excessive kernel allocation, resulting in a > page allocator WARNING and kernel panic when panic_on_warn is set. > > Add an upper-bound check against NVMET_TCP_MAXH2CDATA before calling > sgl_alloc(). This value is already advertised to the host in the ICResp > maxdata field as the maximum data transfer size. > > Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> > --- > drivers/nvme/target/tcp.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index a456dd2fd8bd..c7253221c342 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -431,6 +431,8 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) > return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR; > cmd->pdu_len = len; > } > + if (len > NVMET_TCP_MAXH2CDATA) > + return NVME_SC_SGL_INVALID_DATA | NVME_STATUS_DNR; > cmd->req.transfer_len += len; > > cmd->req.sg = sgl_alloc(len, GFP_KERNEL, &cmd->req.sg_cnt); I am not sure this patch is correct, the intent is valid but you are conflating the total transfer size with the H2CData maximum PDU size. In other words, if I understood the code correctly: MAXH2CDATA is the limit of a single H2CData packet's size. sgl->length is the total amount of data to transfer. Also, will the request be completed with "NVME_SC_SGL_INVALID_DATA | NVME_STATUS_DNR" code? If yes, this would be a violation of the spec that states that an H2CDATA with a too large size should trigger a fatal transport error. Maybe the correct fix is to set a sane default for MDTS (Maximum Data Transfer Size), at the moment it's left to zero (no limit). Maurizio ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-03-19 7:59 ` Maurizio Lombardi @ 2026-03-19 18:00 ` Shivam Kumar 2026-03-20 7:48 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Shivam Kumar @ 2026-03-19 18:00 UTC (permalink / raw) To: Maurizio Lombardi; +Cc: gregkh, security, hch, sagi, kch, linux-nvme Hi Maurizio Thank you for the review. You're right, I was conflating the per-PDU H2CData limit with the total transfer size for the command. And the error handling should trigger a fatal transport error rather than completing the request with an NVMe status code. Would setting a sane default for MDTS be the preferred approach here? If so, I'm happy to send a v2 implementing that instead. Thanks, Shivam On Thu, Mar 19, 2026 at 3:59 AM Maurizio Lombardi <mlombard@arkamax.eu> wrote: > > On Thu Mar 19, 2026 at 2:26 AM CET, Shivam Kumar wrote: > > nvmet_tcp_map_data() reads the SGL length from the incoming NVMe command > > and passes it to sgl_alloc() for scatter-gather allocation. The only > > existing size check (len > inline_data_size) is gated behind > > sgl->type == NVME_SGL_FMT_DATA_DESC | NVME_SGL_FMT_OFFSET, meaning any > > other SGL type bypasses the validation entirely. A remote attacker can > > send a CapsuleCmd with a non-offset SGL type and a large sgl->length > > (e.g. ~3 GB) to trigger an excessive kernel allocation, resulting in a > > page allocator WARNING and kernel panic when panic_on_warn is set. > > > > Add an upper-bound check against NVMET_TCP_MAXH2CDATA before calling > > sgl_alloc(). This value is already advertised to the host in the ICResp > > maxdata field as the maximum data transfer size. > > > > Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> > > --- > > drivers/nvme/target/tcp.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > > index a456dd2fd8bd..c7253221c342 100644 > > --- a/drivers/nvme/target/tcp.c > > +++ b/drivers/nvme/target/tcp.c > > @@ -431,6 +431,8 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) > > return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR; > > cmd->pdu_len = len; > > } > > + if (len > NVMET_TCP_MAXH2CDATA) > > + return NVME_SC_SGL_INVALID_DATA | NVME_STATUS_DNR; > > cmd->req.transfer_len += len; > > > > cmd->req.sg = sgl_alloc(len, GFP_KERNEL, &cmd->req.sg_cnt); > > > I am not sure this patch is correct, the intent is valid but you are conflating > the total transfer size with the H2CData maximum PDU size. > > In other words, if I understood the code correctly: > > MAXH2CDATA is the limit of a single H2CData packet's size. > sgl->length is the total amount of data to transfer. > > Also, will the request be completed with "NVME_SC_SGL_INVALID_DATA | > NVME_STATUS_DNR" code? If yes, this would be a violation of the spec > that states that an H2CDATA with a too large size should trigger a > fatal transport error. > > Maybe the correct fix is to set a sane default for MDTS > (Maximum Data Transfer Size), at the moment it's left to zero (no limit). > > Maurizio > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-03-19 18:00 ` Shivam Kumar @ 2026-03-20 7:48 ` Christoph Hellwig 2026-04-05 19:46 ` Shivam Kumar 0 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-03-20 7:48 UTC (permalink / raw) To: Shivam Kumar Cc: Maurizio Lombardi, gregkh, security, hch, sagi, kch, linux-nvme On Thu, Mar 19, 2026 at 02:00:27PM -0400, Shivam Kumar wrote: > Hi Maurizio > > Thank you for the review. You're right, I was conflating the per-PDU > H2CData limit with the total transfer size for the command. > And the error handling should trigger a fatal transport error rather > than completing the request with an NVMe status code. > > Would setting a sane default for MDTS be the preferred approach here? > If so, I'm happy to send a v2 implementing that instead. Yes, if we want to limit the I/O size we have to limit MDTS. Which all other transports do anyway, so I'm kinda surprised TCP doesn't. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-03-20 7:48 ` Christoph Hellwig @ 2026-04-05 19:46 ` Shivam Kumar 2026-04-07 6:30 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Shivam Kumar @ 2026-04-05 19:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Maurizio Lombardi, gregkh, security, sagi, kch, linux-nvme Hi Christoph, Maurizio Just following up on this thread. Would you like me to send a v2 patch that sets a sane default MDTS for the TCP transport? If so, is there a preferred value I should use, or should I follow what the other transports do? Thanks, Shivam Kumar On Fri, Mar 20, 2026 at 3:48 AM Christoph Hellwig <hch@lst.de> wrote: > > On Thu, Mar 19, 2026 at 02:00:27PM -0400, Shivam Kumar wrote: > > Hi Maurizio > > > > Thank you for the review. You're right, I was conflating the per-PDU > > H2CData limit with the total transfer size for the command. > > And the error handling should trigger a fatal transport error rather > > than completing the request with an NVMe status code. > > > > Would setting a sane default for MDTS be the preferred approach here? > > If so, I'm happy to send a v2 implementing that instead. > > Yes, if we want to limit the I/O size we have to limit MDTS. Which all > other transports do anyway, so I'm kinda surprised TCP doesn't. > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-04-05 19:46 ` Shivam Kumar @ 2026-04-07 6:30 ` Christoph Hellwig 2026-04-08 6:30 ` Maurizio Lombardi 0 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-04-07 6:30 UTC (permalink / raw) To: Shivam Kumar Cc: Christoph Hellwig, Maurizio Lombardi, gregkh, security, sagi, kch, linux-nvme On Sun, Apr 05, 2026 at 03:46:12PM -0400, Shivam Kumar wrote: > Hi Christoph, Maurizio > > Just following up on this thread. > Would you like me to send a v2 patch that sets a sane default MDTS for > the TCP transport? Yes, that would be great! > If so, is there a preferred value I should use, or > should I follow what the other transports do? I'm honestly not sure. Maurizio/Sagi: any good idea what a TCP default max MDTS might be? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-04-07 6:30 ` Christoph Hellwig @ 2026-04-08 6:30 ` Maurizio Lombardi 2026-04-09 6:08 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Maurizio Lombardi @ 2026-04-08 6:30 UTC (permalink / raw) To: Christoph Hellwig, Shivam Kumar Cc: Maurizio Lombardi, gregkh, security, sagi, kch, linux-nvme On Tue Apr 7, 2026 at 8:30 AM CEST, Christoph Hellwig wrote: > On Sun, Apr 05, 2026 at 03:46:12PM -0400, Shivam Kumar wrote: >> Hi Christoph, Maurizio >> >> Just following up on this thread. >> Would you like me to send a v2 patch that sets a sane default MDTS for >> the TCP transport? > > Yes, that would be great! > >> If so, is there a preferred value I should use, or >> should I follow what the other transports do? > > I'm honestly not sure. Maurizio/Sagi: any good idea what a TCP default > max MDTS might be? I'm not sure there is a "correct" value here. It seems to mostly come down to balancing performance vs memory usage, rather than anything TCP-specific. For reference, pci-epf allows a max MDTS of 2 MiB[1], while RDMA uses 1 MiB. Personally, I would pick a default of 2 MiB, it should be large enough to avoid excessive splitting while still being reasonable from a memory perspective. Sagi? Any thought on that? I have seen a patch on this mailing list that makes mdts configurable via configfs, so in case of need a user could still change it to his preferred value. Maurizio [1] https://docs.kernel.org/nvme/nvme-pci-endpoint-target.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-04-08 6:30 ` Maurizio Lombardi @ 2026-04-09 6:08 ` Christoph Hellwig 2026-04-26 23:57 ` Shivam Kumar 2026-04-27 0:44 ` [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport Shivam Kumar 0 siblings, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-04-09 6:08 UTC (permalink / raw) To: Maurizio Lombardi Cc: Christoph Hellwig, Shivam Kumar, gregkh, security, sagi, kch, linux-nvme On Wed, Apr 08, 2026 at 08:30:12AM +0200, Maurizio Lombardi wrote: > Personally, I would pick a default of 2 MiB, it should be large > enough to avoid excessive splitting while still being reasonable from a > memory perspective. > > Sagi? Any thought on that? > > I have seen a patch on this mailing list that makes mdts configurable > via configfs, so in case of need a user could still change it > to his preferred value. Right now that just allows adjusting down, so picking the larger value sounds sane. We can still increase it later and/or allow adjusting above the default to an upper cap if needed. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() 2026-04-09 6:08 ` Christoph Hellwig @ 2026-04-26 23:57 ` Shivam Kumar 2026-04-27 0:44 ` [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport Shivam Kumar 1 sibling, 0 replies; 17+ messages in thread From: Shivam Kumar @ 2026-04-26 23:57 UTC (permalink / raw) To: Christoph Hellwig Cc: Maurizio Lombardi, gregkh, security, sagi, kch, linux-nvme Hi all, Gentle ping on this. Is there anything else needed from my side? Or is a consensus forming around the 2 MiB default? Happy to send an updated patch if that helps move things along. Thanks, Shivam On Thu, Apr 9, 2026 at 2:08 AM Christoph Hellwig <hch@lst.de> wrote: > > On Wed, Apr 08, 2026 at 08:30:12AM +0200, Maurizio Lombardi wrote: > > Personally, I would pick a default of 2 MiB, it should be large > > enough to avoid excessive splitting while still being reasonable from a > > memory perspective. > > > > Sagi? Any thought on that? > > > > I have seen a patch on this mailing list that makes mdts configurable > > via configfs, so in case of need a user could still change it > > to his preferred value. > > Right now that just allows adjusting down, so picking the larger > value sounds sane. We can still increase it later and/or allow > adjusting above the default to an upper cap if needed. > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport 2026-04-09 6:08 ` Christoph Hellwig 2026-04-26 23:57 ` Shivam Kumar @ 2026-04-27 0:44 ` Shivam Kumar 2026-04-28 6:07 ` Maurizio Lombardi 1 sibling, 1 reply; 17+ messages in thread From: Shivam Kumar @ 2026-04-27 0:44 UTC (permalink / raw) To: hch; +Cc: mlombard, sagi, kch, linux-nvme, kbusch, gregkh, security, Shivam Kumar Unlike other fabrics transports, the TCP target does not set a default Maximum Data Transfer Size. With the configfs MDTS entry defaulting to 0 (no limit), a remote attacker can send a CapsuleCmd with an arbitrarily large SGL length, causing sgl_alloc() in nvmet_tcp_map_data() to attempt an excessive kernel allocation that triggers the OOM killer. Set a default MDTS of 9 (2 MiB) for TCP. Admins can still adjust via the configfs mdts attribute if needed. Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> --- drivers/nvme/target/tcp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index a456dd2fd8bd..d09c81d07a1d 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -25,6 +25,7 @@ #define NVMET_TCP_DEF_INLINE_DATA_SIZE (4 * PAGE_SIZE) #define NVMET_TCP_MAXH2CDATA 0x400000 /* 16M arbitrary limit */ #define NVMET_TCP_BACKLOG 128 +#define NVMET_TCP_DEF_MDTS 9 /* 2 MiB (2^(12+9)) */ static int param_store_val(const char *str, int *val, int min, int max) { @@ -2067,6 +2068,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) INIT_WORK(&port->accept_work, nvmet_tcp_accept_work); if (port->nport->inline_data_size < 0) port->nport->inline_data_size = NVMET_TCP_DEF_INLINE_DATA_SIZE; + if (nport->mdts < 0) + nport->mdts = NVMET_TCP_DEF_MDTS; ret = sock_create(port->addr.ss_family, SOCK_STREAM, IPPROTO_TCP, &port->sock); -- 2.34.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport 2026-04-27 0:44 ` [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport Shivam Kumar @ 2026-04-28 6:07 ` Maurizio Lombardi 2026-04-29 1:05 ` Shivam Kumar 2026-05-08 20:39 ` [PATCH v3] nvmet-tcp: set and enforce a default MDTS " Shivam Kumar 0 siblings, 2 replies; 17+ messages in thread From: Maurizio Lombardi @ 2026-04-28 6:07 UTC (permalink / raw) To: Shivam Kumar, hch Cc: mlombard, sagi, kch, linux-nvme, kbusch, gregkh, security On Mon Apr 27, 2026 at 2:44 AM CEST, Shivam Kumar wrote: > Unlike other fabrics transports, the TCP target does not set a default > Maximum Data Transfer Size. With the configfs MDTS entry defaulting to 0 > (no limit), a remote attacker can send a CapsuleCmd with an arbitrarily > large SGL length, causing sgl_alloc() in nvmet_tcp_map_data() to attempt > an excessive kernel allocation that triggers the OOM killer. > > Set a default MDTS of 9 (2 MiB) for TCP. Admins can still adjust via > the configfs mdts attribute if needed. > > Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> > --- > drivers/nvme/target/tcp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index a456dd2fd8bd..d09c81d07a1d 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -25,6 +25,7 @@ > #define NVMET_TCP_DEF_INLINE_DATA_SIZE (4 * PAGE_SIZE) > #define NVMET_TCP_MAXH2CDATA 0x400000 /* 16M arbitrary limit */ > #define NVMET_TCP_BACKLOG 128 > +#define NVMET_TCP_DEF_MDTS 9 /* 2 MiB (2^(12+9)) */ > > static int param_store_val(const char *str, int *val, int min, int max) > { > @@ -2067,6 +2068,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) > INIT_WORK(&port->accept_work, nvmet_tcp_accept_work); > if (port->nport->inline_data_size < 0) > port->nport->inline_data_size = NVMET_TCP_DEF_INLINE_DATA_SIZE; > + if (nport->mdts < 0) > + nport->mdts = NVMET_TCP_DEF_MDTS; > > ret = sock_create(port->addr.ss_family, SOCK_STREAM, > IPPROTO_TCP, &port->sock); Ok, but what happens if a host doesn't respect the limitation advertized by the target? The spec explicitely says that "If a command is submitted that exceeds this transfer size, then the command is aborted with a status code of Invalid Field in Command." Maurizio ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport 2026-04-28 6:07 ` Maurizio Lombardi @ 2026-04-29 1:05 ` Shivam Kumar 2026-05-08 20:39 ` [PATCH v3] nvmet-tcp: set and enforce a default MDTS " Shivam Kumar 1 sibling, 0 replies; 17+ messages in thread From: Shivam Kumar @ 2026-04-29 1:05 UTC (permalink / raw) To: Maurizio Lombardi; +Cc: hch, sagi, kch, linux-nvme, kbusch, gregkh, security Hi Maurizio, I'll add enforcement to reject commands exceeding the MDTS limit and send a v3. Thanks, Shivam Kumar ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3] nvmet-tcp: set and enforce a default MDTS for TCP transport 2026-04-28 6:07 ` Maurizio Lombardi 2026-04-29 1:05 ` Shivam Kumar @ 2026-05-08 20:39 ` Shivam Kumar 2026-05-10 20:42 ` Sagi Grimberg 1 sibling, 1 reply; 17+ messages in thread From: Shivam Kumar @ 2026-05-08 20:39 UTC (permalink / raw) To: hch; +Cc: mlombard, sagi, kch, linux-nvme, kbusch, gregkh, security, Shivam Kumar Unlike other fabrics transports, the TCP target does not set a default Maximum Data Transfer Size. With the configfs MDTS entry defaulting to 0 (no limit), a remote attacker can send a CapsuleCmd with an arbitrarily large SGL length, causing sgl_alloc() in nvmet_tcp_map_data() to attempt an excessive kernel allocation that triggers the OOM killer. Set a default MDTS of 9 (2 MiB) for TCP. Enforce the limit server-side in nvmet_tcp_map_data() by rejecting commands whose SGL length exceeds the configured MDTS, returning NVME_SC_INVALID_FIELD as required by the NVMe specification. Admins can still adjust via the configfs mdts attribute if needed. Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> --- drivers/nvme/target/tcp.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 164a564ba3b4..098cf877c358 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -25,6 +25,7 @@ #define NVMET_TCP_DEF_INLINE_DATA_SIZE (4 * PAGE_SIZE) #define NVMET_TCP_MAXH2CDATA 0x400000 /* 16M arbitrary limit */ #define NVMET_TCP_BACKLOG 128 +#define NVMET_TCP_DEF_MDTS 9 /* 2 MiB (2^(12+9)) */ static int param_store_val(const char *str, int *val, int min, int max) { @@ -422,6 +423,13 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) if (!len) return 0; + /* Enforce MDTS: abort commands exceeding the advertised limit */ + if (cmd->req.port->mdts) { + u8 mdts = cmd->req.port->mdts; + if (mdts < 20 && len > (1U << (12 + mdts))) + return NVME_SC_INVALID_FIELD | NVME_STATUS_DNR; + } + if (sgl->type == ((NVME_SGL_FMT_DATA_DESC << 4) | NVME_SGL_FMT_OFFSET)) { if (!nvme_is_write(cmd->req.cmd)) @@ -2077,6 +2085,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) INIT_WORK(&port->accept_work, nvmet_tcp_accept_work); if (port->nport->inline_data_size < 0) port->nport->inline_data_size = NVMET_TCP_DEF_INLINE_DATA_SIZE; + if (nport->mdts < 0) + nport->mdts = NVMET_TCP_DEF_MDTS; ret = sock_create(port->addr.ss_family, SOCK_STREAM, IPPROTO_TCP, &port->sock); -- 2.34.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3] nvmet-tcp: set and enforce a default MDTS for TCP transport 2026-05-08 20:39 ` [PATCH v3] nvmet-tcp: set and enforce a default MDTS " Shivam Kumar @ 2026-05-10 20:42 ` Sagi Grimberg 2026-05-11 6:36 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Sagi Grimberg @ 2026-05-10 20:42 UTC (permalink / raw) To: Shivam Kumar, hch; +Cc: mlombard, kch, linux-nvme, kbusch, gregkh, security On 08/05/2026 23:39, Shivam Kumar wrote: > Unlike other fabrics transports, the TCP target does not set a default > Maximum Data Transfer Size. With the configfs MDTS entry defaulting to 0 > (no limit), a remote attacker can send a CapsuleCmd with an arbitrarily > large SGL length, causing sgl_alloc() in nvmet_tcp_map_data() to attempt > an excessive kernel allocation that triggers the OOM killer. > > Set a default MDTS of 9 (2 MiB) for TCP. Enforce the limit server-side > in nvmet_tcp_map_data() by rejecting commands whose SGL length exceeds > the configured MDTS, returning NVME_SC_INVALID_FIELD as required by the > NVMe specification. Admins can still adjust via the configfs mdts > attribute if needed. > > Signed-off-by: Shivam Kumar <kumar.shivam43666@gmail.com> > --- > drivers/nvme/target/tcp.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 164a564ba3b4..098cf877c358 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -25,6 +25,7 @@ > #define NVMET_TCP_DEF_INLINE_DATA_SIZE (4 * PAGE_SIZE) > #define NVMET_TCP_MAXH2CDATA 0x400000 /* 16M arbitrary limit */ > #define NVMET_TCP_BACKLOG 128 > +#define NVMET_TCP_DEF_MDTS 9 /* 2 MiB (2^(12+9)) */ > > static int param_store_val(const char *str, int *val, int min, int max) > { > @@ -422,6 +423,13 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) > if (!len) > return 0; > > + /* Enforce MDTS: abort commands exceeding the advertised limit */ > + if (cmd->req.port->mdts) { > + u8 mdts = cmd->req.port->mdts; > + if (mdts < 20 && len > (1U << (12 + mdts))) > + return NVME_SC_INVALID_FIELD | NVME_STATUS_DNR; > + } > + > if (sgl->type == ((NVME_SGL_FMT_DATA_DESC << 4) | > NVME_SGL_FMT_OFFSET)) { > if (!nvme_is_write(cmd->req.cmd)) > @@ -2077,6 +2085,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) > INIT_WORK(&port->accept_work, nvmet_tcp_accept_work); > if (port->nport->inline_data_size < 0) > port->nport->inline_data_size = NVMET_TCP_DEF_INLINE_DATA_SIZE; > + if (nport->mdts < 0) > + nport->mdts = NVMET_TCP_DEF_MDTS; > > ret = sock_create(port->addr.ss_family, SOCK_STREAM, > IPPROTO_TCP, &port->sock); Shivam, I think what we want is to limit the tcp to a sane limit similar to the nvmet-rdma driver. Also, we probably want it at least consistent with the (rather arbitrary) MAXH2CDATA... e.g. something like: -- diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 20f150d17a96..32f33e5dbfdb 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -24,6 +24,7 @@ #define NVMET_TCP_DEF_INLINE_DATA_SIZE (4 * PAGE_SIZE) #define NVMET_TCP_MAXH2CDATA 0x400000 /* 16M arbitrary limit */ +#define NVMET_TCP_MAX_MDTS 12 #define NVMET_TCP_BACKLOG 128 static int param_store_val(const char *str, int *val, int min, int max) @@ -2220,6 +2221,11 @@ static ssize_t nvmet_tcp_host_port_addr(struct nvmet_ctrl *ctrl, (struct sockaddr *)&queue->sockaddr_peer); } +static u8 nvmet_tcp_get_mdts(const struct nvmet_ctrl *ctrl) +{ + return NVMET_TCP_MAX_MDTS; +} + static const struct nvmet_fabrics_ops nvmet_tcp_ops = { .owner = THIS_MODULE, .type = NVMF_TRTYPE_TCP, @@ -2231,6 +2237,7 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops = { .install_queue = nvmet_tcp_install_queue, .disc_traddr = nvmet_tcp_disc_port_addr, .host_traddr = nvmet_tcp_host_port_addr, + .get_mdts = nvmet_tcp_get_mdts, }; static int __init nvmet_tcp_init(void) -- ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3] nvmet-tcp: set and enforce a default MDTS for TCP transport 2026-05-10 20:42 ` Sagi Grimberg @ 2026-05-11 6:36 ` Christoph Hellwig 2026-05-11 8:01 ` Sagi Grimberg 0 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-05-11 6:36 UTC (permalink / raw) To: Sagi Grimberg Cc: Shivam Kumar, hch, mlombard, kch, linux-nvme, kbusch, gregkh, security On Sun, May 10, 2026 at 11:42:48PM +0300, Sagi Grimberg wrote: > I think what we want is to limit the tcp to a sane limit similar to the > nvmet-rdma driver. Agreed. We might still need the early enforcement in Shivam's patch, though. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] nvmet-tcp: set and enforce a default MDTS for TCP transport 2026-05-11 6:36 ` Christoph Hellwig @ 2026-05-11 8:01 ` Sagi Grimberg 2026-05-11 8:04 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Sagi Grimberg @ 2026-05-11 8:01 UTC (permalink / raw) To: Christoph Hellwig Cc: Shivam Kumar, mlombard, kch, linux-nvme, kbusch, gregkh, security On 11/05/2026 9:36, Christoph Hellwig wrote: > On Sun, May 10, 2026 at 11:42:48PM +0300, Sagi Grimberg wrote: >> I think what we want is to limit the tcp to a sane limit similar to the >> nvmet-rdma driver. > Agreed. > > We might still need the early enforcement in Shivam's patch, though. > Agreed, I think that rdma is susceptible here as well (perhaps also fc?). Perhaps we can have a helper for nvmet_check_mdts() that would accept the request after req->transfer_len was initialized. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] nvmet-tcp: set and enforce a default MDTS for TCP transport 2026-05-11 8:01 ` Sagi Grimberg @ 2026-05-11 8:04 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-05-11 8:04 UTC (permalink / raw) To: Sagi Grimberg Cc: Christoph Hellwig, Shivam Kumar, mlombard, kch, linux-nvme, kbusch, gregkh, security On Mon, May 11, 2026 at 11:01:18AM +0300, Sagi Grimberg wrote: > Agreed, I think that rdma is susceptible here as well (perhaps also fc?). > > Perhaps we can have a helper for nvmet_check_mdts() that would accept > the request after req->transfer_len was initialized. Sounds good. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-05-11 8:04 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2026031805-stretch-skid-ae5b@gregkh>
2026-03-19 1:26 ` [PATCH] nvmet-tcp: bound sgl->length check in nvmet_tcp_map_data() Shivam Kumar
2026-03-19 7:59 ` Maurizio Lombardi
2026-03-19 18:00 ` Shivam Kumar
2026-03-20 7:48 ` Christoph Hellwig
2026-04-05 19:46 ` Shivam Kumar
2026-04-07 6:30 ` Christoph Hellwig
2026-04-08 6:30 ` Maurizio Lombardi
2026-04-09 6:08 ` Christoph Hellwig
2026-04-26 23:57 ` Shivam Kumar
2026-04-27 0:44 ` [PATCH v2] nvmet-tcp: set a default MDTS of 2 MiB for TCP transport Shivam Kumar
2026-04-28 6:07 ` Maurizio Lombardi
2026-04-29 1:05 ` Shivam Kumar
2026-05-08 20:39 ` [PATCH v3] nvmet-tcp: set and enforce a default MDTS " Shivam Kumar
2026-05-10 20:42 ` Sagi Grimberg
2026-05-11 6:36 ` Christoph Hellwig
2026-05-11 8:01 ` Sagi Grimberg
2026-05-11 8:04 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox