* [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones
@ 2026-04-15 23:02 alistair23
2026-04-16 5:15 ` Christoph Hellwig
2026-04-16 6:16 ` Hannes Reinecke
0 siblings, 2 replies; 5+ messages in thread
From: alistair23 @ 2026-04-15 23:02 UTC (permalink / raw)
To: hare, hch, sagi, kch, kbusch, linux-nvme, linux-kernel, yi.zhang,
mlombard, linux-block
Cc: alistair23, shinichiro.kawasaki, Alistair Francis
From: Alistair Francis <alistair.francis@wdc.com>
Previously after the host sends a REPLACETLSPSK we freed the TLS keys as
part of calling nvmet_auth_sq_free() on success. A recent change ensured
we don't free the keys, allowing REPLACETLSPSK to work.
But that fix results in a kernel memory leak when running
```
nvme_trtype=loop ./check nvme/041 nvme/042 nvme/043 nvme/044 nvme/045 nvme/051 nvme/052
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
```
We can't free the keys on a successful DHCHAP operation, otherwise the
next REPLACETLSPSK will fail, so instead let's free them before we
replace them as part of nvmet_auth_challenge().
This ensures that REPLACETLSPSK works, while also avoiding any memory
leaks.
Fixes: 2e6eb6b277f59 ("nvmet-tcp: Don't free SQ on authentication success")
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
drivers/nvme/target/fabrics-cmd-auth.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c
index b9ab80c7a6941..58185184478a4 100644
--- a/drivers/nvme/target/fabrics-cmd-auth.c
+++ b/drivers/nvme/target/fabrics-cmd-auth.c
@@ -412,6 +412,13 @@ static int nvmet_auth_challenge(struct nvmet_req *req, void *d, int al)
int hash_len = nvme_auth_hmac_hash_len(ctrl->shash_id);
int data_size = sizeof(*d) + hash_len;
+ /*
+ * If replacing the keys then we have previous successful keys
+ * that might be leaked, so we need to free them here.
+ */
+ if (req->sq->dhchap_c1)
+ nvmet_auth_sq_free(req->sq);
+
if (ctrl->dh_tfm)
data_size += ctrl->dh_keysize;
if (al < data_size) {
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones 2026-04-15 23:02 [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones alistair23 @ 2026-04-16 5:15 ` Christoph Hellwig 2026-04-16 6:16 ` Hannes Reinecke 1 sibling, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2026-04-16 5:15 UTC (permalink / raw) To: alistair23 Cc: hare, hch, sagi, kch, kbusch, linux-nvme, linux-kernel, yi.zhang, mlombard, linux-block, shinichiro.kawasaki, Alistair Francis Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones 2026-04-15 23:02 [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones alistair23 2026-04-16 5:15 ` Christoph Hellwig @ 2026-04-16 6:16 ` Hannes Reinecke 2026-04-16 15:27 ` Chris Leech 1 sibling, 1 reply; 5+ messages in thread From: Hannes Reinecke @ 2026-04-16 6:16 UTC (permalink / raw) To: alistair23, hch, sagi, kch, kbusch, linux-nvme, linux-kernel, yi.zhang, mlombard, linux-block Cc: shinichiro.kawasaki, Alistair Francis On 4/16/26 01:02, alistair23@gmail.com wrote: > From: Alistair Francis <alistair.francis@wdc.com> > > Previously after the host sends a REPLACETLSPSK we freed the TLS keys as > part of calling nvmet_auth_sq_free() on success. A recent change ensured > we don't free the keys, allowing REPLACETLSPSK to work. > > But that fix results in a kernel memory leak when running > > ``` > nvme_trtype=loop ./check nvme/041 nvme/042 nvme/043 nvme/044 nvme/045 nvme/051 nvme/052 > echo scan > /sys/kernel/debug/kmemleak > cat /sys/kernel/debug/kmemleak > ``` > > We can't free the keys on a successful DHCHAP operation, otherwise the > next REPLACETLSPSK will fail, so instead let's free them before we > replace them as part of nvmet_auth_challenge(). > > This ensures that REPLACETLSPSK works, while also avoiding any memory > leaks. > > Fixes: 2e6eb6b277f59 ("nvmet-tcp: Don't free SQ on authentication success") > Signed-off-by: Alistair Francis <alistair.francis@wdc.com> > --- > drivers/nvme/target/fabrics-cmd-auth.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c > index b9ab80c7a6941..58185184478a4 100644 > --- a/drivers/nvme/target/fabrics-cmd-auth.c > +++ b/drivers/nvme/target/fabrics-cmd-auth.c > @@ -412,6 +412,13 @@ static int nvmet_auth_challenge(struct nvmet_req *req, void *d, int al) > int hash_len = nvme_auth_hmac_hash_len(ctrl->shash_id); > int data_size = sizeof(*d) + hash_len; > > + /* > + * If replacing the keys then we have previous successful keys > + * that might be leaked, so we need to free them here. > + */ > + if (req->sq->dhchap_c1) > + nvmet_auth_sq_free(req->sq); > + > if (ctrl->dh_tfm) > data_size += ctrl->dh_keysize; > if (al < data_size) { I am not sure. The authentication variables should be freed as soon as the authentication completes; the session key is ephemeral and should not be stored longer than necessary and will _never_ be used again once authentication completes. The TLS key, OTOH, is used throughout the session and needs to be present while the session is active As such, both sets have vastly different lifetimes, and I would argue that this void nvmet_auth_sq_free(struct nvmet_sq *sq) { cancel_delayed_work(&sq->auth_expired_work); #ifdef CONFIG_NVME_TARGET_TCP_TLS sq->tls_key = NULL; #endif kfree(sq->dhchap_c1); sq->dhchap_c1 = NULL; is actually wrong as we should not modify 'tls_key' here. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones 2026-04-16 6:16 ` Hannes Reinecke @ 2026-04-16 15:27 ` Chris Leech 2026-04-17 0:49 ` Alistair Francis 0 siblings, 1 reply; 5+ messages in thread From: Chris Leech @ 2026-04-16 15:27 UTC (permalink / raw) To: Hannes Reinecke Cc: alistair23, hch, sagi, kch, kbusch, linux-nvme, linux-kernel, yi.zhang, mlombard, linux-block, shinichiro.kawasaki, Alistair Francis On Thu, Apr 16, 2026 at 08:16:14AM +0200, Hannes Reinecke wrote: > On 4/16/26 01:02, alistair23@gmail.com wrote: > > From: Alistair Francis <alistair.francis@wdc.com> > > > > Previously after the host sends a REPLACETLSPSK we freed the TLS keys as > > part of calling nvmet_auth_sq_free() on success. A recent change ensured > > we don't free the keys, allowing REPLACETLSPSK to work. > > > > But that fix results in a kernel memory leak when running > > > > ``` > > nvme_trtype=loop ./check nvme/041 nvme/042 nvme/043 nvme/044 nvme/045 nvme/051 nvme/052 > > echo scan > /sys/kernel/debug/kmemleak > > cat /sys/kernel/debug/kmemleak > > ``` > > > > We can't free the keys on a successful DHCHAP operation, otherwise the > > next REPLACETLSPSK will fail, so instead let's free them before we > > replace them as part of nvmet_auth_challenge(). > > > > This ensures that REPLACETLSPSK works, while also avoiding any memory > > leaks. > > > > Fixes: 2e6eb6b277f59 ("nvmet-tcp: Don't free SQ on authentication success") > > Signed-off-by: Alistair Francis <alistair.francis@wdc.com> > > --- > > drivers/nvme/target/fabrics-cmd-auth.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c > > index b9ab80c7a6941..58185184478a4 100644 > > --- a/drivers/nvme/target/fabrics-cmd-auth.c > > +++ b/drivers/nvme/target/fabrics-cmd-auth.c > > @@ -412,6 +412,13 @@ static int nvmet_auth_challenge(struct nvmet_req *req, void *d, int al) > > int hash_len = nvme_auth_hmac_hash_len(ctrl->shash_id); > > int data_size = sizeof(*d) + hash_len; > > + /* > > + * If replacing the keys then we have previous successful keys > > + * that might be leaked, so we need to free them here. > > + */ > > + if (req->sq->dhchap_c1) > > + nvmet_auth_sq_free(req->sq); > > + > > if (ctrl->dh_tfm) > > data_size += ctrl->dh_keysize; > > if (al < data_size) { > I am not sure. > The authentication variables should be freed as soon as the authentication > completes; the session key is ephemeral and > should not be stored longer than necessary and will _never_ > be used again once authentication completes. > The TLS key, OTOH, is used throughout the session and needs > to be present while the session is active > As such, both sets have vastly different lifetimes, and > I would argue that this > > void nvmet_auth_sq_free(struct nvmet_sq *sq) > { > cancel_delayed_work(&sq->auth_expired_work); > #ifdef CONFIG_NVME_TARGET_TCP_TLS > sq->tls_key = NULL; > #endif > kfree(sq->dhchap_c1); > sq->dhchap_c1 = NULL; > > is actually wrong as we should not modify 'tls_key' here. I agree with Hannes, and was just about to respond with the same feedback. I think the freeing of the auth temporaries needs to be returned to fix the memleak, and the real problem is the setting of tls_key to NULL. That doesn't seem like the right lifetime for tls_key, and it looks to be a reference count leak as well. Is the presence of sq->tls_key the best check to see if the socket is currently in a kTLS mode? (it might be, I'm not as up on the target code) - Chris ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones 2026-04-16 15:27 ` Chris Leech @ 2026-04-17 0:49 ` Alistair Francis 0 siblings, 0 replies; 5+ messages in thread From: Alistair Francis @ 2026-04-17 0:49 UTC (permalink / raw) To: Chris Leech Cc: Hannes Reinecke, hch, sagi, kch, kbusch, linux-nvme, linux-kernel, yi.zhang, mlombard, linux-block, shinichiro.kawasaki, Alistair Francis On Fri, Apr 17, 2026 at 1:27 AM Chris Leech <cleech@redhat.com> wrote: > > On Thu, Apr 16, 2026 at 08:16:14AM +0200, Hannes Reinecke wrote: > > On 4/16/26 01:02, alistair23@gmail.com wrote: > > > From: Alistair Francis <alistair.francis@wdc.com> > > > > > > Previously after the host sends a REPLACETLSPSK we freed the TLS keys as > > > part of calling nvmet_auth_sq_free() on success. A recent change ensured > > > we don't free the keys, allowing REPLACETLSPSK to work. > > > > > > But that fix results in a kernel memory leak when running > > > > > > ``` > > > nvme_trtype=loop ./check nvme/041 nvme/042 nvme/043 nvme/044 nvme/045 nvme/051 nvme/052 > > > echo scan > /sys/kernel/debug/kmemleak > > > cat /sys/kernel/debug/kmemleak > > > ``` > > > > > > We can't free the keys on a successful DHCHAP operation, otherwise the > > > next REPLACETLSPSK will fail, so instead let's free them before we > > > replace them as part of nvmet_auth_challenge(). > > > > > > This ensures that REPLACETLSPSK works, while also avoiding any memory > > > leaks. > > > > > > Fixes: 2e6eb6b277f59 ("nvmet-tcp: Don't free SQ on authentication success") > > > Signed-off-by: Alistair Francis <alistair.francis@wdc.com> > > > --- > > > drivers/nvme/target/fabrics-cmd-auth.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c > > > index b9ab80c7a6941..58185184478a4 100644 > > > --- a/drivers/nvme/target/fabrics-cmd-auth.c > > > +++ b/drivers/nvme/target/fabrics-cmd-auth.c > > > @@ -412,6 +412,13 @@ static int nvmet_auth_challenge(struct nvmet_req *req, void *d, int al) > > > int hash_len = nvme_auth_hmac_hash_len(ctrl->shash_id); > > > int data_size = sizeof(*d) + hash_len; > > > + /* > > > + * If replacing the keys then we have previous successful keys > > > + * that might be leaked, so we need to free them here. > > > + */ > > > + if (req->sq->dhchap_c1) > > > + nvmet_auth_sq_free(req->sq); > > > + > > > if (ctrl->dh_tfm) > > > data_size += ctrl->dh_keysize; > > > if (al < data_size) { > > I am not sure. > > The authentication variables should be freed as soon as the authentication > > completes; the session key is ephemeral and > > should not be stored longer than necessary and will _never_ > > be used again once authentication completes. > > The TLS key, OTOH, is used throughout the session and needs > > to be present while the session is active > > As such, both sets have vastly different lifetimes, and > > I would argue that this > > > > void nvmet_auth_sq_free(struct nvmet_sq *sq) > > { > > cancel_delayed_work(&sq->auth_expired_work); > > #ifdef CONFIG_NVME_TARGET_TCP_TLS > > sq->tls_key = NULL; > > #endif > > kfree(sq->dhchap_c1); > > sq->dhchap_c1 = NULL; > > > > is actually wrong as we should not modify 'tls_key' here. > > I agree with Hannes, and was just about to respond with the same > feedback. I think the freeing of the auth temporaries needs to be > returned to fix the memleak, and the real problem is the setting of > tls_key to NULL. That doesn't seem like the right lifetime for tls_key, > and it looks to be a reference count leak as well. Yep, agreed. Patches sent to revert the free and remove the `sq->tls_key = NULL;` > > Is the presence of sq->tls_key the best check to see if the socket is > currently in a kTLS mode? (it might be, I'm not as up on the target > code) I think it is correct, the issue is just that we are setting sq->tls_key to NULL Alistair ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-17 0:49 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-15 23:02 [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones alistair23 2026-04-16 5:15 ` Christoph Hellwig 2026-04-16 6:16 ` Hannes Reinecke 2026-04-16 15:27 ` Chris Leech 2026-04-17 0:49 ` Alistair Francis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox