[PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
@ 2025-10-21 21:33 Bernd Schubert
  2025-10-21 21:33 ` [PATCH 1/2] fuse: Move ring queues_refs decrement Bernd Schubert
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Bernd Schubert @ 2025-10-21 21:33 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Joanne Koong, linux-fsdevel, Jian Huang Li, Bernd Schubert,
	stable

Do not merge yet, the current series has not been tested yet.
The race is only easily reproducible with additional patches that
pin pages during FUSE_IO_URING_CMD_REGISTER - slows it down and then
xfstest's generic/001 triggers it reliably. However, I need to update
these pin patches for linux master.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
Bernd Schubert (1):
      fuse: Move ring queues_refs decrement

Jian Huang Li (1):
      fs/fuse: fix potential memory leak from fuse_uring_cancel

 fs/fuse/dev_uring.c | 33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)
---
base-commit: 6548d364a3e850326831799d7e3ea2d7bb97ba08
change-id: 20251021-io-uring-fixes-cancel-mem-leak-820642677c37

Best regards,
-- 
Bernd Schubert <bschubert@ddn.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] fuse: Move ring queues_refs decrement
  2025-10-21 21:33 [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
@ 2025-10-21 21:33 ` Bernd Schubert
  2025-10-21 21:33 ` [PATCH 2/2] fs/fuse: fix potential memory leak from fuse_uring_cancel Bernd Schubert
  2026-04-09 11:02 ` [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
  2 siblings, 0 replies; 14+ messages in thread
From: Bernd Schubert @ 2025-10-21 21:33 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Joanne Koong, linux-fsdevel, Jian Huang Li, Bernd Schubert

This is just to avoid code dup with an upcoming commit.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
 fs/fuse/dev_uring.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
index f6b12aebb8bbe7d255980593b75b5fb5af9c669e..e7c1095b83b11fe46080c24f539df17e70969e21 100644
--- a/fs/fuse/dev_uring.c
+++ b/fs/fuse/dev_uring.c
@@ -328,7 +328,7 @@ static void fuse_uring_entry_teardown(struct fuse_ring_ent *ent)
 {
 	struct fuse_req *req;
 	struct io_uring_cmd *cmd;
-
+	ssize_t queue_refs;
 	struct fuse_ring_queue *queue = ent->queue;
 
 	spin_lock(&queue->lock);
@@ -356,15 +356,16 @@ static void fuse_uring_entry_teardown(struct fuse_ring_ent *ent)
 
 	if (req)
 		fuse_uring_stop_fuse_req_end(req);
+
+	queue_refs = atomic_dec_return(&queue->ring->queue_refs);
+	WARN_ON_ONCE(queue_refs < 0);
 }
 
 static void fuse_uring_stop_list_entries(struct list_head *head,
 					 struct fuse_ring_queue *queue,
 					 enum fuse_ring_req_state exp_state)
 {
-	struct fuse_ring *ring = queue->ring;
 	struct fuse_ring_ent *ent, *next;
-	ssize_t queue_refs = SSIZE_MAX;
 	LIST_HEAD(to_teardown);
 
 	spin_lock(&queue->lock);
@@ -381,11 +382,8 @@ static void fuse_uring_stop_list_entries(struct list_head *head,
 	spin_unlock(&queue->lock);
 
 	/* no queue lock to avoid lock order issues */
-	list_for_each_entry_safe(ent, next, &to_teardown, list) {
+	list_for_each_entry_safe(ent, next, &to_teardown, list)
 		fuse_uring_entry_teardown(ent);
-		queue_refs = atomic_dec_return(&ring->queue_refs);
-		WARN_ON_ONCE(queue_refs < 0);
-	}
 }
 
 static void fuse_uring_teardown_entries(struct fuse_ring_queue *queue)

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] fs/fuse: fix potential memory leak from fuse_uring_cancel
  2025-10-21 21:33 [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
  2025-10-21 21:33 ` [PATCH 1/2] fuse: Move ring queues_refs decrement Bernd Schubert
@ 2025-10-21 21:33 ` Bernd Schubert
  2026-04-09 11:02 ` [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
  2 siblings, 0 replies; 14+ messages in thread
From: Bernd Schubert @ 2025-10-21 21:33 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Joanne Koong, linux-fsdevel, Jian Huang Li, Bernd Schubert,
	stable

From: Jian Huang Li <ali@ddn.com>

This issue could be observed sometimes during libfuse xfstests, from
dmseg prints some like "kernel: WARNING: CPU: 4 PID: 0 at
fs/fuse/dev_uring.c:204 fuse_uring_destruct+0x1f5/0x200 [fuse]".

The cause is, if when fuse daemon just submitted
FUSE_IO_URING_CMD_REGISTER SQEs, then umount or fuse daemon quits at
this very early stage. After all uring queues stopped, might have one or
more unprocessed FUSE_IO_URING_CMD_REGISTER SQEs get processed then some
new ring entities are created and added to ent_avail_queue, and
immediately fuse_uring_cancel moved them to ent_in_userspace after SQEs
get canceled. These ring entities were not moved to ent_released, and
stayed in ent_in_userspace when fuse_uring_destruct was called.

One way to solve it would be to also free 'ent_in_userspace' in
fuse_uring_destruct(), but from code point of view it is hard to see why
it is needed. As suggested by Joanne, another solution is to avoid moving
entries in fuse_uring_cancel() to the 'ent_in_userspace' list and just
releasing them directly.

Fixes: b6236c8407cb ("fuse: {io-uring} Prevent mount point hang on fuse-server termination")
Cc: Joanne Koong <joannelkoong@gmail.com>
Cc: <stable@vger.kernel.org> # v6.14
Signed-off-by: Jian Huang Li <ali@ddn.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
 fs/fuse/dev_uring.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
index e7c1095b83b11fe46080c24f539df17e70969e21..d88a0c05434a04668241f09f123d5e3a9cc1621d 100644
--- a/fs/fuse/dev_uring.c
+++ b/fs/fuse/dev_uring.c
@@ -324,7 +324,7 @@ static void fuse_uring_stop_fuse_req_end(struct fuse_req *req)
 /*
  * Release a request/entry on connection tear down
  */
-static void fuse_uring_entry_teardown(struct fuse_ring_ent *ent)
+static void fuse_uring_entry_teardown(struct fuse_ring_ent *ent, int issue_flags)
 {
 	struct fuse_req *req;
 	struct io_uring_cmd *cmd;
@@ -352,7 +352,7 @@ static void fuse_uring_entry_teardown(struct fuse_ring_ent *ent)
 	spin_unlock(&queue->lock);
 
 	if (cmd)
-		io_uring_cmd_done(cmd, -ENOTCONN, IO_URING_F_UNLOCKED);
+		io_uring_cmd_done(cmd, -ENOTCONN, issue_flags);
 
 	if (req)
 		fuse_uring_stop_fuse_req_end(req);
@@ -383,7 +383,7 @@ static void fuse_uring_stop_list_entries(struct list_head *head,
 
 	/* no queue lock to avoid lock order issues */
 	list_for_each_entry_safe(ent, next, &to_teardown, list)
-		fuse_uring_entry_teardown(ent);
+		fuse_uring_entry_teardown(ent, IO_URING_F_UNLOCKED);
 }
 
 static void fuse_uring_teardown_entries(struct fuse_ring_queue *queue)
@@ -499,7 +499,7 @@ static void fuse_uring_cancel(struct io_uring_cmd *cmd,
 {
 	struct fuse_ring_ent *ent = uring_cmd_to_ring_ent(cmd);
 	struct fuse_ring_queue *queue;
-	bool need_cmd_done = false;
+	bool teardown = false;
 
 	/*
 	 * direct access on ent - it must not be destructed as long as
@@ -508,17 +508,14 @@ static void fuse_uring_cancel(struct io_uring_cmd *cmd,
 	queue = ent->queue;
 	spin_lock(&queue->lock);
 	if (ent->state == FRRS_AVAILABLE) {
-		ent->state = FRRS_USERSPACE;
-		list_move_tail(&ent->list, &queue->ent_in_userspace);
-		need_cmd_done = true;
-		ent->cmd = NULL;
+		ent->state = FRRS_TEARDOWN;
+		list_del_init(&ent->list);
+		teardown = true;
 	}
 	spin_unlock(&queue->lock);
 
-	if (need_cmd_done) {
-		/* no queue lock to avoid lock order issues */
-		io_uring_cmd_done(cmd, -ENOTCONN, issue_flags);
-	}
+	if (teardown)
+		fuse_uring_entry_teardown(ent, issue_flags);
 }
 
 static void fuse_uring_prepare_cancel(struct io_uring_cmd *cmd, int issue_flags,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2025-10-21 21:33 [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
  2025-10-21 21:33 ` [PATCH 1/2] fuse: Move ring queues_refs decrement Bernd Schubert
  2025-10-21 21:33 ` [PATCH 2/2] fs/fuse: fix potential memory leak from fuse_uring_cancel Bernd Schubert
@ 2026-04-09 11:02 ` Bernd Schubert
  2026-04-09 23:09   ` Joanne Koong
  2 siblings, 1 reply; 14+ messages in thread
From: Bernd Schubert @ 2026-04-09 11:02 UTC (permalink / raw)
  To: Bernd Schubert, Miklos Szeredi
  Cc: Joanne Koong, linux-fsdevel, Jian Huang Li, stable,
	Horst Birthelmer



On 10/21/25 23:33, Bernd Schubert wrote:
> Do not merge yet, the current series has not been tested yet.

I'm glad that that I was hesitating to apply it, the DDN branch had it
for ages and this patch actually introduced a possible fc->num_waiting
issue, because fc->uring->queue_refs might go down to 0 though
fuse_uring_cancel() and then fuse_uring_abort() would never stop and
flush the queues without another addition.

Thanks,
Bernd

> The race is only easily reproducible with additional patches that
> pin pages during FUSE_IO_URING_CMD_REGISTER - slows it down and then
> xfstest's generic/001 triggers it reliably. However, I need to update
> these pin patches for linux master.
> 
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> ---
> Bernd Schubert (1):
>       fuse: Move ring queues_refs decrement
> 
> Jian Huang Li (1):
>       fs/fuse: fix potential memory leak from fuse_uring_cancel
> 
>  fs/fuse/dev_uring.c | 33 ++++++++++++++-------------------
>  1 file changed, 14 insertions(+), 19 deletions(-)
> ---
> base-commit: 6548d364a3e850326831799d7e3ea2d7bb97ba08
> change-id: 20251021-io-uring-fixes-cancel-mem-leak-820642677c37
> 
> Best regards,


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-09 11:02 ` [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
@ 2026-04-09 23:09   ` Joanne Koong
  2026-04-10  7:21     ` Horst Birthelmer
  2026-04-10 11:26     ` Bernd Schubert
  0 siblings, 2 replies; 14+ messages in thread
From: Joanne Koong @ 2026-04-09 23:09 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Bernd Schubert, Miklos Szeredi, linux-fsdevel, Jian Huang Li,
	stable, Horst Birthelmer

On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>
>
>
> On 10/21/25 23:33, Bernd Schubert wrote:
> > Do not merge yet, the current series has not been tested yet.
>
> I'm glad that that I was hesitating to apply it, the DDN branch had it
> for ages and this patch actually introduced a possible fc->num_waiting
> issue, because fc->uring->queue_refs might go down to 0 though
> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> flush the queues without another addition.
>

Hi Bernd and Jian,

For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
from fuse_uring_cancel" email was never delivered to my inbox, so I am
just going to write my reply to that patch here instead, hope that's
ok.

Just to summarize, the race is that during unmount, fuse_abort() ->
fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
fuse_uring_entry_teardown() gets run but there may still be sqes that
are being registered, which results in new ents that are created (and
leaked) after the teardown logic has finished and the queues are
stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
never gets scheduled because at the time of teardown, queue->refs is 0
as those sqes have not fully created the ents and grabbed refs yet.
fuse_uring_destruct() runs during unmount, but this doesn't clean up
the created ents because those registered ents got put on the
ent_in_userspace list which fuse_uring_destruct() doesn't go through
to free, resulting in those ents being leaked.

The root cause of the race is that ents are being registered even when
the queue is already stopped/dead. I think if we at registration time
check the queue state before calling fuse_uring_prepare_cancel(), we
eliminate the race altogether. If we see that the abort path has
already triggered (eg queue->stopped == true), we manually free the
ent and return an error instead of adding it to a list, eg

diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
index d88a0c05434a..351c19150aae 100644
--- a/fs/fuse/dev_uring.c
+++ b/fs/fuse/dev_uring.c
@@ -969,7 +969,7 @@ static bool is_ring_ready(struct fuse_ring *ring,
int current_qid)
 /*
  * fuse_uring_req_fetch command handling
  */
-static void fuse_uring_do_register(struct fuse_ring_ent *ent,
+static int fuse_uring_do_register(struct fuse_ring_ent *ent,
                                   struct io_uring_cmd *cmd,
                                   unsigned int issue_flags)
 {
@@ -978,6 +978,16 @@ static void fuse_uring_do_register(struct
fuse_ring_ent *ent,
        struct fuse_conn *fc = ring->fc;
        struct fuse_iqueue *fiq = &fc->iq;

+       spin_lock(&queue->lock);
+       /* abort teardown path is running or has run */
+       if (queue->stopped) {
+               spin_unlock(&queue->lock);
+               atomic_dec(&ring->queue_refs);
+               kfree(ent);
+               return -ECONNABORTED;
+       }
+       spin_unlock(&queue->lock);
+
        fuse_uring_prepare_cancel(cmd, issue_flags, ent);

        spin_lock(&queue->lock);
@@ -994,6 +1004,7 @@ static void fuse_uring_do_register(struct
fuse_ring_ent *ent,
                        wake_up_all(&fc->blocked_waitq);
                }
        }
+       return 0;
 }

 /*
@@ -1109,9 +1120,7 @@ static int fuse_uring_register(struct io_uring_cmd *cmd,
        if (IS_ERR(ent))
                return PTR_ERR(ent);

-       fuse_uring_do_register(ent, cmd, issue_flags);
-
-       return 0;
+       return fuse_uring_do_register(ent, cmd, issue_flags);
 }

There's the scenario where the abort path's "queue->stopped = true"
gets set right between when we drop the queue lock and before we call
fuse_uring_prepare_cancel(), but the fuse_uring_create_ring_ent()
logic that was called before fuse_uring_do_register() has already
grabbed the ref on ring->queue_refs, which means in the abort path,
the async teardown (fuse_uring_async_stop_queues()) work is guaranteed
to run and clean up / free the entry.

Thanks,
Joanne

> Thanks,
> Bernd
>
> > The race is only easily reproducible with additional patches that
> > pin pages during FUSE_IO_URING_CMD_REGISTER - slows it down and then
> > xfstest's generic/001 triggers it reliably. However, I need to update
> > these pin patches for linux master.
> >
> > Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> > ---
> > Bernd Schubert (1):
> >       fuse: Move ring queues_refs decrement
> >
> > Jian Huang Li (1):
> >       fs/fuse: fix potential memory leak from fuse_uring_cancel
> >
> >  fs/fuse/dev_uring.c | 33 ++++++++++++++-------------------
> >  1 file changed, 14 insertions(+), 19 deletions(-)
> > ---
> > base-commit: 6548d364a3e850326831799d7e3ea2d7bb97ba08
> > change-id: 20251021-io-uring-fixes-cancel-mem-leak-820642677c37
> >
> > Best regards,
>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-09 23:09   ` Joanne Koong
@ 2026-04-10  7:21     ` Horst Birthelmer
  2026-04-10 17:09       ` Joanne Koong
  2026-04-10 11:26     ` Bernd Schubert
  1 sibling, 1 reply; 14+ messages in thread
From: Horst Birthelmer @ 2026-04-10  7:21 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Bernd Schubert, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> >
> >
> >
> > On 10/21/25 23:33, Bernd Schubert wrote:
> > > Do not merge yet, the current series has not been tested yet.
> >
> > I'm glad that that I was hesitating to apply it, the DDN branch had it
> > for ages and this patch actually introduced a possible fc->num_waiting
> > issue, because fc->uring->queue_refs might go down to 0 though
> > fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> > flush the queues without another addition.
> >
> 
> Hi Bernd and Jian,
> 
> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> from fuse_uring_cancel" email was never delivered to my inbox, so I am
> just going to write my reply to that patch here instead, hope that's
> ok.
> 
> Just to summarize, the race is that during unmount, fuse_abort() ->
> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> fuse_uring_entry_teardown() gets run but there may still be sqes that
> are being registered, which results in new ents that are created (and
> leaked) after the teardown logic has finished and the queues are
> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> never gets scheduled because at the time of teardown, queue->refs is 0
> as those sqes have not fully created the ents and grabbed refs yet.
> fuse_uring_destruct() runs during unmount, but this doesn't clean up
> the created ents because those registered ents got put on the
> ent_in_userspace list which fuse_uring_destruct() doesn't go through
> to free, resulting in those ents being leaked.
> 
> The root cause of the race is that ents are being registered even when
> the queue is already stopped/dead. I think if we at registration time
> check the queue state before calling fuse_uring_prepare_cancel(), we
> eliminate the race altogether. If we see that the abort path has
> already triggered (eg queue->stopped == true), we manually free the
> ent and return an error instead of adding it to a list, eg

In my case (Bernd mentioned that I was investigating a hang during umount)
there were a lot of requests created during teardown, so what happened
was very similar, but for exact the opposite reason.
In fuse_uring_abort() queue_refs was already 0 due to an optimization
where the ring teardown ran before fuse_abort_conn(). 
Thus the queue->stopped was never set.

How do we make sure that fuse_uring_teardown_entries() has not been
called by fuse_uring_async_stop_queues()?

Maybe I'm missing something?

My fix was to remove the check for queue_refs > 0 in fuse_uring_abort()
and make sure that even if the teardown was complete nothing bad happens
in fuse_uring_abort_end_requests() and fuse_uring_stop_queues().

Thanks,
Horst


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-09 23:09   ` Joanne Koong
  2026-04-10  7:21     ` Horst Birthelmer
@ 2026-04-10 11:26     ` Bernd Schubert
  1 sibling, 0 replies; 14+ messages in thread
From: Bernd Schubert @ 2026-04-10 11:26 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Bernd Schubert, Miklos Szeredi, linux-fsdevel, Jian Huang Li,
	stable, Horst Birthelmer

Hi Joanne,

On 4/10/26 01:09, Joanne Koong wrote:
> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>>
>>
>>
>> On 10/21/25 23:33, Bernd Schubert wrote:
>>> Do not merge yet, the current series has not been tested yet.
>>
>> I'm glad that that I was hesitating to apply it, the DDN branch had it
>> for ages and this patch actually introduced a possible fc->num_waiting
>> issue, because fc->uring->queue_refs might go down to 0 though
>> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
>> flush the queues without another addition.
>>
> 
> Hi Bernd and Jian,
> 
> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> from fuse_uring_cancel" email was never delivered to my inbox, so I am
> just going to write my reply to that patch here instead, hope that's
> ok.
> 
> Just to summarize, the race is that during unmount, fuse_abort() ->
> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> fuse_uring_entry_teardown() gets run but there may still be sqes that
> are being registered, which results in new ents that are created (and
> leaked) after the teardown logic has finished and the queues are
> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> never gets scheduled because at the time of teardown, queue->refs is 0
> as those sqes have not fully created the ents and grabbed refs yet.
> fuse_uring_destruct() runs during unmount, but this doesn't clean up
> the created ents because those registered ents got put on the
> ent_in_userspace list which fuse_uring_destruct() doesn't go through
> to free, resulting in those ents being leaked.
> 
> The root cause of the race is that ents are being registered even when
> the queue is already stopped/dead. I think if we at registration time
> check the queue state before calling fuse_uring_prepare_cancel(), we
> eliminate the race altogether. If we see that the abort path has
> already triggered (eg queue->stopped == true), we manually free the
> ent and return an error instead of adding it to a list, eg
> 
> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> index d88a0c05434a..351c19150aae 100644
> --- a/fs/fuse/dev_uring.c
> +++ b/fs/fuse/dev_uring.c
> @@ -969,7 +969,7 @@ static bool is_ring_ready(struct fuse_ring *ring,
> int current_qid)
>  /*
>   * fuse_uring_req_fetch command handling
>   */
> -static void fuse_uring_do_register(struct fuse_ring_ent *ent,
> +static int fuse_uring_do_register(struct fuse_ring_ent *ent,
>                                    struct io_uring_cmd *cmd,
>                                    unsigned int issue_flags)
>  {
> @@ -978,6 +978,16 @@ static void fuse_uring_do_register(struct
> fuse_ring_ent *ent,
>         struct fuse_conn *fc = ring->fc;
>         struct fuse_iqueue *fiq = &fc->iq;
> 
> +       spin_lock(&queue->lock);
> +       /* abort teardown path is running or has run */
> +       if (queue->stopped) {
> +               spin_unlock(&queue->lock);
> +               atomic_dec(&ring->queue_refs);
> +               kfree(ent);
> +               return -ECONNABORTED;
> +       }
> +       spin_unlock(&queue->lock);
> +
>         fuse_uring_prepare_cancel(cmd, issue_flags, ent);
> 
>         spin_lock(&queue->lock);
> @@ -994,6 +1004,7 @@ static void fuse_uring_do_register(struct
> fuse_ring_ent *ent,
>                         wake_up_all(&fc->blocked_waitq);
>                 }
>         }
> +       return 0;
>  }
> 
>  /*
> @@ -1109,9 +1120,7 @@ static int fuse_uring_register(struct io_uring_cmd *cmd,
>         if (IS_ERR(ent))
>                 return PTR_ERR(ent);
> 
> -       fuse_uring_do_register(ent, cmd, issue_flags);
> -
> -       return 0;
> +       return fuse_uring_do_register(ent, cmd, issue_flags);
>  }
> 
> There's the scenario where the abort path's "queue->stopped = true"
> gets set right between when we drop the queue lock and before we call
> fuse_uring_prepare_cancel(), but the fuse_uring_create_ring_ent()
> logic that was called before fuse_uring_do_register() has already
> grabbed the ref on ring->queue_refs, which means in the abort path,
> the async teardown (fuse_uring_async_stop_queues()) work is guaranteed
> to run and clean up / free the entry.


I don't think your changes are needed, it should be handled by
IO_URING_F_CANCEL -> fuse_uring_cancel(). That is exactly where the
initial leak was - these commands came after abort and
fuse_uring_cancel() in linux upstream then puts the entries onto the
&queue->ent_in_userspace list.
Issue in master is, fuse_uring_stop_queues() might have been run already
- entries then get leaked and fuse_uring_destruct() later might give a
warning. That part can be reproduced with xfstests, before it starts any
of the tests it does some funny start stop actions.

Initial *simple* patch was to either add a new list or to just remove
the warning and to also handle either that new list or
queue->ent_in_userspace list  in fuse_uring_destruct(). The comment
explaining why it is needed was much longer than the rest of the patch.
The hard part in the long term would be tranfer the knowledge for that
requirement.

You then asked to handle the release directly in fuse_uring_cancel()
without another list
https://lore.kernel.org/r/CAJnrk1YaRRKHA-jVPAKZYpydaKcdswLG0XO7pUQZZ4-pTewkHQ@mail.gmail.com

Yes possible and this is what the next patch version does. However,
given fuse_uring_cancel() runs outside of all the fuse locks, it is racy
and I therefore asked in the introduction patch not to merge it yet.

https://lore.kernel.org/all/20251021-io-uring-fixes-cancel-mem-leak-v1-0-26b78b2c973c@ddn.com/


Turns out my suspicion was right ;)

Queue references might go to 0 when nothing is in flight and then
fuse_uring_abort(), which _might_ race and come a little later, then
might not doing anything.

        if (atomic_read(&ring->queue_refs) > 0) {
                fuse_uring_abort_end_requests(ring);
                fuse_uring_stop_queues(ring);
        }

As Horst figure out, removing this check for queue_refs avoids the
issue. I'm rather sure that the check was needed during development and
avoided some null pointer derefs, as that is what I remember. But I
don't think it is needed anymore.


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10  7:21     ` Horst Birthelmer
@ 2026-04-10 17:09       ` Joanne Koong
  2026-04-10 17:18         ` Bernd Schubert
  2026-04-10 18:55         ` Re: " Horst Birthelmer
  0 siblings, 2 replies; 14+ messages in thread
From: Joanne Koong @ 2026-04-10 17:09 UTC (permalink / raw)
  To: Horst Birthelmer
  Cc: Bernd Schubert, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> > On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> > >
> > >
> > >
> > > On 10/21/25 23:33, Bernd Schubert wrote:
> > > > Do not merge yet, the current series has not been tested yet.
> > >
> > > I'm glad that that I was hesitating to apply it, the DDN branch had it
> > > for ages and this patch actually introduced a possible fc->num_waiting
> > > issue, because fc->uring->queue_refs might go down to 0 though
> > > fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> > > flush the queues without another addition.
> > >
> >
> > Hi Bernd and Jian,
> >
> > For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> > from fuse_uring_cancel" email was never delivered to my inbox, so I am
> > just going to write my reply to that patch here instead, hope that's
> > ok.
> >
> > Just to summarize, the race is that during unmount, fuse_abort() ->
> > fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> > fuse_uring_entry_teardown() gets run but there may still be sqes that
> > are being registered, which results in new ents that are created (and
> > leaked) after the teardown logic has finished and the queues are
> > stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> > never gets scheduled because at the time of teardown, queue->refs is 0
> > as those sqes have not fully created the ents and grabbed refs yet.
> > fuse_uring_destruct() runs during unmount, but this doesn't clean up
> > the created ents because those registered ents got put on the
> > ent_in_userspace list which fuse_uring_destruct() doesn't go through
> > to free, resulting in those ents being leaked.
> >
> > The root cause of the race is that ents are being registered even when
> > the queue is already stopped/dead. I think if we at registration time
> > check the queue state before calling fuse_uring_prepare_cancel(), we
> > eliminate the race altogether. If we see that the abort path has
> > already triggered (eg queue->stopped == true), we manually free the
> > ent and return an error instead of adding it to a list, eg
>
> In my case (Bernd mentioned that I was investigating a hang during umount)
> there were a lot of requests created during teardown, so what happened
> was very similar, but for exact the opposite reason.
> In fuse_uring_abort() queue_refs was already 0 due to an optimization
> where the ring teardown ran before fuse_abort_conn().

Hi Horst,

Just to clarify, is this with running locally patched changes on your
ddn kernel? In the upstream code I'm seeing that teardown is only
called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
-> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
it's possible for teardown to run before fuse_abort_conn(). Is there
something I'm missing?

> Thus the queue->stopped was never set.
>
> How do we make sure that fuse_uring_teardown_entries() has not been
> called by fuse_uring_async_stop_queues()?

If i'm understanding your question correctly, your question is what
ensures the teardown logic in fuse_uring_async_stop_queues() hasn't
already executed by the time we drop the queue lock after checking if
the queue has been stopped? In fuse_uring_async_stop_queues(), the
async teardown work gets continuously rescheduled so long as
queue_refs > 0. The ent holds a reference on the queue, so when the
queue lock is dropped that async teardown work will be continuously
running until it cleans up that (and any other) ents.

>
> Maybe I'm missing something?
>
> My fix was to remove the check for queue_refs > 0 in fuse_uring_abort()
> and make sure that even if the teardown was complete nothing bad happens
> in fuse_uring_abort_end_requests() and fuse_uring_stop_queues().

I'll look more at this path today.

Thanks,
Joanne
>
> Thanks,
> Horst
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 17:09       ` Joanne Koong
@ 2026-04-10 17:18         ` Bernd Schubert
  2026-04-10 17:28           ` Joanne Koong
  2026-04-10 18:55         ` Re: " Horst Birthelmer
  1 sibling, 1 reply; 14+ messages in thread
From: Bernd Schubert @ 2026-04-10 17:18 UTC (permalink / raw)
  To: Joanne Koong, Horst Birthelmer
  Cc: Bernd Schubert, Miklos Szeredi, linux-fsdevel, Jian Huang Li,
	stable, Horst Birthelmer



On 4/10/26 19:09, Joanne Koong wrote:
> On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>
>> On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
>>> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>>>>
>>>>
>>>>
>>>> On 10/21/25 23:33, Bernd Schubert wrote:
>>>>> Do not merge yet, the current series has not been tested yet.
>>>>
>>>> I'm glad that that I was hesitating to apply it, the DDN branch had it
>>>> for ages and this patch actually introduced a possible fc->num_waiting
>>>> issue, because fc->uring->queue_refs might go down to 0 though
>>>> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
>>>> flush the queues without another addition.
>>>>
>>>
>>> Hi Bernd and Jian,
>>>
>>> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
>>> from fuse_uring_cancel" email was never delivered to my inbox, so I am
>>> just going to write my reply to that patch here instead, hope that's
>>> ok.
>>>
>>> Just to summarize, the race is that during unmount, fuse_abort() ->
>>> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
>>> fuse_uring_entry_teardown() gets run but there may still be sqes that
>>> are being registered, which results in new ents that are created (and
>>> leaked) after the teardown logic has finished and the queues are
>>> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
>>> never gets scheduled because at the time of teardown, queue->refs is 0
>>> as those sqes have not fully created the ents and grabbed refs yet.
>>> fuse_uring_destruct() runs during unmount, but this doesn't clean up
>>> the created ents because those registered ents got put on the
>>> ent_in_userspace list which fuse_uring_destruct() doesn't go through
>>> to free, resulting in those ents being leaked.
>>>
>>> The root cause of the race is that ents are being registered even when
>>> the queue is already stopped/dead. I think if we at registration time
>>> check the queue state before calling fuse_uring_prepare_cancel(), we
>>> eliminate the race altogether. If we see that the abort path has
>>> already triggered (eg queue->stopped == true), we manually free the
>>> ent and return an error instead of adding it to a list, eg
>>
>> In my case (Bernd mentioned that I was investigating a hang during umount)
>> there were a lot of requests created during teardown, so what happened
>> was very similar, but for exact the opposite reason.
>> In fuse_uring_abort() queue_refs was already 0 due to an optimization
>> where the ring teardown ran before fuse_abort_conn().
> 
> Hi Horst,
> 
> Just to clarify, is this with running locally patched changes on your
> ddn kernel? In the upstream code I'm seeing that teardown is only
> called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
> -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
> it's possible for teardown to run before fuse_abort_conn(). Is there
> something I'm missing?

See my mail please it explains the history and shows the patch I had
posted to the list and which is not applied yet. The DDN branches have
it applied.

Thanks,
Bernd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 17:18         ` Bernd Schubert
@ 2026-04-10 17:28           ` Joanne Koong
  2026-04-10 17:32             ` Bernd Schubert
  0 siblings, 1 reply; 14+ messages in thread
From: Joanne Koong @ 2026-04-10 17:28 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Horst Birthelmer, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Fri, Apr 10, 2026 at 10:18 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>
>
>
> On 4/10/26 19:09, Joanne Koong wrote:
> > On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >>
> >> On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> >>> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 10/21/25 23:33, Bernd Schubert wrote:
> >>>>> Do not merge yet, the current series has not been tested yet.
> >>>>
> >>>> I'm glad that that I was hesitating to apply it, the DDN branch had it
> >>>> for ages and this patch actually introduced a possible fc->num_waiting
> >>>> issue, because fc->uring->queue_refs might go down to 0 though
> >>>> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> >>>> flush the queues without another addition.
> >>>>
> >>>
> >>> Hi Bernd and Jian,
> >>>
> >>> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> >>> from fuse_uring_cancel" email was never delivered to my inbox, so I am
> >>> just going to write my reply to that patch here instead, hope that's
> >>> ok.
> >>>
> >>> Just to summarize, the race is that during unmount, fuse_abort() ->
> >>> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> >>> fuse_uring_entry_teardown() gets run but there may still be sqes that
> >>> are being registered, which results in new ents that are created (and
> >>> leaked) after the teardown logic has finished and the queues are
> >>> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> >>> never gets scheduled because at the time of teardown, queue->refs is 0
> >>> as those sqes have not fully created the ents and grabbed refs yet.
> >>> fuse_uring_destruct() runs during unmount, but this doesn't clean up
> >>> the created ents because those registered ents got put on the
> >>> ent_in_userspace list which fuse_uring_destruct() doesn't go through
> >>> to free, resulting in those ents being leaked.
> >>>
> >>> The root cause of the race is that ents are being registered even when
> >>> the queue is already stopped/dead. I think if we at registration time
> >>> check the queue state before calling fuse_uring_prepare_cancel(), we
> >>> eliminate the race altogether. If we see that the abort path has
> >>> already triggered (eg queue->stopped == true), we manually free the
> >>> ent and return an error instead of adding it to a list, eg
> >>
> >> In my case (Bernd mentioned that I was investigating a hang during umount)
> >> there were a lot of requests created during teardown, so what happened
> >> was very similar, but for exact the opposite reason.
> >> In fuse_uring_abort() queue_refs was already 0 due to an optimization
> >> where the ring teardown ran before fuse_abort_conn().
> >
> > Hi Horst,
> >
> > Just to clarify, is this with running locally patched changes on your
> > ddn kernel? In the upstream code I'm seeing that teardown is only
> > called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
> > -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
> > it's possible for teardown to run before fuse_abort_conn(). Is there
> > something I'm missing?
>
> See my mail please it explains the history and shows the patch I had
> posted to the list and which is not applied yet. The DDN branches have
> it applied.

Hi Bernd,

Can you link to which mail you are referring to? Which patch are you
talking about?

Thanks,
Joanne

>
> Thanks,
> Bernd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 17:28           ` Joanne Koong
@ 2026-04-10 17:32             ` Bernd Schubert
  2026-04-10 19:53               ` Joanne Koong
  0 siblings, 1 reply; 14+ messages in thread
From: Bernd Schubert @ 2026-04-10 17:32 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Horst Birthelmer, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer



On 4/10/26 19:28, Joanne Koong wrote:
> On Fri, Apr 10, 2026 at 10:18 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>>
>>
>>
>> On 4/10/26 19:09, Joanne Koong wrote:
>>> On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>>>
>>>> On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
>>>>> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/21/25 23:33, Bernd Schubert wrote:
>>>>>>> Do not merge yet, the current series has not been tested yet.
>>>>>>
>>>>>> I'm glad that that I was hesitating to apply it, the DDN branch had it
>>>>>> for ages and this patch actually introduced a possible fc->num_waiting
>>>>>> issue, because fc->uring->queue_refs might go down to 0 though
>>>>>> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
>>>>>> flush the queues without another addition.
>>>>>>
>>>>>
>>>>> Hi Bernd and Jian,
>>>>>
>>>>> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
>>>>> from fuse_uring_cancel" email was never delivered to my inbox, so I am
>>>>> just going to write my reply to that patch here instead, hope that's
>>>>> ok.
>>>>>
>>>>> Just to summarize, the race is that during unmount, fuse_abort() ->
>>>>> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
>>>>> fuse_uring_entry_teardown() gets run but there may still be sqes that
>>>>> are being registered, which results in new ents that are created (and
>>>>> leaked) after the teardown logic has finished and the queues are
>>>>> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
>>>>> never gets scheduled because at the time of teardown, queue->refs is 0
>>>>> as those sqes have not fully created the ents and grabbed refs yet.
>>>>> fuse_uring_destruct() runs during unmount, but this doesn't clean up
>>>>> the created ents because those registered ents got put on the
>>>>> ent_in_userspace list which fuse_uring_destruct() doesn't go through
>>>>> to free, resulting in those ents being leaked.
>>>>>
>>>>> The root cause of the race is that ents are being registered even when
>>>>> the queue is already stopped/dead. I think if we at registration time
>>>>> check the queue state before calling fuse_uring_prepare_cancel(), we
>>>>> eliminate the race altogether. If we see that the abort path has
>>>>> already triggered (eg queue->stopped == true), we manually free the
>>>>> ent and return an error instead of adding it to a list, eg
>>>>
>>>> In my case (Bernd mentioned that I was investigating a hang during umount)
>>>> there were a lot of requests created during teardown, so what happened
>>>> was very similar, but for exact the opposite reason.
>>>> In fuse_uring_abort() queue_refs was already 0 due to an optimization
>>>> where the ring teardown ran before fuse_abort_conn().
>>>
>>> Hi Horst,
>>>
>>> Just to clarify, is this with running locally patched changes on your
>>> ddn kernel? In the upstream code I'm seeing that teardown is only
>>> called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
>>> -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
>>> it's possible for teardown to run before fuse_abort_conn(). Is there
>>> something I'm missing?
>>
>> See my mail please it explains the history and shows the patch I had
>> posted to the list and which is not applied yet. The DDN branches have
>> it applied.
> 
> Hi Bernd,
> 
> Can you link to which mail you are referring to? Which patch are you
> talking about?

The mail I had sent earlier today, a few hours after Horsts. Somehow I
have the bad feeling that half of my mails are going into a spam folder.
I hope you get this one.

Here is the link to the message-id
https://lore.kernel.org/all/3eabbc7b-010f-4d4c-9145-30d69fe1aa79@bsbernd.com/


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 17:09       ` Joanne Koong
  2026-04-10 17:18         ` Bernd Schubert
@ 2026-04-10 18:55         ` Horst Birthelmer
  2026-04-10 20:09           ` Joanne Koong
  1 sibling, 1 reply; 14+ messages in thread
From: Horst Birthelmer @ 2026-04-10 18:55 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Bernd Schubert, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Fri, Apr 10, 2026 at 10:09:36AM -0700, Joanne Koong wrote:
> On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >
> > On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> > > On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> > > >
> > > >
> > > >
> > > > On 10/21/25 23:33, Bernd Schubert wrote:
> > > > > Do not merge yet, the current series has not been tested yet.
> > > >
> > > > I'm glad that that I was hesitating to apply it, the DDN branch had it
> > > > for ages and this patch actually introduced a possible fc->num_waiting
> > > > issue, because fc->uring->queue_refs might go down to 0 though
> > > > fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> > > > flush the queues without another addition.
> > > >
> > >
> > > Hi Bernd and Jian,
> > >
> > > For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> > > from fuse_uring_cancel" email was never delivered to my inbox, so I am
> > > just going to write my reply to that patch here instead, hope that's
> > > ok.
> > >
> > > Just to summarize, the race is that during unmount, fuse_abort() ->
> > > fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> > > fuse_uring_entry_teardown() gets run but there may still be sqes that
> > > are being registered, which results in new ents that are created (and
> > > leaked) after the teardown logic has finished and the queues are
> > > stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> > > never gets scheduled because at the time of teardown, queue->refs is 0
> > > as those sqes have not fully created the ents and grabbed refs yet.
> > > fuse_uring_destruct() runs during unmount, but this doesn't clean up
> > > the created ents because those registered ents got put on the
> > > ent_in_userspace list which fuse_uring_destruct() doesn't go through
> > > to free, resulting in those ents being leaked.
> > >
> > > The root cause of the race is that ents are being registered even when
> > > the queue is already stopped/dead. I think if we at registration time
> > > check the queue state before calling fuse_uring_prepare_cancel(), we
> > > eliminate the race altogether. If we see that the abort path has
> > > already triggered (eg queue->stopped == true), we manually free the
> > > ent and return an error instead of adding it to a list, eg
> >
> > In my case (Bernd mentioned that I was investigating a hang during umount)
> > there were a lot of requests created during teardown, so what happened
> > was very similar, but for exact the opposite reason.
> > In fuse_uring_abort() queue_refs was already 0 due to an optimization
> > where the ring teardown ran before fuse_abort_conn().
> 
> Hi Horst,
> 
> Just to clarify, is this with running locally patched changes on your
> ddn kernel? In the upstream code I'm seeing that teardown is only
> called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
> -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
> it's possible for teardown to run before fuse_abort_conn(). Is there
> something I'm missing?

Yes and no ... ;-)
The original patch this whole discussion was started by had a call to
the teardown of the entries and I had that applied.
But even without that the problem can still occur that queue_refs is 0
by the time fuse_abort_conn() is called.

> 
> > Thus the queue->stopped was never set.
> >
> > How do we make sure that fuse_uring_teardown_entries() has not been
> > called by fuse_uring_async_stop_queues()?
> 
> If i'm understanding your question correctly, your question is what
> ensures the teardown logic in fuse_uring_async_stop_queues() hasn't
> already executed by the time we drop the queue lock after checking if
> the queue has been stopped? In fuse_uring_async_stop_queues(), the
> async teardown work gets continuously rescheduled so long as
> queue_refs > 0. The ent holds a reference on the queue, so when the
> queue lock is dropped that async teardown work will be continuously
> running until it cleans up that (and any other) ents.
> 

You understand correctly.
If the fuse_async_stop_queues() runs there is still a window where
we have queue_refs == 0. If in that window fuse_abort_conn() runs
we never actually stop the queues and we can accept requests which
will never be processed.

I have never seen this happen without the patch mentioned above,
but with that 'optimization' it happens regularly when you are able to
kill the fuse server and the application using the file system more or
less at the same time e.g. by an OOM event, when the kernel tries to
free resources.

To me this looks like nothing will stop this from happening, though,
but maybe I'm just not familiar enough with the uring code ...

> >
> > Maybe I'm missing something?
> >
> > My fix was to remove the check for queue_refs > 0 in fuse_uring_abort()
> > and make sure that even if the teardown was complete nothing bad happens
> > in fuse_uring_abort_end_requests() and fuse_uring_stop_queues().
> 
> I'll look more at this path today.
> 
> Thanks,
> Joanne

Thanks,
Horst

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 17:32             ` Bernd Schubert
@ 2026-04-10 19:53               ` Joanne Koong
  0 siblings, 0 replies; 14+ messages in thread
From: Joanne Koong @ 2026-04-10 19:53 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Horst Birthelmer, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Fri, Apr 10, 2026 at 10:32 AM Bernd Schubert <bernd@bsbernd.com> wrote:
>
>
>
> On 4/10/26 19:28, Joanne Koong wrote:
> > On Fri, Apr 10, 2026 at 10:18 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> >>
> >>
> >>
> >> On 4/10/26 19:09, Joanne Koong wrote:
> >>> On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> >>>>
> >>>> On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> >>>>> On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 10/21/25 23:33, Bernd Schubert wrote:
> >>>>>>> Do not merge yet, the current series has not been tested yet.
> >>>>>>
> >>>>>> I'm glad that that I was hesitating to apply it, the DDN branch had it
> >>>>>> for ages and this patch actually introduced a possible fc->num_waiting
> >>>>>> issue, because fc->uring->queue_refs might go down to 0 though
> >>>>>> fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> >>>>>> flush the queues without another addition.
> >>>>>>
> >>>>>
> >>>>> Hi Bernd and Jian,
> >>>>>
> >>>>> For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> >>>>> from fuse_uring_cancel" email was never delivered to my inbox, so I am
> >>>>> just going to write my reply to that patch here instead, hope that's
> >>>>> ok.
> >>>>>
> >>>>> Just to summarize, the race is that during unmount, fuse_abort() ->
> >>>>> fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> >>>>> fuse_uring_entry_teardown() gets run but there may still be sqes that
> >>>>> are being registered, which results in new ents that are created (and
> >>>>> leaked) after the teardown logic has finished and the queues are
> >>>>> stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> >>>>> never gets scheduled because at the time of teardown, queue->refs is 0
> >>>>> as those sqes have not fully created the ents and grabbed refs yet.
> >>>>> fuse_uring_destruct() runs during unmount, but this doesn't clean up
> >>>>> the created ents because those registered ents got put on the
> >>>>> ent_in_userspace list which fuse_uring_destruct() doesn't go through
> >>>>> to free, resulting in those ents being leaked.
> >>>>>
> >>>>> The root cause of the race is that ents are being registered even when
> >>>>> the queue is already stopped/dead. I think if we at registration time
> >>>>> check the queue state before calling fuse_uring_prepare_cancel(), we
> >>>>> eliminate the race altogether. If we see that the abort path has
> >>>>> already triggered (eg queue->stopped == true), we manually free the
> >>>>> ent and return an error instead of adding it to a list, eg
> >>>>
> >>>> In my case (Bernd mentioned that I was investigating a hang during umount)
> >>>> there were a lot of requests created during teardown, so what happened
> >>>> was very similar, but for exact the opposite reason.
> >>>> In fuse_uring_abort() queue_refs was already 0 due to an optimization
> >>>> where the ring teardown ran before fuse_abort_conn().
> >>>
> >>> Hi Horst,
> >>>
> >>> Just to clarify, is this with running locally patched changes on your
> >>> ddn kernel? In the upstream code I'm seeing that teardown is only
> >>> called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
> >>> -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
> >>> it's possible for teardown to run before fuse_abort_conn(). Is there
> >>> something I'm missing?
> >>
> >> See my mail please it explains the history and shows the patch I had
> >> posted to the list and which is not applied yet. The DDN branches have
> >> it applied.
> >
> > Hi Bernd,
> >
> > Can you link to which mail you are referring to? Which patch are you
> > talking about?
>
> The mail I had sent earlier today, a few hours after Horsts. Somehow I
> have the bad feeling that half of my mails are going into a spam folder.
> I hope you get this one.
>
> Here is the link to the message-id
> https://lore.kernel.org/all/3eabbc7b-010f-4d4c-9145-30d69fe1aa79@bsbernd.com/

Thanks for the link! To summarize, what Horst was saying in this
paragraph then is that with this patchset applied, there's that
queue_refs == 0 problem (since queue refs is now dropped in the
fuse_uring_cancel() path) where abort then doesn't trigger since abort
checks queue_refs > 0 which is the same as the situation in [1].

Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/4b5a8040-b62c-4d75-a474-70d0b4759461@bsbernd.com/

>
>
> Thanks,
> Bernd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Re: [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown
  2026-04-10 18:55         ` Re: " Horst Birthelmer
@ 2026-04-10 20:09           ` Joanne Koong
  0 siblings, 0 replies; 14+ messages in thread
From: Joanne Koong @ 2026-04-10 20:09 UTC (permalink / raw)
  To: Horst Birthelmer
  Cc: Bernd Schubert, Bernd Schubert, Miklos Szeredi, linux-fsdevel,
	Jian Huang Li, stable, Horst Birthelmer

On Fri, Apr 10, 2026 at 11:55 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Fri, Apr 10, 2026 at 10:09:36AM -0700, Joanne Koong wrote:
> > On Fri, Apr 10, 2026 at 12:21 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > >
> > > On Thu, Apr 09, 2026 at 04:09:53PM -0700, Joanne Koong wrote:
> > > > On Thu, Apr 9, 2026 at 4:02 AM Bernd Schubert <bernd@bsbernd.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 10/21/25 23:33, Bernd Schubert wrote:
> > > > > > Do not merge yet, the current series has not been tested yet.
> > > > >
> > > > > I'm glad that that I was hesitating to apply it, the DDN branch had it
> > > > > for ages and this patch actually introduced a possible fc->num_waiting
> > > > > issue, because fc->uring->queue_refs might go down to 0 though
> > > > > fuse_uring_cancel() and then fuse_uring_abort() would never stop and
> > > > > flush the queues without another addition.
> > > > >
> > > >
> > > > Hi Bernd and Jian,
> > > >
> > > > For some reason the "[PATCH 2/2] fs/fuse: fix potential memory leak
> > > > from fuse_uring_cancel" email was never delivered to my inbox, so I am
> > > > just going to write my reply to that patch here instead, hope that's
> > > > ok.
> > > >
> > > > Just to summarize, the race is that during unmount, fuse_abort() ->
> > > > fuse_uring_abort() -> ... -> fuse_uring_teardown_entries() -> ... ->
> > > > fuse_uring_entry_teardown() gets run but there may still be sqes that
> > > > are being registered, which results in new ents that are created (and
> > > > leaked) after the teardown logic has finished and the queues are
> > > > stopped/dead. The async teardown work (fuse_uring_async_stop_queues())
> > > > never gets scheduled because at the time of teardown, queue->refs is 0
> > > > as those sqes have not fully created the ents and grabbed refs yet.
> > > > fuse_uring_destruct() runs during unmount, but this doesn't clean up
> > > > the created ents because those registered ents got put on the
> > > > ent_in_userspace list which fuse_uring_destruct() doesn't go through
> > > > to free, resulting in those ents being leaked.
> > > >
> > > > The root cause of the race is that ents are being registered even when
> > > > the queue is already stopped/dead. I think if we at registration time
> > > > check the queue state before calling fuse_uring_prepare_cancel(), we
> > > > eliminate the race altogether. If we see that the abort path has
> > > > already triggered (eg queue->stopped == true), we manually free the
> > > > ent and return an error instead of adding it to a list, eg
> > >
> > > In my case (Bernd mentioned that I was investigating a hang during umount)
> > > there were a lot of requests created during teardown, so what happened
> > > was very similar, but for exact the opposite reason.
> > > In fuse_uring_abort() queue_refs was already 0 due to an optimization
> > > where the ring teardown ran before fuse_abort_conn().
> >
> > Hi Horst,
> >
> > Just to clarify, is this with running locally patched changes on your
> > ddn kernel? In the upstream code I'm seeing that teardown is only
> > called by the abort path, eg fuse_abort_conn() -> fuse_uring_abort()
> > -> fuse_uring_stop_queues() -> teardown logic, so I'm not seeing how
> > it's possible for teardown to run before fuse_abort_conn(). Is there
> > something I'm missing?
>
> Yes and no ... ;-)
> The original patch this whole discussion was started by had a call to
> the teardown of the entries and I had that applied.
> But even without that the problem can still occur that queue_refs is 0
> by the time fuse_abort_conn() is called.

Gotcha, thanks for clarifying.

Without the original patch, can queue_refs still be 0 by the time
fuse_abort_conn() is called? The only case where I see that is when
the sqes are in the middle of being registered but haven't grabbed the
queue ref yet, and then the abort logic runs (I am going to write more
about this race in a reply to Bernd's other message in this thread),
but other than that I don't see how without the original patch we run
into this case since teardown -> queue ref decrement only happens in
fuse_uring_stop_list_entries() which only is triggered on the abort
path. Are you talking about a subsequent fuse_abort_conn() call (the
one called from fuse_dev_release())?


Thanks,
Joanne
>
> >
> > > Thus the queue->stopped was never set.
> > >
> > > How do we make sure that fuse_uring_teardown_entries() has not been
> > > called by fuse_uring_async_stop_queues()?
> >
> > If i'm understanding your question correctly, your question is what
> > ensures the teardown logic in fuse_uring_async_stop_queues() hasn't
> > already executed by the time we drop the queue lock after checking if
> > the queue has been stopped? In fuse_uring_async_stop_queues(), the
> > async teardown work gets continuously rescheduled so long as
> > queue_refs > 0. The ent holds a reference on the queue, so when the
> > queue lock is dropped that async teardown work will be continuously
> > running until it cleans up that (and any other) ents.
> >
>
> You understand correctly.
> If the fuse_async_stop_queues() runs there is still a window where
> we have queue_refs == 0. If in that window fuse_abort_conn() runs
> we never actually stop the queues and we can accept requests which
> will never be processed.
>
> I have never seen this happen without the patch mentioned above,
> but with that 'optimization' it happens regularly when you are able to
> kill the fuse server and the application using the file system more or
> less at the same time e.g. by an OOM event, when the kernel tries to
> free resources.
>
> To me this looks like nothing will stop this from happening, though,
> but maybe I'm just not familiar enough with the uring code ...
>
> > >
> > > Maybe I'm missing something?
> > >
> > > My fix was to remove the check for queue_refs > 0 in fuse_uring_abort()
> > > and make sure that even if the teardown was complete nothing bad happens
> > > in fuse_uring_abort_end_requests() and fuse_uring_stop_queues().
> >
> > I'll look more at this path today.
> >
> > Thanks,
> > Joanne
>
> Thanks,
> Horst

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-04-10 20:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-21 21:33 [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
2025-10-21 21:33 ` [PATCH 1/2] fuse: Move ring queues_refs decrement Bernd Schubert
2025-10-21 21:33 ` [PATCH 2/2] fs/fuse: fix potential memory leak from fuse_uring_cancel Bernd Schubert
2026-04-09 11:02 ` [PATCH 0/2] fuse: Fix possible memleak at startup with immediate teardown Bernd Schubert
2026-04-09 23:09   ` Joanne Koong
2026-04-10  7:21     ` Horst Birthelmer
2026-04-10 17:09       ` Joanne Koong
2026-04-10 17:18         ` Bernd Schubert
2026-04-10 17:28           ` Joanne Koong
2026-04-10 17:32             ` Bernd Schubert
2026-04-10 19:53               ` Joanne Koong
2026-04-10 18:55         ` Re: " Horst Birthelmer
2026-04-10 20:09           ` Joanne Koong
2026-04-10 11:26     ` Bernd Schubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox