* [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
@ 2025-08-01 16:09 Brian Song
2025-08-03 23:33 ` Brian Song
0 siblings, 1 reply; 7+ messages in thread
From: Brian Song @ 2025-08-01 16:09 UTC (permalink / raw)
To: bschubert, qemu-block; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel
Hi Bernd,
We are currently working on implementing termination support for
fuse-over-io_uring in QEMU, and right now we are focusing on how to
clean up in-flight SQEs properly. Our main question is about how well
the kernel supports robust cancellation for these fuse-over-io_uring
SQEs. Does it actually implement cancellation beyond destroying the
io_uring queue?
In QEMU FUSE export, we need a way to quickly and cleanly detach from
the event loop and cancel any pending SQEs when an export is no longer
in use. Ideally, we want to avoid the more drastic measure of having to
close the entire /dev/fuse fd just to gracefully terminate outstanding
operations.
We are not sure if there's an existing code path that supports async
cancel for these in-flight SQEs in the fuse-over-io_uring setup, or if
additional callbacks might be needed to fully integrate with the
kernel's async cancel mechanism. We also realized libfuse manages
shutdowns differently, typically by signaling a thread via eventfd
rather than relying on async cancel.
Would love to hear your thoughts or suggestions on this!
Thanks,
Brian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-01 16:09 [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring Brian Song
@ 2025-08-03 23:33 ` Brian Song
2025-08-04 11:33 ` Bernd Schubert
0 siblings, 1 reply; 7+ messages in thread
From: Brian Song @ 2025-08-03 23:33 UTC (permalink / raw)
To: bschubert, qemu-block; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel
On 2025-08-01 12:09 p.m., Brian Song wrote:
> Hi Bernd,
>
> We are currently working on implementing termination support for fuse-
> over-io_uring in QEMU, and right now we are focusing on how to clean up
> in-flight SQEs properly. Our main question is about how well the kernel
> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
> actually implement cancellation beyond destroying the io_uring queue?
>
> In QEMU FUSE export, we need a way to quickly and cleanly detach from
> the event loop and cancel any pending SQEs when an export is no longer
> in use. Ideally, we want to avoid the more drastic measure of having to
> close the entire /dev/fuse fd just to gracefully terminate outstanding
> operations.
>
> We are not sure if there's an existing code path that supports async
> cancel for these in-flight SQEs in the fuse-over-io_uring setup, or if
> additional callbacks might be needed to fully integrate with the
> kernel's async cancel mechanism. We also realized libfuse manages
> shutdowns differently, typically by signaling a thread via eventfd
> rather than relying on async cancel.
>
> Would love to hear your thoughts or suggestions on this!
>
> Thanks,
> Brian
I looked into the kernel codebase and came up with some initial ideas,
which might not be entirely accurate:
The IORING_OP_ASYNC_CANCEL operation can only cancel io_uring ring
resources and a limited set of request types. It does not clean up
resources related to fuse-over-io_uring, such as in-use entries.
IORING_OP_ASYNC_CANCEL
-> submit/enter
-> io_uring/opdef.c:: .issue = io_async_cancel,
-> __io_async_cancel
-> io_try_cancel ==> Can only cancel few types of requests
Currently, full cleanup of both io_uring and FUSE data structures for
fuse-over-io_uring only happens in two cases: [since we have mark these
SQEs cancelable when we commit_and_fetch everytime(mentioned below)]
1.When the FUSE daemon exits (exit syscall)
2.During execve, which triggers the kernel path:
io_uring_files_cancel =>
io_uring_try_cancel_uring_cmd =>
file->f_op->uring_cmd(cmd, IO_URING_F_CANCEL | IO_URING_F_COMPLETE_DEFER)
Below is a state diagram (mermaid graph) of a fuse_uring entry inside
the kernel:
graph TD
A["Userspace daemon"] -->
B["FUSE_IO_URING_CMD_REGISTER<br/>Register buffer"]
B --> C["Create fuse_ring_ent"]
C --> D["State: FRRS_AVAILABLE<br/>Added to ent_avail_queue"]
E["FUSE filesystem operation"] --> F["Generate FUSE request"]
F --> G["fuse_uring_queue_fuse_req()"]
G --> H{"Check ent_avail_queue"}
H -->|Entry available| I["Take entry from queue<br/>Assign to FUSE
request"]
H -->|No entry available| J["Request goes to fuse_req_queue and waits"]
I --> K["fuse_uring_dispatch_ent()"]
K --> L["State: FRRS_USERSPACE<br/>Move to ent_in_userspace"]
L --> M["Notify userspace to process"]
N["Process exit / daemon termination"] -->
O["io_uring_try_cancel_uring_cmd() <br/> >> NOTE Since we marked the
entry IORING_URING_CMD_CANCELABLE <br/> in the previous fuse_uring_cmd ,
try_cancel_uring_cmd will call <br/> fuse_uring_cmd to 'delete' it <<"]
O --> P["fuse_uring_cancel()"]
P --> Q{"Is entry state AVAILABLE?"}
Q -->|Yes| R[">> equivalent to 'delete' << Directly change to
USERSPACE<br/>Move to ent_in_userspace"]
Q -->|No| S["Do nothing"]
R --> T["io_uring_cmd_done(-ENOTCONN)"]
T --> U["Entry is 'disguised' as completed<br/>Will no longer
handle new FUSE requests"]
V["Practical effects of cancellation:"] --> W["1. Prevent new FUSE
requests from using this entry<br/>2. Release io_uring command
resources<br/>3. Does not affect already assigned FUSE requests"]
When the kernel is waiting for VFS requests and the corresponding entry
is idle, its state is FRRS_AVAILABLE. Once a request is handed off to
the userspace daemon, the entry's state transitions to FRRS_USERSPACE.
The fuse_uring_cmd function handles the COMMIT_AND_FETCH operation. If a
cmd call carries the IO_URING_F_CANCEL flag, fuse_uring_cancel is
invoked to mark the entry state as FRRS_USERSPACE, making it unavailable
for future requests from the VFS.
If the IORING_URING_CMD_CANCELABLE flag is not set, before committing
and fetching, we first call fuse_uring_prepare_cancel to mark the entry
as IORING_URING_CMD_CANCELABLE. This indicates that if the daemon exits
or an execve happens during fetch, the kernel can call
io_uring_try_cancel_uring_cmd to safely clean up these SQEs/CQEs and
related fuse resource.
Back to our previous issue, when deleting a FUSE export in QEMU, we hit
a crash due to an invalid CQE handler. This happened because the SQEs we
previously submitted hadn't returned yet by the time we shut down and
deleted the export.
To avoid this, we need to ensure that no further CQEs are returned and
no CQE handler is triggered. We need to either:
* Prevent any further user operations before calling blk_exp_close_all
or
* Require the userspace to trigger few specific operations that causes
the kernel to return all outstanding CQEs, and then the daemon can send
io_uring_cmd with the IO_URING_F_CANCEL flag to mark all entries as
unavailable (FRRS_USERSPACE) "delete operation", ensuring the kernel
won't assign them to future VFS requests.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-03 23:33 ` Brian Song
@ 2025-08-04 11:33 ` Bernd Schubert
2025-08-04 12:29 ` Kevin Wolf
2025-08-05 4:11 ` Brian Song
0 siblings, 2 replies; 7+ messages in thread
From: Bernd Schubert @ 2025-08-04 11:33 UTC (permalink / raw)
To: Brian Song, qemu-block@nongnu.org
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel@nongnu.org
Hi Brian,
sorry for my late reply, just back from vacation and fighting through
my mails.
On 8/4/25 01:33, Brian Song wrote:
>
>
> On 2025-08-01 12:09 p.m., Brian Song wrote:
>> Hi Bernd,
>>
>> We are currently working on implementing termination support for fuse-
>> over-io_uring in QEMU, and right now we are focusing on how to clean up
>> in-flight SQEs properly. Our main question is about how well the kernel
>> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
>> actually implement cancellation beyond destroying the io_uring queue?
>>
>> In QEMU FUSE export, we need a way to quickly and cleanly detach from
>> the event loop and cancel any pending SQEs when an export is no longer
>> in use. Ideally, we want to avoid the more drastic measure of having to
>> close the entire /dev/fuse fd just to gracefully terminate outstanding
>> operations.
>>
>> We are not sure if there's an existing code path that supports async
>> cancel for these in-flight SQEs in the fuse-over-io_uring setup, or if
>> additional callbacks might be needed to fully integrate with the
>> kernel's async cancel mechanism. We also realized libfuse manages
>> shutdowns differently, typically by signaling a thread via eventfd
>> rather than relying on async cancel.
>>
>> Would love to hear your thoughts or suggestions on this!
>>
>> Thanks,
>> Brian
>
> I looked into the kernel codebase and came up with some initial ideas,
> which might not be entirely accurate:
>
> The IORING_OP_ASYNC_CANCEL operation can only cancel io_uring ring
> resources and a limited set of request types. It does not clean up
> resources related to fuse-over-io_uring, such as in-use entries.
> IORING_OP_ASYNC_CANCEL
> -> submit/enter
> -> io_uring/opdef.c:: .issue = io_async_cancel,
> -> __io_async_cancel
> -> io_try_cancel ==> Can only cancel few types of requests
>
>
> Currently, full cleanup of both io_uring and FUSE data structures for
> fuse-over-io_uring only happens in two cases: [since we have mark these
> SQEs cancelable when we commit_and_fetch everytime(mentioned below)]
> 1.When the FUSE daemon exits (exit syscall)
> 2.During execve, which triggers the kernel path:
>
> io_uring_files_cancel =>
> io_uring_try_cancel_uring_cmd =>
> file->f_op->uring_cmd(cmd, IO_URING_F_CANCEL | IO_URING_F_COMPLETE_DEFER)
>
>
>
> Below is a state diagram (mermaid graph) of a fuse_uring entry inside
> the kernel:
>
> graph TD
> A["Userspace daemon"] -->
> B["FUSE_IO_URING_CMD_REGISTER<br/>Register buffer"]
> B --> C["Create fuse_ring_ent"]
> C --> D["State: FRRS_AVAILABLE<br/>Added to ent_avail_queue"]
>
> E["FUSE filesystem operation"] --> F["Generate FUSE request"]
> F --> G["fuse_uring_queue_fuse_req()"]
> G --> H{"Check ent_avail_queue"}
>
> H -->|Entry available| I["Take entry from queue<br/>Assign to FUSE
> request"]
> H -->|No entry available| J["Request goes to fuse_req_queue and waits"]
>
> I --> K["fuse_uring_dispatch_ent()"]
> K --> L["State: FRRS_USERSPACE<br/>Move to ent_in_userspace"]
> L --> M["Notify userspace to process"]
>
> N["Process exit / daemon termination"] -->
> O["io_uring_try_cancel_uring_cmd() <br/> >> NOTE Since we marked the
> entry IORING_URING_CMD_CANCELABLE <br/> in the previous fuse_uring_cmd ,
> try_cancel_uring_cmd will call <br/> fuse_uring_cmd to 'delete' it <<"]
> O --> P["fuse_uring_cancel()"]
> P --> Q{"Is entry state AVAILABLE?"}
>
> Q -->|Yes| R[">> equivalent to 'delete' << Directly change to
> USERSPACE<br/>Move to ent_in_userspace"]
> Q -->|No| S["Do nothing"]
>
> R --> T["io_uring_cmd_done(-ENOTCONN)"]
> T --> U["Entry is 'disguised' as completed<br/>Will no longer
> handle new FUSE requests"]
>
> V["Practical effects of cancellation:"] --> W["1. Prevent new FUSE
> requests from using this entry<br/>2. Release io_uring command
> resources<br/>3. Does not affect already assigned FUSE requests"]
>
>
>
> When the kernel is waiting for VFS requests and the corresponding entry
> is idle, its state is FRRS_AVAILABLE. Once a request is handed off to
> the userspace daemon, the entry's state transitions to FRRS_USERSPACE.
>
> The fuse_uring_cmd function handles the COMMIT_AND_FETCH operation. If a
> cmd call carries the IO_URING_F_CANCEL flag, fuse_uring_cancel is
> invoked to mark the entry state as FRRS_USERSPACE, making it unavailable
> for future requests from the VFS.
>
> If the IORING_URING_CMD_CANCELABLE flag is not set, before committing
> and fetching, we first call fuse_uring_prepare_cancel to mark the entry
> as IORING_URING_CMD_CANCELABLE. This indicates that if the daemon exits
> or an execve happens during fetch, the kernel can call
> io_uring_try_cancel_uring_cmd to safely clean up these SQEs/CQEs and
> related fuse resource.
>
> Back to our previous issue, when deleting a FUSE export in QEMU, we hit
> a crash due to an invalid CQE handler. This happened because the SQEs we
> previously submitted hadn't returned yet by the time we shut down and
> deleted the export.
>
> To avoid this, we need to ensure that no further CQEs are returned and
> no CQE handler is triggered. We need to either:
>
> * Prevent any further user operations before calling blk_exp_close_all
>
> or
>
> * Require the userspace to trigger few specific operations that causes
> the kernel to return all outstanding CQEs, and then the daemon can send
> io_uring_cmd with the IO_URING_F_CANCEL flag to mark all entries as
> unavailable (FRRS_USERSPACE) "delete operation", ensuring the kernel
> won't assign them to future VFS requests.
>
>
>
I have to admit that I'm confused why you can't use umount, isn't that
the most graceful way to shutdown a connection?
If you need another custom way for some reasons, we probably need
to add it.
Thanks,
Bernd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-04 11:33 ` Bernd Schubert
@ 2025-08-04 12:29 ` Kevin Wolf
2025-08-05 4:11 ` Brian Song
1 sibling, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2025-08-04 12:29 UTC (permalink / raw)
To: Bernd Schubert
Cc: Brian Song, qemu-block@nongnu.org, Stefan Hajnoczi,
qemu-devel@nongnu.org
Hi Bernd,
Am 04.08.2025 um 13:33 hat Bernd Schubert geschrieben:
> Hi Brian,
>
> sorry for my late reply, just back from vacation and fighting through
> my mails.
>
> On 8/4/25 01:33, Brian Song wrote:
> >
> >
> > On 2025-08-01 12:09 p.m., Brian Song wrote:
> >> Hi Bernd,
> >>
> >> We are currently working on implementing termination support for fuse-
> >> over-io_uring in QEMU, and right now we are focusing on how to clean up
> >> in-flight SQEs properly. Our main question is about how well the kernel
> >> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
> >> actually implement cancellation beyond destroying the io_uring queue?
> >>
> >> In QEMU FUSE export, we need a way to quickly and cleanly detach from
> >> the event loop and cancel any pending SQEs when an export is no longer
> >> in use. Ideally, we want to avoid the more drastic measure of having to
> >> close the entire /dev/fuse fd just to gracefully terminate outstanding
> >> operations.
> >> [...]
> I have to admit that I'm confused why you can't use umount, isn't that
> the most graceful way to shutdown a connection?
>
> If you need another custom way for some reasons, we probably need
> to add it.
Brian focussed on shutdown in his message because that is the scenario
he's seeing right now, but you're right that shutdown probably isn't
that bad and once we unmount the exported image, we can properly shut
down things on the QEMU side, too.
The more challenging part is that sometimes QEMU needs to quiesce an
export so that no new requests can be processed for a short time. Maybe
we're switching processing to a different iothread or something like
this. In this scenario, we don't actually want to unmount the image, but
just cancel any outstanding COMMIT_AND_FETCH request, and soon after
submit a new one to continue processing requests.
If it's impossible to cancel the request in the kernel and queue new
request for a little bit (I suppose it would look a bit like userspace
being completely busy processing hypothetical NOP requests), we would
have to introduce some indirections in userspace to handle the case that
CQEs may be posted at times when we don't want to process them, or even
in the ring of the wrong thread (each iothread in QEMU has it's own
io_uring instance).
Come to think of it, the next thing the user may want to do might even
be deleting the old thread, which would have to fail while it's still
busy. So I think we do need a way to get rid of requests that it started
and can't just wait until they are used up.
Kevin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-04 11:33 ` Bernd Schubert
2025-08-04 12:29 ` Kevin Wolf
@ 2025-08-05 4:11 ` Brian Song
2025-08-07 9:05 ` Bernd Schubert
1 sibling, 1 reply; 7+ messages in thread
From: Brian Song @ 2025-08-05 4:11 UTC (permalink / raw)
To: Bernd Schubert, qemu-block@nongnu.org
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel@nongnu.org
On 2025-08-04 7:33 a.m., Bernd Schubert wrote:
> Hi Brian,
>
> sorry for my late reply, just back from vacation and fighting through
> my mails.
>
> On 8/4/25 01:33, Brian Song wrote:
>>
>>
>> On 2025-08-01 12:09 p.m., Brian Song wrote:
>>> Hi Bernd,
>>>
>>> We are currently working on implementing termination support for fuse-
>>> over-io_uring in QEMU, and right now we are focusing on how to clean up
>>> in-flight SQEs properly. Our main question is about how well the kernel
>>> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
>>> actually implement cancellation beyond destroying the io_uring queue?
>>> [...]
>>
>
> I have to admit that I'm confused why you can't use umount, isn't that
> the most graceful way to shutdown a connection?
>
> If you need another custom way for some reasons, we probably need
> to add it.
>
>
> Thanks,
> Bernd
Hi Bernd,
Thanks for your insights!
I think umount doesn't cancel any pending SQEs, right? From what I see,
the only way to cancel all pending SQEs and transition all entries to
the FRRS_USERSPACE state (unavailable for further fuse requests) in the
kernel is by calling io_uring_files_cancel in do_exit, or
io_uring_task_cancel in begin_new_exec.
From my understanding, QEMU follows an event-driven model. So if we
don't cancel the SQEs submitted by a connection when it ends, then
before QEMU exits — after the connection is closed and the associated
FUSE data structures have been freed — any CQE that comes back will
trigger QEMU to invoke a previously deleted CQE handler, leading to a
segfault.
So if the only way to make all pending entries unavailable in the kernel
is calling do_exit or begin_new_exec, I think we should do some
workarounds in QEMU.
Thanks,
Brian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-05 4:11 ` Brian Song
@ 2025-08-07 9:05 ` Bernd Schubert
2025-08-07 15:15 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: Bernd Schubert @ 2025-08-07 9:05 UTC (permalink / raw)
To: Brian Song, qemu-block@nongnu.org
Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel@nongnu.org
Hi Brian,
sorry for late replies. Totally swamped in work this week and next week
will be off another week.
On 8/5/25 06:11, Brian Song wrote:
>
>
> On 2025-08-04 7:33 a.m., Bernd Schubert wrote:
>> Hi Brian,
>>
>> sorry for my late reply, just back from vacation and fighting through
>> my mails.
>>
>> On 8/4/25 01:33, Brian Song wrote:
>>>
>>>
>>> On 2025-08-01 12:09 p.m., Brian Song wrote:
>>>> Hi Bernd,
>>>>
>>>> We are currently working on implementing termination support for fuse-
>>>> over-io_uring in QEMU, and right now we are focusing on how to clean up
>>>> in-flight SQEs properly. Our main question is about how well the kernel
>>>> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
>>>> actually implement cancellation beyond destroying the io_uring queue?
>>>> [...]
>>>
>>
>> I have to admit that I'm confused why you can't use umount, isn't that
>> the most graceful way to shutdown a connection?
>>
>> If you need another custom way for some reasons, we probably need
>> to add it.
>>
>>
>> Thanks,
>> Bernd
>
> Hi Bernd,
>
> Thanks for your insights!
>
> I think umount doesn't cancel any pending SQEs, right? From what I see,
> the only way to cancel all pending SQEs and transition all entries to
> the FRRS_USERSPACE state (unavailable for further fuse requests) in the
> kernel is by calling io_uring_files_cancel in do_exit, or
> io_uring_task_cancel in begin_new_exec.
There are two umount forms
- Forced umount - immediately cancels the connection and aborts
requests. That also immediately releases pending SQEs.
- Normal umount, destroys the connection and completed SQEs at the end
of umount.
>
> From my understanding, QEMU follows an event-driven model. So if we
> don't cancel the SQEs submitted by a connection when it ends, then
> before QEMU exits — after the connection is closed and the associated
> FUSE data structures have been freed — any CQE that comes back will
> trigger QEMU to invoke a previously deleted CQE handler, leading to a
> segfault.
>
> So if the only way to make all pending entries unavailable in the kernel
> is calling do_exit or begin_new_exec, I think we should do some
> workarounds in QEMU.
I guess if we find a good argument why qemu needs to complete SQEs
before umount is complete a kernel patch would be accepted. Doesn't
sound that difficult to create patch for that. At least for entries that
are on state FRRS_AVAILABLE. I can prepare patch, but at best in between
Saturday and Monday.
Thanks,
Bernd
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring
2025-08-07 9:05 ` Bernd Schubert
@ 2025-08-07 15:15 ` Stefan Hajnoczi
0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-08-07 15:15 UTC (permalink / raw)
To: Bernd Schubert
Cc: Brian Song, qemu-block@nongnu.org, Kevin Wolf,
qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 3559 bytes --]
On Thu, Aug 07, 2025 at 09:05:25AM +0000, Bernd Schubert wrote:
> Hi Brian,
>
> sorry for late replies. Totally swamped in work this week and next week
> will be off another week.
>
> On 8/5/25 06:11, Brian Song wrote:
> >
> >
> > On 2025-08-04 7:33 a.m., Bernd Schubert wrote:
> >> Hi Brian,
> >>
> >> sorry for my late reply, just back from vacation and fighting through
> >> my mails.
> >>
> >> On 8/4/25 01:33, Brian Song wrote:
> >>>
> >>>
> >>> On 2025-08-01 12:09 p.m., Brian Song wrote:
> >>>> Hi Bernd,
> >>>>
> >>>> We are currently working on implementing termination support for fuse-
> >>>> over-io_uring in QEMU, and right now we are focusing on how to clean up
> >>>> in-flight SQEs properly. Our main question is about how well the kernel
> >>>> supports robust cancellation for these fuse-over-io_uring SQEs. Does it
> >>>> actually implement cancellation beyond destroying the io_uring queue?
> >>>> [...]
> >>>
> >>
> >> I have to admit that I'm confused why you can't use umount, isn't that
> >> the most graceful way to shutdown a connection?
> >>
> >> If you need another custom way for some reasons, we probably need
> >> to add it.
> >>
> >>
> >> Thanks,
> >> Bernd
> >
> > Hi Bernd,
> >
> > Thanks for your insights!
> >
> > I think umount doesn't cancel any pending SQEs, right? From what I see,
> > the only way to cancel all pending SQEs and transition all entries to
> > the FRRS_USERSPACE state (unavailable for further fuse requests) in the
> > kernel is by calling io_uring_files_cancel in do_exit, or
> > io_uring_task_cancel in begin_new_exec.
>
> There are two umount forms
>
> - Forced umount - immediately cancels the connection and aborts
> requests. That also immediately releases pending SQEs.
>
> - Normal umount, destroys the connection and completed SQEs at the end
> of umount.
>
> >
> > From my understanding, QEMU follows an event-driven model. So if we
> > don't cancel the SQEs submitted by a connection when it ends, then
> > before QEMU exits — after the connection is closed and the associated
> > FUSE data structures have been freed — any CQE that comes back will
> > trigger QEMU to invoke a previously deleted CQE handler, leading to a
> > segfault.
> >
> > So if the only way to make all pending entries unavailable in the kernel
> > is calling do_exit or begin_new_exec, I think we should do some
> > workarounds in QEMU.
>
> I guess if we find a good argument why qemu needs to complete SQEs
> before umount is complete a kernel patch would be accepted. Doesn't
> sound that difficult to create patch for that. At least for entries that
> are on state FRRS_AVAILABLE. I can prepare patch, but at best in between
> Saturday and Monday.
Hi Bernd,
QEMU quiesces I/O at certain points, like when the block driver graph is
reconfigured (kind of like changing the device-mapper table in the
kernel) or when threads are reconfigured. This is also used during
termination to stop accepting new I/O and wait until in-flight I/O has
completed.
Ideally io_uring's ASYNC_CANCEL would work on in-flight
FUSE-over-io_uring uring_cmd requests. The REGISTER or COMMIT_AND_FETCH
uring_cmds would complete with -ECANCELED and future FUSE requests would
be queued in the kernel until FUSE-over-io_uring becomes ready again.
If and when userspace becomes ready again, it submits REGISTER
uring_cmds again and queued FUSE requests are then delivered to
userspace.
Thanks for your help!
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-07 15:17 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-01 16:09 [QEMU/FUSE] Discussion on Proper Termination and Async Cancellation in fuse-over-io_uring Brian Song
2025-08-03 23:33 ` Brian Song
2025-08-04 11:33 ` Bernd Schubert
2025-08-04 12:29 ` Kevin Wolf
2025-08-05 4:11 ` Brian Song
2025-08-07 9:05 ` Bernd Schubert
2025-08-07 15:15 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).