* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released [not found] <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> @ 2023-07-19 21:16 ` Axel Rasmussen 2023-07-19 21:54 ` Axel Rasmussen 2025-03-07 7:21 ` Jinjiang Tu 1 sibling, 1 reply; 14+ messages in thread From: Axel Rasmussen @ 2023-07-19 21:16 UTC (permalink / raw) To: Dimitris Siakavaras Cc: viro, linux-fsdevel, linux-kernel, Peter Xu, linux-mm, Axel Rasmussen Thanks for the detailed report Dimitris! I've CCed the MM mailing list and some folks who work on userfaultfd. I took a look at this today, but I haven't quite come up with a solution. I thought it might be as easy as changing userfaultfd_release() to set released *after* taking the lock. But no such luck, the ordering is what it is to deal with another subtle case: WRITE_ONCE(ctx->released, true); if (!mmget_not_zero(mm)) goto wakeup; /* * Flush page faults out of all CPUs. NOTE: all page faults * must be retried without returning VM_FAULT_SIGBUS if * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx * changes while handle_userfault released the mmap_lock. So * it's critical that released is set to true (above), before * taking the mmap_lock for writing. */ mmap_write_lock(mm); I think perhaps the right thing to do is to have handle_userfault() release mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that appropriately? But, some investigation is required to be sure that's okay to do in the other non-GUP ways we can end up in handle_userfault(). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2023-07-19 21:16 ` Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released Axel Rasmussen @ 2023-07-19 21:54 ` Axel Rasmussen 2023-07-20 10:35 ` Hillf Danton 2023-07-20 20:06 ` Peter Xu 0 siblings, 2 replies; 14+ messages in thread From: Axel Rasmussen @ 2023-07-19 21:54 UTC (permalink / raw) To: Dimitris Siakavaras; +Cc: viro, linux-fsdevel, linux-kernel, Peter Xu, linux-mm On Wed, Jul 19, 2023 at 2:16 PM Axel Rasmussen <axelrasmussen@google.com> wrote: > > Thanks for the detailed report Dimitris! I've CCed the MM mailing list and some > folks who work on userfaultfd. Apologies, I should have quoted the original message for the others I added to CC: https://lore.kernel.org/lkml/79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr/T/#u > > I took a look at this today, but I haven't quite come up with a solution. > > I thought it might be as easy as changing userfaultfd_release() to set released > *after* taking the lock. But no such luck, the ordering is what it is to deal > with another subtle case: > > > WRITE_ONCE(ctx->released, true); > > if (!mmget_not_zero(mm)) > goto wakeup; > > /* > * Flush page faults out of all CPUs. NOTE: all page faults > * must be retried without returning VM_FAULT_SIGBUS if > * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx > * changes while handle_userfault released the mmap_lock. So > * it's critical that released is set to true (above), before > * taking the mmap_lock for writing. > */ > mmap_write_lock(mm); > > I think perhaps the right thing to do is to have handle_userfault() release > mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that > appropriately? But, some investigation is required to be sure that's okay to do > in the other non-GUP ways we can end up in handle_userfault(). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2023-07-19 21:54 ` Axel Rasmussen @ 2023-07-20 10:35 ` Hillf Danton 2023-07-20 20:07 ` Peter Xu 2023-07-20 20:06 ` Peter Xu 1 sibling, 1 reply; 14+ messages in thread From: Hillf Danton @ 2023-07-20 10:35 UTC (permalink / raw) To: Dimitris Siakavaras; +Cc: Axel Rasmussen, linux-kernel, Peter Xu, linux-mm On Wed, Jul 19, 2023 at 2:16 PM Axel Rasmussen <axelrasmussen@google.com> wrote: > > I think perhaps the right thing to do is to have handle_userfault() release > mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that > appropriately? But, some investigation is required to be sure that's okay to do > in the other non-GUP ways we can end up in handle_userfault(). See if making kworker special works. --- x/fs/userfaultfd.c +++ y/fs/userfaultfd.c @@ -457,6 +457,8 @@ vm_fault_t handle_userfault(struct vm_fa * close the uffd. */ ret = VM_FAULT_NOPAGE; + if (current->flags & PF_WQ_WORKER) + ret = VM_FAULT_OOM; goto out; } -- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2023-07-20 10:35 ` Hillf Danton @ 2023-07-20 20:07 ` Peter Xu 0 siblings, 0 replies; 14+ messages in thread From: Peter Xu @ 2023-07-20 20:07 UTC (permalink / raw) To: Hillf Danton; +Cc: Dimitris Siakavaras, Axel Rasmussen, linux-kernel, linux-mm On Thu, Jul 20, 2023 at 06:35:34PM +0800, Hillf Danton wrote: > On Wed, Jul 19, 2023 at 2:16 PM Axel Rasmussen <axelrasmussen@google.com> wrote: > > > > I think perhaps the right thing to do is to have handle_userfault() release > > mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that > > appropriately? But, some investigation is required to be sure that's okay to do > > in the other non-GUP ways we can end up in handle_userfault(). > > See if making kworker special works. > > --- x/fs/userfaultfd.c > +++ y/fs/userfaultfd.c > @@ -457,6 +457,8 @@ vm_fault_t handle_userfault(struct vm_fa > * close the uffd. > */ > ret = VM_FAULT_NOPAGE; > + if (current->flags & PF_WQ_WORKER) > + ret = VM_FAULT_OOM; > goto out; > } Sorry this won't work - we need userfault to work with all forms of kworkers, especially including kvm async pf. Thanks. -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2023-07-19 21:54 ` Axel Rasmussen 2023-07-20 10:35 ` Hillf Danton @ 2023-07-20 20:06 ` Peter Xu 1 sibling, 0 replies; 14+ messages in thread From: Peter Xu @ 2023-07-20 20:06 UTC (permalink / raw) To: Axel Rasmussen Cc: Dimitris Siakavaras, viro, linux-fsdevel, linux-kernel, linux-mm Hello, Axel, Dimitris, On Wed, Jul 19, 2023 at 02:54:21PM -0700, Axel Rasmussen wrote: > On Wed, Jul 19, 2023 at 2:16 PM Axel Rasmussen <axelrasmussen@google.com> wrote: > > > > Thanks for the detailed report Dimitris! I've CCed the MM mailing list and some > > folks who work on userfaultfd. > > Apologies, I should have quoted the original message for the others I > added to CC: https://lore.kernel.org/lkml/79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr/T/#u > > > > > I took a look at this today, but I haven't quite come up with a solution. > > > > I thought it might be as easy as changing userfaultfd_release() to set released > > *after* taking the lock. But no such luck, the ordering is what it is to deal > > with another subtle case: > > > > > > WRITE_ONCE(ctx->released, true); > > > > if (!mmget_not_zero(mm)) > > goto wakeup; > > > > /* > > * Flush page faults out of all CPUs. NOTE: all page faults > > * must be retried without returning VM_FAULT_SIGBUS if > > * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx > > * changes while handle_userfault released the mmap_lock. So > > * it's critical that released is set to true (above), before > > * taking the mmap_lock for writing. > > */ > > mmap_write_lock(mm); > > > > I think perhaps the right thing to do is to have handle_userfault() release > > mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that > > appropriately? But, some investigation is required to be sure that's okay to do > > in the other non-GUP ways we can end up in handle_userfault(). Heh, this is also what I thought after reading. :) If we see in the very early commit from Andrea it seems that would not hang gup but just sigbus-ing it (see the comment that's mostly exactly the thing Dimitris hit here): commit 86039bd3b4e6a1129318cbfed4e0a6e001656635 Author: Andrea Arcangeli <aarcange@redhat.com> Date: Fri Sep 4 15:46:31 2015 -0700 userfaultfd: add new syscall to provide memory externalization + /* + * If it's already released don't get it. This avoids to loop + * in __get_user_pages if userfaultfd_release waits on the + * caller of handle_userfault to release the mmap_sem. + */ + if (unlikely(ACCESS_ONCE(ctx->released))) + return VM_FAULT_SIGBUS; + Then we switched over to the friendly way, assuming CRIU could close() the uffd during the monitee process running, in: commit 656710a60e3693911bee3a355d2f2bbae3faba33 Author: Andrea Arcangeli <aarcange@redhat.com> Date: Fri Sep 8 16:12:42 2017 -0700 userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS I had a feeling that after that we didn't test gup (I assume normal page fault path will still work). Let me copy Mike too for that just in case he has anything to say. Paste thread again: https://lore.kernel.org/lkml/79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr/T/#u My understanding is that releasing mmap lock here should work, but we need to move the code a bit. Dimitris, please feel free to try the patch attached here if you want. It's probably not a major use case of uffd over kvm (IIUC unregister before close() will also work?), but if it's trivial to fix we should proably fix it. Thanks, ===8<=== From 7e9ef050b487220463fa77a7aa97259ffe9bb15e Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Thu, 20 Jul 2023 15:33:55 -0400 Subject: [PATCH] mm/uffd: Fix release hang over GUP Signed-off-by: Peter Xu <peterx@redhat.com> --- fs/userfaultfd.c | 57 ++++++++++++++++++++++++++---------------------- 1 file changed, 31 insertions(+), 26 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index bbfaf5837a08..2358e6b00315 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -455,32 +455,6 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (!(vmf->flags & FAULT_FLAG_USER) && (ctx->flags & UFFD_USER_MODE_ONLY)) goto out; - /* - * If it's already released don't get it. This avoids to loop - * in __get_user_pages if userfaultfd_release waits on the - * caller of handle_userfault to release the mmap_lock. - */ - if (unlikely(READ_ONCE(ctx->released))) { - /* - * Don't return VM_FAULT_SIGBUS in this case, so a non - * cooperative manager can close the uffd after the - * last UFFDIO_COPY, without risking to trigger an - * involuntary SIGBUS if the process was starting the - * userfaultfd while the userfaultfd was still armed - * (but after the last UFFDIO_COPY). If the uffd - * wasn't already closed when the userfault reached - * this point, that would normally be solved by - * userfaultfd_must_wait returning 'false'. - * - * If we were to return VM_FAULT_SIGBUS here, the non - * cooperative manager would be instead forced to - * always call UFFDIO_UNREGISTER before it can safely - * close the uffd. - */ - ret = VM_FAULT_NOPAGE; - goto out; - } - /* * Check that we can return VM_FAULT_RETRY. * @@ -517,6 +491,37 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) goto out; + /* + * If it's already released don't get it. This avoids to loop + * in __get_user_pages if userfaultfd_release waits on the + * caller of handle_userfault to release the mmap_lock. + */ + if (unlikely(READ_ONCE(ctx->released))) { + /* + * Don't return VM_FAULT_SIGBUS in this case, so a non + * cooperative manager can close the uffd after the + * last UFFDIO_COPY, without risking to trigger an + * involuntary SIGBUS if the process was starting the + * userfaultfd while the userfaultfd was still armed + * (but after the last UFFDIO_COPY). If the uffd + * wasn't already closed when the userfault reached + * this point, that would normally be solved by + * userfaultfd_must_wait returning 'false'. + * + * If we were to return VM_FAULT_SIGBUS here, the non + * cooperative manager would be instead forced to + * always call UFFDIO_UNREGISTER before it can safely + * close the uffd. + * + * We release the mmap lock in this special case, just in + * case we're in a gup to not dead loop, so the other uffd + * handler thread/process can have a chance to take the + * write lock and do the unregistration. + */ + release_fault_lock(vmf); + goto out; + } + /* take the reference before dropping the mmap_lock */ userfaultfd_ctx_get(ctx); -- 2.41.0 ===8<=== -- Peter Xu ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released [not found] <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> 2023-07-19 21:16 ` Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released Axel Rasmussen @ 2025-03-07 7:21 ` Jinjiang Tu 2025-03-07 8:07 ` Jinjiang Tu 1 sibling, 1 reply; 14+ messages in thread From: Jinjiang Tu @ 2025-03-07 7:21 UTC (permalink / raw) To: jimsiak Cc: linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang, tujinjiang Hi, I encountered the same issue too. In my scenario, GUP is called by mlockall() syscall. Is there a solution to fix it? Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-07 7:21 ` Jinjiang Tu @ 2025-03-07 8:07 ` Jinjiang Tu 2025-03-07 13:11 ` jimsiak 0 siblings, 1 reply; 14+ messages in thread From: Jinjiang Tu @ 2025-03-07 8:07 UTC (permalink / raw) To: jimsiak, peterx Cc: linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang cc Peter Xu 在 2025/3/7 15:21, Jinjiang Tu 写道: > Hi, > > I encountered the same issue too. In my scenario, GUP is called by mlockall() > syscall. > > Is there a solution to fix it? > > Thanks. > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-07 8:07 ` Jinjiang Tu @ 2025-03-07 13:11 ` jimsiak 2025-03-07 22:41 ` Peter Xu 0 siblings, 1 reply; 14+ messages in thread From: jimsiak @ 2025-03-07 13:11 UTC (permalink / raw) To: Jinjiang Tu Cc: peterx, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang Hi, From my side, I managed to avoid the freezing of processes with the following change in function userfaultfd_release() in file fs/userfaultfd.c (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): I moved the following command from line 851: WRITE_ONCE(ctx->released, true); (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) to line 905, that is exactly before the functions returns 0. That simple workaround worked for my use case but I am far from sure that is a correct/sufficient fix for the problem at hand. Best Regards, Dimitris Στις 07/03/2025 10:07, Jinjiang Tu έγραψε: > cc Peter Xu > > 在 2025/3/7 15:21, Jinjiang Tu 写道: >> Hi, >> >> I encountered the same issue too. In my scenario, GUP is called by >> mlockall() >> syscall. >> >> Is there a solution to fix it? >> >> Thanks. >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-07 13:11 ` jimsiak @ 2025-03-07 22:41 ` Peter Xu 2025-03-10 6:40 ` Jinjiang Tu 0 siblings, 1 reply; 14+ messages in thread From: Peter Xu @ 2025-03-07 22:41 UTC (permalink / raw) To: jimsiak Cc: Jinjiang Tu, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: > Hi, > > From my side, I managed to avoid the freezing of processes with the > following change in function userfaultfd_release() in file fs/userfaultfd.c > (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): > > I moved the following command from line 851: > WRITE_ONCE(ctx->released, true); > (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) > > to line 905, that is exactly before the functions returns 0. > > That simple workaround worked for my use case but I am far from sure that is > a correct/sufficient fix for the problem at hand. Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. Maybe it's possible to remove ctx->released but only rely on the mmap write lock. However that'll need some closer look and more thoughts. To me, the more straightforward way to fix it is to use the patch I mentioned in the other email: https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ Or does it mean it didn't work at all? Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-07 22:41 ` Peter Xu @ 2025-03-10 6:40 ` Jinjiang Tu 2025-03-10 18:50 ` Peter Xu 0 siblings, 1 reply; 14+ messages in thread From: Jinjiang Tu @ 2025-03-10 6:40 UTC (permalink / raw) To: Peter Xu, jimsiak Cc: linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang 在 2025/3/8 6:41, Peter Xu 写道: > On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: >> Hi, >> >> From my side, I managed to avoid the freezing of processes with the >> following change in function userfaultfd_release() in file fs/userfaultfd.c >> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): >> >> I moved the following command from line 851: >> WRITE_ONCE(ctx->released, true); >> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) >> >> to line 905, that is exactly before the functions returns 0. >> >> That simple workaround worked for my use case but I am far from sure that is >> a correct/sufficient fix for the problem at hand. > Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. > > Maybe it's possible to remove ctx->released but only rely on the mmap write > lock. However that'll need some closer look and more thoughts. > > To me, the more straightforward way to fix it is to use the patch I > mentioned in the other email: > > https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ > > Or does it mean it didn't work at all? This patch works for me. mlock() syscall calls GUP with FOLL_UNLOCKABLE and allows to release mmap lock and retry. But other GUP call without FOLL_UNLOCKABLE will return VM_FAULT_SIGBUS, is it a regression for the below commit? commit 656710a60e3693911bee3a355d2f2bbae3faba33 Author: Andrea Arcangeli <aarcange@redhat.com> Date: Fri Sep 8 16:12:42 2017 -0700 userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS > > Thanks, > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-10 6:40 ` Jinjiang Tu @ 2025-03-10 18:50 ` Peter Xu 2025-03-11 8:14 ` Jinjiang Tu 0 siblings, 1 reply; 14+ messages in thread From: Peter Xu @ 2025-03-10 18:50 UTC (permalink / raw) To: Jinjiang Tu Cc: jimsiak, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang On Mon, Mar 10, 2025 at 02:40:35PM +0800, Jinjiang Tu wrote: > > 在 2025/3/8 6:41, Peter Xu 写道: > > On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: > > > Hi, > > > > > > From my side, I managed to avoid the freezing of processes with the > > > following change in function userfaultfd_release() in file fs/userfaultfd.c > > > (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): > > > > > > I moved the following command from line 851: > > > WRITE_ONCE(ctx->released, true); > > > (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) > > > > > > to line 905, that is exactly before the functions returns 0. > > > > > > That simple workaround worked for my use case but I am far from sure that is > > > a correct/sufficient fix for the problem at hand. > > Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. > > > > Maybe it's possible to remove ctx->released but only rely on the mmap write > > lock. However that'll need some closer look and more thoughts. > > > > To me, the more straightforward way to fix it is to use the patch I > > mentioned in the other email: > > > > https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ > > > > Or does it mean it didn't work at all? > > This patch works for me. mlock() syscall calls GUP with FOLL_UNLOCKABLE and > allows to release mmap lock and retry. > > But other GUP call without FOLL_UNLOCKABLE will return VM_FAULT_SIGBUS, > is it a regression for the below commit? Do you have an explicit reproducer / use case of such? AFAIU, below commit should only change it from SIGBUS to NOPAGE when "released" is set. I don't see how it can regress on !FOLL_UNLOCKABLE. Thanks, > > commit 656710a60e3693911bee3a355d2f2bbae3faba33 > Author: Andrea Arcangeli <aarcange@redhat.com> > Date: Fri Sep 8 16:12:42 2017 -0700 > > userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS > > > > > Thanks, > > > -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-10 18:50 ` Peter Xu @ 2025-03-11 8:14 ` Jinjiang Tu 2025-03-12 9:18 ` Jinjiang Tu 0 siblings, 1 reply; 14+ messages in thread From: Jinjiang Tu @ 2025-03-11 8:14 UTC (permalink / raw) To: Peter Xu Cc: jimsiak, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang 在 2025/3/11 2:50, Peter Xu 写道: > On Mon, Mar 10, 2025 at 02:40:35PM +0800, Jinjiang Tu wrote: >> 在 2025/3/8 6:41, Peter Xu 写道: >>> On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: >>>> Hi, >>>> >>>> From my side, I managed to avoid the freezing of processes with the >>>> following change in function userfaultfd_release() in file fs/userfaultfd.c >>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): >>>> >>>> I moved the following command from line 851: >>>> WRITE_ONCE(ctx->released, true); >>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) >>>> >>>> to line 905, that is exactly before the functions returns 0. >>>> >>>> That simple workaround worked for my use case but I am far from sure that is >>>> a correct/sufficient fix for the problem at hand. >>> Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. >>> >>> Maybe it's possible to remove ctx->released but only rely on the mmap write >>> lock. However that'll need some closer look and more thoughts. >>> >>> To me, the more straightforward way to fix it is to use the patch I >>> mentioned in the other email: >>> >>> https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ >>> >>> Or does it mean it didn't work at all? >> This patch works for me. mlock() syscall calls GUP with FOLL_UNLOCKABLE and >> allows to release mmap lock and retry. >> >> But other GUP call without FOLL_UNLOCKABLE will return VM_FAULT_SIGBUS, >> is it a regression for the below commit? > Do you have an explicit reproducer / use case of such? > > AFAIU, below commit should only change it from SIGBUS to NOPAGE when > "released" is set. I don't see how it can regress on !FOLL_UNLOCKABLE. > > Thanks, You are right, the below commit seems to only care about page fault from userspace (which has FAULT_FLAG_ALLOW_RETRY flag), and doesn't care about GUP from drivers (which may be !FOLL_UNLOCKABLE) Thanks. >> commit 656710a60e3693911bee3a355d2f2bbae3faba33 >> Author: Andrea Arcangeli <aarcange@redhat.com> >> Date: Fri Sep 8 16:12:42 2017 -0700 >> >> userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS >> >>> Thanks, >>> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-11 8:14 ` Jinjiang Tu @ 2025-03-12 9:18 ` Jinjiang Tu 2025-03-12 14:09 ` Peter Xu 0 siblings, 1 reply; 14+ messages in thread From: Jinjiang Tu @ 2025-03-12 9:18 UTC (permalink / raw) To: Peter Xu Cc: jimsiak, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang, tujinjiang 在 2025/3/11 16:14, Jinjiang Tu 写道: > > 在 2025/3/11 2:50, Peter Xu 写道: >> On Mon, Mar 10, 2025 at 02:40:35PM +0800, Jinjiang Tu wrote: >>> 在 2025/3/8 6:41, Peter Xu 写道: >>>> On Fri, Mar 07, 2025 at 03:11:09PM +0200, jimsiak wrote: >>>>> Hi, >>>>> >>>>> From my side, I managed to avoid the freezing of processes with the >>>>> following change in function userfaultfd_release() in file >>>>> fs/userfaultfd.c >>>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L842): >>>>> >>>>> >>>>> I moved the following command from line 851: >>>>> WRITE_ONCE(ctx->released, true); >>>>> (https://elixir.bootlin.com/linux/v5.13/source/fs/userfaultfd.c#L851) >>>>> >>>>> to line 905, that is exactly before the functions returns 0. >>>>> >>>>> That simple workaround worked for my use case but I am far from >>>>> sure that is >>>>> a correct/sufficient fix for the problem at hand. >>>> Updating the field after userfaultfd_ctx_put() might mean UAF, afaict. >>>> >>>> Maybe it's possible to remove ctx->released but only rely on the >>>> mmap write >>>> lock. However that'll need some closer look and more thoughts. >>>> >>>> To me, the more straightforward way to fix it is to use the patch I >>>> mentioned in the other email: >>>> >>>> https://lore.kernel.org/all/ZLmT3BfcmltfFvbq@x1n/ >>>> >>>> Or does it mean it didn't work at all? >>> This patch works for me. mlock() syscall calls GUP with >>> FOLL_UNLOCKABLE and >>> allows to release mmap lock and retry. >>> >>> But other GUP call without FOLL_UNLOCKABLE will return VM_FAULT_SIGBUS, >>> is it a regression for the below commit? >> Do you have an explicit reproducer / use case of such? >> >> AFAIU, below commit should only change it from SIGBUS to NOPAGE when >> "released" is set. I don't see how it can regress on !FOLL_UNLOCKABLE. >> >> Thanks, > > You are right, the below commit seems to only care about page fault > from userspace (which has > FAULT_FLAG_ALLOW_RETRY flag), and doesn't care about GUP from drivers > (which may be !FOLL_UNLOCKABLE) > > Thanks. Hi Peter, Since this patch works, could you please send a formal patch to maillist? Thanks. > >>> commit 656710a60e3693911bee3a355d2f2bbae3faba33 >>> Author: Andrea Arcangeli <aarcange@redhat.com> >>> Date: Fri Sep 8 16:12:42 2017 -0700 >>> >>> userfaultfd: non-cooperative: closing the uffd without >>> triggering SIGBUS >>> >>>> Thanks, >>>> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released 2025-03-12 9:18 ` Jinjiang Tu @ 2025-03-12 14:09 ` Peter Xu 0 siblings, 0 replies; 14+ messages in thread From: Peter Xu @ 2025-03-12 14:09 UTC (permalink / raw) To: Jinjiang Tu Cc: jimsiak, linux-fsdevel, linux-kernel, viro, linux-mm, wangkefeng.wang On Wed, Mar 12, 2025 at 05:18:26PM +0800, Jinjiang Tu wrote: > Since this patch works, could you please send a formal patch to maillist? Will do it later today, thanks. -- Peter Xu ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-03-12 14:09 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> 2023-07-19 21:16 ` Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released Axel Rasmussen 2023-07-19 21:54 ` Axel Rasmussen 2023-07-20 10:35 ` Hillf Danton 2023-07-20 20:07 ` Peter Xu 2023-07-20 20:06 ` Peter Xu 2025-03-07 7:21 ` Jinjiang Tu 2025-03-07 8:07 ` Jinjiang Tu 2025-03-07 13:11 ` jimsiak 2025-03-07 22:41 ` Peter Xu 2025-03-10 6:40 ` Jinjiang Tu 2025-03-10 18:50 ` Peter Xu 2025-03-11 8:14 ` Jinjiang Tu 2025-03-12 9:18 ` Jinjiang Tu 2025-03-12 14:09 ` Peter Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).