All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Liu Shixin <liushixin2@huawei.com>
Cc: Liu Zixian <liuzixian4@huawei.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH] mm: hugetlb: fix UAF in hugetlb_handle_userfault
Date: Wed, 21 Sep 2022 16:57:39 -0700	[thread overview]
Message-ID: <Yyuk83B4VHh+pbFp@monkey> (raw)
In-Reply-To: <YytOYH1MSo5cNoB6@monkey>

On 09/21/22 10:48, Mike Kravetz wrote:
> On 09/21/22 16:34, Liu Shixin wrote:
> > The vma_lock and hugetlb_fault_mutex are dropped before handling
> > userfault and reacquire them again after handle_userfault(), but
> > reacquire the vma_lock could lead to UAF[1] due to the following
> > race,
> > 
> > hugetlb_fault
> >   hugetlb_no_page
> >     /*unlock vma_lock */
> >     hugetlb_handle_userfault
> >       handle_userfault
> >         /* unlock mm->mmap_lock*/
> >                                            vm_mmap_pgoff
> >                                              do_mmap
> >                                                mmap_region
> >                                                  munmap_vma_range
> >                                                    /* clean old vma */
> >         /* lock vma_lock again  <--- UAF */
> >     /* unlock vma_lock */
> > 
> > Since the vma_lock will unlock immediately after hugetlb_handle_userfault(),
> > let's drop the unneeded lock and unlock in hugetlb_handle_userfault() to fix
> > the issue.
> 
> Thank you very much!
> 
> When I saw this report, the obvious fix was to do something like what you have
> done below.  That looks fine with a few minor comments.
> 
> One question I have not yet answered is, "Does this same issue apply to
> follow_hugetlb_page()?".  I believe it does.  follow_hugetlb_page calls
> hugetlb_fault which could result in the fault being processed by userfaultfd.
> If we experience the race above, then the associated vma could no longer be
> valid when returning from hugetlb_fault.  follow_hugetlb_page and callers
> have a flag (locked) to deal with dropping mmap lock.  However, I am not sure
> if it is handled correctly WRT userfaultfd.  I think this needs to be answered
> before fixing.  And, if the follow_hugetlb_page code needs to be fixed it
> should be done at the same time.
> 

To at least verify this code path, I added userfaultfd handling to the gup_test
program in kernel selftests.  When doing basic gup test on a hugetlb page in
a userfaultfd registered range, I hit this warning:

[ 6939.867796] FAULT_FLAG_ALLOW_RETRY missing 1
[ 6939.871503] CPU: 2 PID: 5720 Comm: gup_test Not tainted 6.0.0-rc6-next-20220921+ #72
[ 6939.874562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 6939.877707] Call Trace:
[ 6939.878745]  <TASK>
[ 6939.879779]  dump_stack_lvl+0x6c/0x9f
[ 6939.881199]  handle_userfault.cold+0x14/0x1e
[ 6939.882830]  ? find_held_lock+0x2b/0x80
[ 6939.884370]  ? __mutex_unlock_slowpath+0x45/0x280
[ 6939.886145]  hugetlb_handle_userfault+0x90/0xf0
[ 6939.887936]  hugetlb_fault+0xb7e/0xda0
[ 6939.889409]  ? vprintk_emit+0x118/0x3a0
[ 6939.890903]  ? _printk+0x58/0x73
[ 6939.892279]  follow_hugetlb_page.cold+0x59/0x145
[ 6939.894116]  __get_user_pages+0x146/0x750
[ 6939.895580]  __gup_longterm_locked+0x3e9/0x680
[ 6939.897023]  ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0
[ 6939.898939]  ? lockdep_hardirqs_on+0x7d/0x100
[ 6939.901243]  gup_test_ioctl+0x320/0x6e0
[ 6939.902202]  __x64_sys_ioctl+0x87/0xc0
[ 6939.903220]  do_syscall_64+0x38/0x90
[ 6939.904233]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 6939.905423] RIP: 0033:0x7fbb53830f7b

This is because userfaultfd is expecting FAULT_FLAG_ALLOW_RETRY which is not
set in this path.

Adding John, Peter and David on Cc: as they are much more fluent in all the
fault and FOLL combinations and might have immediate suggestions.  It is going
to take me a little while to figure out:
1) How to make sure we get the right flags passed to handle_userfault
2) How to modify follow_hugetlb_page as userfaultfd can certainly drop
   mmap_lock.  So we can not assume vma still exists upon return.

-- 
Mike Kravetz


  reply	other threads:[~2022-09-21 23:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  8:34 [PATCH] mm: hugetlb: fix UAF in hugetlb_handle_userfault Liu Shixin
2022-09-21 17:31 ` Sidhartha Kumar
2022-09-21 17:48 ` Mike Kravetz
2022-09-21 23:57   ` Mike Kravetz [this message]
2022-09-22  0:57     ` John Hubbard
2022-09-22  2:35       ` Mike Kravetz
2022-09-22  7:46     ` David Hildenbrand
2022-09-22 17:18       ` Mike Kravetz
2022-09-22 15:14     ` Peter Xu
2022-09-21 19:07 ` Andrew Morton
2022-09-22  1:58   ` Liu Shixin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyuk83B4VHh+pbFp@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=liuzixian4@huawei.com \
    --cc=peterx@redhat.com \
    --cc=songmuchun@bytedance.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.