linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	kernel test robot <yujie.liu@intel.com>,
	"Fabio M. De Francesco" <fmdefrancesco@gmail.com>,
	lkp@lists.01.org, lkp@intel.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected
Date: Fri, 21 Oct 2022 20:17:43 -0400	[thread overview]
Message-ID: <Y1M2p9OtBGnKwGUE@x1n> (raw)
In-Reply-To: <Y1MymJ/INb45AdaY@iweiny-desk3>

On Fri, Oct 21, 2022 at 05:00:24PM -0700, Ira Weiny wrote:
> On Fri, Oct 21, 2022 at 03:48:57PM -0700, Ira wrote:
> > On Fri, Oct 21, 2022 at 01:30:41PM -0700, Andrew Morton wrote:
> > > On Fri, 21 Oct 2022 14:09:16 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> > > 
> > > > On Fri, Oct 21, 2022 at 12:10:17PM +0800, kernel test robot wrote:
> > > > > FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11):
> > > > > 
> > > > > commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > Ummm.  Looks to me like this now occurs because of this part of the
> > > > change:
> > > > 
> > > >                 if (!zeropage) {        /* COPY */
> > > > -                       page_kaddr = kmap_atomic(page);
> > > > +                       page_kaddr = kmap_local_folio(folio, 0);
> > > >                         ret = copy_from_user(page_kaddr,
> > > >                                              (const void __user *)src_addr,
> > > >                                              PAGE_SIZE);
> > > > -                       kunmap_atomic(page_kaddr);
> > > > +                       kunmap_local(page_kaddr);
> > > > 
> > > > Should I be using __copy_from_user_inatomic() here?
> > 
> > I would say not.  I'm curious why copy_from_user() was safe (at least did not
> > fail the checkers).  :-/
> > 
> > > 
> > > Caller __mcopy_atomic() is holding mmap_read_lock(dst_mm) and this
> > > copy_from_user() calls
> > > might_fault()->might_lock_read(current->mm->mmap_lock).
> > > 
> > > And I guess might_lock_read() gets upset because we're holding another
> > > mm's mmap_lock.  Which sounds OK to me, unless a concurrent
> > > mmap_write_lock() could jam things up.
> > > 
> > > But I cannot see why your patch would suddenly trigger this warning -
> > > kmap_local_folio() and kmap_atomic() are basically the same thing.
> > 
> > It is related to your patch but I think what you did made sense on the surface.
> > 
> > On the surface copy_from_user() should not require pagefaults to be disabled.
> > But that side affect of kmap_atomic() was being used here because it looks like
> > the code is designed to fallback if the fault was not allowed:[1]
> > 
> > mm/shmem.c
> > ...
> >                         page_kaddr = kmap_local_folio(folio, 0);
> >                         ret = copy_from_user(page_kaddr,
> >                                              (const void __user *)src_addr,
> >                                              PAGE_SIZE);
> >                         kunmap_local(page_kaddr);
> > 
> >                         /* fallback to copy_from_user outside mmap_lock */
> >                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >                         if (unlikely(ret)) {
> >                                 *pagep = &folio->page;
> >                                 ret = -ENOENT;
> >                                 /* don't free the page */
> >                                 goto out_unacct_blocks;
> >                         }
> > ...
> > 
> > So this is one of those rare places where the kmap_atomic() side effects were
> > being depended on...  :-(
> > 
> > [1] might_fault() does not actually mean the code completes the fault.
> > 
> > mm/memory.c
> > ...
> > void __might_fault(const char *file, int line)
> > {
> >         if (pagefault_disabled())
> >                 return;
> > ...
> > 
> > > 
> > > I see that __mcopy_atomic() is using plain old kmap(), perhaps to work
> > > around this?  But that's 2015 code and I'm not sure we had such
> > > detailed lock checking in those days.
> > 
> > No kmap() can't work around this.  That works because the lock is released just
> > above that.
> > 
> > mm/userfaultfd.c
> > ...
> >                         mmap_read_unlock(dst_mm);
> >                         BUG_ON(!page);
> > 
> >                         page_kaddr = kmap(page);
> >                         err = copy_from_user(page_kaddr,
> >                                              (const void __user *) src_addr,
> >                                              PAGE_SIZE);
> >                         kunmap(page);
> > ...
> > 
> > So I think the correct solution is below because we want to prevent the page
> > fault.
> 
> I was about to get this patch ready to send when I found this:
> 
> commit b6ebaedb4cb1a18220ae626c3a9e184ee39dd248
> Author: Andrea Arcangeli <aarcange@redhat.com>
> Date:   Fri Sep 4 15:47:08 2015 -0700
> 
>     userfaultfd: avoid mmap_sem read recursion in mcopy_atomic
> 
>     If the rwsem starves writers it wasn't strictly a bug but lockdep
>     doesn't like it and this avoids depending on lowlevel implementation  
>     details of the lock.
>     
>     [akpm@linux-foundation.org: delete weird BUILD_BUG_ON()]
>     Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
>     Acked-by: Pavel Emelyanov <xemul@parallels.com>
> ...
> 
> So I wonder if the true fix is something to lockdep?

I think lockdep used to complain because we can be taking the same mmap_sem
twice in this case (the 2nd one during the useraddr page fault).  So to
answer the other question - yeah the current->mm and dest_mm can definitely
be the same one in this context.

> 
> Regardless I'll send the below patch because it will restore things to a
> working order.
> 
> But I'm CC'ing Andrea for comments.

Open-code disabling of pagefault sounds okay to me.  pagefault_disable()
used to be covering the kmap procedure too as done in kmap_atomic(), but
frankly I don't know whether there's a real difference.

Yeah, let's see whether we can get a confirmation from Andrea.

Thanks,

-- 
Peter Xu



      parent reply	other threads:[~2022-10-22  0:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-21  4:10 [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected kernel test robot
2022-10-21 13:09 ` Matthew Wilcox
2022-10-21 20:30   ` Andrew Morton
2022-10-21 22:36     ` Matthew Wilcox
2022-10-21 22:48     ` Ira Weiny
2022-10-22  0:00       ` Ira Weiny
2022-10-22  0:02         ` Randy Dunlap
2022-10-22  0:08           ` Ira Weiny
2022-10-22  0:17         ` Peter Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1M2p9OtBGnKwGUE@x1n \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fmdefrancesco@gmail.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=willy@infradead.org \
    --cc=yujie.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).