linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Felix.Kuehling@amd.com, linux-rdma@vger.kernel.org,
	linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
Date: Fri, 7 Jun 2019 23:12:45 -0300	[thread overview]
Message-ID: <20190608021245.GD7844@ziepe.ca> (raw)
In-Reply-To: <61ea869d-43d2-d1e5-dc00-cf5e3e139169@nvidia.com>

On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote:
> 
> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > hmm_release() is called exactly once per hmm. ops->release() cannot
> > accidentally trigger any action that would recurse back onto
> > hmm->mirrors_sem.
> > 
> > This fixes a use after-free race of the form:
> > 
> >         CPU0                                   CPU1
> >                                             hmm_release()
> >                                               up_write(&hmm->mirrors_sem);
> >   hmm_mirror_unregister(mirror)
> >    down_write(&hmm->mirrors_sem);
> >    up_write(&hmm->mirrors_sem);
> >    kfree(mirror)
> >                                               mirror->ops->release(mirror)
> > 
> > The only user we have today for ops->release is an empty function, so this
> > is unambiguously safe.
> > 
> > As a consequence of plugging this race drivers are not allowed to
> > register/unregister mirrors from within a release op.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> 
> I agree with the analysis above but I'm not sure that release() will
> always be an empty function. It might be more efficient to write back
> all data migrated to a device "in one pass" instead of relying
> on unmap_vmas() calling hmm_start_range_invalidate() per VMA.

Sure, but it should not be allowed to recurse back to
hmm_mirror_unregister.

> I think the bigger issue is potential deadlocks while calling
> sync_cpu_device_pagetables() and tasks calling hmm_mirror_unregister():
>
> Say you have three threads:
> - Thread A is in try_to_unmap(), either without holding mmap_sem or with
> mmap_sem held for read.
> - Thread B has some unrelated driver calling hmm_mirror_unregister().
> This doesn't require mmap_sem.
> - Thread C is about to call migrate_vma().
>
> Thread A                Thread B                 Thread C
> try_to_unmap            hmm_mirror_unregister    migrate_vma
> hmm_invalidate_range_start
> down_read(mirrors_sem)
>                         down_write(mirrors_sem)
>                         // Blocked on A
>                                                   device_lock
> device_lock
> // Blocked on C
>                                                   migrate_vma()
>                                                   hmm_invalidate_range_s
>                                                   down_read(mirrors_sem)
>                                                   // Blocked on B
>                                                   // Deadlock

Oh... you know I didn't know this about rwsems in linux that they have
a fairness policy for writes to block future reads..

Still, at least as things are designed, the driver cannot hold a lock
it obtains under sync_cpu_device_pagetables() and nest other things in
that lock. It certainly can't recurse back into any mmu notifiers
while holding that lock. (as you point out)

The lock in sync_cpu_device_pagetables() needs to be very narrowly
focused on updating device state only.

So, my first reaction is that the driver in thread C is wrong, and
needs a different locking scheme. I think you'd have to make a really
good case that there is no alternative for a driver..

> Perhaps we should consider using SRCU for walking the mirror->list?

It means the driver has to deal with races like in this patch
description. At that point there is almost no reason to insert hmm
here, just use mmu notifiers directly.

Drivers won't get this right, it is too hard.

Jason


  reply	other threads:[~2019-06-08  2:12 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 18:44 [PATCH v2 hmm 00/11] Various revisions from a locking/code review Jason Gunthorpe
2019-06-06 18:44 ` [PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
2019-06-07  2:29   ` John Hubbard
2019-06-07 12:34     ` Jason Gunthorpe
2019-06-07 13:42       ` Jason Gunthorpe
2019-06-08  1:13       ` John Hubbard
2019-06-08  1:37       ` John Hubbard
2019-06-07 18:12   ` Ralph Campbell
2019-06-08  8:49   ` Christoph Hellwig
2019-06-08 11:33     ` Jason Gunthorpe
2019-06-06 18:44 ` [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
2019-06-07  2:36   ` John Hubbard
2019-06-07 18:24   ` Ralph Campbell
2019-06-07 22:39     ` Ralph Campbell
2019-06-10 13:09       ` Jason Gunthorpe
2019-06-07 22:33   ` Ira Weiny
2019-06-08  8:54   ` Christoph Hellwig
2019-06-11 19:44     ` Jason Gunthorpe
2019-06-12  7:12       ` Christoph Hellwig
2019-06-12 11:41         ` Jason Gunthorpe
2019-06-12 12:11           ` Christoph Hellwig
2019-06-06 18:44 ` [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
2019-06-07  2:44   ` John Hubbard
2019-06-07 12:36     ` Jason Gunthorpe
2019-06-07 18:41   ` Ralph Campbell
2019-06-07 18:51     ` Jason Gunthorpe
2019-06-07 22:38   ` Ira Weiny
2019-06-06 18:44 ` [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
2019-06-07  2:54   ` John Hubbard
2019-06-07 18:52   ` Ralph Campbell
2019-06-07 22:44   ` Ira Weiny
2019-06-06 18:44 ` [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
2019-06-07  3:06   ` John Hubbard
2019-06-07 12:47     ` Jason Gunthorpe
2019-06-07 13:31     ` [PATCH v3 " Jason Gunthorpe
2019-06-07 22:55       ` Ira Weiny
2019-06-08  1:32       ` John Hubbard
2019-06-07 19:01   ` [PATCH v2 " Ralph Campbell
2019-06-07 19:13     ` Jason Gunthorpe
2019-06-07 20:21       ` Ralph Campbell
2019-06-07 20:44         ` Jason Gunthorpe
2019-06-07 22:13           ` Ralph Campbell
2019-06-08  1:47             ` Jason Gunthorpe
2019-06-06 18:44 ` [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
2019-06-07  3:15   ` John Hubbard
2019-06-07 20:29   ` Ralph Campbell
2019-06-06 18:44 ` [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
2019-06-07  3:19   ` John Hubbard
2019-06-07 20:31   ` Ralph Campbell
2019-06-07 22:16   ` Souptick Joarder
2019-06-06 18:44 ` [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
2019-06-07  3:29   ` John Hubbard
2019-06-07 13:57     ` Jason Gunthorpe
2019-06-07 20:33   ` Ralph Campbell
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
2019-06-07  3:37   ` John Hubbard
2019-06-07 14:03     ` Jason Gunthorpe
2019-06-07 20:46   ` Ralph Campbell
2019-06-07 20:49     ` Jason Gunthorpe
2019-06-07 23:01   ` Ira Weiny
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
2019-06-07  3:40   ` John Hubbard
2019-06-07 20:49   ` Ralph Campbell
2019-06-07 22:11   ` Souptick Joarder
2019-06-07 23:02   ` Ira Weiny
2019-06-06 18:44 ` [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
2019-06-07  3:47   ` John Hubbard
2019-06-07 12:58     ` Jason Gunthorpe
2019-06-07 21:37   ` Ralph Campbell
2019-06-08  2:12     ` Jason Gunthorpe [this message]
2019-06-10 16:02     ` Jason Gunthorpe
2019-06-10 22:03       ` Ralph Campbell
2019-06-07 16:05 ` [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
2019-06-07 23:52   ` Ralph Campbell
2019-06-08  1:35     ` Jason Gunthorpe
2019-06-11 19:48 ` [PATCH v2 hmm 00/11] Various revisions from a locking/code review Jason Gunthorpe
2019-06-12 17:54   ` Kuehling, Felix
2019-06-12 21:49     ` Yang, Philip
2019-06-13 17:50       ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190608021245.GD7844@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=Felix.Kuehling@amd.com \
    --cc=aarcange@redhat.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).