From: "Koenig, Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
To: "Yang, Philip" <Philip.Yang-5C7GfCeVMHo@public.gmane.org>
Cc: Andrea Arcangeli
<aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Ralph Campbell
<rcampbell-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
"Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>,
"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
"linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"
<linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
Jerome Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Jason Gunthorpe <jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
"dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking
Date: Thu, 17 Oct 2019 16:44:06 +0000 [thread overview]
Message-ID: <da56ae92-1ddd-4f89-bb31-2a8a12eebc0a@email.android.com> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 3901 bytes --]
Am 17.10.2019 18:26 schrieb "Yang, Philip" <Philip.Yang@amd.com>:
On 2019-10-17 4:54 a.m., Christian König wrote:
> Am 16.10.19 um 18:04 schrieb Jason Gunthorpe:
>> On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian König wrote:
>>> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe:
>>>> From: Jason Gunthorpe <jgg@mellanox.com>
>>>>
>>>> 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp,
>>>> hfi1,
>>>> scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
>>>> they only use invalidate_range_start/end and immediately check the
>>>> invalidating range against some driver data structure to tell if the
>>>> driver is interested. Half of them use an interval_tree, the others are
>>>> simple linear search lists.
>>>>
>>>> Of the ones I checked they largely seem to have various kinds of races,
>>>> bugs and poor implementation. This is a result of the complexity in how
>>>> the notifier interacts with get_user_pages(). It is extremely
>>>> difficult to
>>>> use it correctly.
>>>>
>>>> Consolidate all of this code together into the core mmu_notifier and
>>>> provide a locking scheme similar to hmm_mirror that allows the user to
>>>> safely use get_user_pages() and reliably know if the page list still
>>>> matches the mm.
>>> That sounds really good, but could you outline for a moment how that is
>>> archived?
>> It uses the same basic scheme as hmm and rdma odp, outlined in the
>> revisions to hmm.rst later on.
>>
>> Basically,
>>
>> seq = mmu_range_read_begin(&mrn);
>>
>> // This is a speculative region
>> .. get_user_pages()/hmm_range_fault() ..
>
> How do we enforce that this get_user_pages()/hmm_range_fault() doesn't
> see outdated page table information?
>
> In other words how the the following race prevented:
>
> CPU A CPU B
> invalidate_range_start()
> mmu_range_read_begin()
> get_user_pages()/hmm_range_fault()
> Updating the ptes
> invalidate_range_end()
>
>
> I mean get_user_pages() tries to circumvent this issue by grabbing a
> reference to the pages in question, but that isn't sufficient for the
> SVM use case.
>
> That's the reason why we had this horrible solution with a r/w lock and
> a linked list of BOs in an interval tree.
>
> Regards,
> Christian.
get_user_pages/hmm_range_fault() and invalidate_range_start() both are
called while holding mm->map_sem, so they are always serialized.
Not even remotely.
For calling get_user_pages()/hmm_range_fault() you only need to hold the mmap_sem in read mode.
And IIRC invalidate_range_start() is sometimes called without holding the mmap_sem at all.
So again how are they serialized?
Regards,
Christian.
Philip
>
>> // Result cannot be derferenced
>>
>> take_lock(driver->update);
>> if (mmu_range_read_retry(&mrn, range.notifier_seq) {
>> // collision! The results are not correct
>> goto again
>> }
>>
>> // no collision, and now under lock. Now we can de-reference the
>> pages/etc
>> // program HW
>> // Now the invalidate callback is responsible to synchronize against
>> changes
>> unlock(driver->update)
>>
>> Basically, anything that was using hmm_mirror correctly transisions
>> over fairly trivially, just with the modification to store a sequence
>> number to close that race described in the hmm commit.
>>
>> For something like AMD gpu I expect it to transition to use dma_fence
>> from the notifier for coherency right before it unlocks driver->update.
>>
>> Jason
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[-- Attachment #1.2: Type: text/html, Size: 6283 bytes --]
[-- Attachment #2: Type: text/plain, Size: 153 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next reply other threads:[~2019-10-17 16:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 16:44 Koenig, Christian [this message]
-- strict thread matches above, loose matches on Subject: below --
2019-10-15 18:12 [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking Jason Gunthorpe
[not found] ` <20191015181242.8343-1-jgg-uk2M96/98Pc@public.gmane.org>
2019-10-16 8:58 ` Christian König
[not found] ` <bc954d29-388b-9e29-f960-115ccc6b9fea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-10-16 16:04 ` Jason Gunthorpe
[not found] ` <20191016160444.GB3430-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-10-17 8:54 ` Christian König
2019-10-17 16:26 ` Yang, Philip
[not found] ` <2046e0b4-ba05-0683-5804-e9bbf903658d-5C7GfCeVMHo@public.gmane.org>
2019-10-17 16:47 ` Koenig, Christian
[not found] ` <d6bcbd2a-2519-8945-eaf5-4f4e738c7fa9-5C7GfCeVMHo@public.gmane.org>
2019-10-18 20:36 ` Jason Gunthorpe
2019-10-20 14:21 ` Koenig, Christian
2019-10-21 13:57 ` Jason Gunthorpe
[not found] ` <20191021135744.GA25164-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-10-21 14:28 ` Koenig, Christian
2019-10-21 15:12 ` Jason Gunthorpe
[not found] ` <20191021151221.GC25164-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-10-22 7:57 ` Daniel Vetter
2019-10-22 15:01 ` Jason Gunthorpe
2019-10-23 9:08 ` Daniel Vetter
2019-10-23 9:08 ` Daniel Vetter
2019-10-23 9:32 ` Christian König
2019-10-23 9:32 ` Christian König
[not found] ` <13edf841-421e-3522-fcec-ef919c2013ef-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-10-23 16:52 ` Jerome Glisse
2019-10-23 16:52 ` Jerome Glisse
2019-10-23 17:24 ` Jason Gunthorpe
2019-10-23 17:24 ` Jason Gunthorpe
[not found] ` <20191023172442.GX22766-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-10-24 2:16 ` Christoph Hellwig
2019-10-24 2:16 ` Christoph Hellwig
2019-10-21 18:40 ` Jerome Glisse
2019-10-21 19:06 ` Jason Gunthorpe
[not found] ` <20191021190556.GI6285-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2019-10-23 20:26 ` Jerome Glisse
2019-10-23 20:26 ` Jerome Glisse
2019-10-21 15:55 ` Dennis Dalessandro
2019-10-21 16:58 ` Jason Gunthorpe
2019-10-22 11:56 ` Dennis Dalessandro
2019-10-22 14:37 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=da56ae92-1ddd-4f89-bb31-2a8a12eebc0a@email.android.com \
--to=christian.koenig-5c7gfcevmho@public.gmane.org \
--cc=Felix.Kuehling-5C7GfCeVMHo@public.gmane.org \
--cc=Philip.Yang-5C7GfCeVMHo@public.gmane.org \
--cc=aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=rcampbell-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox