All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
	"Christian König" <christian.koenig@amd.com>,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback
Date: Thu, 31 Aug 2017 09:59:53 -0400	[thread overview]
Message-ID: <20170831135953.GA9227@redhat.com> (raw)
In-Reply-To: <f46585ee-7a89-1d68-5698-b6de76152cde@amd.com>

[Adding Intel folks as they might be interested in this discussion]

On Wed, Aug 30, 2017 at 05:51:52PM -0400, Felix Kuehling wrote:
> Hi Jérôme,
> 
> I have some questions about the potential range-start-end race you
> mentioned.
> 
> On 2017-08-29 07:54 PM, Jérôme Glisse wrote:
> > Note that a lot of existing user feels broken in respect to range_start/
> > range_end. Many user only have range_start() callback but there is nothing
> > preventing them to undo what was invalidated in their range_start() callback
> > after it returns but before any CPU page table update take place.
> >
> > The code pattern use in kvm or umem odp is an example on how to properly
> > avoid such race. In a nutshell use some kind of sequence number and active
> > range invalidation counter to block anything that might undo what the
> > range_start() callback did.
> What happens when we start monitoring an address range after
> invaligate_range_start was called? Sounds like we have to keep track of
> all active invalidations for just such a case, even in address ranges
> that we don't currently care about.
> 
> What are the things we cannot do between invalidate_range_start and
> invalidate_range_end? amdgpu calls get_user_pages to re-validate our
> userptr mappings after the invalidate_range_start notifier invalidated
> it. Do we have to wait for invalidate_range_end before we can call
> get_user_pages safely?

Well the whole userptr bo object is somewhat broken from the start.
You never defined the semantic of it ie what is expected. I can
think of 2 differents semantics:
  A) a uptr buffer object is a snapshot of a memory at the time of
     uptr buffer object creation
  B) a uptr buffer object allow GPU to access a range of virtual
     address of a process an share coherent view of that range
     between CPU and GPU

As it was implemented it is more inline with B but it is not defined
anywhere AFAICT.

Anyway getting back to your questions, it kind of doesn't matter as
you are using GUP ie you are pinning pages except for one scenario
(at least i can only think of one).

Problematic case is race between CPU write to zero page or COW and
GPU driver doing read only GUP:

    CPU thread 1                       | CPU thread 2
    ---------------------------------------------------------------------
                                       |
                                       | uptr covering addr A read only
                                       | .... do stuff with A
    write fault to addr A              |
    invalidate_range_start([A, A+1])   | unbind_ttm -> unpin
                                       | validate bo -> GUP -> zero page
    lock page table                    |
    replace zero pfn/COW with new page |
    unlock page table                  |
    invalidate_range_end([A, A+1])     |

So here the GPU would be using wrong page for the address. How bad
is it is undefined as the semantic of uptr is undefine. Given how it
as been use so far this race is unlikely (i don't think we have many
userspace that use that feature and do fork).


So i would first define the semantic of uptr bo and then i would fix
accordingly the code. Semantic A is easier to implement and you could
just drop the whole mmu_notifier. Maybe it is better to create uptr
buffer object everytime you want to snapshot a range of address. I
don't think the overhead of buffer creation would matter.


If you want to close the race for COW and zero page in case of read
only GUP there is no other way than what KVM or ODP is doing. I had
patchset to simplify all this but i need to bring it back to life.


Note that other thing might race but as you pin the pages they do
not matter. It just mean that if you GUP after range_start() but
before range_end() and before CPU page table update then you pinned
the same old page again and nothing will happen (migrate will fail,
MADV_FREE will nop, ...). So you just did the range_start() callback
for nothing in those cases.

(Sorry for taking so long to answer i forgot your mail yesterday with
 all the other discussion going on).

Cheers,
Jérôme
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-08-31 13:59 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29 23:54 [PATCH 00/13] mmu_notifier kill invalidate_page callback Jérôme Glisse
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 01/13] dax: update to new mmu_notifier semantic Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 02/13] mm/rmap: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  2:46   ` Nadav Amit
2017-08-30  2:46     ` Nadav Amit
2017-08-30  2:59     ` Jerome Glisse
2017-08-30  2:59       ` Jerome Glisse
2017-08-30  3:16       ` Nadav Amit
2017-08-30  3:16         ` Nadav Amit
2017-08-30  3:18         ` Nadav Amit
2017-08-30  3:18           ` Nadav Amit
2017-08-30 17:27     ` Andrea Arcangeli
2017-08-30 17:27       ` Andrea Arcangeli
2017-08-30 18:00       ` Nadav Amit
2017-08-30 18:00         ` Nadav Amit
2017-08-30 21:25         ` Andrea Arcangeli
2017-08-30 21:25           ` Andrea Arcangeli
2017-08-30 23:25           ` Nadav Amit
2017-08-30 23:25             ` Nadav Amit
2017-08-31  0:47             ` Jerome Glisse
2017-08-31  0:47               ` Jerome Glisse
2017-08-31  0:47               ` Jerome Glisse
     [not found]               ` <20170831004719.GF9445-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-31 17:12                 ` Andrea Arcangeli
2017-08-31 17:12                   ` Andrea Arcangeli
2017-08-31 17:12                   ` Andrea Arcangeli
2017-08-31 19:15                   ` Nadav Amit
2017-08-31 19:15                     ` Nadav Amit
2017-08-30 18:20       ` Jerome Glisse
2017-08-30 18:20         ` Jerome Glisse
2017-08-30 18:40         ` Nadav Amit
2017-08-30 18:40           ` Nadav Amit
2017-08-30 20:45           ` Jerome Glisse
2017-08-30 20:45             ` Jerome Glisse
2017-08-30 22:17             ` Andrea Arcangeli
2017-08-30 22:17               ` Andrea Arcangeli
2017-08-30 20:55           ` Andrea Arcangeli
2017-08-30 20:55             ` Andrea Arcangeli
2017-08-30 16:52   ` Andrea Arcangeli
2017-08-30 16:52     ` Andrea Arcangeli
2017-08-30 17:48     ` Jerome Glisse
2017-08-30 17:48       ` Jerome Glisse
2017-08-30 21:53     ` Linus Torvalds
2017-08-30 21:53       ` Linus Torvalds
2017-08-30 23:01       ` Andrea Arcangeli
2017-08-30 23:01         ` Andrea Arcangeli
2017-08-31 18:25         ` Jerome Glisse
2017-08-31 18:25           ` Jerome Glisse
2017-08-31 19:40           ` Linus Torvalds
2017-08-31 19:40             ` Linus Torvalds
2017-08-29 23:54 ` [PATCH 03/13] powerpc/powernv: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 04/13] drm/amdgpu: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  6:18   ` Christian König
2017-08-30  6:18     ` Christian König
2017-08-29 23:54 ` [PATCH 05/13] IB/umem: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  6:13   ` Leon Romanovsky
2017-08-29 23:54 ` [PATCH 06/13] IB/hfi1: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-09-06 14:08   ` Arumugam, Kamenee
2017-08-29 23:54 ` [PATCH 08/13] iommu/intel: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 09/13] misc/mic/scif: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 10/13] sgi-gru: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 11/13] xen/gntdev: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30 19:32   ` Boris Ostrovsky
2017-08-30 19:32   ` Boris Ostrovsky
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 12/13] KVM: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 13/13] mm/mmu_notifier: kill invalidate_page Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  0:11 ` [PATCH 00/13] mmu_notifier kill invalidate_page callback Linus Torvalds
     [not found] ` <20170829235447.10050-1-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-29 23:54   ` [PATCH 07/13] iommu/amd: update to new mmu_notifier semantic Jérôme Glisse
2017-08-29 23:54     ` Jérôme Glisse
2017-08-29 23:54     ` Jérôme Glisse
2017-08-30  0:11   ` [PATCH 00/13] mmu_notifier kill invalidate_page callback Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:56     ` Jerome Glisse
2017-08-30  0:56       ` Jerome Glisse
2017-08-30  0:56       ` Jerome Glisse
2017-08-30  8:40       ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30  8:40       ` Mike Galbraith
2017-08-30 14:57       ` Adam Borowski
2017-08-30 14:57         ` Adam Borowski
2017-08-30 14:57         ` Adam Borowski
2017-09-01 14:47         ` Jeff Cook
2017-09-01 14:47         ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:50           ` taskboxtester
2017-09-01 14:50             ` taskboxtester
2017-08-30 14:57       ` Adam Borowski
2017-08-30  0:56     ` Jerome Glisse
2017-08-30 21:51   ` Felix Kuehling
2017-08-31 13:59     ` Jerome Glisse [this message]
     [not found]       ` <20170831135953.GA9227-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-31 14:14         ` Christian König
2017-08-31 18:39         ` Felix Kuehling
2017-08-31 19:00           ` Jerome Glisse
     [not found]             ` <20170831190021.GG9227-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-31 23:19               ` Felix Kuehling
2017-08-31 23:29                 ` Jerome Glisse
2017-11-30  9:33 ` BSOD with " Fabian Grünbichler
2017-11-30  9:33 ` Fabian Grünbichler
2017-11-30  9:33   ` Fabian Grünbichler
     [not found]   ` <20171130093320.66cxaoj45g2ttzoh-aVfaTQcAavps8ZkLLAvlZtBPR1lH4CV8@public.gmane.org>
2017-11-30 11:20     ` Paolo Bonzini
2017-11-30 11:20       ` Paolo Bonzini
2017-11-30 11:20       ` Paolo Bonzini
2017-11-30 16:19       ` Radim Krčmář
2017-11-30 16:19         ` Radim Krčmář
2017-11-30 18:05         ` [PATCH 1/2] KVM: x86: fix APIC page invalidation Radim Krčmář
2017-11-30 18:05           ` Radim Krčmář
2017-11-30 18:05           ` Radim Krčmář
2017-11-30 18:05           ` [PATCH 2/2] TESTING! KVM: x86: add invalidate_range mmu notifier Radim Krčmář
2017-11-30 18:05             ` Radim Krčmář
2017-12-01 15:15             ` Paolo Bonzini
2017-12-01 15:15               ` Paolo Bonzini
2017-12-01 15:15               ` Paolo Bonzini
2017-12-03 17:24               ` Andrea Arcangeli
2017-12-03 17:24                 ` Andrea Arcangeli
2017-12-03 17:24                 ` Andrea Arcangeli
2017-12-01 12:21           ` [PATCH 1/2] KVM: x86: fix APIC page invalidation Fabian Grünbichler
2017-12-01 12:21             ` Fabian Grünbichler
2017-12-01 15:27           ` Paolo Bonzini
2017-12-01 15:27             ` Paolo Bonzini
2017-12-03 17:28           ` Andrea Arcangeli
2017-12-03 17:28             ` Andrea Arcangeli
2017-12-03 17:28             ` Andrea Arcangeli
2017-12-06  2:32           ` Wanpeng Li
2017-12-06  2:32             ` Wanpeng Li
2017-12-06  9:50             ` 王金浦
2017-12-06  9:50               ` 王金浦
2017-12-06 10:00               ` Paolo Bonzini
2017-12-06 10:00                 ` Paolo Bonzini
2017-12-06 10:00                 ` Paolo Bonzini
2017-12-06  8:15           ` Fabian Grünbichler
2017-12-06  8:15             ` Fabian Grünbichler
2017-12-06  8:15             ` Fabian Grünbichler
2017-12-13 12:54           ` Richard Purdie
2017-12-13 12:54             ` Richard Purdie
2017-12-13 12:54             ` Richard Purdie
2017-11-30 16:19       ` BSOD with [PATCH 00/13] mmu_notifier kill invalidate_page callback Radim Krčmář
2017-11-30 11:20   ` Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2017-08-29 23:54 Jérôme Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170831135953.GA9227@redhat.com \
    --to=jglisse@redhat.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=felix.kuehling@amd.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.