All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haggai Eran <haggaie@mellanox.com>
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Mike Rapoport <mike.rapoport@ravellosystems.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Shachar Raindel <raindel@mellanox.com>
Subject: Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics
Date: Wed, 2 Apr 2014 15:52:45 +0300	[thread overview]
Message-ID: <533C081D.9050202@mellanox.com> (raw)
In-Reply-To: <20140330203328.GA4859@gmail.com>

On 03/30/2014 11:33 PM, Jerome Glisse wrote:
> On Wed, Jan 22, 2014 at 04:01:15PM +0200, Haggai Eran wrote:
>> I'm worried about the following scenario:
>>
>> Given a read-only page, suppose one host thread (thread 1) writes to
>> that page, and performs COW, but before it calls the
>> mmu_notifier_invalidate_page_if_missing_change_pte function another host
>> thread (thread 2) writes to the same page (this time without a page
>> fault). Then we have a valid entry in the secondary page table to a
>> stale page, and someone (thread 3) may read stale data from there.
>>
>> Here's a diagram that shows this scenario:
>>
>> Thread 1                                | Thread 2        | Thread 3
>> ========================================================================
>> do_wp_page(page 1)                      |                 |
>>    ...                                   |                 |
>>    set_pte_at_notify                     |                 |
>>    ...                                   | write to page 1 |
>>                                          |                 | stale access
>>    pte_unmap_unlock                      |                 |
>>    invalidate_page_if_missing_change_pte |                 |
>>
>> This is currently prevented by the use of the range start and range end
>> notifiers.
>>
>> Do you agree that this scenario is possible with the new patch, or am I
>> missing something?
>>
> I believe you are right, but of all the upstream user of the mmu_notifier
> API only xen would suffer from this ie any user that do not have a proper
> change_pte callback can see the bogus scenario you describe above.
Yes. I sent our RDMA paging RFC patch-set on linux-rdma [1] last month, 
and it would also suffer from this scenario, but it's not upstream yet.
> The issue i see is with user that want to/or might sleep when they are
> invalidation the secondary page table. The issue being that change_pte is
> call with the cpu page table locked (well at least for the affected pmd).
>
> I would rather keep the invalidate_range_start/end bracket around change_pte
> and invalidate page. I think we can fix the kvm regression by other means.
Perhaps another possibility would be to do the 
invalidate_range_start/end bracket only when the mmu_notifier is missing 
a change_pte implementation.

Best regards,
Haggai

[1] http://www.spinics.net/lists/linux-rdma/msg18906.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Haggai Eran <haggaie@mellanox.com>
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Mike Rapoport <mike.rapoport@ravellosystems.com>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Shachar Raindel <raindel@mellanox.com>
Subject: Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics
Date: Wed, 2 Apr 2014 15:52:45 +0300	[thread overview]
Message-ID: <533C081D.9050202@mellanox.com> (raw)
In-Reply-To: <20140330203328.GA4859@gmail.com>

On 03/30/2014 11:33 PM, Jerome Glisse wrote:
> On Wed, Jan 22, 2014 at 04:01:15PM +0200, Haggai Eran wrote:
>> I'm worried about the following scenario:
>>
>> Given a read-only page, suppose one host thread (thread 1) writes to
>> that page, and performs COW, but before it calls the
>> mmu_notifier_invalidate_page_if_missing_change_pte function another host
>> thread (thread 2) writes to the same page (this time without a page
>> fault). Then we have a valid entry in the secondary page table to a
>> stale page, and someone (thread 3) may read stale data from there.
>>
>> Here's a diagram that shows this scenario:
>>
>> Thread 1                                | Thread 2        | Thread 3
>> ========================================================================
>> do_wp_page(page 1)                      |                 |
>>    ...                                   |                 |
>>    set_pte_at_notify                     |                 |
>>    ...                                   | write to page 1 |
>>                                          |                 | stale access
>>    pte_unmap_unlock                      |                 |
>>    invalidate_page_if_missing_change_pte |                 |
>>
>> This is currently prevented by the use of the range start and range end
>> notifiers.
>>
>> Do you agree that this scenario is possible with the new patch, or am I
>> missing something?
>>
> I believe you are right, but of all the upstream user of the mmu_notifier
> API only xen would suffer from this ie any user that do not have a proper
> change_pte callback can see the bogus scenario you describe above.
Yes. I sent our RDMA paging RFC patch-set on linux-rdma [1] last month, 
and it would also suffer from this scenario, but it's not upstream yet.
> The issue i see is with user that want to/or might sleep when they are
> invalidation the secondary page table. The issue being that change_pte is
> call with the cpu page table locked (well at least for the affected pmd).
>
> I would rather keep the invalidate_range_start/end bracket around change_pte
> and invalidate page. I think we can fix the kvm regression by other means.
Perhaps another possibility would be to do the 
invalidate_range_start/end bracket only when the mmu_notifier is missing 
a change_pte implementation.

Best regards,
Haggai

[1] http://www.spinics.net/lists/linux-rdma/msg18906.html

  reply	other threads:[~2014-04-02 12:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-15  9:40 [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics Mike Rapoport
2014-01-15  9:40 ` Mike Rapoport
2014-01-22 13:10 ` Andrea Arcangeli
2014-01-22 13:10   ` Andrea Arcangeli
2014-01-22 14:01   ` Haggai Eran
2014-01-22 14:01     ` Haggai Eran
2014-03-30 20:33     ` Jerome Glisse
2014-03-30 20:33       ` Jerome Glisse
2014-04-02 12:52       ` Haggai Eran [this message]
2014-04-02 12:52         ` Haggai Eran
2014-04-02 15:18         ` Jerome Glisse
2014-04-02 15:18           ` Jerome Glisse
2014-04-02 16:43           ` Andrea Arcangeli
2014-04-02 16:43             ` Andrea Arcangeli
2014-01-22 21:54 ` Andrew Morton
2014-01-22 21:54   ` Andrew Morton
2014-01-22 22:19   ` Andrea Arcangeli
2014-01-22 22:19     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533C081D.9050202@mellanox.com \
    --to=haggaie@mellanox.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=izik.eidus@ravellosystems.com \
    --cc=j.glisse@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.rapoport@ravellosystems.com \
    --cc=ogerlitz@mellanox.com \
    --cc=raindel@mellanox.com \
    --cc=sagig@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.