All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Ram Pai <linuxram@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org,
	leonardo@linux.ibm.com, kirill@shutemov.name,
	Ram Pai <linuxram@linux.ibm.com>
Subject: Re: [PATCH v2 01/22] powerpc/pkeys: Avoid using lockless page table walk
Date: Sun, 05 Apr 2020 13:49:40 +0000	[thread overview]
Message-ID: <87h7xyjbob.fsf@linux.ibm.com> (raw)
In-Reply-To: <20200403002649.GB22412@oc0525413822.ibm.com>

Ram Pai <linuxram@us.ibm.com> writes:

> On Thu, Mar 19, 2020 at 09:25:48AM +0530, Aneesh Kumar K.V wrote:
>> Fetch pkey from vma instead of linux page table. Also document the fact that in
>> some cases the pkey returned in siginfo won't be the same as the one we took
>> keyfault on. Even with linux page table walk, we can end up in a similar scenario.
>
> There is no way to correctly ensure that the key returned through
> siginfo is actually the key that took the fault.  Either get it
> from page table or get it from the corresponding vma.

That is correct.

>
> So we had to choose the lesser evil. Getting it from the page table was
> faster, and did not involve taking any locks.

That is because you are locks which need to be held on page table walk.

>Getting it from the vma
> was slower, since it needed locks.  Also I faintly recall, there
> is a scenario where the address that gets a key fault, has no
> corresponding VMA associated with it yet.

I would be interested in this. For now IIUC even x86 fetch the key from
VMA.

>
> Hence the logic used was --
> 	if it is key-fault, than procure the key quickly
> 	from the page table.  In the unlikely event that the fault is
> 	something else, but still has a non-permissive key associated
> 	with it, get the key from the vma.


I am fixing that logic further in the next patch. I do have a test case
attached for that. We always check for the key in the vma and if it
allows access, then we retry.


>
> A well written application should avoid changing the key of an address
> space without synchronizing the corresponding threads that operate in
> that address range.  However, if the application ignores to do so, than
> it is vulnerable to a undefined behavior. There is no way to prove that
> the reported key is correct or incorrect, since there is no provable
> order between the two events; the key-fault event and the key-change
> event.
>
> Hence I think the change proposed in this patch may not be necessary.
> RP

The change is needed so that we can make the page table walk safer.


-aneesh

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Ram Pai <linuxram@us.ibm.com>
Cc: Ram Pai <linuxram@linux.ibm.com>,
	linux-kernel@vger.kernel.org, npiggin@gmail.com,
	linux-mm@kvack.org, kvm-ppc@vger.kernel.org,
	kirill@shutemov.name, leonardo@linux.ibm.com,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 01/22] powerpc/pkeys: Avoid using lockless page table walk
Date: Sun, 05 Apr 2020 19:07:40 +0530	[thread overview]
Message-ID: <87h7xyjbob.fsf@linux.ibm.com> (raw)
In-Reply-To: <20200403002649.GB22412@oc0525413822.ibm.com>

Ram Pai <linuxram@us.ibm.com> writes:

> On Thu, Mar 19, 2020 at 09:25:48AM +0530, Aneesh Kumar K.V wrote:
>> Fetch pkey from vma instead of linux page table. Also document the fact that in
>> some cases the pkey returned in siginfo won't be the same as the one we took
>> keyfault on. Even with linux page table walk, we can end up in a similar scenario.
>
> There is no way to correctly ensure that the key returned through
> siginfo is actually the key that took the fault.  Either get it
> from page table or get it from the corresponding vma.

That is correct.

>
> So we had to choose the lesser evil. Getting it from the page table was
> faster, and did not involve taking any locks.

That is because you are locks which need to be held on page table walk.

>Getting it from the vma
> was slower, since it needed locks.  Also I faintly recall, there
> is a scenario where the address that gets a key fault, has no
> corresponding VMA associated with it yet.

I would be interested in this. For now IIUC even x86 fetch the key from
VMA.

>
> Hence the logic used was --
> 	if it is key-fault, than procure the key quickly
> 	from the page table.  In the unlikely event that the fault is
> 	something else, but still has a non-permissive key associated
> 	with it, get the key from the vma.


I am fixing that logic further in the next patch. I do have a test case
attached for that. We always check for the key in the vma and if it
allows access, then we retry.


>
> A well written application should avoid changing the key of an address
> space without synchronizing the corresponding threads that operate in
> that address range.  However, if the application ignores to do so, than
> it is vulnerable to a undefined behavior. There is no way to prove that
> the reported key is correct or incorrect, since there is no provable
> order between the two events; the key-fault event and the key-change
> event.
>
> Hence I think the change proposed in this patch may not be necessary.
> RP

The change is needed so that we can make the page table walk safer.


-aneesh

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Ram Pai <linuxram@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org,
	leonardo@linux.ibm.com, kirill@shutemov.name,
	Ram Pai <linuxram@linux.ibm.com>
Subject: Re: [PATCH v2 01/22] powerpc/pkeys: Avoid using lockless page table walk
Date: Sun, 05 Apr 2020 19:07:40 +0530	[thread overview]
Message-ID: <87h7xyjbob.fsf@linux.ibm.com> (raw)
In-Reply-To: <20200403002649.GB22412@oc0525413822.ibm.com>

Ram Pai <linuxram@us.ibm.com> writes:

> On Thu, Mar 19, 2020 at 09:25:48AM +0530, Aneesh Kumar K.V wrote:
>> Fetch pkey from vma instead of linux page table. Also document the fact that in
>> some cases the pkey returned in siginfo won't be the same as the one we took
>> keyfault on. Even with linux page table walk, we can end up in a similar scenario.
>
> There is no way to correctly ensure that the key returned through
> siginfo is actually the key that took the fault.  Either get it
> from page table or get it from the corresponding vma.

That is correct.

>
> So we had to choose the lesser evil. Getting it from the page table was
> faster, and did not involve taking any locks.

That is because you are locks which need to be held on page table walk.

>Getting it from the vma
> was slower, since it needed locks.  Also I faintly recall, there
> is a scenario where the address that gets a key fault, has no
> corresponding VMA associated with it yet.

I would be interested in this. For now IIUC even x86 fetch the key from
VMA.

>
> Hence the logic used was --
> 	if it is key-fault, than procure the key quickly
> 	from the page table.  In the unlikely event that the fault is
> 	something else, but still has a non-permissive key associated
> 	with it, get the key from the vma.


I am fixing that logic further in the next patch. I do have a test case
attached for that. We always check for the key in the vma and if it
allows access, then we retry.


>
> A well written application should avoid changing the key of an address
> space without synchronizing the corresponding threads that operate in
> that address range.  However, if the application ignores to do so, than
> it is vulnerable to a undefined behavior. There is no way to prove that
> the reported key is correct or incorrect, since there is no provable
> order between the two events; the key-fault event and the key-change
> event.
>
> Hence I think the change proposed in this patch may not be necessary.
> RP

The change is needed so that we can make the page table walk safer.


-aneesh


  reply	other threads:[~2020-04-05 13:49 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19  3:55 [PATCH v2 00/22] Avoid IPI while updating page table entries Aneesh Kumar K.V
2020-03-19  3:56 ` Aneesh Kumar K.V
2020-03-19  3:55 ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 01/22] powerpc/pkeys: Avoid using lockless page table walk Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-04-03  0:28   ` Ram Pai
2020-04-03  0:28     ` Ram Pai
2020-04-03  0:28     ` Ram Pai
2020-04-05 13:37     ` Aneesh Kumar K.V [this message]
2020-04-05 13:49       ` Aneesh Kumar K.V
2020-04-05 13:37       ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 02/22] powerpc/pkeys: Check vma before returning key fault error to the user Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 03/22] powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_range Aneesh Kumar K.V
2020-03-19  3:56   ` [PATCH v2 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_ra Aneesh Kumar K.V
2020-03-19  3:55   ` [PATCH v2 04/22] powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_range Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 05/22] powerpc/book3s64/hash: Use the pte_t address from the caller Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 06/22] powerpc/mce: Don't reload pte val in addr_to_pfn Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 07/22] powerpc/perf/callchain: Use __get_user_pages_fast in read_user_stack_slow Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 08/22] powerpc/kvm/book3s: switch from raw_spin_*lock to arch_spin_lock Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 09/22] powerpc/kvm/book3s: Add helper to walk partition scoped linux page table Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 10/22] powerpc/kvm/nested: Add helper to walk nested shadow " Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 11/22] powerpc/kvm/book3s: Use kvm helpers to walk shadow or secondary table Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:55 ` [PATCH v2 12/22] powerpc/kvm/book3s: Add helper for host page table walk Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:55   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 13/22] powerpc/kvm/book3s: Use find_kvm_host_pte in page fault handler Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 14/22] powerpc/kvm/book3s: Use find_kvm_host_pte in h_enter Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 15/22] powerpc/kvm/book3s: use find_kvm_host_pte in pute_tce functions Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 16/22] powerpc/kvm/book3s: Avoid using rmap to protect parallel page table update Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 17/22] powerpc/kvm/book3s: use find_kvm_host_pte in kvmppc_book3s_instantiate_page Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 18/22] powerpc/kvm/book3s: Use find_kvm_host_pte in kvmppc_get_hpa Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 19/22] powerpc/kvm/book3s: Use pte_present instead of opencoding _PAGE_PRESENT check Aneesh Kumar K.V
2020-03-19  3:57   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 20/22] powerpc/mm/book3s64: Avoid sending IPI on clearing PMD Aneesh Kumar K.V
2020-03-19  3:58   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 21/22] mm: change pmdp_huge_get_and_clear_full take vm_area_struct as arg Aneesh Kumar K.V
2020-03-19  3:58   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V
2020-03-19  3:56 ` [PATCH v2 22/22] powerpc/mm/book3s64: Fix MADV_DONTNEED and parallel page fault race Aneesh Kumar K.V
2020-03-19  3:58   ` Aneesh Kumar K.V
2020-03-19  3:56   ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7xyjbob.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=leonardo@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=linuxram@linux.ibm.com \
    --cc=linuxram@us.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulus@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.