public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: skip check for spurious faults for non-present faults
@ 2014-05-12 10:29 David Vrabel
  2014-05-15 20:50 ` Dave Hansen
  0 siblings, 1 reply; 4+ messages in thread
From: David Vrabel @ 2014-05-12 10:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, David Vrabel

If a fault on a kernel address is due to a non-present page, then it
cannot be the result of stale TLB entry from a protection change (RO
to RW or NX to X).  Thus the pagetable walk in spurious_fault() can be
skipped.

This avoids spurious_fault() oopsing in some cases if the pagetables
it attempts to walk are not accessible.  This obscures the location of
the original fault.

This also fixes a crash with Xen PV guests when they access entries in
the M2P corresponding to device MMIO regions.  The M2P is mapped
(read-only) by Xen into the kernel address space of the guest and this
mapping may contains holes for non-RAM regions.  Read faults will
result in calls to spurious_fault(), but because the page tables for
the M2P mappings are not accessible by the guest the pagetable walk
would fault.

This was not normally a problem as MMIO mappings would not normally
result in a M2P lookup because of the use of the _PAGE_IOMAP bit the
PTE.  However, removing the _PAGE_IOMAP bit requires M2P lookups for
MMIO mappings as well.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
x86 maintainers, this is a prerequisite for removing Xen's usage of
_PAGE_IOMAP so I think this is best merged via the Xen tree.

See the full series at https://lkml.org/lkml/2014/4/15/390
---
 arch/x86/mm/fault.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 8e57229..c39e249 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -936,8 +936,10 @@ spurious_fault(unsigned long error_code, unsigned long address)
 	pte_t *pte;
 	int ret;
 
-	/* Reserved-bit violation or user access to kernel space? */
-	if (error_code & (PF_USER | PF_RSVD))
+	/* Only check for spurious faults on supervisor write or
+	   instruction faults. */
+	if (error_code != (PF_WRITE | PF_PROT)
+	    && error_code != (PF_INSTR | PF_PROT))
 		return 0;
 
 	pgd = init_mm.pgd + pgd_index(address);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: skip check for spurious faults for non-present faults
  2014-05-12 10:29 [PATCH] x86: skip check for spurious faults for non-present faults David Vrabel
@ 2014-05-15 20:50 ` Dave Hansen
  2014-05-15 21:20   ` H. Peter Anvin
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2014-05-15 20:50 UTC (permalink / raw)
  To: David Vrabel, linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86

On 05/12/2014 03:29 AM, David Vrabel wrote:
> -	/* Reserved-bit violation or user access to kernel space? */
> -	if (error_code & (PF_USER | PF_RSVD))
> +	/* Only check for spurious faults on supervisor write or
> +	   instruction faults. */
> +	if (error_code != (PF_WRITE | PF_PROT)
> +	    && error_code != (PF_INSTR | PF_PROT))
>  		return 0;

This changes the semantics a bit too much for me to feel happy about it.
 This is at best missing quite a bit of detail from the changelog.

 1. 'return 0' means "this was not a spurious fault"
 2. We used to check for the presence of PF_USER|PF_RSVD
 3. This patch checks now for two _explicit_ conditions, which
    implicitly check for the _absence_ of the two bits we checked for
    before.

I do believe your patch is correct, but it took me a bit to convince
myself that it was the right thing.  Please be explicit (in the comment)
about the exact PTE transitions that you expect to get you here.

Also, I have to wonder if you can just leave the original if() in there.
 You're making this _more_ restrictive than it was before, and I wonder
if it might just be more clear if you have both checks.  The compiler
might even compile it down to the same code, just changing the immediate
that was generated for the mask that you're checking.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: skip check for spurious faults for non-present faults
  2014-05-15 20:50 ` Dave Hansen
@ 2014-05-15 21:20   ` H. Peter Anvin
  2014-05-16 16:54     ` Dave Hansen
  0 siblings, 1 reply; 4+ messages in thread
From: H. Peter Anvin @ 2014-05-15 21:20 UTC (permalink / raw)
  To: Dave Hansen, David Vrabel, linux-kernel; +Cc: Thomas Gleixner, Ingo Molnar, x86

I'm trying to wrap my head around any forward compatibility concerns... if we misidentify a fault as spurious that would be bad.  

On May 15, 2014 1:50:13 PM PDT, Dave Hansen <dave.hansen@intel.com> wrote:
>On 05/12/2014 03:29 AM, David Vrabel wrote:
>> -	/* Reserved-bit violation or user access to kernel space? */
>> -	if (error_code & (PF_USER | PF_RSVD))
>> +	/* Only check for spurious faults on supervisor write or
>> +	   instruction faults. */
>> +	if (error_code != (PF_WRITE | PF_PROT)
>> +	    && error_code != (PF_INSTR | PF_PROT))
>>  		return 0;
>
>This changes the semantics a bit too much for me to feel happy about
>it.
> This is at best missing quite a bit of detail from the changelog.
>
> 1. 'return 0' means "this was not a spurious fault"
> 2. We used to check for the presence of PF_USER|PF_RSVD
> 3. This patch checks now for two _explicit_ conditions, which
>    implicitly check for the _absence_ of the two bits we checked for
>    before.
>
>I do believe your patch is correct, but it took me a bit to convince
>myself that it was the right thing.  Please be explicit (in the
>comment)
>about the exact PTE transitions that you expect to get you here.
>
>Also, I have to wonder if you can just leave the original if() in
>there.
> You're making this _more_ restrictive than it was before, and I wonder
>if it might just be more clear if you have both checks.  The compiler
>might even compile it down to the same code, just changing the
>immediate
>that was generated for the mask that you're checking.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: skip check for spurious faults for non-present faults
  2014-05-15 21:20   ` H. Peter Anvin
@ 2014-05-16 16:54     ` Dave Hansen
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Hansen @ 2014-05-16 16:54 UTC (permalink / raw)
  To: H. Peter Anvin, David Vrabel, linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, x86

On 05/15/2014 02:20 PM, H. Peter Anvin wrote:
> I'm trying to wrap my head around any forward compatibility
> concerns... if we misidentify a fault as spurious that would be bad.

FWIW, I did go through the bullets in the SDM's "4.10.4.3 Optional
Invalidation".  The only condition not covered by this patch is the one
that talks about "CR4.SMEP = 0...".

But, I don't know of any case where we take a userspace page and make it
supervisor-only, so I don't think this is possible now.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-16 16:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-12 10:29 [PATCH] x86: skip check for spurious faults for non-present faults David Vrabel
2014-05-15 20:50 ` Dave Hansen
2014-05-15 21:20   ` H. Peter Anvin
2014-05-16 16:54     ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox