From: David Vrabel <david.vrabel@citrix.com>
To: <linux-kernel@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
<x86@kernel.org>, Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCHv2] x86: skip check for spurious faults for non-present faults
Date: Wed, 18 Jun 2014 13:38:05 +0100 [thread overview]
Message-ID: <53A1882D.5030201@citrix.com> (raw)
In-Reply-To: <1400251458-14891-1-git-send-email-david.vrabel@citrix.com>
Peter,
Someone else asking about what _PAGE_IOMAP was for reminded me that this
was still outstanding. Could you review and ack if appropriate?
Thanks.
David
On 16/05/14 15:44, David Vrabel wrote:
> If a fault on a kernel address is due to a non-present page, then it
> cannot be the result of stale TLB entry from a protection change (RO
> to RW or NX to X). Thus the pagetable walk in spurious_fault() can be
> skipped.
>
> See the initial if in spurious_fault() and the tests in
> spurious_fault_check()) for the set of possible error codes checked
> for spurious faults. These are:
>
> IRUWP
> Before x00xx && ( 1xxxx || xxx1x )
> After ( 10001 || 00011 ) && ( 1xxxx || xxx1x )
>
> Thus the new condition is a subset of the previous one, excluding only
> non-present faults (I == 1 and W == 1 are mutually exclusive).
>
> This avoids spurious_fault() oopsing in some cases if the pagetables
> it attempts to walk are not accessible. This obscures the location of
> the original fault.
>
> This also fixes a crash with Xen PV guests when they access entries in
> the M2P corresponding to device MMIO regions. The M2P is mapped
> (read-only) by Xen into the kernel address space of the guest and this
> mapping may contains holes for non-RAM regions. Read faults will
> result in calls to spurious_fault(), but because the page tables for
> the M2P mappings are not accessible by the guest the pagetable walk
> would fault.
>
> This was not normally a problem as MMIO mappings would not normally
> result in a M2P lookup because of the use of the _PAGE_IOMAP bit the
> PTE. However, removing the _PAGE_IOMAP bit requires M2P lookups for
> MMIO mappings as well.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> ---
> x86 maintainers, this is a prerequisite for removing Xen's usage of
> _PAGE_IOMAP so I think this is best merged via the Xen tree.
>
> v2:
> - improve comments
> ---
> arch/x86/mm/fault.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 8e57229..7f790e4 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -924,8 +924,17 @@ static int spurious_fault_check(unsigned long error_code, pte_t *pte)
> * cross-processor TLB flush, even if no stale TLB entries exist
> * on other processors.
> *
> + * Spurious faults may only occur if the TLB contains an entry with
> + * fewer permission than the page table entry. Non-present (P = 0)
> + * and reserved bit (R = 1) faults are never spurious.
> + *
> * There are no security implications to leaving a stale TLB when
> * increasing the permissions on a page.
> + *
> + * Returns non-zero if a spurious fault was handled, zero otherwise.
> + *
> + * See Intel Developer's Manual Vol 3 Section 4.10.4.3, bullet 3
> + * (Optional Invalidation).
> */
> static noinline __kprobes int
> spurious_fault(unsigned long error_code, unsigned long address)
> @@ -936,8 +945,17 @@ spurious_fault(unsigned long error_code, unsigned long address)
> pte_t *pte;
> int ret;
>
> - /* Reserved-bit violation or user access to kernel space? */
> - if (error_code & (PF_USER | PF_RSVD))
> + /*
> + * Only writes to RO or instruction fetches from NX may cause
> + * spurious faults.
> + *
> + * These could be from user or supervisor accesses but the TLB
> + * is only lazily flushed after a kernel mapping protection
> + * change, so user accesses are not expected to cause spurious
> + * faults.
> + */
> + if (error_code != (PF_WRITE | PF_PROT)
> + && error_code != (PF_INSTR | PF_PROT))
> return 0;
>
> pgd = init_mm.pgd + pgd_index(address);
>
prev parent reply other threads:[~2014-06-18 12:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-16 14:44 [PATCHv2] x86: skip check for spurious faults for non-present faults David Vrabel
2014-05-16 16:45 ` Dave Hansen
2014-06-18 12:38 ` David Vrabel [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53A1882D.5030201@citrix.com \
--to=david.vrabel@citrix.com \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.