Re: [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Dave Hansen <dave@sr71.net>, linux-kernel@vger.kernel.org
Cc: x86@kernel.org, linux-mm@kvack.org,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	bp@alien8.de, ak@linux.intel.com, mhocko@suse.com
Subject: Re: [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum
Date: Sat, 02 Jul 2016 08:28:12 +1000	[thread overview]
Message-ID: <1467412092.7422.56.camel@kernel.crashing.org> (raw)
In-Reply-To: <20160701174658.6ED27E64@viggo.jf.intel.com>

On Fri, 2016-07-01 at 10:46 -0700, Dave Hansen wrote:
> The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
> Landing) has an erratum where a processor thread setting the Accessed
> or Dirty bits may not do so atomically against its checks for the
> Present bit.A  This may cause a thread (which is about to page fault)
> to set A and/or D, even though the Present bit had already been
> atomically cleared.

Interesting.... I always wondered where in the Intel docs did it specify
that present was tested atomically with setting of A and D ... I couldn't
find it.

Isn't there a more fundamental issue however that you may actually lose
those bits ? For example if we do an munmap, in zap_pte_range()

We first exchange all the PTEs with 0 with ptep_get_and_clear_full()
and we then transfer D that we just read into the struct page.

We rely on the fact that D will never be set again, what we go it a
"final" D bit. IE. We rely on the fact that a processor either:

   - Has a cached PTE in its TLB with D set, in which case it can still
write to the page until we flush the TLB or

   - Doesn't have a cached PTE in its TLB with D set and so will fail
to do so due to the atomic P check, thus never writing.

With the errata, don't you have a situation where a processor in the second
category will write and set D despite P having been cleared (due to the
race) and thus causing us to miss the transfer of that D to the struct
page and essentially completely miss that the physical page is dirty ?

(Leading to memory corruption).

> If the PTE is used for storing a swap index or a NUMA migration index,
> the A bit could be misinterpreted as part of the swap type.A  The stray
> bits being set cause a software-cleared PTE to be interpreted as a
> swap entry.A  In some cases (like when the swap index ends up being
> for a non-existent swapfile), the kernel detects the stray value
> and WARN()s about it, but there is no guarantee that the kernel can
> always detect it.
> 
> This patch changes the kernel to attempt to ignore those stray bits
> when they get set.A  We do this by making our swap PTE format
> completely ignore the A/D bits, and also by ignoring them in our
> pte_none() checks.
> 
> Andi Kleen wrote the original version of this patch.A  Dave Hansen
> wrote the later ones.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-07-01 22:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-01 17:46 [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum Dave Hansen
2016-07-01 17:47 ` [PATCH 1/4] x86, swap: move swap offset/type up in PTE to work around erratum Dave Hansen
2016-07-01 17:47 ` [PATCH 2/4] x86, pagetable: ignore A/D bits in pte/pmd/pud_none() Dave Hansen
2016-07-01 17:47 ` [PATCH 3/4] x86: disallow running with 32-bit PTEs to work around erratum Dave Hansen
2016-07-01 17:47 ` [PATCH 4/4] x86: use pte_none() to test for empty PTE Dave Hansen
2016-07-01 22:28 ` Benjamin Herrenschmidt [this message]
2016-07-13 11:37   ` [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum Vlastimil Babka
2016-07-13 12:10     ` Vlastimil Babka
2016-07-13 14:04     ` Dave Hansen
  -- strict thread matches above, loose matches on Subject: below --
2016-07-08  0:19 Dave Hansen
2016-07-13  9:54 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1467412092.7422.56.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave@sr71.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).