From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755811AbaEOTWu (ORCPT ); Thu, 15 May 2014 15:22:50 -0400 Received: from mail-wg0-f51.google.com ([74.125.82.51]:51250 "EHLO mail-wg0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752333AbaEOTWt (ORCPT ); Thu, 15 May 2014 15:22:49 -0400 Message-ID: <53751403.1010109@gmail.com> Date: Thu, 15 May 2014 20:22:43 +0100 From: Keir Fraser User-Agent: Postbox 3.0.9 (Macintosh/20140129) MIME-Version: 1.0 To: "H. Peter Anvin" CC: David Vrabel , xen-devel@lists.xenproject.org, x86@kernel.org, linux-kernel@vger.kernel.org, Dave Hansen , Ingo Molnar , Mel Gorman , Boris Ostrovsky , Thomas Gleixner Subject: Re: [Xen-devel] [PATCH 7/9] x86: skip check for spurious faults for non-present faults References: <1397571337-20409-1-git-send-email-david.vrabel@citrix.com> <1397571337-20409-8-git-send-email-david.vrabel@citrix.com> <53750A96.2020201@zytor.com> In-Reply-To: <53750A96.2020201@zytor.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org H. Peter Anvin wrote: > On 04/15/2014 07:15 AM, David Vrabel wrote: >> If a fault on a kernel address is due to a non-present page, then it >> cannot be the result of stale TLB entry from a protection change (RO >> to RW or NX to X). Thus the pagetable walk in spurious_fault() can be >> skipped. > > Erk... this code is screaming WTF to me. The x86 architecture is such > that the CPU is responsible for avoiding these faults. Not in this case... > > > 5b727a3b0158a129827c21ce3bfb0ba997e8ddd0 > > x86: ignore spurious faults > > When changing a kernel page from RO->RW, it's OK to leave stale TLB > entries around, since doing a global flush is expensive and they > pose no security problem. They can, however, generate a spurious > fault, which we should catch and simply return from (which will > have the side-effect of reloading the TLB to the current PTE). > > This can occur when running under Xen, because it frequently changes > kernel pages from RW->RO->RW to implement Xen's pagetable semantics. > It could also occur when using CONFIG_DEBUG_PAGEALLOC, since it > avoids doing a global TLB flush after changing page permissions. > > Signed-off-by: Jeremy Fitzhardinge > Cc: Harvey Harrison > Signed-off-by: Ingo Molnar > Signed-off-by: Thomas Gleixner > > Again WTF? > > Are we chasing hardware errata here? Or did someone go off and *assume* > that the x86 hardware architecture work a certain way? Or is there > something way more subtle going on? See Intel Developer's Manual Vol 3 Section 4.10.4.3, 3rd bullet... This is expected behaviour, probably to make copy-on-write faults faster. -- Keir > I guess next step is mailing list archaeology... > > Does anyone still have contacts with Jeremy, and if so, could they poke > him perhaps? > > -hpa > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel