From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rik van Riel Subject: Re: [PATCH] mm: fix up a spurious page fault whenever it happens Date: Wed, 22 May 2013 14:43:11 -0400 Message-ID: <519D11BF.5000604@redhat.com> References: <5195ED8B.7060002@meduna.org> <1369183168.6828.168.camel@gandalf.local.home> <519CBB30.3060200@redhat.com> <20130522134111.33a695c5@cuia.bos.redhat.com> <519D08B0.8050707@meduna.org> <1369246316.6828.176.camel@gandalf.local.home> <519D0CAB.7020800@meduna.org> <519D0FF8.5080200@redhat.com> <519D118B.6010306@zytor.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Stanislav Meduna , Steven Rostedt , Linus Torvalds , "linux-rt-users@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , the arch/x86 maintainers , Hai Huang To: "H. Peter Anvin" Return-path: In-Reply-To: <519D118B.6010306@zytor.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-rt-users.vger.kernel.org On 05/22/2013 02:42 PM, H. Peter Anvin wrote: > On 05/22/2013 11:35 AM, Rik van Riel wrote: >> On 05/22/2013 02:21 PM, Stanislav Meduna wrote: >>> On 22.05.2013 20:11, Steven Rostedt wrote: >>> >>>> Did you apply both patches? Without the first one, this one is >>>> meaningless. >>> >>> Sure. >>> >>> BTW, back when I tried to pinpoint it I also tried adding >>> flush_tlb_page(vma, address) >>> at the beginning of handle_pte_fault, which as I read should >>> be basically the same. It did not not change anything. >> >> I'm stumped. >> >> If the Geode knows how to flush single TLB entries, it >> should do that when flush_tlb_page is called. >> >> If it does not know, it should throw an invalid instruction >> exception, and not quietly complete the instruction without >> doing anything. >> > > Some CPUs have had errata when it comes to flushing large pages that > have been split into small pages by hardware, e.g. due to MTRR > conflicts. In that case, fragments of the large page may have been left > in the TLB. > > Could that explain what you are seeing? That would be testable by changing __native_flush_tlb_single() to call __flush_tlb(), instead of doing an invlpg instruction. In other words, make the code look like this, for testing: static inline void __native_flush_tlb_single(unsigned long addr) { __flush_tlb(); } This on top of the other two patches.