From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com ([134.134.136.126]:54953 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725790AbeKMFC2 (ORCPT ); Tue, 13 Nov 2018 00:02:28 -0500 Subject: Re: [PATCH] x86/mm/pat: Fix missing preemption disable for __native_flush_tlb() To: Dan Williams , Andy Lutomirski Cc: Thomas Gleixner , Sebastian Andrzej Siewior , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Borislav Petkov , stable , X86 ML , Linux Kernel Mailing List , "Kirill A. Shutemov" References: <154180834787.2060925.7738215365584115230.stgit@dwillia2-desk3.amr.corp.intel.com> <7590EF40-B0CF-40BD-9D29-FB731A2A2E3A@amacapital.net> <9DFD717D-857D-493D-A606-B635D72BAC21@amacapital.net> From: Dave Hansen Message-ID: <749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com> Date: Mon, 12 Nov 2018 11:07:55 -0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: On 11/10/18 4:31 PM, Dan Williams wrote: >> If it indeed can run late in boot or after boot, then it sure looks >> buggy. Either the __flush_tlb_all() should be removed or it should >> be replaced with flush_tlb_kernel_range(). It’s unclear to me why a >> flush is needed at all, but if it’s needed, surely all CPUs need >> flushing. > Yeah, I don't think __flush_tlb_all() is needed at > kernel_physical_mapping_init() time, and at > kernel_physical_mapping_remove() time we do a full flush_tlb_all(). It doesn't look strictly necessary to me. I _think_ we're only ever populating previously non-present entries, and those never need TLB flushes. I didn't look too deeply, so I'd appreciate anyone else double-checking me on this. The __flush_tlb_all() actually appears to predate git and it was originally entirely intended for early-boot-only. It probably lasted this long because it looks really important. :) It was even next to where we set MMU features in CR4, which is *really* early in boot: > + asm volatile("movq %%cr4,%0" : "=r" (mmu_cr4_features)); > + __flush_tlb_all(); I also totally agree with Andy that if it were needed on the local CPU, this code would be buggy because it doesn't initiate any *remote* TLB flushes. So, let's remove it, but also add some comments about not being allowed to *change* page table entries, only populate them. We could even add some warnings to keep this enforced.