From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Fleming Subject: Re: [PATCH 2/2] acpi, apei: use appropriate pgprot_t to map GHES memory Date: Fri, 4 Sep 2015 12:28:20 +0100 Message-ID: <20150904112820.GB2737@codeblueprint.co.uk> References: <1440523642-31373-1-git-send-email-zjzhang@codeaurora.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1440523642-31373-1-git-send-email-zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> Sender: linux-efi-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: "Jonathan (Zhixiong) Zhang" Cc: Will Deacon , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Matt Fleming , Borislav Petkov , Ard Biesheuvel , Catalin Marinas List-Id: linux-efi@vger.kernel.org On Tue, 25 Aug, at 10:27:22AM, Jonathan (Zhixiong) Zhang wrote: > From: "Jonathan (Zhixiong) Zhang" > > If the ACPI APEI firmware handles hardware error first (called "firmware > first handling"), the firmware updates the GHES memory region with hardware > error record (called "generic hardware error record"). Essentially the > firmware writes hardware error records in the GHES memory region, triggers > an NMI/interrupt, then the GHES driver goes off and grabs the error record > from the GHES region. > > The kernel currently maps the GHES memory region as cacheable > (PAGE_KERNEL) for all architectures. However, on some arm64 platforms, > there is a mismatch between how the kernel maps the GHES region > (PAGE_KERNEL) and how the firmware maps it (EFI_MEMORY_UC, ie. > uncacheable), leading to the possibility of the kernel GHES driver > reading stale data from the cache when it receives the interrupt. > > With stale data being read, the kernel is unaware there is new hardware > error to be handled when there actually is; this may lead to further damage > in various scenarios, such as error propagation caused data corruption. > If uncorrected error (such as double bit ECC error) happened in memory > operation and if the kernel is unaware of such event happening, errorneous > data may be propagated to the disk. > > Instead GHES memory region should be mapped with page protection type > according to what is returned from arch_apei_get_mem_attribute(). > > Reviewed-by: Matt Fleming > Acked-by: Borislav Petkov > Signed-off-by: Jonathan (Zhixiong) Zhang > --- > drivers/acpi/apei/ghes.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) This patch message looks fine to me. Ingo? -- Matt Fleming, Intel Open Source Technology Center