From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH 2/2] acpi, apei: use appropriate pgprot_t to map GHES memory Date: Fri, 4 Sep 2015 13:36:00 +0200 Message-ID: <20150904113559.GA21396@gmail.com> References: <1440523642-31373-1-git-send-email-zjzhang@codeaurora.org> <20150904112820.GB2737@codeblueprint.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20150904112820.GB2737-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org> Sender: linux-efi-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Matt Fleming Cc: "Jonathan (Zhixiong) Zhang" , Will Deacon , Thomas Gleixner , "H. Peter Anvin" , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Matt Fleming , Borislav Petkov , Ard Biesheuvel , Catalin Marinas List-Id: linux-efi@vger.kernel.org * Matt Fleming wrote: > On Tue, 25 Aug, at 10:27:22AM, Jonathan (Zhixiong) Zhang wrote: > > From: "Jonathan (Zhixiong) Zhang" > > > > If the ACPI APEI firmware handles hardware error first (called "firmware > > first handling"), the firmware updates the GHES memory region with hardware > > error record (called "generic hardware error record"). Essentially the > > firmware writes hardware error records in the GHES memory region, triggers > > an NMI/interrupt, then the GHES driver goes off and grabs the error record > > from the GHES region. > > > > The kernel currently maps the GHES memory region as cacheable > > (PAGE_KERNEL) for all architectures. However, on some arm64 platforms, > > there is a mismatch between how the kernel maps the GHES region > > (PAGE_KERNEL) and how the firmware maps it (EFI_MEMORY_UC, ie. > > uncacheable), leading to the possibility of the kernel GHES driver > > reading stale data from the cache when it receives the interrupt. > > > > With stale data being read, the kernel is unaware there is new hardware > > error to be handled when there actually is; this may lead to further damage > > in various scenarios, such as error propagation caused data corruption. > > If uncorrected error (such as double bit ECC error) happened in memory > > operation and if the kernel is unaware of such event happening, errorneous > > data may be propagated to the disk. > > > > Instead GHES memory region should be mapped with page protection type > > according to what is returned from arch_apei_get_mem_attribute(). > > > > Reviewed-by: Matt Fleming > > Acked-by: Borislav Petkov > > Signed-off-by: Jonathan (Zhixiong) Zhang > > --- > > drivers/acpi/apei/ghes.c | 10 +++++++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > This patch message looks fine to me. Ingo? Looks good to me too! Thanks, Ingo