From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors Date: Tue, 22 Jan 2019 11:51:43 +0100 Message-ID: <20190122105143.GB26587@zn.tnic> References: <20181203180613.228133-1-james.morse@arm.com> <20181203180613.228133-23-james.morse@arm.com> <9d153a07-aa7a-6e0c-3bd3-994a66f9639a@huawei.com> <5c775aa9-ea57-dea7-6083-c1e3fc160b29@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <5c775aa9-ea57-dea7-6083-c1e3fc160b29@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: James Morse Cc: Rafael Wysocki , Tony Luck , Fan Wu , linux-acpi@vger.kernel.org, Marc Zyngier , Catalin Marinas , Will Deacon , Dongjiu Geng , Wang Xiongfeng , linux-mm@kvack.org, Naoya Horiguchi , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, Len Brown List-Id: linux-acpi@vger.kernel.org On Mon, Dec 10, 2018 at 07:15:13PM +0000, James Morse wrote: > What happens if we miss MF_ACTION_REQUIRED? AFAICU, the logic is to force-send a signal to the user process, i.e., force_sig_info() which cannot be ignored. IOW, an "enlightened" process would know how to do recovery action from a memory error. VS the action optional thing which you can handle at your leisure. So the question boils down to what kind of severity do the errors reported through SEA have? I mean, if the hw would go the trouble to do the synchronous reporting, then something important must've happened and it wants us to know about it and handle it. > Surely the page still gets unmapped as its PG_Poisoned, an AO signal > may be pending, but if user-space touches the page it will get an AR > signal. Is this just about removing an extra AO signal to user-space? > > If we do need this, I'd like to pick it up from the CPER records, as x86's > NOTIFY_NMI looks like it covers both AO/AR cases. (as does NOTIFY_SDEI). The > Master/Target abort or Invalid-address types in the memory-error-section CPER > records look like the best bet. Right, and we do all kinds of severity mapping there aka ghes_severity() so that'll be a good start, methinks. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.