From: James Morse <james.morse@arm.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Rafael Wysocki <rjw@rjwysocki.net>,
Tony Luck <tony.luck@intel.com>, Xie XiuQi <xiexiuqi@huawei.com>,
linux-mm@kvack.org, Marc Zyngier <marc.zyngier@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Punit Agrawal <punit.agrawal@arm.com>,
Will Deacon <will.deacon@arm.com>,
Tyler Baicar <tbaicar@codeaurora.org>,
Dongjiu Geng <gengdongjiu@huawei.com>,
linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
kvmarm@lists.cs.columbia.edu,
Christoffer Dall <christoffer.dall@linaro.org>,
Len Brown <lenb@kernel.org>
Subject: Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
Date: Mon, 19 Mar 2018 14:29:13 +0000 [thread overview]
Message-ID: <5AAFC939.3010309@arm.com> (raw)
In-Reply-To: <20180308104408.GB21166@pd.tnic>
Hi Borislav,
On 08/03/18 10:44, Borislav Petkov wrote:
> On Wed, Mar 07, 2018 at 06:15:02PM +0000, James Morse wrote:
>> Today its just x86 and arm64. arm64 doesn't have a hook to do this. I'm happy to
>> add an empty declaration or leave it under an ifdef until someone complains
>> about any behaviour I missed!
>
> So I did some more staring at the code and I think oops_begin() is
> needed mainly, as you point out, to prevent two oops messages from
> interleaving. And yap, the other stuff with printk() is not true anymore
> because the commit which added oops_begin():
>
> 81e88fdc432a ("ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support")
>
> still saw an NMI-unsafe printk. Which is long taken care of now.
>
> So only the interleaving issue remains.
>
> Which begs the question: how are you guys preventing the interleaving on
> arm64? Because arch/arm64/kernel/traps.c:200 grabs the die_lock too, so
> interleaving can happen on arm64 too, AFAICT.
die() messages are stopped from interleaving with each other by that die_lock.
panic()s atomic_cmpxchg() then panic_smp_self_stop() means panic() is
first-past-the-post.
So our problem is interleaving of the two. The sequence is roughly:
1. oops_begin(); // I'm going to panic()
2. printk(some stuff);
3. panic();
Everything we print at (2) gets batched up by vprintk_nmi(), and is only printed
from (3) when we call printk_safe_flush_on_panic().
... and now I spot there are two calls to printk_safe_flush_on_panic(), one of
which happens before any smp_send_stop() calls.
This means we can get interleaving with panic() as we flush the printk_safe
buffer before smp_send_stop(), and even if we change that a remote CPU may
refuse to die. (Both x86 and arm64 have timeouts in their smp_send_stop() code).
> And by that logic, you should technically grab that lock here too in
> _in_nmi_notify_one().
I don't think the die_lock really helps here, do we really want to wait for a
remote CPU to finish printing an OOPs about user-space's bad memory accesses,
before we bring the machine down due to this system-wide fatal RAS error? The
presence of firmware-first means we know this error, and any other oops are
unrelated.
Grabbing the die_lock doesn't stop remote CPUs printing messages via a mechanism
other than die()/_in_nmi_notify_one(). I think oops_begin() is just plastering
over a problem. (how come this exclusion isn't done by oops_enter()/oops_exit()?)
Isn't oops_begin() trying to guarantee any messages printk()d by this CPU appear
'with' the subsequent panic()? I can't see any way to stop a remote CPU from
messing this up by printk()ing in a loop with interrupts masked, preventing us
from smp_send_stop()ing it, and making it difficult to take the lock.
I'd like to leave this under the x86-ifdef for now. For arm64 it would be an
APEI specific arch hook to stop the arch code from printing some messages,
meanwhile the rest of the kernel is unaffected. I suspect this sort of thing
really needs support from printk(). (maybe some printk() severity that mutes
other CPUs, or redirects them to the printk_safe buffer).
Thanks,
James
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
Date: Mon, 19 Mar 2018 14:29:13 +0000 [thread overview]
Message-ID: <5AAFC939.3010309@arm.com> (raw)
In-Reply-To: <20180308104408.GB21166@pd.tnic>
Hi Borislav,
On 08/03/18 10:44, Borislav Petkov wrote:
> On Wed, Mar 07, 2018 at 06:15:02PM +0000, James Morse wrote:
>> Today its just x86 and arm64. arm64 doesn't have a hook to do this. I'm happy to
>> add an empty declaration or leave it under an ifdef until someone complains
>> about any behaviour I missed!
>
> So I did some more staring at the code and I think oops_begin() is
> needed mainly, as you point out, to prevent two oops messages from
> interleaving. And yap, the other stuff with printk() is not true anymore
> because the commit which added oops_begin():
>
> 81e88fdc432a ("ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support")
>
> still saw an NMI-unsafe printk. Which is long taken care of now.
>
> So only the interleaving issue remains.
>
> Which begs the question: how are you guys preventing the interleaving on
> arm64? Because arch/arm64/kernel/traps.c:200 grabs the die_lock too, so
> interleaving can happen on arm64 too, AFAICT.
die() messages are stopped from interleaving with each other by that die_lock.
panic()s atomic_cmpxchg() then panic_smp_self_stop() means panic() is
first-past-the-post.
So our problem is interleaving of the two. The sequence is roughly:
1. oops_begin(); // I'm going to panic()
2. printk(some stuff);
3. panic();
Everything we print at (2) gets batched up by vprintk_nmi(), and is only printed
from (3) when we call printk_safe_flush_on_panic().
... and now I spot there are two calls to printk_safe_flush_on_panic(), one of
which happens before any smp_send_stop() calls.
This means we can get interleaving with panic() as we flush the printk_safe
buffer before smp_send_stop(), and even if we change that a remote CPU may
refuse to die. (Both x86 and arm64 have timeouts in their smp_send_stop() code).
> And by that logic, you should technically grab that lock here too in
> _in_nmi_notify_one().
I don't think the die_lock really helps here, do we really want to wait for a
remote CPU to finish printing an OOPs about user-space's bad memory accesses,
before we bring the machine down due to this system-wide fatal RAS error? The
presence of firmware-first means we know this error, and any other oops are
unrelated.
Grabbing the die_lock doesn't stop remote CPUs printing messages via a mechanism
other than die()/_in_nmi_notify_one(). I think oops_begin() is just plastering
over a problem. (how come this exclusion isn't done by oops_enter()/oops_exit()?)
Isn't oops_begin() trying to guarantee any messages printk()d by this CPU appear
'with' the subsequent panic()? I can't see any way to stop a remote CPU from
messing this up by printk()ing in a loop with interrupts masked, preventing us
from smp_send_stop()ing it, and making it difficult to take the lock.
I'd like to leave this under the x86-ifdef for now. For arm64 it would be an
APEI specific arch hook to stop the arch code from printing some messages,
meanwhile the rest of the kernel is unaffected. I suspect this sort of thing
really needs support from printk(). (maybe some printk() severity that mutes
other CPUs, or redirects them to the printk_safe buffer).
Thanks,
James
WARNING: multiple messages have this Message-ID (diff)
From: James Morse <james.morse@arm.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Punit Agrawal <punit.agrawal@arm.com>,
linux-acpi@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
Christoffer Dall <christoffer.dall@linaro.org>,
Marc Zyngier <marc.zyngier@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Rafael Wysocki <rjw@rjwysocki.net>, Len Brown <lenb@kernel.org>,
Tony Luck <tony.luck@intel.com>,
Tyler Baicar <tbaicar@codeaurora.org>,
Dongjiu Geng <gengdongjiu@huawei.com>,
Xie XiuQi <xiexiuqi@huawei.com>
Subject: Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
Date: Mon, 19 Mar 2018 14:29:13 +0000 [thread overview]
Message-ID: <5AAFC939.3010309@arm.com> (raw)
In-Reply-To: <20180308104408.GB21166@pd.tnic>
Hi Borislav,
On 08/03/18 10:44, Borislav Petkov wrote:
> On Wed, Mar 07, 2018 at 06:15:02PM +0000, James Morse wrote:
>> Today its just x86 and arm64. arm64 doesn't have a hook to do this. I'm happy to
>> add an empty declaration or leave it under an ifdef until someone complains
>> about any behaviour I missed!
>
> So I did some more staring at the code and I think oops_begin() is
> needed mainly, as you point out, to prevent two oops messages from
> interleaving. And yap, the other stuff with printk() is not true anymore
> because the commit which added oops_begin():
>
> 81e88fdc432a ("ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support")
>
> still saw an NMI-unsafe printk. Which is long taken care of now.
>
> So only the interleaving issue remains.
>
> Which begs the question: how are you guys preventing the interleaving on
> arm64? Because arch/arm64/kernel/traps.c:200 grabs the die_lock too, so
> interleaving can happen on arm64 too, AFAICT.
die() messages are stopped from interleaving with each other by that die_lock.
panic()s atomic_cmpxchg() then panic_smp_self_stop() means panic() is
first-past-the-post.
So our problem is interleaving of the two. The sequence is roughly:
1. oops_begin(); // I'm going to panic()
2. printk(some stuff);
3. panic();
Everything we print at (2) gets batched up by vprintk_nmi(), and is only printed
from (3) when we call printk_safe_flush_on_panic().
... and now I spot there are two calls to printk_safe_flush_on_panic(), one of
which happens before any smp_send_stop() calls.
This means we can get interleaving with panic() as we flush the printk_safe
buffer before smp_send_stop(), and even if we change that a remote CPU may
refuse to die. (Both x86 and arm64 have timeouts in their smp_send_stop() code).
> And by that logic, you should technically grab that lock here too in
> _in_nmi_notify_one().
I don't think the die_lock really helps here, do we really want to wait for a
remote CPU to finish printing an OOPs about user-space's bad memory accesses,
before we bring the machine down due to this system-wide fatal RAS error? The
presence of firmware-first means we know this error, and any other oops are
unrelated.
Grabbing the die_lock doesn't stop remote CPUs printing messages via a mechanism
other than die()/_in_nmi_notify_one(). I think oops_begin() is just plastering
over a problem. (how come this exclusion isn't done by oops_enter()/oops_exit()?)
Isn't oops_begin() trying to guarantee any messages printk()d by this CPU appear
'with' the subsequent panic()? I can't see any way to stop a remote CPU from
messing this up by printk()ing in a loop with interrupts masked, preventing us
from smp_send_stop()ing it, and making it difficult to take the lock.
I'd like to leave this under the x86-ifdef for now. For arm64 it would be an
APEI specific arch hook to stop the arch code from printing some messages,
meanwhile the rest of the kernel is unaffected. I suspect this sort of thing
really needs support from printk(). (maybe some printk() severity that mutes
other CPUs, or redirects them to the printk_safe buffer).
Thanks,
James
next prev parent reply other threads:[~2018-03-19 14:29 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-15 18:55 [PATCH 00/11] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` [PATCH 01/11] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` James Morse
2018-02-20 18:26 ` Punit Agrawal
2018-02-20 18:26 ` Punit Agrawal
2018-02-20 18:26 ` Punit Agrawal
2018-02-20 19:28 ` Borislav Petkov
2018-02-20 19:28 ` Borislav Petkov
2018-02-20 19:28 ` Borislav Petkov
2018-02-23 18:02 ` James Morse
2018-02-23 18:02 ` James Morse
2018-02-23 18:02 ` James Morse
2018-02-23 18:07 ` Borislav Petkov
2018-02-23 18:07 ` Borislav Petkov
2018-02-23 18:07 ` Borislav Petkov
2018-02-15 18:55 ` [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` James Morse
2018-02-20 18:26 ` Punit Agrawal
2018-02-20 18:26 ` Punit Agrawal
2018-02-20 18:26 ` Punit Agrawal
2018-02-23 18:21 ` James Morse
2018-02-23 18:21 ` James Morse
2018-02-23 18:21 ` James Morse
2018-03-01 15:01 ` Borislav Petkov
2018-03-01 15:01 ` Borislav Petkov
2018-03-01 15:01 ` Borislav Petkov
2018-03-01 18:06 ` Punit Agrawal
2018-03-01 18:06 ` Punit Agrawal
2018-03-01 18:06 ` Punit Agrawal
2018-03-01 22:35 ` Borislav Petkov
2018-03-01 22:35 ` Borislav Petkov
2018-03-01 22:35 ` Borislav Petkov
2018-03-07 18:15 ` James Morse
2018-03-07 18:15 ` James Morse
2018-03-07 18:15 ` James Morse
2018-03-08 10:44 ` Borislav Petkov
2018-03-08 10:44 ` Borislav Petkov
2018-03-08 10:44 ` Borislav Petkov
2018-03-19 14:29 ` James Morse [this message]
2018-03-19 14:29 ` James Morse
2018-03-19 14:29 ` James Morse
2018-03-27 17:25 ` Borislav Petkov
2018-03-27 17:25 ` Borislav Petkov
2018-03-27 17:25 ` Borislav Petkov
2018-03-28 16:30 ` James Morse
2018-03-28 16:30 ` James Morse
2018-03-28 16:30 ` James Morse
2018-04-17 15:10 ` Borislav Petkov
2018-04-17 15:10 ` Borislav Petkov
2018-04-17 15:10 ` Borislav Petkov
2018-02-15 18:55 ` [PATCH 03/11] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:55 ` [PATCH 04/11] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-02-15 18:55 ` James Morse
2018-02-15 18:56 ` [PATCH 05/11] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-02-15 18:56 ` James Morse
2018-02-20 18:30 ` Punit Agrawal
2018-02-20 18:30 ` Punit Agrawal
2018-02-20 18:30 ` Punit Agrawal
2018-02-15 18:56 ` [PATCH 06/11] ACPI / APEI: Make the fixmap_idx per-ghes to allow multiple in_nmi() users James Morse
2018-02-15 18:56 ` James Morse
2018-02-20 21:18 ` Tyler Baicar
2018-02-20 21:18 ` Tyler Baicar
2018-02-20 21:18 ` Tyler Baicar
2018-02-22 17:47 ` James Morse
2018-02-22 17:47 ` James Morse
2018-02-22 17:47 ` James Morse
2018-02-15 18:56 ` [PATCH 07/11] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` [PATCH 08/11] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-02-15 18:56 ` James Morse
2018-02-20 18:31 ` Punit Agrawal
2018-02-20 18:31 ` Punit Agrawal
2018-02-20 18:31 ` Punit Agrawal
2018-02-15 18:56 ` [PATCH 09/11] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` [PATCH 10/11] mm/memory-failure: increase queued recovery work's priority James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` James Morse
2018-02-15 18:56 ` [PATCH 11/11] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-02-15 18:56 ` James Morse
2018-02-19 21:05 ` [PATCH 00/11] APEI in_nmi() rework and arm64 SDEI wire-up Borislav Petkov
2018-02-19 21:05 ` Borislav Petkov
2018-02-19 21:05 ` Borislav Petkov
2018-02-20 18:42 ` Punit Agrawal
2018-02-20 18:42 ` Punit Agrawal
2018-02-20 18:42 ` Punit Agrawal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5AAFC939.3010309@arm.com \
--to=james.morse@arm.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=gengdongjiu@huawei.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=marc.zyngier@arm.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=punit.agrawal@arm.com \
--cc=rjw@rjwysocki.net \
--cc=tbaicar@codeaurora.org \
--cc=tony.luck@intel.com \
--cc=will.deacon@arm.com \
--cc=xiexiuqi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.