From: Sumit Garg <sumit.garg@linaro.org>
To: Doug Anderson <dianders@chromium.org>
Cc: "Mark Rutland" <mark.rutland@arm.com>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Will Deacon" <will@kernel.org>,
"Daniel Thompson" <daniel.thompson@linaro.org>,
"Marc Zyngier" <maz@kernel.org>,
ito-yuichi@fujitsu.com, kgdb-bugreport@lists.sourceforge.net,
"Chen-Yu Tsai" <wens@csie.org>,
"Masayoshi Mizuma" <msys.mizuma@gmail.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Ard Biesheuvel" <ardb@kernel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
linux-arm-kernel@lists.infradead.org,
"Stephen Boyd" <swboyd@chromium.org>,
"Lecopzer Chen" <lecopzer.chen@mediatek.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
linux-perf-users@vger.kernel.org,
"Alexandru Elisei" <alexandru.elisei@arm.com>,
"Andrey Konovalov" <andreyknvl@gmail.com>,
"Ben Dooks" <ben-linux@fluff.org>,
"Borislav Petkov" <bp@alien8.de>,
"Christophe Leroy" <christophe.leroy@csgroup.eu>,
"Darrick J. Wong" <djwong@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"David S. Miller" <davem@davemloft.net>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Gaosheng Cui" <cuigaosheng1@huawei.com>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
"Guo Ren" <guoren@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
"Huacai Chen" <chenhuacai@kernel.org>,
"Ingo Molnar" <mingo@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Jason A. Donenfeld" <Jason@zx2c4.com>,
"Jason Wessel" <jason.wessel@windriver.com>,
"Jianmin Lv" <lvjianmin@loongson.cn>,
"Jiaxun Yang" <jiaxun.yang@flygoat.com>,
"Jinyang He" <hejinyang@loongson.cn>,
"Joey Gouly" <joey.gouly@arm.com>,
"Kees Cook" <keescook@chromium.org>,
"Laurent Dufour" <ldufour@linux.ibm.com>,
"Masahiro Yamada" <masahiroy@kernel.org>,
"Masayoshi Mizuma" <m.mizuma@jp.fujitsu.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
"Pierre Gondois" <Pierre.Gondois@arm.com>,
"Qing Zhang" <zhangqing@loongson.cn>,
"Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>,
"Russell King" <linux@armlinux.org.uk>,
"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
"Ulf Hansson" <ulf.hansson@linaro.org>,
"WANG Xuerui" <kernel@xen0n.name>,
linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev,
sparclinux@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
Date: Tue, 16 May 2023 15:39:21 +0530 [thread overview]
Message-ID: <CAFA6WYOU8HW2JVBfCeFEkn-5cd81TM-x=ArUKeaSi3NzxgKaGQ@mail.gmail.com> (raw)
In-Reply-To: <CAD=FV=WjX-XD6tX3hZq0GOh9e+Pc1jMMYP8DCc=u1YWQ2E5hYw@mail.gmail.com>
On Wed, 10 May 2023 at 22:20, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >
Thanks Doug for picking up this work and for all your additions/improvements.
> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > > tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > > pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > > traces of all running processes with the soft lockup detector if
> > > > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > > its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > > detector based on perf. On its own, b) gives a stack crawl of the
> > > > locked up CPU but no stack crawls of other CPUs (even if they're
> > > > locked too). Together with a) + b) we get everything (full lockup
> > > > detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > > considering trying to upstream. If b) lands then I believe c) would
> > > > be redundant (at least for arm64). c) on its own is really only
> > > > useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > > (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P
For me it was months waiting without any feedback. So I think you are
lucky :) or atleast better than me at poking arm64 maintainers.
> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go? ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.
>
> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.
I agree here. We should be able to make the foundations robust later
on. IMHO, until we turn on features surrounding pseudo NMIs, I am not
sure how we can have true confidence in the underlying robustness.
-Sumit
>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...
>
> [1] https://issuetracker.google.com/281831288
next prev parent reply other threads:[~2023-05-16 10:09 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-19 22:55 [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 01/10] arm64: Add framework to turn " Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 02/10] irqchip/gic-v3: Enable support for SGIs to act as NMIs Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI Douglas Anderson
2023-04-24 17:53 ` Doug Anderson
2023-04-19 22:55 ` [PATCH v8 04/10] nmi: backtrace: Allow runtime arch specific override Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 05/10] arm64: ipi_nmi: Add support for NMI backtrace Douglas Anderson
2023-04-19 22:56 ` [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle Douglas Anderson
2023-05-10 16:43 ` Mark Rutland
2023-05-10 21:13 ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism Douglas Anderson
2023-05-12 13:48 ` Daniel Thompson
2023-05-15 23:21 ` Doug Anderson
2023-06-01 21:47 ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB Douglas Anderson
2023-05-11 14:34 ` Doug Anderson
2023-05-12 13:52 ` Daniel Thompson
2023-04-19 22:56 ` [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI Douglas Anderson
2023-05-12 14:00 ` Daniel Thompson
2023-05-15 23:11 ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 10/10] arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled Douglas Anderson
2023-05-10 15:28 ` [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI Doug Anderson
2023-05-10 16:30 ` Mark Rutland
2023-05-10 16:42 ` Doug Anderson
2023-05-16 10:09 ` Sumit Garg [this message]
2023-06-01 21:46 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFA6WYOU8HW2JVBfCeFEkn-5cd81TM-x=ArUKeaSi3NzxgKaGQ@mail.gmail.com' \
--to=sumit.garg@linaro.org \
--cc=Jason@zx2c4.com \
--cc=Pierre.Gondois@arm.com \
--cc=alexandru.elisei@arm.com \
--cc=andreyknvl@gmail.com \
--cc=ardb@kernel.org \
--cc=ben-linux@fluff.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chenhuacai@kernel.org \
--cc=christophe.leroy@csgroup.eu \
--cc=cuigaosheng1@huawei.com \
--cc=daniel.thompson@linaro.org \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dianders@chromium.org \
--cc=djwong@kernel.org \
--cc=ebiederm@xmission.com \
--cc=f4bug@amsat.org \
--cc=frederic@kernel.org \
--cc=gautham.shenoy@amd.com \
--cc=gpiccoli@igalia.com \
--cc=gregkh@linuxfoundation.org \
--cc=guoren@kernel.org \
--cc=hejinyang@loongson.cn \
--cc=hpa@zytor.com \
--cc=ito-yuichi@fujitsu.com \
--cc=jason.wessel@windriver.com \
--cc=jiaxun.yang@flygoat.com \
--cc=joey.gouly@arm.com \
--cc=keescook@chromium.org \
--cc=kernel@xen0n.name \
--cc=kgdb-bugreport@lists.sourceforge.net \
--cc=ldufour@linux.ibm.com \
--cc=lecopzer.chen@mediatek.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=lvjianmin@loongson.cn \
--cc=m.mizuma@jp.fujitsu.com \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=maz@kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=msys.mizuma@gmail.com \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=rmk+kernel@armlinux.org.uk \
--cc=sparclinux@vger.kernel.org \
--cc=swboyd@chromium.org \
--cc=tglx@linutronix.de \
--cc=tsbogend@alpha.franken.de \
--cc=ulf.hansson@linaro.org \
--cc=wens@csie.org \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=zhangqing@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).