From: Petr Mladek <pmladek@suse.com>
To: Bradley Morgan <include@grrlz.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Feng Tang <feng.tang@linux.alibaba.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
Christophe Leroy <chleroy@kernel.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Douglas Anderson <dianders@chromium.org>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
stable@vger.kernel.org
Subject: Re: [PATCH v3 4/4] panic: use sys_info_with_filter() to avoid duplicate backtraces
Date: Fri, 26 Jun 2026 16:47:12 +0200 [thread overview]
Message-ID: <aj6Q8JcogaIaQit4@pathway.suse.cz> (raw)
In-Reply-To: <688433ED-A478-43F7-9103-995398A6BF63@grrlz.net>
On Fri 2026-06-26 15:35:19, Bradley Morgan wrote:
> On June 26, 2026 3:26:11 PM GMT+01:00, Petr Mladek <pmladek@suse.com>
> wrote:
> >On Fri 2026-06-26 13:32:38, Bradley Morgan wrote:
> >> On June 26, 2026 1:17:13 PM GMT+01:00, Bradley Morgan
> ><include@grrlz.net>
> >> wrote:
> >> >On June 26, 2026 1:14:14 PM GMT+01:00, Petr Mladek <pmladek@suse.com>
> >> >wrote:
> >> >>On Fri 2026-06-26 12:23:50, Petr Mladek wrote:
> >> >>> On Thu 2026-06-25 15:25:58, Bradley Morgan wrote:
> >> >>> But it all becomes very hairy. We have several levels:
> >> >>>
> >> >>> + watchdog-all_bt-specific option, e.g.
> >> >>sysctl_hardlockup_all_cpu_backtrace
> >> >>>
> >> >>> + watchdog-specific si_info preferences, e.g. hardlockup_si_mask
> >> >>>
> >> >>> + panic-specific si_info: panic_print
> >> >>>
> >> >>> + universal fallback for any layer: kernel_si_info
> >> >>>
> >> >>> Now, we try to check all these variables back and forth to
> >> >>> trigger all backtraces or to avoid triggering them.
> >> >>> And it clearly does not work well and the code is more and more
> >> >>> hairy.
> >> >>>
> >> >>> I think about another approach. The word "waterfall" comes to my
> >mind.
> >> >>> Instead of checking all the settings back and forth, let's process
> >> >>> each setting one by one and just remember what has been done and
> >> >>> skip this in the next level.
> >> >>>
> >> >>> All the si_info actions seems to dump a global system state.
> >> >>> So, it would make sense to remember the state in a global variable
> >> >>> even when it might be modified by more CPUs in parallel.
> >> >>>
> >> Hmm.. new idea
> >>
> >> kernel/dump_filter.c ?
> >>
> >> What this file could do is to handle a generic lockup state machine
> >> so any subsystem can log what it already dumped?
> >>
> >> I know it may bloat, but it's better then cramming fixes in.
> >
> >I am not sure what exactly you would like to achieve but it sounds
> >a bit scary ;-)
> >
> >Anyway, we should not synchronize the watchdog reports against
> >each other, definitely. They are running in non-compatible contexts
> >(task vs interrupt vs NMI). Also we should not add any locking
> >because they usually print something when the system has enough
> >troubles.
> >
> >Also I think that it is not worth preventing duplicated backtraces
> >or reports from a single CPU. IMHO, it is not a big problem
> >in practice.
> >
> >So, we are down to large reports, like backtraces from all CPUs,
> >timers, locks, ... which are handled by sys_info(). So, I think
> >that it should be enough to handle this inside the sys_info() API.
> >
> >I do not want to say that my proposal was the best solution.
> >I am sure that there are better ones. But we need to consider
> >the gain vs. complexity.
> >
> >Honestly, I am already a bit scared by the complexity which
> >we the sys_info() API added. And it is hard to imagine that
> >adding another API would make it easier. But I might be wrong.
> >
> >Instead, it might make sense to integrate the conflicting
> >subsystem-specific calls under the sys_info() API.
> >I mean that, for example watchdog_hardlockup_check() won't
> >call trigger_allbutcpu_cpu_backtrace() directly but
> >it would call it via sys_info() API so that sys_info()
> >could keep track of it. Something like:
> >
> >void sys_info_allbutcpu_bt(int cpu)
> >{
> > trigger_allbutcpu_cpu_backtrace(cpu);
> > /*
> > * The caller likely printed backtrace of the given @cpu
> > * on its own. Prevent duplicate backtraces from all
> > * CPUs with potential next sys_info() call.
> > */
> > sys_info_done(SYS_INFO_ALL_BT);
> >}
> >
> >But I am not sure if it is really easier to follow
> >than calling sys_info_done() from the watchdog code.
> >
> >Some watchdogs try to optimize the output and print backtraces
> >only from CPUs which are relevant for the given lockup.
> >We should keep the logic for selecting the set of CPUs
> >in the watchdog code. We just need to solve how to elegantly
> >make sys_info() aware of it or at least about the more massive
> >reports.
> >
> >Anyway, I would prefer to keep it simple until we see some problems
> >in practice.
> >
> >Best Regards,
> >Petr
> >
>
>
> I understand it's scary. To make a new file in the first place.
>
> But I was a bit vague of what I wanted, and I'm sorry.
>
> So, the reason why I'd suggest a new file, is because if any subsystem
> Theoretically bypasses sys_info to log a lockup, this completely misses
> the filter and duplicates the dump
>
> My file would act as a generic lockless state machine that any
> subsystem can update regardless of how they dump logs.
>
> If you have any questions, feel absolutely free to ask! :)
>
> Discussion is a way to make everyone happy!
Honestly, I am more and more wondering whether your are a real person
or AI bot.
Best Regards,
Petr
next prev parent reply other threads:[~2026-06-26 14:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-25 15:25 [PATCH v3 0/4] sys_info: prevent duplicate backtraces Bradley Morgan
2026-06-25 15:25 ` [PATCH v3 1/4] sys_info: add helper for callers that print some sys_info on their own Bradley Morgan
2026-06-25 15:25 ` [PATCH v3 2/4] watchdog: use sys_info_with_filter() to avoid duplicate backtraces Bradley Morgan
2026-06-25 15:25 ` [PATCH v3 3/4] powerpc/watchdog: " Bradley Morgan
2026-06-26 9:42 ` Petr Mladek
2026-06-25 15:25 ` [PATCH v3 4/4] panic: " Bradley Morgan
2026-06-26 10:23 ` Petr Mladek
2026-06-26 10:27 ` Bradley Morgan
2026-06-26 12:06 ` Feng Tang
2026-06-26 12:14 ` Petr Mladek
2026-06-26 12:17 ` Bradley Morgan
2026-06-26 12:32 ` Bradley Morgan
2026-06-26 14:26 ` Petr Mladek
2026-06-26 14:35 ` Bradley Morgan
2026-06-26 14:47 ` Petr Mladek [this message]
2026-06-26 14:58 ` Bradley Morgan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aj6Q8JcogaIaQit4@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=akpm@linux-foundation.org \
--cc=chleroy@kernel.org \
--cc=dianders@chromium.org \
--cc=feng.tang@linux.alibaba.com \
--cc=include@grrlz.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.