From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: kdevops@lists.linux.dev, Daniel Gomez <da.gomez@kruces.com>
Subject: Re: [PATCH 0/5] crash: provide a crash watchdog
Date: Sun, 20 Apr 2025 11:19:28 -0400 [thread overview]
Message-ID: <e65ed3c5-ce07-4109-a37c-5ecd1a8c4e51@kernel.org> (raw)
In-Reply-To: <20250420054822.533987-1-mcgrof@kernel.org>
On 4/20/25 1:48 AM, Luis Chamberlain wrote:
> One of the biggest pains we've suffered with CIs has been
> crashes, filesystem corruptions. Although we have a feature
> which is *supposed* to do that, its obviosly not working.
> Fix this by adding support for this stuff, and leveraging for
> our CIs. We will start with our fstests CIs and can scale this
> out to the other ones easily now.
Hi Luis -
I haven't looked at the crash detection/reporting logic before.
Where can I find documentation for developers who want to add this
support to their workflows, or for test engineers who want to enable it?
Once I'm a little more educated, my review perspective will be thinking
about how this works for cloud kdevops.
> Luis Chamberlain (5):
> systemd-remote: use ip address for systemd-remote journal
> crash: add kernel crash watchdog library
> fstests_watchdog.py: use the new crash watchdog library
> crash_watchdog.py: add generic crash watchdog
> crash_report.py: add a crash report
>
> scripts/workflows/fstests/fstests_watchdog.py | 89 ++-
> scripts/workflows/generic/crash_report.py | 109 +++
> scripts/workflows/generic/crash_watchdog.py | 186 +++++
> scripts/workflows/generic/get_console.py | 1 +
> scripts/workflows/generic/lib | 1 +
> scripts/workflows/lib/crash.py | 724 ++++++++++++++++++
> scripts/workflows/lib/systemd_remote.py | 19 +-
> 7 files changed, 1081 insertions(+), 48 deletions(-)
> create mode 100755 scripts/workflows/generic/crash_report.py
> create mode 100755 scripts/workflows/generic/crash_watchdog.py
> create mode 120000 scripts/workflows/generic/get_console.py
> create mode 120000 scripts/workflows/generic/lib
> create mode 100755 scripts/workflows/lib/crash.py
>
--
Chuck Lever
next prev parent reply other threads:[~2025-04-20 15:19 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-20 5:48 [PATCH 0/5] crash: provide a crash watchdog Luis Chamberlain
2025-04-20 5:48 ` [PATCH 1/5] systemd-remote: use ip address for systemd-remote journal Luis Chamberlain
2025-04-20 5:48 ` [PATCH 2/5] crash: add kernel crash watchdog library Luis Chamberlain
2025-04-20 5:48 ` [PATCH 3/5] fstests_watchdog.py: use the new " Luis Chamberlain
2025-04-20 5:48 ` [PATCH 4/5] crash_watchdog.py: add generic crash watchdog Luis Chamberlain
2025-04-20 5:48 ` [PATCH 5/5] crash_report.py: add a crash report Luis Chamberlain
2025-04-20 15:19 ` Chuck Lever [this message]
2025-04-21 23:16 ` [PATCH 0/5] crash: provide a crash watchdog Luis Chamberlain
2025-04-22 2:38 ` Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e65ed3c5-ce07-4109-a37c-5ecd1a8c4e51@kernel.org \
--to=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=kdevops@lists.linux.dev \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox