* [PATCHSET 3/3] fstests: integrate with coredump capturing @ 2025-07-29 20:08 Darrick J. Wong 2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong 2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong 0 siblings, 2 replies; 8+ messages in thread From: Darrick J. Wong @ 2025-07-29 20:08 UTC (permalink / raw) To: djwong, zlang; +Cc: fstests, linux-xfs Hi all, Integrate fstests with coredump capturing tools such as systemd-coredump. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. With a bit of luck, this should all go splendidly. Comments and questions are, as always, welcome. --D fstests git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=coredump-capture --- Commits in this patchset: * fsstress: don't abort when stat(".") returns EIO * check: collect core dumps from systemd-coredump --- check | 2 ++ common/rc | 44 ++++++++++++++++++++++++++++++++++++++++++++ ltp/fsstress.c | 15 ++++++++++++++- 3 files changed, 60 insertions(+), 1 deletion(-) ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO 2025-07-29 20:08 [PATCHSET 3/3] fstests: integrate with coredump capturing Darrick J. Wong @ 2025-07-29 20:10 ` Darrick J. Wong 2025-07-30 14:23 ` Christoph Hellwig 2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong 1 sibling, 1 reply; 8+ messages in thread From: Darrick J. Wong @ 2025-07-29 20:10 UTC (permalink / raw) To: djwong, zlang; +Cc: fstests, linux-xfs From: Darrick J. Wong <djwong@kernel.org> First, start with the premise that fstests is run with a nonzero limit on the size of core dumps so that we can capture the state of misbehaving fs utilities like fsck and scrub if they crash. When fsstress is compiled with DEBUG defined (which is the default), it will periodically call check_cwd to ensure that the current working directory hasn't changed out from underneath it. If the filesystem is XFS and it shuts down, the stat64() calls will start returning EIO. In this case, we follow the out: label and call abort() to exit the program. Historically this did not produce any core dumps because $PWD is on the dead filesystem and the write fails. However, modern systems are often configured to capture coredumps using some external mechanism, e.g. abrt/systemd-coredump. In this case, the capture tool will succeeds in capturing every crashed process, which fills the crash dump directory with a lot of useless junk. Worse, if the capture tool is configured to pass the dumps to fstests, it will flag the test as failed because something dumped core. This is really silly, because basic stat requests for the current working directory can be satisfied from the inode cache without a disk access. In this narrow situation, EIO only happens when the fs has shut down, so just exit the program. We really should have a way to query if a filesystem is shut down that isn't conflated with (possibly transient) EIO errors. But for now this is what we have to do. :( Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- ltp/fsstress.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/ltp/fsstress.c b/ltp/fsstress.c index 8dbfb81f95a538..d4abe561787f19 100644 --- a/ltp/fsstress.c +++ b/ltp/fsstress.c @@ -1049,8 +1049,21 @@ check_cwd(void) ret = stat64(".", &statbuf); if (ret != 0) { + int error = errno; + fprintf(stderr, "fsstress: check_cwd stat64() returned %d with errno: %d (%s)\n", - ret, errno, strerror(errno)); + ret, error, strerror(error)); + + /* + * The current working directory is pinned in memory, which + * means that stat should not have had to do any disk accesses + * to retrieve stat information. Treat an EIO as an indication + * that the filesystem shut down and exit instead of dumping + * core like the abort() below does. + */ + if (error == EIO) + exit(1); + goto out; } ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO 2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong @ 2025-07-30 14:23 ` Christoph Hellwig 2025-07-30 14:55 ` Darrick J. Wong 0 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2025-07-30 14:23 UTC (permalink / raw) To: Darrick J. Wong; +Cc: zlang, fstests, linux-xfs On Tue, Jul 29, 2025 at 01:10:50PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > First, start with the premise that fstests is run with a nonzero limit > on the size of core dumps so that we can capture the state of > misbehaving fs utilities like fsck and scrub if they crash. Can you explain what this has to do with core dumping? I'm just really confused between this patch content and the subject of this patch and the entire series.. > This is really silly, because basic stat requests for the current > working directory can be satisfied from the inode cache without a disk > access. In this narrow situation, EIO only happens when the fs has shut > down, so just exit the program. If we think it's silly we can trivially drop the xfs_is_shutdown check in xfs_vn_getattr. But is it really silly? We've tried to basically make every file system operation consistently fail on shut down file systems, > We really should have a way to query if a filesystem is shut down that > isn't conflated with (possibly transient) EIO errors. But for now this > is what we have to do. :( Well, a new STATX_ flag would work, assuming stat doesn't actually fail :) Otherwise a new ioctl/fcntl would make sense, especially as the shutdown concept has spread beyond XFS. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO 2025-07-30 14:23 ` Christoph Hellwig @ 2025-07-30 14:55 ` Darrick J. Wong 0 siblings, 0 replies; 8+ messages in thread From: Darrick J. Wong @ 2025-07-30 14:55 UTC (permalink / raw) To: Christoph Hellwig; +Cc: zlang, fstests, linux-xfs On Wed, Jul 30, 2025 at 07:23:24AM -0700, Christoph Hellwig wrote: > On Tue, Jul 29, 2025 at 01:10:50PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > First, start with the premise that fstests is run with a nonzero limit > > on the size of core dumps so that we can capture the state of > > misbehaving fs utilities like fsck and scrub if they crash. > > Can you explain what this has to do with core dumping? > > I'm just really confused between this patch content and the subject of > this patch and the entire series.. It's a bugfix ahead of new behaviors introduced in patch 2. I clearly didn't explain this well enough, so I'll try again. Before abrt/systemd-coredump, FS_IOC_SHUTDOWN fsstress tests would do something like the following: 1. start fsstress, which chdirs to $TEST_DIR 2. shut down the filesystem 3. fsstress tries to stat($TEST_DIR), fails, and calls abort 4. abort triggers coredump 5. kernel fails to write "core" to $TEST_DIR (because fs is shut down) 6. test finishes, no core files written to $here, test passes Once you install systemd-coredump, that changes to: same 1-4 above 5. kernel pipes core file to coredumpctl, which writes it to /var/crash 6. test finishes, no core files written to $here, test passes And then with patch 2 of this series, that becomes: same 1-4 above 5. kernel pipes core file to coredumpctl, which writes it to /var/crash 6. test finishes, ./check queries coredumpctl for any new coredumps, and copies them to $here 7. ./check finds core files written to $here, test fails Now we've caused a test failure where there was none before, simply because the crash reporting improved. Therefore this patch changes fsstress not to call abort() from check_cwd when it has a reasonable suspicion that the fs has died. (Did that help? /me is still pre-coffee...) > > This is really silly, because basic stat requests for the current > > working directory can be satisfied from the inode cache without a disk > > access. In this narrow situation, EIO only happens when the fs has shut > > down, so just exit the program. > > If we think it's silly we can trivially drop the xfs_is_shutdown check > in xfs_vn_getattr. But is it really silly? We've tried to basically > make every file system operation consistently fail on shut down > file systems, No no, "really silly" refers to failing tests that we didn't used to fail. > > We really should have a way to query if a filesystem is shut down that > > isn't conflated with (possibly transient) EIO errors. But for now this > > is what we have to do. :( > > Well, a new STATX_ flag would work, assuming stat doesn't actually > fail :) Otherwise a new ioctl/fcntl would make sense, especially as > the shutdown concept has spread beyond XFS. I think we ought to add a new ioctl or something so that callers can positively identify a shut down filesystem. bfoster I think was asking about that for fstests some years back, and ended up coding a bunch of grep heuristics to work around the lack of a real call. I think we can't drop the "stat{,x} returns EIO on shutdown fs" behavior because I know of a few, uh, users whose heartbeat monitor periodically queries statx($PWD) and reboots the node if it returns errno. --D ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] check: collect core dumps from systemd-coredump 2025-07-29 20:08 [PATCHSET 3/3] fstests: integrate with coredump capturing Darrick J. Wong 2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong @ 2025-07-29 20:11 ` Darrick J. Wong 2025-08-02 13:47 ` Zorro Lang 2025-08-13 15:18 ` [PATCH v2 " Darrick J. Wong 1 sibling, 2 replies; 8+ messages in thread From: Darrick J. Wong @ 2025-07-29 20:11 UTC (permalink / raw) To: djwong, zlang; +Cc: fstests, linux-xfs From: Darrick J. Wong <djwong@kernel.org> On modern RHEL (>=8) and Debian KDE systems, systemd-coredump can be installed to capture core dumps from crashed programs. If this is the case, we would like to capture core dumps from programs that crash during the test. Set up an (admittedly overwrought) pipeline to extract dumps created during the test and then capture them the same way that we pick up "core" and "core.$pid" files. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- check | 2 ++ common/rc | 44 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/check b/check index ce7eacb7c45d9e..77581e438c46b9 100755 --- a/check +++ b/check @@ -924,6 +924,7 @@ function run_section() $1 == "'$seqnum'" {lasttime=" " $2 "s ... "; exit} \ END {printf "%s", lasttime}' "$check.time" rm -f core $seqres.notrun + _start_coredumpctl_collection start=`_wallclock` $timestamp && _timestamp @@ -957,6 +958,7 @@ function run_section() # just "core". Use globbing to find the most common patterns, # assuming there are no other coredump capture packages set up. local cores=0 + _finish_coredumpctl_collection for i in core core.*; do test -f "$i" || continue if ((cores++ == 0)); then diff --git a/common/rc b/common/rc index 04b721b7318a7e..e4c4d05387f44e 100644 --- a/common/rc +++ b/common/rc @@ -5034,6 +5034,50 @@ _check_kmemleak() fi } +# Current timestamp, in a format that systemd likes +_systemd_now() { + timedatectl show --property=TimeUSec --value +} + +# Do what we need to do to capture core dumps from coredumpctl +_start_coredumpctl_collection() { + command -v coredumpctl &>/dev/null || return + command -v timedatectl &>/dev/null || return + command -v jq &>/dev/null || return + + sysctl kernel.core_pattern | grep -q systemd-coredump || return + COREDUMPCTL_START_TIMESTAMP="$(_systemd_now)" +} + +# Capture core dumps from coredumpctl. +# +# coredumpctl list only supports json output as a machine-readable format. The +# human-readable format intermingles spaces from the timestamp with actual +# column separators, so we cannot parse that sanely. The json output is an +# array of: +# { +# "time" : 1749744847150926, +# "pid" : 2297, +# "uid" : 0, +# "gid" : 0, +# "sig" : 6, +# "corefile" : "present", +# "exe" : "/run/fstests/e2fsprogs/fuse2fs", +# "size" : 47245 +# }, +# So we use jq to filter out lost corefiles, then print the pid and exe +# separated by a pipe and hope that nobody ever puts a pipe in an executable +# name. +_finish_coredumpctl_collection() { + test -n "$COREDUMPCTL_START_TIMESTAMP" || return + + coredumpctl list --since="$COREDUMPCTL_START_TIMESTAMP" --json=short 2>/dev/null | \ + jq --raw-output 'map(select(.corefile == "present")) | map("\(.pid)|\(.exe)") | .[]' | while IFS='|' read pid exe; do + test -e "core.$pid" || coredumpctl dump --output="core.$pid" "$pid" "$exe" &>> $seqres.full + done + unset COREDUMPCTL_START_TIMESTAMP +} + # don't check dmesg log after test _disable_dmesg_check() { ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] check: collect core dumps from systemd-coredump 2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong @ 2025-08-02 13:47 ` Zorro Lang 2025-08-12 18:14 ` Darrick J. Wong 2025-08-13 15:18 ` [PATCH v2 " Darrick J. Wong 1 sibling, 1 reply; 8+ messages in thread From: Zorro Lang @ 2025-08-02 13:47 UTC (permalink / raw) To: Darrick J. Wong; +Cc: fstests, linux-xfs On Tue, Jul 29, 2025 at 01:11:06PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > On modern RHEL (>=8) and Debian KDE systems, systemd-coredump can be > installed to capture core dumps from crashed programs. If this is the > case, we would like to capture core dumps from programs that crash > during the test. Set up an (admittedly overwrought) pipeline to extract > dumps created during the test and then capture them the same way that we > pick up "core" and "core.$pid" files. > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> > --- > check | 2 ++ > common/rc | 44 ++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 46 insertions(+) > > > diff --git a/check b/check > index ce7eacb7c45d9e..77581e438c46b9 100755 > --- a/check > +++ b/check > @@ -924,6 +924,7 @@ function run_section() > $1 == "'$seqnum'" {lasttime=" " $2 "s ... "; exit} \ > END {printf "%s", lasttime}' "$check.time" > rm -f core $seqres.notrun > + _start_coredumpctl_collection > > start=`_wallclock` > $timestamp && _timestamp > @@ -957,6 +958,7 @@ function run_section() > # just "core". Use globbing to find the most common patterns, > # assuming there are no other coredump capture packages set up. > local cores=0 > + _finish_coredumpctl_collection > for i in core core.*; do > test -f "$i" || continue > if ((cores++ == 0)); then > diff --git a/common/rc b/common/rc > index 04b721b7318a7e..e4c4d05387f44e 100644 > --- a/common/rc > +++ b/common/rc > @@ -5034,6 +5034,50 @@ _check_kmemleak() > fi > } > > +# Current timestamp, in a format that systemd likes > +_systemd_now() { > + timedatectl show --property=TimeUSec --value > +} > + > +# Do what we need to do to capture core dumps from coredumpctl > +_start_coredumpctl_collection() { > + command -v coredumpctl &>/dev/null || return > + command -v timedatectl &>/dev/null || return > + command -v jq &>/dev/null || return > + > + sysctl kernel.core_pattern | grep -q systemd-coredump || return # rpm -qf `which coredumpctl` systemd-udev-252-53.el9.x86_64 # rpm -qf `which timedatectl` systemd-252-53.el9.x86_64 # rpm -qf `which jq` jq-1.6-17.el9.x86_64 # rpm -qf /usr/lib/systemd/systemd-coredump systemd-udev-252-53.el9.x86_64 So we have 3 optional running dependences, how about metion that in README? Thanks, Zorro > + COREDUMPCTL_START_TIMESTAMP="$(_systemd_now)" > +} > + > +# Capture core dumps from coredumpctl. > +# > +# coredumpctl list only supports json output as a machine-readable format. The > +# human-readable format intermingles spaces from the timestamp with actual > +# column separators, so we cannot parse that sanely. The json output is an > +# array of: > +# { > +# "time" : 1749744847150926, > +# "pid" : 2297, > +# "uid" : 0, > +# "gid" : 0, > +# "sig" : 6, > +# "corefile" : "present", > +# "exe" : "/run/fstests/e2fsprogs/fuse2fs", > +# "size" : 47245 > +# }, > +# So we use jq to filter out lost corefiles, then print the pid and exe > +# separated by a pipe and hope that nobody ever puts a pipe in an executable > +# name. > +_finish_coredumpctl_collection() { > + test -n "$COREDUMPCTL_START_TIMESTAMP" || return > + > + coredumpctl list --since="$COREDUMPCTL_START_TIMESTAMP" --json=short 2>/dev/null | \ > + jq --raw-output 'map(select(.corefile == "present")) | map("\(.pid)|\(.exe)") | .[]' | while IFS='|' read pid exe; do > + test -e "core.$pid" || coredumpctl dump --output="core.$pid" "$pid" "$exe" &>> $seqres.full > + done > + unset COREDUMPCTL_START_TIMESTAMP > +} > + > # don't check dmesg log after test > _disable_dmesg_check() > { > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] check: collect core dumps from systemd-coredump 2025-08-02 13:47 ` Zorro Lang @ 2025-08-12 18:14 ` Darrick J. Wong 0 siblings, 0 replies; 8+ messages in thread From: Darrick J. Wong @ 2025-08-12 18:14 UTC (permalink / raw) To: Zorro Lang; +Cc: fstests, linux-xfs On Sat, Aug 02, 2025 at 09:47:00PM +0800, Zorro Lang wrote: > On Tue, Jul 29, 2025 at 01:11:06PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > On modern RHEL (>=8) and Debian KDE systems, systemd-coredump can be > > installed to capture core dumps from crashed programs. If this is the > > case, we would like to capture core dumps from programs that crash > > during the test. Set up an (admittedly overwrought) pipeline to extract > > dumps created during the test and then capture them the same way that we > > pick up "core" and "core.$pid" files. > > > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> > > --- > > check | 2 ++ > > common/rc | 44 ++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 46 insertions(+) > > > > > > diff --git a/check b/check > > index ce7eacb7c45d9e..77581e438c46b9 100755 > > --- a/check > > +++ b/check > > @@ -924,6 +924,7 @@ function run_section() > > $1 == "'$seqnum'" {lasttime=" " $2 "s ... "; exit} \ > > END {printf "%s", lasttime}' "$check.time" > > rm -f core $seqres.notrun > > + _start_coredumpctl_collection > > > > start=`_wallclock` > > $timestamp && _timestamp > > @@ -957,6 +958,7 @@ function run_section() > > # just "core". Use globbing to find the most common patterns, > > # assuming there are no other coredump capture packages set up. > > local cores=0 > > + _finish_coredumpctl_collection > > for i in core core.*; do > > test -f "$i" || continue > > if ((cores++ == 0)); then > > diff --git a/common/rc b/common/rc > > index 04b721b7318a7e..e4c4d05387f44e 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -5034,6 +5034,50 @@ _check_kmemleak() > > fi > > } > > > > +# Current timestamp, in a format that systemd likes > > +_systemd_now() { > > + timedatectl show --property=TimeUSec --value > > +} > > + > > +# Do what we need to do to capture core dumps from coredumpctl > > +_start_coredumpctl_collection() { > > + command -v coredumpctl &>/dev/null || return > > + command -v timedatectl &>/dev/null || return > > + command -v jq &>/dev/null || return > > + > > + sysctl kernel.core_pattern | grep -q systemd-coredump || return > > # rpm -qf `which coredumpctl` > systemd-udev-252-53.el9.x86_64 > # rpm -qf `which timedatectl` > systemd-252-53.el9.x86_64 > # rpm -qf `which jq` > jq-1.6-17.el9.x86_64 > # rpm -qf /usr/lib/systemd/systemd-coredump > systemd-udev-252-53.el9.x86_64 > > So we have 3 optional running dependences, how about metion that in README? Done. --D > Thanks, > Zorro > > > + COREDUMPCTL_START_TIMESTAMP="$(_systemd_now)" > > +} > > + > > +# Capture core dumps from coredumpctl. > > +# > > +# coredumpctl list only supports json output as a machine-readable format. The > > +# human-readable format intermingles spaces from the timestamp with actual > > +# column separators, so we cannot parse that sanely. The json output is an > > +# array of: > > +# { > > +# "time" : 1749744847150926, > > +# "pid" : 2297, > > +# "uid" : 0, > > +# "gid" : 0, > > +# "sig" : 6, > > +# "corefile" : "present", > > +# "exe" : "/run/fstests/e2fsprogs/fuse2fs", > > +# "size" : 47245 > > +# }, > > +# So we use jq to filter out lost corefiles, then print the pid and exe > > +# separated by a pipe and hope that nobody ever puts a pipe in an executable > > +# name. > > +_finish_coredumpctl_collection() { > > + test -n "$COREDUMPCTL_START_TIMESTAMP" || return > > + > > + coredumpctl list --since="$COREDUMPCTL_START_TIMESTAMP" --json=short 2>/dev/null | \ > > + jq --raw-output 'map(select(.corefile == "present")) | map("\(.pid)|\(.exe)") | .[]' | while IFS='|' read pid exe; do > > + test -e "core.$pid" || coredumpctl dump --output="core.$pid" "$pid" "$exe" &>> $seqres.full > > + done > > + unset COREDUMPCTL_START_TIMESTAMP > > +} > > + > > # don't check dmesg log after test > > _disable_dmesg_check() > > { > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] check: collect core dumps from systemd-coredump 2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong 2025-08-02 13:47 ` Zorro Lang @ 2025-08-13 15:18 ` Darrick J. Wong 1 sibling, 0 replies; 8+ messages in thread From: Darrick J. Wong @ 2025-08-13 15:18 UTC (permalink / raw) To: zlang; +Cc: fstests, linux-xfs From: Darrick J. Wong <djwong@kernel.org> On modern RHEL (>=8) and Debian KDE systems, systemd-coredump can be installed to capture core dumps from crashed programs. If this is the case, we would like to capture core dumps from programs that crash during the test. Set up an (admittedly overwrought) pipeline to extract dumps created during the test and then capture them the same way that we pick up "core" and "core.$pid" files. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> --- v2: update reamde --- README | 20 ++++++++++++++++++++ check | 2 ++ common/rc | 44 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 66 insertions(+) diff --git a/README b/README index de452485af87a3..14e54a00c9e1a2 100644 --- a/README +++ b/README @@ -109,6 +109,11 @@ Ubuntu or Debian $ sudo apt-get install exfatprogs f2fs-tools ocfs2-tools udftools xfsdump \ xfslibs-dev +3. Install packages for optional features: + + systemd coredump capture: + $ sudo apt install systemd-coredump systemd jq + Fedora ------ @@ -124,6 +129,11 @@ Fedora $ sudo yum install btrfs-progs exfatprogs f2fs-tools ocfs2-tools xfsdump \ xfsprogs-devel +3. Install packages for optional features: + + systemd coredump capture: + $ sudo yum install systemd systemd-udev jq + RHEL or CentOS -------------- @@ -159,6 +169,11 @@ RHEL or CentOS For ocfs2 build and install: - see https://github.com/markfasheh/ocfs2-tools +5. Install packages for optional features: + + systemd coredump capture: + $ sudo yum install systemd systemd-udev jq + SUSE Linux Enterprise or openSUSE --------------------------------- @@ -176,6 +191,11 @@ SUSE Linux Enterprise or openSUSE For XFS install: $ sudo zypper install xfsdump xfsprogs-devel +3. Install packages for optional features: + + systemd coredump capture: + $ sudo yum install systemd systemd-coredump jq + Build and install test, libs and utils -------------------------------------- diff --git a/check b/check index 7ef6c9b3d69df5..37f733d0f2afb2 100755 --- a/check +++ b/check @@ -924,6 +924,7 @@ function run_section() $1 == "'$seqnum'" {lasttime=" " $2 "s ... "; exit} \ END {printf "%s", lasttime}' "$check.time" rm -f core $seqres.notrun + _start_coredumpctl_collection start=`_wallclock` $timestamp && _timestamp @@ -957,6 +958,7 @@ function run_section() # just "core". Use globbing to find the most common patterns, # assuming there are no other coredump capture packages set up. local cores=0 + _finish_coredumpctl_collection for i in core core.*; do test -f "$i" || continue if ((cores++ == 0)); then diff --git a/common/rc b/common/rc index 3b853a913bee44..335d995909f74c 100644 --- a/common/rc +++ b/common/rc @@ -5053,6 +5053,50 @@ _check_kmemleak() fi } +# Current timestamp, in a format that systemd likes +_systemd_now() { + timedatectl show --property=TimeUSec --value +} + +# Do what we need to do to capture core dumps from coredumpctl +_start_coredumpctl_collection() { + command -v coredumpctl &>/dev/null || return + command -v timedatectl &>/dev/null || return + command -v jq &>/dev/null || return + + sysctl kernel.core_pattern | grep -q systemd-coredump || return + COREDUMPCTL_START_TIMESTAMP="$(_systemd_now)" +} + +# Capture core dumps from coredumpctl. +# +# coredumpctl list only supports json output as a machine-readable format. The +# human-readable format intermingles spaces from the timestamp with actual +# column separators, so we cannot parse that sanely. The json output is an +# array of: +# { +# "time" : 1749744847150926, +# "pid" : 2297, +# "uid" : 0, +# "gid" : 0, +# "sig" : 6, +# "corefile" : "present", +# "exe" : "/run/fstests/e2fsprogs/fuse2fs", +# "size" : 47245 +# }, +# So we use jq to filter out lost corefiles, then print the pid and exe +# separated by a pipe and hope that nobody ever puts a pipe in an executable +# name. +_finish_coredumpctl_collection() { + test -n "$COREDUMPCTL_START_TIMESTAMP" || return + + coredumpctl list --since="$COREDUMPCTL_START_TIMESTAMP" --json=short 2>/dev/null | \ + jq --raw-output 'map(select(.corefile == "present")) | map("\(.pid)|\(.exe)") | .[]' | while IFS='|' read pid exe; do + test -e "core.$pid" || coredumpctl dump --output="core.$pid" "$pid" "$exe" &>> $seqres.full + done + unset COREDUMPCTL_START_TIMESTAMP +} + # don't check dmesg log after test _disable_dmesg_check() { ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-08-13 15:18 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-29 20:08 [PATCHSET 3/3] fstests: integrate with coredump capturing Darrick J. Wong 2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong 2025-07-30 14:23 ` Christoph Hellwig 2025-07-30 14:55 ` Darrick J. Wong 2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong 2025-08-02 13:47 ` Zorro Lang 2025-08-12 18:14 ` Darrick J. Wong 2025-08-13 15:18 ` [PATCH v2 " Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).