From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: zlang@redhat.com, fstests@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO
Date: Wed, 30 Jul 2025 07:55:09 -0700 [thread overview]
Message-ID: <20250730145509.GX2672039@frogsfrogsfrogs> (raw)
In-Reply-To: <aIoq3LgL2ODgENFy@infradead.org>
On Wed, Jul 30, 2025 at 07:23:24AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 29, 2025 at 01:10:50PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > First, start with the premise that fstests is run with a nonzero limit
> > on the size of core dumps so that we can capture the state of
> > misbehaving fs utilities like fsck and scrub if they crash.
>
> Can you explain what this has to do with core dumping?
>
> I'm just really confused between this patch content and the subject of
> this patch and the entire series..
It's a bugfix ahead of new behaviors introduced in patch 2. I clearly
didn't explain this well enough, so I'll try again.
Before abrt/systemd-coredump, FS_IOC_SHUTDOWN fsstress tests would do
something like the following:
1. start fsstress, which chdirs to $TEST_DIR
2. shut down the filesystem
3. fsstress tries to stat($TEST_DIR), fails, and calls abort
4. abort triggers coredump
5. kernel fails to write "core" to $TEST_DIR (because fs is shut down)
6. test finishes, no core files written to $here, test passes
Once you install systemd-coredump, that changes to:
same 1-4 above
5. kernel pipes core file to coredumpctl, which writes it to /var/crash
6. test finishes, no core files written to $here, test passes
And then with patch 2 of this series, that becomes:
same 1-4 above
5. kernel pipes core file to coredumpctl, which writes it to /var/crash
6. test finishes, ./check queries coredumpctl for any new coredumps,
and copies them to $here
7. ./check finds core files written to $here, test fails
Now we've caused a test failure where there was none before, simply
because the crash reporting improved.
Therefore this patch changes fsstress not to call abort() from check_cwd
when it has a reasonable suspicion that the fs has died.
(Did that help? /me is still pre-coffee...)
> > This is really silly, because basic stat requests for the current
> > working directory can be satisfied from the inode cache without a disk
> > access. In this narrow situation, EIO only happens when the fs has shut
> > down, so just exit the program.
>
> If we think it's silly we can trivially drop the xfs_is_shutdown check
> in xfs_vn_getattr. But is it really silly? We've tried to basically
> make every file system operation consistently fail on shut down
> file systems,
No no, "really silly" refers to failing tests that we didn't used to
fail.
> > We really should have a way to query if a filesystem is shut down that
> > isn't conflated with (possibly transient) EIO errors. But for now this
> > is what we have to do. :(
>
> Well, a new STATX_ flag would work, assuming stat doesn't actually
> fail :) Otherwise a new ioctl/fcntl would make sense, especially as
> the shutdown concept has spread beyond XFS.
I think we ought to add a new ioctl or something so that callers can
positively identify a shut down filesystem. bfoster I think was asking
about that for fstests some years back, and ended up coding a bunch of
grep heuristics to work around the lack of a real call.
I think we can't drop the "stat{,x} returns EIO on shutdown fs" behavior
because I know of a few, uh, users whose heartbeat monitor periodically
queries statx($PWD) and reboots the node if it returns errno.
--D
next prev parent reply other threads:[~2025-07-30 14:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-29 20:08 [PATCHSET 3/3] fstests: integrate with coredump capturing Darrick J. Wong
2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong
2025-07-30 14:23 ` Christoph Hellwig
2025-07-30 14:55 ` Darrick J. Wong [this message]
2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong
2025-08-02 13:47 ` Zorro Lang
2025-08-12 18:14 ` Darrick J. Wong
2025-08-13 15:18 ` [PATCH v2 " Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2025-10-21 18:39 [PATCHSET 2/2] fstests: integrate with coredump capturing Darrick J. Wong
2025-10-21 18:42 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong
2025-10-22 4:22 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250730145509.GX2672039@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox