public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: zlang@redhat.com, fstests@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO
Date: Wed, 30 Jul 2025 07:55:09 -0700	[thread overview]
Message-ID: <20250730145509.GX2672039@frogsfrogsfrogs> (raw)
In-Reply-To: <aIoq3LgL2ODgENFy@infradead.org>

On Wed, Jul 30, 2025 at 07:23:24AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 29, 2025 at 01:10:50PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > First, start with the premise that fstests is run with a nonzero limit
> > on the size of core dumps so that we can capture the state of
> > misbehaving fs utilities like fsck and scrub if they crash.
> 
> Can you explain what this has to do with core dumping?
> 
> I'm just really confused between this patch content and the subject of
> this patch and the entire series..

It's a bugfix ahead of new behaviors introduced in patch 2.  I clearly
didn't explain this well enough, so I'll try again.

Before abrt/systemd-coredump, FS_IOC_SHUTDOWN fsstress tests would do
something like the following:

1. start fsstress, which chdirs to $TEST_DIR
2. shut down the filesystem
3. fsstress tries to stat($TEST_DIR), fails, and calls abort
4. abort triggers coredump
5. kernel fails to write "core" to $TEST_DIR (because fs is shut down)
6. test finishes, no core files written to $here, test passes

Once you install systemd-coredump, that changes to:

same 1-4 above
5. kernel pipes core file to coredumpctl, which writes it to /var/crash
6. test finishes, no core files written to $here, test passes

And then with patch 2 of this series, that becomes:

same 1-4 above
5. kernel pipes core file to coredumpctl, which writes it to /var/crash
6. test finishes, ./check queries coredumpctl for any new coredumps,
   and copies them to $here
7. ./check finds core files written to $here, test fails

Now we've caused a test failure where there was none before, simply
because the crash reporting improved.

Therefore this patch changes fsstress not to call abort() from check_cwd
when it has a reasonable suspicion that the fs has died.

(Did that help?  /me is still pre-coffee...)

> > This is really silly, because basic stat requests for the current
> > working directory can be satisfied from the inode cache without a disk
> > access.  In this narrow situation, EIO only happens when the fs has shut
> > down, so just exit the program.
> 
> If we think it's silly we can trivially drop the xfs_is_shutdown check
> in xfs_vn_getattr.  But is it really silly?  We've tried to basically
> make every file system operation consistently fail on shut down
> file systems,

No no, "really silly" refers to failing tests that we didn't used to
fail.

> > We really should have a way to query if a filesystem is shut down that
> > isn't conflated with (possibly transient) EIO errors.  But for now this
> > is what we have to do. :(
> 
> Well, a new STATX_ flag would work, assuming stat doesn't actually
> fail :)  Otherwise a new ioctl/fcntl would make sense, especially as
> the shutdown concept has spread beyond XFS.

I think we ought to add a new ioctl or something so that callers can
positively identify a shut down filesystem.  bfoster I think was asking
about that for fstests some years back, and ended up coding a bunch of
grep heuristics to work around the lack of a real call.

I think we can't drop the "stat{,x} returns EIO on shutdown fs" behavior
because I know of a few, uh, users whose heartbeat monitor periodically
queries statx($PWD) and reboots the node if it returns errno.

--D

  reply	other threads:[~2025-07-30 14:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-29 20:08 [PATCHSET 3/3] fstests: integrate with coredump capturing Darrick J. Wong
2025-07-29 20:10 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong
2025-07-30 14:23   ` Christoph Hellwig
2025-07-30 14:55     ` Darrick J. Wong [this message]
2025-07-29 20:11 ` [PATCH 2/2] check: collect core dumps from systemd-coredump Darrick J. Wong
2025-08-02 13:47   ` Zorro Lang
2025-08-12 18:14     ` Darrick J. Wong
2025-08-13 15:18   ` [PATCH v2 " Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2025-10-21 18:39 [PATCHSET 2/2] fstests: integrate with coredump capturing Darrick J. Wong
2025-10-21 18:42 ` [PATCH 1/2] fsstress: don't abort when stat(".") returns EIO Darrick J. Wong
2025-10-22  4:22   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250730145509.GX2672039@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox