From: "Darrick J. Wong" <djwong@kernel.org>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: John Garry <john.g.garry@oracle.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"ojaswin@linux.ibm.com" <ojaswin@linux.ibm.com>
Subject: Re: [bug report] fstests generic/774 hang
Date: Thu, 6 Nov 2025 20:28:40 -0800 [thread overview]
Message-ID: <20251107042840.GK196370@frogsfrogsfrogs> (raw)
In-Reply-To: <cc5yndgo6enxwtnwvcc26wdoxg3wdnnzie6lvn2mttrzkeez24@6sk5qlhlrozp>
On Fri, Nov 07, 2025 at 02:27:50AM +0000, Shinichiro Kawasaki wrote:
> On Nov 06, 2025 / 08:53, John Garry wrote:
> > > > > >
> > > > > > Shinichiro, do the other atomic writes tests run ok, like 775, 767? You
> > > > > > can check group "atomicwrites" to know which tests they are.
> > > > > >
> > > > > > 774 is the fio test.
> > >
> > > I tried the other "atomicwrites" test. I found g778 took very long time.
> > > I think it implies that g778 may have similar problem as g774.
> > >
> > > g765: [not run] write atomic not supported by this block device
> > > g767: 11s
> > > g768: 13s
> > > g769: 13s
> > > g770: 35s
> > > g773: [not run] write atomic not supported by this block device
> > > g774: did not completed after 3 hours run (and kernel reported the INFO messages)
> > > g775: 48s
> > > g776: [not run] write atomic not supported by this block device
> > > g778: did not completed after 50 minutes run
> > > x838: [not run] External volumes not in use, skipped this test
> > > x839: [not run] XFS error injection requires CONFIG_XFS_DEBUG
> > > x840: [not run] write atomic not supported by this block device
> >
> > This is testing software-based atomic writes, and they are just slow. Very
> > slow, relative to HW-based atomic writes. And having bs=1M will make things
> > worse, as we are locking out other threads for longer (when doing the
> > write).
>
> I see, thanks for the explanation.
>
> > So I think that we should limit the file size which we try to write.
>
> This sounds reasonable, and it will make fstests run maintenance work easier.
>
> >
> > >
> > > > > >
> > > > > > Some things to try:
> > > > > > - use a physical disk for the TEST_DEV
> > >
> > > I tried using a real HDD for TEST_DEV, but still observed the hang and INFO
> > > messages at g774.
> > >
> > > > > > - Don't set LOAD_FACTOR (if you were setting it). If not, bodge 774 to
> > > > > > reduce $threads to a low value, say, 2
> > >
> > > I do not set LOAD_FACTOR. I changed g775 script to set threads=2, then the
> > > test case completed quickly, within a few minutes. I'm suspecting that this
> > > short test time might hide the hang/INFO problem.
> > >
> > > > > > - trying turning on XFS_DEBUG config
> > >
> > > I turned on XFS_DEBUG, and still observed the hang and the INFO messages.
> > >
> >
> > I don't think that this will help.
> >
> > > > > >
> > > > > > BTW, Darrick has posted some xfs atomics fixes @ https://urldefense.com/
> > > > > > v3/__https://urldefense.com/v3/__https://lore.kernel.org/linux-__;!!ACWV5N9M2RV99hQ!J3HKTWLF8Qx-j42OOJ4o1YAttSSoqOCm9ymJtisUYoOtGgOyNNGqHnjjl1Zd9DQXJvCz8zqPMG-kgeVdo9MQuupMlcAo$
> > > > > > xfs/20251105001200.GV196370@frogsfrogsfrogs/T/*t__;Iw!!ACWV5N9M2RV99hQ! IuEPY6yJ1ZEQu7dpfjUplkPJucOHMQ9cpPvIC4fiJhTi_X_7ImN0t6wGqxg9_GM6gWe4B1OBiBjEI8Gz_At0595tIQ$
> > > > > > . I doubt that they will help this, but worth trying.
> > >
> > > I have not yet tried this. Will try it tomorrow.
> >
> > Nor this.
>
> I confirmed it. I applied the patches to v6.18-rc4 kernel. With this kernel, the
> hang and the INFO messages are recreated.
>
> >
> > Having a hang - even for the conditions set - should not produce a hang. I
> > can check on whether we can improve the software-based atomic writes in xfs
> > to avoid this.
>
> Thanks. Will sysrq-t output help? If it helps, I can take it from the hanging
> test node and share.
Yes, anything you can share would be helpful. FWIW the test runs in 51
seconds here, but I only have 4 CPUs in the VM and fast storage so its
filesize is "only" 800MB.
--D
next prev parent reply other threads:[~2025-11-07 4:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-30 8:45 [bug report] fstests generic/774 hang Shinichiro Kawasaki
2025-11-05 0:33 ` Darrick J. Wong
2025-11-05 2:19 ` Shinichiro Kawasaki
2025-11-05 8:52 ` John Garry
2025-11-05 10:39 ` John Garry
2025-11-05 11:29 ` John Garry
2025-11-05 12:37 ` Shinichiro Kawasaki
2025-11-06 8:19 ` Shinichiro Kawasaki
2025-11-06 8:53 ` John Garry
2025-11-07 2:27 ` Shinichiro Kawasaki
2025-11-07 4:28 ` Darrick J. Wong [this message]
2025-11-07 5:53 ` Shinichiro Kawasaki
2025-11-07 12:48 ` John Garry
2025-11-07 17:50 ` Darrick J. Wong
2025-11-07 23:18 ` Darrick J. Wong
2025-11-10 2:41 ` Shinichiro Kawasaki
2025-11-09 12:02 ` Ojaswin Mujoo
2025-11-10 12:46 ` [WARNING: UNSCANNABLE EXTRACTION FAILED]Re: " Shinichiro Kawasaki
2025-11-10 21:12 ` Darrick J. Wong
2025-11-11 11:43 ` Shinichiro Kawasaki
2025-11-09 11:58 ` Ojaswin Mujoo
2025-11-10 8:58 ` John Garry
2025-11-10 12:39 ` Shinichiro Kawasaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251107042840.GK196370@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=john.g.garry@oracle.com \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=shinichiro.kawasaki@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox