From: Brian Foster <bfoster@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: [BUG] bcachefs fio lockup via generic/703
Date: Wed, 28 Feb 2024 15:13:04 -0500 [thread overview]
Message-ID: <Zd+T0EeeodCXKqQt@bfoster> (raw)
In-Reply-To: <izlhsoadddmmfyc64pc557tx3lk3vtu7ycwu3ahryh2r3o6nj5@eaqsftsh7kv3>
On Wed, Feb 28, 2024 at 03:03:06PM -0500, Kent Overstreet wrote:
> On Wed, Feb 28, 2024 at 02:47:26PM -0500, Brian Foster wrote:
> > Hi Kent,
> >
> > Firstly, I confirmed that today's master seems to avoid the splat I sent
> > previously (re: your comment about a reverse journal replay patch or
> > some such).
> >
> > I still reproduce the stall issue on this system. After peeling away at
> > it, I was eventually able to reproduce without the drop writes
> > (dm-flakey) behavior from the test, and with fio using either the libaio
> > or sync I/O engine options. The sync I/O mode fortunately provides a
> > more useful stack trace:
> >
> > # cat /proc/177747/stack
> > [<0>] bch2_dio_write_flush+0x122/0x160 [bcachefs]
> > [<0>] bch2_direct_write+0xb53/0xce0 [bcachefs]
> > [<0>] bch2_write_iter+0x142/0xc70 [bcachefs]
> > [<0>] vfs_write+0x29b/0x470
> > [<0>] ksys_write+0x6f/0xf0
> > [<0>] do_syscall_64+0x86/0x170
> > [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
> >
> > ... which resolves down to the closure_sync() call in
> > bch2_dio_write_flush(). The problem seems to go away if I remove the
> > preceding journal flush from that function. This seems to rule out
> > io_uring/aio and instead suggest that we're getting stuck somehow
> > waiting on a journal flush.
> >
> > Based on that I went back to the first commit before 746a33c96b7a0
> > ("bcachefs: better journal pipelining"). With that, I can run hundreds
> > of iterations of generic/703 without a problem, so this appears to be a
> > regression associated with the journal pipeline improvements. I'm
> > currently re-running on the last known good commit with my test tweaks
> > backed out (i.e. so back to io_uring and drop writes) just to
> > corroborate that it's the same problem, but so far it's running as
> > expected...
>
> So I suppose the journal must be getting stuck, and a journal write
> isn't completing - what does sysfs internal/journal_debug say when it happens?
>
dirty journal entries: 0/32768
seq: 11
seq_ondisk: 11
last_seq: 12
last_seq_ondisk: 11
flushed_seq_ondisk: 11
watermark: stripe
each entry reserved: 346
nr flush writes: 10
nr noflush writes: 1
average write size: 10.9 KiB
nr direct reclaim: 0
nr background reclaim: 25
reclaim kicked: 0
reclaim runs in: 0 ms
current entry sectors: 512
current entry error: ok
current entry: closed
unwritten entries:
last buf closed
replay done: 1
space:
discarded 512:244736
clean ondisk 512:244736
clean 512:244736
total 512:245760
dev 0:
nr 480
bucket size 512
available 478:266
discard_idx 1
dirty_ondisk 1 (seq 11)
dirty_idx 1 (seq 11)
cur_idx 1 (seq 11)
next prev parent reply other threads:[~2024-02-28 20:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-28 19:47 [BUG] bcachefs fio lockup via generic/703 Brian Foster
2024-02-28 20:03 ` Kent Overstreet
2024-02-28 20:13 ` Brian Foster [this message]
2024-02-28 23:43 ` Kent Overstreet
2024-02-29 0:02 ` Kent Overstreet
2024-02-29 2:59 ` Kent Overstreet
2024-02-29 15:55 ` Brian Foster
2024-02-29 16:24 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zd+T0EeeodCXKqQt@bfoster \
--to=bfoster@redhat.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.