From: Christoph Hellwig <hch@infradead.org>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/4] xfs: force all buffers to be written during btree bulk load
Date: Fri, 24 Nov 2023 21:49:28 -0800 [thread overview]
Message-ID: <ZWGK6Ig752wdBvwF@infradead.org> (raw)
In-Reply-To: <170086926593.2770816.5504104328549141972.stgit@frogsfrogsfrogs>
The code makes me feel we're fixing the wrong thing, but I have to
admin I don't fully understand it. Let me go step by step.
> While stress-testing online repair of btrees, I noticed periodic
> assertion failures from the buffer cache about buffer readers
> encountering buffers with DELWRI_Q set, even though the btree bulk load
> had already committed and the buffer itself wasn't on any delwri list.
What assert do these buffer reader hit? The two I can spot that are in
the read path in the broader sense are:
1) in xfs_buf_find_lock for stale buffers.
2) in __xfs_buf_submit just before I/O submission
> I traced this to a misunderstanding of how the delwri lists work,
> particularly with regards to the AIL's buffer list. If a buffer is
> logged and committed, the buffer can end up on that AIL buffer list. If
> btree repairs are run twice in rapid succession, it's possible that the
> first repair will invalidate the buffer and free it before the next time
> the AIL wakes up. This clears DELWRI_Q from the buffer state.
Where "this clears" is xfs_buf_stale called from xfs_btree_free_block
via xfs_trans_binval?
> If the second repair allocates the same block, it will then recycle the
> buffer to start writing the new btree block.
If my above theory is correct: how do we end up reusing a stale buffer?
If not, what am I misunderstanding above?
> Meanwhile, if the AIL
> wakes up and walks the buffer list, it will ignore the buffer because it
> can't lock it, and go back to sleep.
And I think this is where the trouble starts - we have a buffer that
is left on some delwri list, but with the _XBF_DELWRI_Q flag cleared,
it is stale and we then reuse it. I don't think we just need to kick
it off the delwri list just for btree staging, but in general.
>
> When the second repair calls delwri_queue to put the buffer on the
> list of buffers to write before committing the new btree, it will set
> DELWRI_Q again, but since the buffer hasn't been removed from the AIL's
> buffer list, it won't add it to the bulkload buffer's list.
>
> This is incorrect, because the bulkload caller relies on delwri_submit
> to ensure that all the buffers have been sent to disk /before/
> committing the new btree root pointer. This ordering requirement is
> required for data consistency.
>
> Worse, the AIL won't clear DELWRI_Q from the buffer when it does finally
> drop it, so the next thread to walk through the btree will trip over a
> debug assertion on that flag.
Where do it finally drop it?
> To fix this, create a new function that waits for the buffer to be
> removed from any other delwri lists before adding the buffer to the
> caller's delwri list. By waiting for the buffer to clear both the
> delwri list and any potential delwri wait list, we can be sure that
> repair will initiate writes of all buffers and report all write errors
> back to userspace instead of committing the new structure.
If my understanding above is correct this just papers over the bug
that a buffer that is marked stale and can be reused for something
else is left on a delwri list. I've entirely thought about all the
consequence, but here is what I'd try:
- if xfs_buf_find_lock finds a stale buffer with _XBF_DELWRI_Q
call your new wait code instead of asserting (probably only
for the !trylock case)
- make sure we don't leak DELWRI_Q
next prev parent reply other threads:[~2023-11-25 5:49 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 23:39 [MEGAPATCHSET v28] xfs: online repair, second part of part 1 Darrick J. Wong
2023-11-24 23:44 ` [PATCHSET v28.0 0/1] xfs: prevent livelocks in xchk_iget Darrick J. Wong
2023-11-24 23:46 ` [PATCH 1/1] xfs: make xchk_iget safer in the presence of corrupt inode btrees Darrick J. Wong
2023-11-25 4:57 ` Christoph Hellwig
2023-11-27 21:55 ` Darrick J. Wong
2023-11-24 23:45 ` [PATCHSET v28.0 0/7] xfs: reserve disk space for online repairs Darrick J. Wong
2023-11-24 23:47 ` [PATCH 1/7] xfs: don't append work items to logged xfs_defer_pending objects Darrick J. Wong
2023-11-25 5:04 ` Christoph Hellwig
2023-11-24 23:47 ` [PATCH 2/7] xfs: allow pausing of pending deferred work items Darrick J. Wong
2023-11-25 5:05 ` Christoph Hellwig
2023-11-24 23:47 ` [PATCH 3/7] xfs: remove __xfs_free_extent_later Darrick J. Wong
2023-11-25 5:05 ` Christoph Hellwig
2023-11-24 23:47 ` [PATCH 4/7] xfs: automatic freeing of freshly allocated unwritten space Darrick J. Wong
2023-11-25 5:06 ` Christoph Hellwig
2023-11-24 23:48 ` [PATCH 5/7] xfs: implement block reservation accounting for btrees we're staging Darrick J. Wong
2023-11-26 13:14 ` Christoph Hellwig
2023-11-27 22:34 ` Darrick J. Wong
2023-11-28 5:41 ` Christoph Hellwig
2023-11-28 17:02 ` Darrick J. Wong
2023-11-24 23:48 ` [PATCH 6/7] xfs: log EFIs for all btree blocks being used to stage a btree Darrick J. Wong
2023-11-26 13:15 ` Christoph Hellwig
2023-11-24 23:48 ` [PATCH 7/7] xfs: force small EFIs for reaping btree extents Darrick J. Wong
2023-11-25 5:13 ` Christoph Hellwig
2023-11-27 22:46 ` Darrick J. Wong
2023-11-24 23:45 ` [PATCHSET v28.0 0/4] xfs: prepare repair for bulk loading Darrick J. Wong
2023-11-24 23:48 ` [PATCH 1/4] xfs: force all buffers to be written during btree bulk load Darrick J. Wong
2023-11-25 5:49 ` Christoph Hellwig [this message]
2023-11-28 1:50 ` Darrick J. Wong
2023-11-28 7:13 ` Christoph Hellwig
2023-11-28 15:18 ` Christoph Hellwig
2023-11-28 17:07 ` Darrick J. Wong
2023-11-30 4:33 ` Christoph Hellwig
2023-11-24 23:49 ` [PATCH 2/4] xfs: add debug knobs to control btree bulk load slack factors Darrick J. Wong
2023-11-25 5:50 ` Christoph Hellwig
2023-11-28 1:44 ` Darrick J. Wong
2023-11-28 5:42 ` Christoph Hellwig
2023-11-28 17:07 ` Darrick J. Wong
2023-11-24 23:49 ` [PATCH 3/4] xfs: move btree bulkload record initialization to ->get_record implementations Darrick J. Wong
2023-11-25 5:51 ` Christoph Hellwig
2023-11-24 23:49 ` [PATCH 4/4] xfs: constrain dirty buffers while formatting a staged btree Darrick J. Wong
2023-11-25 5:53 ` Christoph Hellwig
2023-11-27 22:56 ` Darrick J. Wong
2023-11-28 20:11 ` Darrick J. Wong
2023-11-29 5:50 ` Christoph Hellwig
2023-11-29 5:57 ` Darrick J. Wong
2023-11-24 23:45 ` [PATCHSET v28.0 0/5] xfs: online repair of AG btrees Darrick J. Wong
2023-11-24 23:50 ` [PATCH 1/5] xfs: create separate structures and code for u32 bitmaps Darrick J. Wong
2023-11-25 5:57 ` Christoph Hellwig
2023-11-28 1:34 ` Darrick J. Wong
2023-11-28 5:43 ` Christoph Hellwig
2023-11-24 23:50 ` [PATCH 2/5] xfs: roll the scrub transaction after completing a repair Darrick J. Wong
2023-11-25 6:05 ` Christoph Hellwig
2023-11-28 1:29 ` Darrick J. Wong
2023-11-24 23:50 ` [PATCH 3/5] xfs: repair free space btrees Darrick J. Wong
2023-11-25 6:11 ` Christoph Hellwig
2023-11-28 1:05 ` Darrick J. Wong
2023-11-28 15:10 ` Christoph Hellwig
2023-11-28 21:13 ` Darrick J. Wong
2023-11-29 5:56 ` Christoph Hellwig
2023-11-29 6:18 ` Darrick J. Wong
2023-11-29 6:24 ` Christoph Hellwig
2023-11-29 6:26 ` Darrick J. Wong
2023-11-24 23:50 ` [PATCH 4/5] xfs: repair inode btrees Darrick J. Wong
2023-11-25 6:12 ` Christoph Hellwig
2023-11-28 1:09 ` Darrick J. Wong
2023-11-28 15:57 ` Christoph Hellwig
2023-11-28 21:37 ` Darrick J. Wong
2023-11-24 23:51 ` [PATCH 5/5] xfs: repair refcount btrees Darrick J. Wong
2023-11-28 16:07 ` Christoph Hellwig
2023-11-24 23:45 ` [PATCHSET v28.0 0/7] xfs: online repair of inodes and forks Darrick J. Wong
2023-11-24 23:51 ` [PATCH 1/7] xfs: disable online repair quota helpers when quota not enabled Darrick J. Wong
2023-11-25 6:13 ` Christoph Hellwig
2023-11-24 23:51 ` [PATCH 2/7] xfs: try to attach dquots to files before repairing them Darrick J. Wong
2023-11-25 6:14 ` Christoph Hellwig
2023-11-24 23:51 ` [PATCH 3/7] xfs: repair inode records Darrick J. Wong
2023-11-28 17:08 ` Christoph Hellwig
2023-11-28 23:08 ` Darrick J. Wong
2023-11-29 6:02 ` Christoph Hellwig
2023-12-05 23:08 ` Darrick J. Wong
2023-12-06 5:16 ` Christoph Hellwig
2023-11-24 23:52 ` [PATCH 4/7] xfs: zap broken inode forks Darrick J. Wong
2023-11-30 4:44 ` Christoph Hellwig
2023-11-30 21:08 ` Darrick J. Wong
2023-12-04 4:39 ` Christoph Hellwig
2023-12-04 20:43 ` Darrick J. Wong
2023-12-05 4:28 ` Christoph Hellwig
2023-11-24 23:52 ` [PATCH 5/7] xfs: abort directory parent scrub scans if we encounter a zapped directory Darrick J. Wong
2023-11-30 4:47 ` Christoph Hellwig
2023-11-30 21:37 ` Darrick J. Wong
2023-12-04 4:41 ` Christoph Hellwig
2023-12-04 20:44 ` Darrick J. Wong
2023-11-24 23:52 ` [PATCH 6/7] xfs: skip the rmapbt search on an empty attr fork unless we know it was zapped Darrick J. Wong
2023-11-24 23:52 ` [PATCH 7/7] xfs: repair obviously broken inode modes Darrick J. Wong
2023-11-30 4:49 ` Christoph Hellwig
2023-11-30 21:18 ` Darrick J. Wong
2023-12-04 4:42 ` Christoph Hellwig
2023-11-24 23:46 ` [PATCHSET v28.0 0/5] xfs: online repair of file fork mappings Darrick J. Wong
2023-11-24 23:53 ` [PATCH 1/5] xfs: reintroduce reaping of file metadata blocks to xrep_reap_extents Darrick J. Wong
2023-11-30 4:53 ` Christoph Hellwig
2023-11-30 21:48 ` Darrick J. Wong
2023-12-04 4:42 ` Christoph Hellwig
2023-11-24 23:53 ` [PATCH 2/5] xfs: repair inode fork block mapping data structures Darrick J. Wong
2023-11-30 5:07 ` Christoph Hellwig
2023-12-01 1:38 ` Darrick J. Wong
2023-11-24 23:53 ` [PATCH 3/5] xfs: refactor repair forcing tests into a repair.c helper Darrick J. Wong
2023-11-28 14:20 ` Christoph Hellwig
2023-11-29 5:42 ` Darrick J. Wong
2023-11-29 6:03 ` Christoph Hellwig
2023-11-24 23:53 ` [PATCH 4/5] xfs: create a ranged query function for refcount btrees Darrick J. Wong
2023-11-28 13:59 ` Christoph Hellwig
2023-11-24 23:54 ` [PATCH 5/5] xfs: repair problems in CoW forks Darrick J. Wong
2023-11-30 5:10 ` Christoph Hellwig
2023-11-24 23:46 ` [PATCHSET v28.0 0/6] xfs: online repair of rt bitmap file Darrick J. Wong
2023-11-24 23:54 ` [PATCH 1/6] xfs: check rt bitmap file geometry more thoroughly Darrick J. Wong
2023-11-28 14:04 ` Christoph Hellwig
2023-11-28 23:27 ` Darrick J. Wong
2023-11-29 6:05 ` Christoph Hellwig
2023-11-29 6:20 ` Darrick J. Wong
2023-11-24 23:54 ` [PATCH 2/6] xfs: check rt summary " Darrick J. Wong
2023-11-28 14:05 ` Christoph Hellwig
2023-11-28 23:30 ` Darrick J. Wong
2023-11-29 1:23 ` Darrick J. Wong
2023-11-29 6:05 ` Christoph Hellwig
2023-11-29 6:21 ` Darrick J. Wong
2023-11-29 6:23 ` Christoph Hellwig
2023-11-30 0:10 ` Darrick J. Wong
2023-11-24 23:54 ` [PATCH 3/6] xfs: always check the rtbitmap and rtsummary files Darrick J. Wong
2023-11-28 14:06 ` Christoph Hellwig
2023-11-24 23:55 ` [PATCH 4/6] xfs: repair the inode core and forks of a metadata inode Darrick J. Wong
2023-11-30 5:12 ` Christoph Hellwig
2023-11-24 23:55 ` [PATCH 5/6] xfs: create a new inode fork block unmap helper Darrick J. Wong
2023-11-25 6:17 ` Christoph Hellwig
2023-11-24 23:55 ` [PATCH 6/6] xfs: online repair of realtime bitmaps Darrick J. Wong
2023-11-30 5:16 ` Christoph Hellwig
2023-11-24 23:46 ` [PATCHSET v28.0 0/5] xfs: online repair of quota and rt metadata files Darrick J. Wong
2023-11-24 23:56 ` [PATCH 1/5] xfs: check the ondisk space mapping behind a dquot Darrick J. Wong
2023-11-30 5:17 ` Christoph Hellwig
2023-11-24 23:56 ` [PATCH 2/5] xfs: check dquot resource timers Darrick J. Wong
2023-11-30 5:17 ` Christoph Hellwig
2023-11-24 23:56 ` [PATCH 3/5] xfs: pull xfs_qm_dqiterate back into scrub Darrick J. Wong
2023-11-30 5:22 ` Christoph Hellwig
2023-11-24 23:56 ` [PATCH 4/5] xfs: improve dquot iteration for scrub Darrick J. Wong
2023-11-30 5:25 ` Christoph Hellwig
2023-11-24 23:57 ` [PATCH 5/5] xfs: repair quotas Darrick J. Wong
2023-11-30 5:33 ` Christoph Hellwig
2023-11-30 22:10 ` Darrick J. Wong
2023-12-04 4:48 ` Christoph Hellwig
2023-12-04 20:52 ` Darrick J. Wong
2023-12-05 4:27 ` Christoph Hellwig
2023-12-05 5:20 ` Darrick J. Wong
2023-12-04 4:49 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2023-09-26 23:29 [PATCHSET v27.0 0/4] xfs: prepare repair for bulk loading Darrick J. Wong
2023-09-26 23:33 ` [PATCH 1/4] xfs: force all buffers to be written during btree bulk load Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZWGK6Ig752wdBvwF@infradead.org \
--to=hch@infradead.org \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).