From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, "Darrick J . Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@infradead.org>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>
Subject: Re: [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
Date: Tue, 07 May 2024 12:52:49 +0530 [thread overview]
Message-ID: <87h6fanl12.fsf@gmail.com> (raw)
Dave Chinner <david@fromorbit.com> writes:
> On Tue, May 07, 2024 at 10:43:07AM +0530, Ritesh Harjani (IBM) wrote:
>> An async dio write to a sparse file can generate a lot of extents
>> and when we unlink this file (using rm), the kernel can be busy in umapping
>> and freeing those extents as part of transaction processing.
>>
>> Similarly xfs reflink remapping path also goes through
>> xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
>> in this function, so let's add cond_resched() to avoid softlockup messages
>> like these.
>
> You proposed this change two weeks ago: I said:
>
> "The problem is the number of extents being processed without
> yielding, not the time spent processing each individual deferred
> work chain to free the extent. Hence the explicit rescheduling
> should be at the top level loop where it can be easily explained and
> understand, not hidden deep inside the defer chain mechanism...."
>
> https://lore.kernel.org/linux-xfs/ZiWjbWrD60W%2F0s%2FF@dread.disaster.area/
>
>> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
>> CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S L 6.9.0-rc5-0-default #1
>> Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
>> NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
>> LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
>> Call Trace:
>> xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
>> xfs_alloc_compute_aligned+0x5c/0x144
>> xfs_alloc_ag_vextent_size+0x238/0x8d4
>> xfs_alloc_fix_freelist+0x540/0x694
>> xfs_free_extent_fix_freelist+0x84/0xe0
>> __xfs_free_extent+0x74/0x1ec
>> xfs_extent_free_finish_item+0xcc/0x214
>> xfs_defer_finish_one+0x194/0x388
>> xfs_defer_finish_noroll+0x1b4/0x5c8
>> xfs_defer_finish+0x2c/0xc4
>> xfs_bunmapi_range+0xa4/0x100
>> xfs_itruncate_extents_flags+0x1b8/0x2f4
>> xfs_inactive_truncate+0xe0/0x124
>> xfs_inactive+0x30c/0x3e0
>
> This is the same issue as you originally reported two weeks ago. My
> comments about it still stand.
>
>> xfs_inodegc_worker+0x140/0x234
>> process_scheduled_works+0x240/0x57c
>> worker_thread+0x198/0x468
>> kthread+0x138/0x140
>> start_kernel_thread+0x14/0x18
>>
>> run fstests generic/175 at 2024-02-02 04:40:21
>> [ C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>> watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>> CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
>> NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
>> LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
>> Call Trace:
>> 0xc000000014107c00 (unreliable)
>> __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
>> xfs_btree_update_keys+0x150/0x170 [xfs]
>> xfs_btree_lshift+0x534/0x660 [xfs]
>> xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
>> xfs_btree_insrec+0x4e4/0x630 [xfs]
>> xfs_btree_insert+0x104/0x2d0 [xfs]
>> xfs_rmap_insert+0xc4/0x260 [xfs]
>> xfs_rmap_map_shared+0x228/0x630 [xfs]
>> xfs_rmap_finish_one+0x2d4/0x350 [xfs]
>> xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
>> xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
>> __xfs_trans_commit+0x1f4/0x400 [xfs]
>> xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
>> xfs_reflink_remap_blocks+0x154/0x320 [xfs]
>
> And this is the same case of iterating millions of extents without
> ever blocking on a lock, a buffer cache miss, transaction
> reservation, etc. IOWs, xfs_reflink_remap_blocks() is another high
> level extent iterator that needs the cond_resched() calls.
ok. I was thinking if we add this to xfs_defer_finish_noroll(), then we
can just take care of all such cases. But I agree with your point here.
I will test adding cond_resched() to xfs_relink_remap_blocks() loop
and will re-submit this patch.
>
> I can't wait for the kernel to always be pre-emptible so this whole
> class of nasty hacks can go away forever. But in the mean time, they
> still need to be placed appropriately.
Sure Dave. Thanks for the review!
-ritesh
next reply other threads:[~2024-05-07 7:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 7:22 Ritesh Harjani [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-05-07 5:13 [PATCHv3 0/1] xfs: soft lockups in unmapping and reflink remapping path Ritesh Harjani (IBM)
2024-05-07 5:13 ` [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani (IBM)
2024-05-07 6:06 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h6fanl12.fsf@gmail.com \
--to=ritesh.list@gmail.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox