From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, "Darrick J . Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@infradead.org>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>
Subject: Re: [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll
Date: Tue, 07 May 2024 12:52:49 +0530 [thread overview]
Message-ID: <87h6fanl12.fsf@gmail.com> (raw)
Dave Chinner <david@fromorbit.com> writes:
> On Tue, May 07, 2024 at 10:43:07AM +0530, Ritesh Harjani (IBM) wrote:
>> An async dio write to a sparse file can generate a lot of extents
>> and when we unlink this file (using rm), the kernel can be busy in umapping
>> and freeing those extents as part of transaction processing.
>>
>> Similarly xfs reflink remapping path also goes through
>> xfs_defer_finish_noroll() for transaction processing. Since we can busy loop
>> in this function, so let's add cond_resched() to avoid softlockup messages
>> like these.
>
> You proposed this change two weeks ago: I said:
>
> "The problem is the number of extents being processed without
> yielding, not the time spent processing each individual deferred
> work chain to free the extent. Hence the explicit rescheduling
> should be at the top level loop where it can be easily explained and
> understand, not hidden deep inside the defer chain mechanism...."
>
> https://lore.kernel.org/linux-xfs/ZiWjbWrD60W%2F0s%2FF@dread.disaster.area/
>
>> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435]
>> CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S L 6.9.0-rc5-0-default #1
>> Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker
>> NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290
>> LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290
>> Call Trace:
>> xfs_alloc_get_rec+0x54/0x1b0 (unreliable)
>> xfs_alloc_compute_aligned+0x5c/0x144
>> xfs_alloc_ag_vextent_size+0x238/0x8d4
>> xfs_alloc_fix_freelist+0x540/0x694
>> xfs_free_extent_fix_freelist+0x84/0xe0
>> __xfs_free_extent+0x74/0x1ec
>> xfs_extent_free_finish_item+0xcc/0x214
>> xfs_defer_finish_one+0x194/0x388
>> xfs_defer_finish_noroll+0x1b4/0x5c8
>> xfs_defer_finish+0x2c/0xc4
>> xfs_bunmapi_range+0xa4/0x100
>> xfs_itruncate_extents_flags+0x1b8/0x2f4
>> xfs_inactive_truncate+0xe0/0x124
>> xfs_inactive+0x30c/0x3e0
>
> This is the same issue as you originally reported two weeks ago. My
> comments about it still stand.
>
>> xfs_inodegc_worker+0x140/0x234
>> process_scheduled_works+0x240/0x57c
>> worker_thread+0x198/0x468
>> kthread+0x138/0x140
>> start_kernel_thread+0x14/0x18
>>
>> run fstests generic/175 at 2024-02-02 04:40:21
>> [ C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>> watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679]
>> CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0
>> NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs]
>> LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs]
>> Call Trace:
>> 0xc000000014107c00 (unreliable)
>> __xfs_btree_updkeys+0x8c/0x2c0 [xfs]
>> xfs_btree_update_keys+0x150/0x170 [xfs]
>> xfs_btree_lshift+0x534/0x660 [xfs]
>> xfs_btree_make_block_unfull+0x19c/0x240 [xfs]
>> xfs_btree_insrec+0x4e4/0x630 [xfs]
>> xfs_btree_insert+0x104/0x2d0 [xfs]
>> xfs_rmap_insert+0xc4/0x260 [xfs]
>> xfs_rmap_map_shared+0x228/0x630 [xfs]
>> xfs_rmap_finish_one+0x2d4/0x350 [xfs]
>> xfs_rmap_update_finish_item+0x44/0xc0 [xfs]
>> xfs_defer_finish_noroll+0x2e4/0x740 [xfs]
>> __xfs_trans_commit+0x1f4/0x400 [xfs]
>> xfs_reflink_remap_extent+0x2d8/0x650 [xfs]
>> xfs_reflink_remap_blocks+0x154/0x320 [xfs]
>
> And this is the same case of iterating millions of extents without
> ever blocking on a lock, a buffer cache miss, transaction
> reservation, etc. IOWs, xfs_reflink_remap_blocks() is another high
> level extent iterator that needs the cond_resched() calls.
ok. I was thinking if we add this to xfs_defer_finish_noroll(), then we
can just take care of all such cases. But I agree with your point here.
I will test adding cond_resched() to xfs_relink_remap_blocks() loop
and will re-submit this patch.
>
> I can't wait for the kernel to always be pre-emptible so this whole
> class of nasty hacks can go away forever. But in the mean time, they
> still need to be placed appropriately.
Sure Dave. Thanks for the review!
-ritesh
next reply other threads:[~2024-05-07 7:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 7:22 Ritesh Harjani [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-05-07 5:13 [PATCHv3 0/1] xfs: soft lockups in unmapping and reflink remapping path Ritesh Harjani (IBM)
2024-05-07 5:13 ` [PATCHv3 1/1] xfs: Add cond_resched to xfs_defer_finish_noroll Ritesh Harjani (IBM)
2024-05-07 6:06 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h6fanl12.fsf@gmail.com \
--to=ritesh.list@gmail.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.