From: "Darrick J. Wong" <djwong@kernel.org>
To: Jian Wen <wenjianhn@gmail.com>
Cc: linux-xfs@vger.kernel.org, hch@lst.de, dchinner@redhat.com,
Jian Wen <wenjian1@xiaomi.com>
Subject: Re: [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags
Date: Wed, 10 Jan 2024 09:45:01 -0800 [thread overview]
Message-ID: <20240110174501.GH722975@frogsfrogsfrogs> (raw)
In-Reply-To: <20240110071347.3711925-1-wenjian1@xiaomi.com>
On Wed, Jan 10, 2024 at 03:13:47PM +0800, Jian Wen wrote:
> From: Jian Wen <wenjianhn@gmail.com>
>
> Deleting a file with lots of extents may cause a soft lockup if the
> preemption model is none(CONFIG_PREEMPT_NONE=y or preempt=none is set
> in the kernel cmdline). Alibaba cloud kernel and Oracle UEK container
> kernel are affected by the issue, since they select CONFIG_PREEMPT_NONE=y.
>
> Explicitly call cond_resched in xfs_itruncate_extents_flags avoid
> the below softlockup warning.
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:13:139]
Wowwee, how many extents does this file have, anyway?
> CPU: 0 PID: 139 Comm: kworker/0:13 Not tainted 6.7.0-rc8-g610a9b8f49fb #23
> Workqueue: xfs-inodegc/vda1 xfs_inodegc_worker
> Call Trace:
> _raw_spin_lock+0x30/0x80
> ? xfs_extent_busy_trim+0x38/0x200
> xfs_extent_busy_trim+0x38/0x200
> xfs_alloc_compute_aligned+0x38/0xd0
> xfs_alloc_ag_vextent_size+0x1f1/0x870
> xfs_alloc_fix_freelist+0x58a/0x770
> xfs_free_extent_fix_freelist+0x60/0xa0
> __xfs_free_extent+0x66/0x1d0
> xfs_trans_free_extent+0xac/0x290
> xfs_extent_free_finish_item+0xf/0x40
> xfs_defer_finish_noroll+0x1db/0x7f0
> xfs_defer_finish+0x10/0xa0
> xfs_itruncate_extents_flags+0x169/0x4c0
> xfs_inactive_truncate+0xba/0x140
> xfs_inactive+0x239/0x2a0
> xfs_inodegc_worker+0xa3/0x210
> ? process_scheduled_works+0x273/0x550
> process_scheduled_works+0x2da/0x550
> worker_thread+0x180/0x350
>
> Most of the Linux distributions default to voluntary preemption,
> might_sleep() would yield CPU if needed. Thus they are not affected.
> kworker/0:24+xf 298 [000] 7294.810021: probe:__cond_resched:
> __cond_resched+0x5 ([kernel.kallsyms])
> __kmem_cache_alloc_node+0x17c ([kernel.kallsyms])
> __kmalloc+0x4d ([kernel.kallsyms])
> kmem_alloc+0x7a ([kernel.kallsyms])
> xfs_extent_busy_insert_list+0x2e ([kernel.kallsyms])
> __xfs_free_extent+0xd3 ([kernel.kallsyms])
> xfs_trans_free_extent+0x93 ([kernel.kallsyms])
> xfs_extent_free_finish_item+0xf ([kernel.kallsyms])
>
> kworker/0:24+xf 298 [000] 7294.810045: probe:__cond_resched:
> __cond_resched+0x5 ([kernel.kallsyms])
> down+0x11 ([kernel.kallsyms])
> xfs_buf_lock+0x2c ([kernel.kallsyms])
> xfs_buf_find_lock+0x62 ([kernel.kallsyms])
> xfs_buf_get_map+0x1fd ([kernel.kallsyms])
> xfs_buf_read_map+0x51 ([kernel.kallsyms])
> xfs_trans_read_buf_map+0x1c5 ([kernel.kallsyms])
> xfs_btree_read_buf_block.constprop.0+0xa1 ([kernel.kallsyms])
> xfs_btree_lookup_get_block+0x97 ([kernel.kallsyms])
> xfs_btree_lookup+0xc5 ([kernel.kallsyms])
> xfs_alloc_lookup_eq+0x18 ([kernel.kallsyms])
> xfs_free_ag_extent+0x63f ([kernel.kallsyms])
> __xfs_free_extent+0xa7 ([kernel.kallsyms])
> xfs_trans_free_extent+0x93 ([kernel.kallsyms])
> xfs_extent_free_finish_item+0xf ([kernel.kallsyms])
>
> Signed-off-by: Jian Wen <wenjian1@xiaomi.com>
> ---
> fs/xfs/xfs_inode.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index c0f1c89786c2..194381e10472 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -4,6 +4,7 @@
> * All Rights Reserved.
> */
> #include <linux/iversion.h>
> +#include <linux/sched.h>
>
> #include "xfs.h"
> #include "xfs_fs.h"
> @@ -1383,6 +1384,8 @@ xfs_itruncate_extents_flags(
> error = xfs_defer_finish(&tp);
> if (error)
> goto out;
> +
> + cond_resched();
Is there a particular reason why truncation uses a giant chained
transaction for all extents (with xfs_defer_finish) instead of, say, one
chain per extent?
(If you've actually studied this and know why then I guess cond_resched
is a reasonable bandaid, but I don't want to play cond_resched
whackamole here.)
--D
> }
>
> if (whichfork == XFS_DATA_FORK) {
> --
> 2.25.1
>
>
next prev parent reply other threads:[~2024-01-10 17:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-10 7:13 [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags Jian Wen
2024-01-10 17:45 ` Darrick J. Wong [this message]
2024-01-11 12:40 ` Jian Wen
2024-01-11 20:16 ` Dave Chinner
2024-01-10 21:38 ` Dave Chinner
2024-01-11 12:52 ` Jian Wen
2024-01-11 20:27 ` Dave Chinner
2024-01-12 13:01 ` Thomas Gleixner
2024-01-23 7:01 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240110174501.GH722975@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
--cc=wenjian1@xiaomi.com \
--cc=wenjianhn@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox