public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Ye Bin <yebin@huaweicloud.com>
Cc: viro@zeniv.linux.org.uk, jack@suse.cz,
	linux-fsdevel@vger.kernel.org,  axboe@kernel.dk,
	linux-block@vger.kernel.org, agruenba@redhat.com,
	 gfs2@lists.linux.dev, amir73il@gmail.com, mic@digikod.net,
	gnoack@google.com,  paul@paul-moore.com, jmorris@namei.org,
	serge@hallyn.com,  linux-security-module@vger.kernel.org,
	yebin10@huawei.com, zhangxiaoxu5@huawei.com
Subject: Re: [PATCH 10/11] fs: fix hungtask due to repeated traversal of inodes list
Date: Wed, 4 Dec 2024 12:17:49 +0100	[thread overview]
Message-ID: <20241204-worden-tontechnik-3ce77e9f3bad@brauner> (raw)
In-Reply-To: <20241118114508.1405494-11-yebin@huaweicloud.com>

On Mon, Nov 18, 2024 at 07:45:07PM +0800, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
> 
> There's a issue when remove scsi disk, the invalidate_inodes() function
> cannot exit for a long time, then trigger hungtask:
> INFO: task kworker/56:0:1391396 blocked for more than 122 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Workqueue: events_freezable virtscsi_handle_event [virtio_scsi]
> Call Trace:
>  __schedule+0x33c/0x7f0
>  schedule+0x46/0xb0
>  schedule_preempt_disabled+0xa/0x10
>  __mutex_lock.constprop.0+0x22b/0x490
>  mutex_lock+0x52/0x70
>  scsi_scan_target+0x6d/0xf0
>  virtscsi_handle_event+0x152/0x1a0 [virtio_scsi]
>  process_one_work+0x1b2/0x350
>  worker_thread+0x49/0x310
>  kthread+0xfb/0x140
>  ret_from_fork+0x1f/0x30
> 
> PID: 540499  TASK: ffff9b15e504c080  CPU: 44  COMMAND: "kworker/44:0"
> Call trace:
>  invalidate_inodes at ffffffff8f3b4784
>  __invalidate_device at ffffffff8f3dfea3
>  invalidate_partition at ffffffff8f526b49
>  del_gendisk at ffffffff8f5280fb
>  sd_remove at ffffffffc0186455 [sd_mod]
>  __device_release_driver at ffffffff8f738ab2
>  device_release_driver at ffffffff8f738bc4
>  bus_remove_device at ffffffff8f737f66
>  device_del at ffffffff8f73341b
>  __scsi_remove_device at ffffffff8f780340
>  scsi_remove_device at ffffffff8f7803a2
>  virtscsi_handle_event at ffffffffc017204f [virtio_scsi]
>  process_one_work at ffffffff8f1041f2
>  worker_thread at ffffffff8f104789
>  kthread at ffffffff8f109abb
>  ret_from_fork at ffffffff8f001d6f
> 
> As commit 04646aebd30b ("fs: avoid softlockups in s_inodes iterators")
> introduces the retry logic. In the problem environment, the 'i_count'
> of millions of files is not zero. As a result, the time slice for each
> traversal to the matching inode process is almost used up, and then the
> traversal is started from scratch. The worst-case scenario is that only
> one inode can be processed after each wakeup. Because this process holds
> a lock, other processes will be stuck for a long time, causing a series
> of problems.
> To solve the problem of repeated traversal from the beginning, each time
> the CPU needs to be freed, a cursor is inserted into the linked list, and
> the traversal continues from the cursor next time.
> 
> Fixes: 04646aebd30b ("fs: avoid softlockups in s_inodes iterators")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> ---
>  fs/inode.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index dc966990bda6..b78895af8779 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -857,11 +857,16 @@ static void dispose_list(struct list_head *head)
>  void evict_inodes(struct super_block *sb)
>  {
>  	struct inode *inode, *next;
> +	struct inode cursor;

It seems pretty adventurous to me to just add in a random inode whose
only fiels that is initialized is i_state. That would need a proper
analysis and argument that this is safe to do and won't cause trouble
for any filesystem.

Jan, do you have thoughts on this?

>  	LIST_HEAD(dispose);
>  
> +	cursor.i_state = I_CURSOR;
> +	INIT_LIST_HEAD(&cursor.i_sb_list);
> +	inode = list_entry(&sb->s_inodes, typeof(*inode), i_sb_list);
> +
>  again:
>  	spin_lock(&sb->s_inode_list_lock);
> -	sb_for_each_inodes_safe(inode, next, &sb->s_inodes) {
> +	sb_for_each_inodes_continue_safe(inode, next, &sb->s_inodes) {
>  		if (atomic_read(&inode->i_count))
>  			continue;
>  
> @@ -886,12 +891,16 @@ void evict_inodes(struct super_block *sb)
>  		 * bit so we don't livelock.
>  		 */
>  		if (need_resched()) {
> +			list_del(&cursor.i_sb_list);
> +			list_add(&cursor.i_sb_list, &inode->i_sb_list);
> +			inode = &cursor;
>  			spin_unlock(&sb->s_inode_list_lock);
>  			cond_resched();
>  			dispose_list(&dispose);
>  			goto again;
>  		}
>  	}
> +	list_del(&cursor.i_sb_list);
>  	spin_unlock(&sb->s_inode_list_lock);
>  
>  	dispose_list(&dispose);
> @@ -907,11 +916,16 @@ EXPORT_SYMBOL_GPL(evict_inodes);
>  void invalidate_inodes(struct super_block *sb)
>  {
>  	struct inode *inode, *next;
> +	struct inode cursor;
>  	LIST_HEAD(dispose);
>  
> +	cursor.i_state = I_CURSOR;
> +	INIT_LIST_HEAD(&cursor.i_sb_list);
> +	inode = list_entry(&sb->s_inodes, typeof(*inode), i_sb_list);
> +
>  again:
>  	spin_lock(&sb->s_inode_list_lock);
> -	sb_for_each_inodes_safe(inode, next, &sb->s_inodes) {
> +	sb_for_each_inodes_continue_safe(inode, next, &sb->s_inodes) {
>  		spin_lock(&inode->i_lock);
>  		if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
>  			spin_unlock(&inode->i_lock);
> @@ -927,12 +941,16 @@ void invalidate_inodes(struct super_block *sb)
>  		spin_unlock(&inode->i_lock);
>  		list_add(&inode->i_lru, &dispose);
>  		if (need_resched()) {
> +			list_del(&cursor.i_sb_list);
> +			list_add(&cursor.i_sb_list, &inode->i_sb_list);
> +			inode = &cursor;
>  			spin_unlock(&sb->s_inode_list_lock);
>  			cond_resched();
>  			dispose_list(&dispose);
>  			goto again;
>  		}
>  	}
> +	list_del(&cursor.i_sb_list);
>  	spin_unlock(&sb->s_inode_list_lock);
>  
>  	dispose_list(&dispose);
> -- 
> 2.34.1
> 

  reply	other threads:[~2024-12-04 11:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-18 11:44 [PATCH 00/11] fix hungtask due to repeated traversal of inodes list Ye Bin
2024-11-18 11:44 ` [PATCH 01/11] fs: introduce I_CURSOR flag for inode Ye Bin
2024-11-18 11:44 ` [PATCH 02/11] block: use sb_for_each_inodes API Ye Bin
2024-11-18 11:45 ` [PATCH 03/11] fs: " Ye Bin
2024-11-18 11:45 ` [PATCH 04/11] gfs2: " Ye Bin
2024-11-18 11:45 ` [PATCH 05/11] fs: use sb_for_each_inodes_safe API Ye Bin
2024-11-18 11:45 ` [PATCH 06/11] fsnotify: use sb_for_each_inodes API Ye Bin
2024-11-18 11:45 ` [PATCH 07/11] quota: " Ye Bin
2024-11-18 11:45 ` [PATCH 08/11] fs/super.c: " Ye Bin
2024-11-18 11:45 ` [PATCH 09/11] landlock: " Ye Bin
2024-11-18 11:45 ` [PATCH 10/11] fs: fix hungtask due to repeated traversal of inodes list Ye Bin
2024-12-04 11:17   ` Christian Brauner [this message]
2024-12-04 14:16     ` Jan Kara
2024-11-18 11:45 ` [PATCH 11/11] fs: fix potential soft lockup when 'sb->s_inodes' has large number of inodes Ye Bin
2024-12-04  7:04 ` [PATCH 00/11] fix hungtask due to repeated traversal of inodes list yebin (H)
2024-12-04 11:04 ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241204-worden-tontechnik-3ce77e9f3bad@brauner \
    --to=brauner@kernel.org \
    --cc=agruenba@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=gfs2@lists.linux.dev \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=jmorris@namei.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mic@digikod.net \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yebin10@huawei.com \
    --cc=yebin@huaweicloud.com \
    --cc=zhangxiaoxu5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox