All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Yu via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
To: ruipengqi <ruipengqi3@gmail.com>, jaegeuk@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH v3] f2fs: fix potential deadlock in f2fs_balance_fs()
Date: Sun, 3 May 2026 17:47:59 +0800	[thread overview]
Message-ID: <e4b82e9f-c863-4d3a-8077-b56df9faa491@kernel.org> (raw)
In-Reply-To: <20260502124157.3406780-1-ruipengqi3@gmail.com>

On 5/2/26 20:41, ruipengqi wrote:
> From: Ruipeng Qi <ruipengqi3@gmail.com>
> 
> When the f2fs filesystem space is nearly exhausted, we encounter deadlock
> issues as below:
> 
> INFO: task A:1890 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:A    state:D stack:0     pid:1890  tgid:1626  ppid:1153   flags:0x00000204
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   io_schedule+0x44/0x68
>   folio_wait_bit_common+0x174/0x370
>   folio_wait_bit+0x20/0x38
>   folio_wait_writeback+0x54/0xc8
>   truncate_inode_partial_folio+0x70/0x1e0
>   truncate_inode_pages_range+0x1b0/0x450
>   truncate_pagecache+0x54/0x88
>   f2fs_file_write_iter+0x3e8/0xb80
>   do_iter_readv_writev+0xf0/0x1e0
>   vfs_writev+0x138/0x2c8
>   do_writev+0x88/0x130
>   __arm64_sys_writev+0x28/0x40
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> INFO: task kworker/u8:11:2680853 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u8:11   state:D stack:0     pid:2680853 tgid:2680853 ppid:2      flags:0x00000208
> Workqueue: writeback wb_workfn (flush-254:0)
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   io_schedule+0x44/0x68
>   folio_wait_bit_common+0x174/0x370
>   __filemap_get_folio+0x214/0x348
>   pagecache_get_page+0x20/0x70
>   f2fs_get_read_data_page+0x150/0x3e8
>   f2fs_get_lock_data_page+0x2c/0x160
>   move_data_page+0x50/0x478
>   do_garbage_collect+0xd38/0x1528
>   f2fs_gc+0x240/0x7e0
>   f2fs_balance_fs+0x1a0/0x208
>   f2fs_write_single_data_page+0x6e4/0x730
>   f2fs_write_cache_pages+0x378/0x9b0
>   f2fs_write_data_pages+0x2e4/0x388
>   do_writepages+0x8c/0x2c8
>   __writeback_single_inode+0x4c/0x498
>   writeback_sb_inodes+0x234/0x4a8
>   __writeback_inodes_wb+0x58/0x118
>   wb_writeback+0x2f8/0x3c0
>   wb_workfn+0x2c4/0x508
>   process_one_work+0x180/0x408
>   worker_thread+0x258/0x368
>   kthread+0x118/0x128
>   ret_from_fork+0x10/0x200
> 
> INFO: task kworker/u8:8:2641297 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u8:8    state:D stack:0     pid:2641297 tgid:2641297 ppid:2      flags:0x00000208
> Workqueue: writeback wb_workfn (flush-254:0)
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   rt_mutex_schedule+0x30/0x60
>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8
>   rwbase_write_lock+0x24c/0x378
>   down_write+0x1c/0x30
>   f2fs_balance_fs+0x184/0x208
>   f2fs_write_inode+0xf4/0x328
>   __writeback_single_inode+0x370/0x498
>   writeback_sb_inodes+0x234/0x4a8
>   __writeback_inodes_wb+0x58/0x118
>   wb_writeback+0x2f8/0x3c0
>   wb_workfn+0x2c4/0x508
>   process_one_work+0x180/0x408
>   worker_thread+0x258/0x368
>   kthread+0x118/0x128
>   ret_from_fork+0x10/0x20
> 
> INFO: task B:1902 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:B     state:D stack:0     pid:1902  tgid:1626  ppid:1153   flags:0x0000020c
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   rt_mutex_schedule+0x30/0x60
>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8
>   rwbase_write_lock+0x24c/0x378
>   down_write+0x1c/0x30
>   f2fs_balance_fs+0x184/0x208
>   f2fs_map_blocks+0x94c/0x1110
>   f2fs_file_write_iter+0x228/0xb80
>   do_iter_readv_writev+0xf0/0x1e0
>   vfs_writev+0x138/0x2c8
>   do_writev+0x88/0x130
>   __arm64_sys_writev+0x28/0x40
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> INFO: task sync:2769849 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:sync            state:D stack:0     pid:2769849 tgid:2769849 ppid:736    flags:0x0000020c
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   wb_wait_for_completion+0xb0/0xe8
>   sync_inodes_sb+0xc8/0x2b0
>   sync_inodes_one_sb+0x24/0x38
>   iterate_supers+0xa8/0x138
>   ksys_sync+0x54/0xc8
>   __arm64_sys_sync+0x18/0x30
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> The root cause is a potential deadlock between the following tasks:
> 
> kworker/u8:11				Thread A
> - f2fs_write_single_data_page
>   - f2fs_do_write_data_page
>    - folio_start_writeback(X)
>    - f2fs_outplace_write_data
>     - bio_add_folio(X)
>   - folio_unlock(X)
> 					- truncate_inode_pages_range
> 					 - __filemap_get_folio(X, FGP_LOCK)
> 					 - truncate_inode_partial_folio(X)
> 					  - folio_wait_writeback(X)
>   - f2fs_balance_fs
>    - f2fs_gc
>     - do_garbage_collect
>      - move_data_page
>       - f2fs_get_lock_data_page
>        - __filemap_get_folio(X, FGP_LOCK)
> 
> Both threads try to access folio X. Thread A holds the lock but waits
> for writeback, while kworker waits for the lock. This causes a deadlock.
> 
> Other threads also enter D state, waiting for locks such as gc_lock and
> writepages.
> 
> OPU/IPU DATA folio are all affected by this issue. To avoid such
> potential deadlocks, always commit these cached folios before
> triggering f2fs_gc() in f2fs_balance_fs().
> 
> v2:
> - Commit cached OPU/IPU folios, not just OPU folios as in v1.
> 
> v3:
> - Fixed minor grammatical issues
> - Add comment on lockless list_empty() to explain why it is safe
>    without holding bio_list_lock
> 
> Suggested-by: Chao <chao@kernel.org>

Chao Yu <chao@kernel.org>, :)

> Signed-off-by: Ruipeng Qi <ruipengqi3@gmail.com>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

WARNING: multiple messages have this Message-ID (diff)
From: Chao Yu <chao@kernel.org>
To: ruipengqi <ruipengqi3@gmail.com>, jaegeuk@kernel.org
Cc: chao@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] f2fs: fix potential deadlock in f2fs_balance_fs()
Date: Sun, 3 May 2026 17:47:59 +0800	[thread overview]
Message-ID: <e4b82e9f-c863-4d3a-8077-b56df9faa491@kernel.org> (raw)
In-Reply-To: <20260502124157.3406780-1-ruipengqi3@gmail.com>

On 5/2/26 20:41, ruipengqi wrote:
> From: Ruipeng Qi <ruipengqi3@gmail.com>
> 
> When the f2fs filesystem space is nearly exhausted, we encounter deadlock
> issues as below:
> 
> INFO: task A:1890 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:A    state:D stack:0     pid:1890  tgid:1626  ppid:1153   flags:0x00000204
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   io_schedule+0x44/0x68
>   folio_wait_bit_common+0x174/0x370
>   folio_wait_bit+0x20/0x38
>   folio_wait_writeback+0x54/0xc8
>   truncate_inode_partial_folio+0x70/0x1e0
>   truncate_inode_pages_range+0x1b0/0x450
>   truncate_pagecache+0x54/0x88
>   f2fs_file_write_iter+0x3e8/0xb80
>   do_iter_readv_writev+0xf0/0x1e0
>   vfs_writev+0x138/0x2c8
>   do_writev+0x88/0x130
>   __arm64_sys_writev+0x28/0x40
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> INFO: task kworker/u8:11:2680853 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u8:11   state:D stack:0     pid:2680853 tgid:2680853 ppid:2      flags:0x00000208
> Workqueue: writeback wb_workfn (flush-254:0)
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   io_schedule+0x44/0x68
>   folio_wait_bit_common+0x174/0x370
>   __filemap_get_folio+0x214/0x348
>   pagecache_get_page+0x20/0x70
>   f2fs_get_read_data_page+0x150/0x3e8
>   f2fs_get_lock_data_page+0x2c/0x160
>   move_data_page+0x50/0x478
>   do_garbage_collect+0xd38/0x1528
>   f2fs_gc+0x240/0x7e0
>   f2fs_balance_fs+0x1a0/0x208
>   f2fs_write_single_data_page+0x6e4/0x730
>   f2fs_write_cache_pages+0x378/0x9b0
>   f2fs_write_data_pages+0x2e4/0x388
>   do_writepages+0x8c/0x2c8
>   __writeback_single_inode+0x4c/0x498
>   writeback_sb_inodes+0x234/0x4a8
>   __writeback_inodes_wb+0x58/0x118
>   wb_writeback+0x2f8/0x3c0
>   wb_workfn+0x2c4/0x508
>   process_one_work+0x180/0x408
>   worker_thread+0x258/0x368
>   kthread+0x118/0x128
>   ret_from_fork+0x10/0x200
> 
> INFO: task kworker/u8:8:2641297 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u8:8    state:D stack:0     pid:2641297 tgid:2641297 ppid:2      flags:0x00000208
> Workqueue: writeback wb_workfn (flush-254:0)
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   rt_mutex_schedule+0x30/0x60
>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8
>   rwbase_write_lock+0x24c/0x378
>   down_write+0x1c/0x30
>   f2fs_balance_fs+0x184/0x208
>   f2fs_write_inode+0xf4/0x328
>   __writeback_single_inode+0x370/0x498
>   writeback_sb_inodes+0x234/0x4a8
>   __writeback_inodes_wb+0x58/0x118
>   wb_writeback+0x2f8/0x3c0
>   wb_workfn+0x2c4/0x508
>   process_one_work+0x180/0x408
>   worker_thread+0x258/0x368
>   kthread+0x118/0x128
>   ret_from_fork+0x10/0x20
> 
> INFO: task B:1902 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:B     state:D stack:0     pid:1902  tgid:1626  ppid:1153   flags:0x0000020c
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   rt_mutex_schedule+0x30/0x60
>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8
>   rwbase_write_lock+0x24c/0x378
>   down_write+0x1c/0x30
>   f2fs_balance_fs+0x184/0x208
>   f2fs_map_blocks+0x94c/0x1110
>   f2fs_file_write_iter+0x228/0xb80
>   do_iter_readv_writev+0xf0/0x1e0
>   vfs_writev+0x138/0x2c8
>   do_writev+0x88/0x130
>   __arm64_sys_writev+0x28/0x40
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> INFO: task sync:2769849 blocked for more than 120 seconds.
>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:sync            state:D stack:0     pid:2769849 tgid:2769849 ppid:736    flags:0x0000020c
> Call trace:
>   __switch_to+0xf4/0x158
>   __schedule+0x27c/0x908
>   schedule+0x3c/0x118
>   wb_wait_for_completion+0xb0/0xe8
>   sync_inodes_sb+0xc8/0x2b0
>   sync_inodes_one_sb+0x24/0x38
>   iterate_supers+0xa8/0x138
>   ksys_sync+0x54/0xc8
>   __arm64_sys_sync+0x18/0x30
>   invoke_syscall+0x50/0x120
>   el0_svc_common.constprop.0+0xc8/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x30/0xf8
>   el0t_64_sync_handler+0x120/0x130
>   el0t_64_sync+0x190/0x198
> 
> The root cause is a potential deadlock between the following tasks:
> 
> kworker/u8:11				Thread A
> - f2fs_write_single_data_page
>   - f2fs_do_write_data_page
>    - folio_start_writeback(X)
>    - f2fs_outplace_write_data
>     - bio_add_folio(X)
>   - folio_unlock(X)
> 					- truncate_inode_pages_range
> 					 - __filemap_get_folio(X, FGP_LOCK)
> 					 - truncate_inode_partial_folio(X)
> 					  - folio_wait_writeback(X)
>   - f2fs_balance_fs
>    - f2fs_gc
>     - do_garbage_collect
>      - move_data_page
>       - f2fs_get_lock_data_page
>        - __filemap_get_folio(X, FGP_LOCK)
> 
> Both threads try to access folio X. Thread A holds the lock but waits
> for writeback, while kworker waits for the lock. This causes a deadlock.
> 
> Other threads also enter D state, waiting for locks such as gc_lock and
> writepages.
> 
> OPU/IPU DATA folio are all affected by this issue. To avoid such
> potential deadlocks, always commit these cached folios before
> triggering f2fs_gc() in f2fs_balance_fs().
> 
> v2:
> - Commit cached OPU/IPU folios, not just OPU folios as in v1.
> 
> v3:
> - Fixed minor grammatical issues
> - Add comment on lockless list_empty() to explain why it is safe
>    without holding bio_list_lock
> 
> Suggested-by: Chao <chao@kernel.org>

Chao Yu <chao@kernel.org>, :)

> Signed-off-by: Ruipeng Qi <ruipengqi3@gmail.com>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,

  reply	other threads:[~2026-05-03  9:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-02 12:41 [f2fs-dev] [PATCH v3] f2fs: fix potential deadlock in f2fs_balance_fs() ruipengqi
2026-05-02 12:41 ` ruipengqi
2026-05-03  9:47 ` Chao Yu via Linux-f2fs-devel [this message]
2026-05-03  9:47   ` Chao Yu
2026-05-10 12:10   ` [f2fs-dev] " Ruipeng Qi
2026-05-10 12:10     ` Ruipeng Qi
2026-05-11  1:41 ` [f2fs-dev] " patchwork-bot+f2fs--- via Linux-f2fs-devel
2026-05-11  1:41   ` patchwork-bot+f2fs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4b82e9f-c863-4d3a-8077-b56df9faa491@kernel.org \
    --to=linux-f2fs-devel@lists.sourceforge.net \
    --cc=chao@kernel.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ruipengqi3@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.