From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A3C31482E8 for ; Wed, 29 Apr 2026 03:39:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777433975; cv=none; b=hYEOQYo3QztmpI8Sx/qQ6l3YZh2Zn/7S8ged4AEhyB+4xlqwPzx3q99ZRbGCcr2n/WMbb6M++S4spSf02cJfJMkIOtLYptKBYjLScphIWMctDk4GHxNETZ9yBriHF2C/MsTPw5EobYuktrEoj8w/Dh9ygGILRARmyYGPnsd4E3I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777433975; c=relaxed/simple; bh=DX9cx2wTXmbXLMv/ganXLvWwfs1Q519p8pb+ixrunyQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Siz5sCZkdEnXkL2om9R0Ms7Wnm///l0sAr0yOcSV2pPfqhIA0PcuV+Pdw6h2rDN3buDrCwH8Y91DmB41kHCwLPxfLFnDBdYsGGrLbo2E3ROP2a1WsPKM8PQ+WIPM24sIIUgt1cUg/cMxNBs/9wJIfmua/iyLBBU5mPZPcStNTjo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hWJ949nh; arc=none smtp.client-ip=209.85.215.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hWJ949nh" Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-c79662668bbso4639175a12.1 for ; Tue, 28 Apr 2026 20:39:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777433973; x=1778038773; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=G0kp7XTeDN8xQUCoZ7CCLc1Wu3tREkWK892pDwsUIm4=; b=hWJ949nhmUiCYrFzyMAFJmz1PaVRoqJ8qpqgMTIaM/o5Lkk7i9ziAXcN1BlMHQR8JN b/YwnH3gUPlgSFvG+tdmXO6TvQyruiX7viZJ4tnJ97WIpkZ4YRj4N9lHAPAK1l7oyLEn avfL6VwLSO7KJVQH+wdO8xxfkluAIpRfcw0B/y9fPyL7WjG+kfhoEbXgtfECIjH+0OW4 iybeAJVWO9iI5WkIxVsNLpjKZw0kfmNYamTQAgwTv73WJHb7U10BwcPjNK1euXDP3j58 fMmzTD8V54X8LldrBtUcSsXOn5bpjd1UV4P8y9SerFjZ5wshIeZHM4DDoxyXT9PV0Fuy RLYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777433973; x=1778038773; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=G0kp7XTeDN8xQUCoZ7CCLc1Wu3tREkWK892pDwsUIm4=; b=tTV59DVEuGRZB0KVLsXS0US4h2huV/Q7Bv6WxJKvbIexpxSpl//V/g3dNZFHoQwfFY e6p9kTC0NnyUfKZT+bHQySBvGXYJP5wtxDSYkJcu1U8hLibZSvqTxnYPIbh8uyIlpyst UW+po1D+6L/bflC9sBHGQ35Ud78HbIGVgiqXUHt7E6QY+iPRiXWX9fTXkRob1yaDS0n+ OklALXB1AfJ99DuhpI/eTt8o9ywVt7qEMScAafSm+z0cqA4Q6GtXV3VOnw8eEmjGZjtm fzkReFO4huG1BHrwkcy2yZ3tnGefNnWHu7ikuxp7JHYAGq63c9koh5Tl9ukOExpc693+ q9Hw== X-Forwarded-Encrypted: i=1; AFNElJ8v2TuYCxkn9K3vKfqxu5htjLb2vDOHdwttf/78o9SoY3pvgCyCmFRn4IOOYuaTQJKWVOboRPu8kz+euNE=@vger.kernel.org X-Gm-Message-State: AOJu0YzFsvFF5i1UitPu4TpDs0V4Hnzgs+vJZ9HpJZGZvZqXIOBzdz2h u7d71FRYwDMG+Xv6IHSVecSueZqeSe8xgCnqScaWous9RtF4nM1gITtX X-Gm-Gg: AeBDievoaF1IwfG59muYTQDqUbo8idY/BTIayQB+gfGq7EXOMI+vCBfbNCu7S3sjGRC jOz0i6NanltdBB2mRp8ymTeDWfJfphrzdpk1XbX7GeiPUzGdyXct2fTNjE6yBB4Yc6daU05/CtS kE8DYmYqVTtWUHeC2gjUbcGYj9cFS5HFV8v8yaaZJV/uzZrgU/q9uaC4I2Azrx8T4kKQGT0PItO yyWI//bUB7koaAHZkWm9f3Fy+xqo2Ql1rYqF3nPlTM6lhgn4ar7axAqYq1oaquxNOKezACp6XTV 8kI/Pxx0PRVjpDqq6ChtzrXo1Tmb3iVkbAnviYFHdHKnklc4AYU0A0UPWZOZdiLIbsqMOlgieEf L9F/LaTt8wdx+X38ZKvG+td1Ba9QhnKlnHdiNj5h/yIj94WMGJxcu5e6mvpkW7AcODMv3sCeJqq MD9Q3XXkex94zoJ2PN8xINiQ5WE9amLCeGVdlY1DARE91QLQrz X-Received: by 2002:a17:903:44a:b0:2b4:5b9e:4edd with SMTP id d9443c01a7336-2b97c403be5mr43587725ad.9.1777433973359; Tue, 28 Apr 2026 20:39:33 -0700 (PDT) Received: from [10.121.80.58] ([210.184.73.204]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b98895a919sm6452085ad.55.2026.04.28.20.39.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 28 Apr 2026 20:39:32 -0700 (PDT) Message-ID: <70254f4c-80ce-4c53-ba60-be023d0cd6fc@gmail.com> Date: Wed, 29 Apr 2026 11:39:29 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] f2fs: fix potential deadlock in f2fs_balance_fs() To: Chao Yu , jaegeuk@kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org References: <20260426093239.165767-1-ruipengqi3@gmail.com> Content-Language: en-US From: Ruipeng Qi In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2026/4/27 16:38, Chao Yu wrote: > On 4/26/26 17:32, ruipengqi wrote: >> From: Ruipeng Qi >> >> When the f2fs filesystem space is nearly exhausted, we encounter >> deadlock >> issues as below: >> >> INFO: task A:1890 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:A    state:D stack:0     pid:1890  tgid:1626  ppid:1153 >> flags:0x00000204 >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   io_schedule+0x44/0x68 >>   folio_wait_bit_common+0x174/0x370 >>   folio_wait_bit+0x20/0x38 >>   folio_wait_writeback+0x54/0xc8 >>   truncate_inode_partial_folio+0x70/0x1e0 >>   truncate_inode_pages_range+0x1b0/0x450 >>   truncate_pagecache+0x54/0x88 >>   f2fs_file_write_iter+0x3e8/0xb80 >>   do_iter_readv_writev+0xf0/0x1e0 >>   vfs_writev+0x138/0x2c8 >>   do_writev+0x88/0x130 >>   __arm64_sys_writev+0x28/0x40 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> INFO: task kworker/u8:11:2680853 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:kworker/u8:11   state:D stack:0     pid:2680853 tgid:2680853 >> ppid:2      flags:0x00000208 >> Workqueue: writeback wb_workfn (flush-254:0) >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   io_schedule+0x44/0x68 >>   folio_wait_bit_common+0x174/0x370 >>   __filemap_get_folio+0x214/0x348 >>   pagecache_get_page+0x20/0x70 >>   f2fs_get_read_data_page+0x150/0x3e8 >>   f2fs_get_lock_data_page+0x2c/0x160 >>   move_data_page+0x50/0x478 >>   do_garbage_collect+0xd38/0x1528 >>   f2fs_gc+0x240/0x7e0 >>   f2fs_balance_fs+0x1a0/0x208 >>   f2fs_write_single_data_page+0x6e4/0x730  //0xfffffe0d6ca08300 >>   f2fs_write_cache_pages+0x378/0x9b0 >>   f2fs_write_data_pages+0x2e4/0x388 >>   do_writepages+0x8c/0x2c8 >>   __writeback_single_inode+0x4c/0x498 >>   writeback_sb_inodes+0x234/0x4a8 >>   __writeback_inodes_wb+0x58/0x118 >>   wb_writeback+0x2f8/0x3c0 >>   wb_workfn+0x2c4/0x508 >>   process_one_work+0x180/0x408 >>   worker_thread+0x258/0x368 >>   kthread+0x118/0x128 >>   ret_from_fork+0x10/0x200 >> >> INFO: task kworker/u8:8:2641297 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:kworker/u8:8    state:D stack:0     pid:2641297 tgid:2641297 >> ppid:2      flags:0x00000208 >> Workqueue: writeback wb_workfn (flush-254:0) >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   rt_mutex_schedule+0x30/0x60 >>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8 >>   rwbase_write_lock+0x24c/0x378 >>   down_write+0x1c/0x30 >>   f2fs_balance_fs+0x184/0x208 >>   f2fs_write_inode+0xf4/0x328 >>   __writeback_single_inode+0x370/0x498 >>   writeback_sb_inodes+0x234/0x4a8 >>   __writeback_inodes_wb+0x58/0x118 >>   wb_writeback+0x2f8/0x3c0 >>   wb_workfn+0x2c4/0x508 >>   process_one_work+0x180/0x408 >>   worker_thread+0x258/0x368 >>   kthread+0x118/0x128 >>   ret_from_fork+0x10/0x20 >> >> INFO: task B:1902 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:B     state:D stack:0     pid:1902  tgid:1626  ppid:1153 >> flags:0x0000020c >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   rt_mutex_schedule+0x30/0x60 >>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8 >>   rwbase_write_lock+0x24c/0x378 >>   down_write+0x1c/0x30 >>   f2fs_balance_fs+0x184/0x208 >>   f2fs_map_blocks+0x94c/0x1110 >>   f2fs_file_write_iter+0x228/0xb80 >>   do_iter_readv_writev+0xf0/0x1e0 >>   vfs_writev+0x138/0x2c8 >>   do_writev+0x88/0x130 >>   __arm64_sys_writev+0x28/0x40 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> INFO: task sync:2769849 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:sync            state:D stack:0     pid:2769849 tgid:2769849 >> ppid:736    flags:0x0000020c >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   wb_wait_for_completion+0xb0/0xe8 >>   sync_inodes_sb+0xc8/0x2b0 >>   sync_inodes_one_sb+0x24/0x38 >>   iterate_supers+0xa8/0x138 >>   ksys_sync+0x54/0xc8 >>   __arm64_sys_sync+0x18/0x30 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> The root cause is a potential deadlock between the following tasks: >> >> kworker/u8:11                Thread A >> - f2fs_write_single_data_page >>   - f2fs_do_write_data_page >>    - folio_start_writeback(X) >>    - f2fs_outplace_write_data >>     - bio_add_folio(X) >>   - folio_unlock(X) >>                     - truncate_inode_pages_range >>                      - __filemap_get_folio(X, FGP_LOCK) >>                      - truncate_inode_partial_folio(X) >>                       - folio_wait_writeback(X) >>   - f2fs_balance_fs >>    - f2fs_gc >>     - do_garbage_collect >>      - move_data_page >>       - f2fs_get_lock_data_page >>        - __filemap_get_folio(X, FGP_LOCK) >> >> Both threads try to access folio X. Thread A holds the lock but waits >> for writeback, while kworker waits for the lock. This causes a deadlock. >> >> Other threads also enter D state, waiting for locks such as gc_lock and >> writepages. >> >> OPU/IPU DATA folio are all affected by this issue. To avoid such >> potential deadlocks, always commit these cached folios before >> triggering f2fs_gc() in f2fs_balance_fs(). >> >> v2: >> - Commit cached OPU/IPU folios, not just OPU folios as in v1. >> >> Suggested-by: Chao >> Signed-off-by: Ruipeng Qi >> --- >>   fs/f2fs/data.c    | 26 ++++++++++++++++++++++++++ >>   fs/f2fs/f2fs.h    |  1 + >>   fs/f2fs/segment.c |  9 +++++++++ >>   3 files changed, 36 insertions(+) >> >> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >> index 338df7a2aea6..fd03366b3228 100644 >> --- a/fs/f2fs/data.c >> +++ b/fs/f2fs/data.c >> @@ -939,6 +939,32 @@ void f2fs_submit_merged_ipu_write(struct >> f2fs_sb_info *sbi, >>       } >>   } >>   +void f2fs_submit_all_merged_ipu_writes(struct f2fs_sb_info *sbi) >> +{ >> +    struct bio_entry *be, *tmp; >> +    struct f2fs_bio_info *io; >> +    enum temp_type temp; >> +    LIST_HEAD(list); >> + >> +    for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { >> +        io = sbi->write_io[DATA] + temp; >> + >> +        if (list_empty(&io->bio_list)) >> +            continue; > > Needs to be covered w/ bio_list_lock to avoid race condition. Hi,Chao The lockless list_empty() here is intentional and acceptable. If list_empty() returns true but the list becomes non-empty afterwards (due to race), the newly added bio will be submitted by the subsequent write path, so no bio will be lost. Similar patterns exist in the kernel, e.g.:   net/rfkill/core.c: rfkill_fop_read()     /* since we re-check and it just compares pointers,      * using !list_empty() without locking isn't a problem      */   fs/f2fs/data.c: f2fs_submit_merged_ipu_write()     list_empty() is also used without holding bio_list_lock     as a lockless pre-check If you'd prefer, we can add a comment to make the intent clear:     /* list_empty() without lock is safe here - READ_ONCE()      * ensures pointer read atomicity. A false negative is      * acceptable since any bio added concurrently will be      * submitted by the next write path.      */     if (list_empty(&io->bio_list))         continue; > >> + >> +        f2fs_down_write(&io->bio_list_lock); >> +        list_splice_init(&io->bio_list, &list); >> +        f2fs_up_write(&io->bio_list_lock); >> + >> +        list_for_each_entry_safe(be, tmp, &list, list) { >> +            f2fs_submit_write_bio(sbi, be->bio, DATA); >> +            del_bio_entry(be); >> +        } >> + > > Unnecessary blank line. > > Thanks, Thanks for your correction. Will fix in v3.     v3:     - Fixed minor grammatical issues     - Add comment on lockless list_empty() to explain why it is safe   without holding bio_list_lock Thanks, > >> +    } >> + >> +} >> + >>   int f2fs_merge_page_bio(struct f2fs_io_info *fio) >>   { >>       struct bio *bio = *fio->bio; >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h >> index bb34e864d0ef..e9038ab1b2bd 100644 >> --- a/fs/f2fs/f2fs.h >> +++ b/fs/f2fs/f2fs.h >> @@ -4148,6 +4148,7 @@ void f2fs_submit_merged_write_folio(struct >> f2fs_sb_info *sbi, >>                   struct folio *folio, enum page_type type); >>   void f2fs_submit_merged_ipu_write(struct f2fs_sb_info *sbi, >>                       struct bio **bio, struct folio *folio); >> +void f2fs_submit_all_merged_ipu_writes(struct f2fs_sb_info *sbi); >>   void f2fs_flush_merged_writes(struct f2fs_sb_info *sbi); >>   int f2fs_submit_page_bio(struct f2fs_io_info *fio); >>   int f2fs_merge_page_bio(struct f2fs_io_info *fio); >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >> index 6a97fe76712b..856ffe91b94f 100644 >> --- a/fs/f2fs/segment.c >> +++ b/fs/f2fs/segment.c >> @@ -454,6 +454,15 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, >> bool need) >>           io_schedule(); >>           finish_wait(&sbi->gc_thread->fggc_wq, &wait); >>       } else { >> + >> +        /* >> +         * Submit all cached OPU/IPU DATA bios before triggering >> +         * foreground GC to avoid potential deadlocks. >> +         */ >> + >> +        f2fs_submit_merged_write(sbi, DATA); >> +        f2fs_submit_all_merged_ipu_writes(sbi); >> + >>           struct f2fs_gc_control gc_control = { >>               .victim_segno = NULL_SEGNO, >>               .init_gc_type = f2fs_sb_has_blkzoned(sbi) ? >