From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C16A42AA6 for ; Sun, 26 Apr 2026 09:30:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777195823; cv=none; b=HtmtPp4BqUGSAJ0Y8FPrtO/M0/fgqBdzpEJY/Wy65uiyF5TSxgvI1MMROrXxCEXh9U8+ISgTRrLntLU4vyrz2mD509bKSYuL68LfGYdfhYEkb+LSPl1mLg2+bWaUivLGTZCo0FnYCnPrSrRqVkhO/U5n9TR7YwM/V44dsrV5t5o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777195823; c=relaxed/simple; bh=OilgNv2wBaV+208pS1sPo75nx0Rv/Non2bze0YmgB/g=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=WJe0VxCR1pcnfJwW2Opzc5XaMQp9JDG3kOHHCfIjlZft7H/S2p3hJ2CIqcq5fMgNnHBIo+Tdr46z4t7L7Ylm/WXtpYZXHT4c6ygo/Hd3IRQ/4tMVuRd0cfZzB8vWx6e31NfkfMkyQRm3u9P30/GXoDKyY9a6nxM0fDZE68kOKyk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XXLGQUJR; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XXLGQUJR" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-35fc0d7c310so6011858a91.1 for ; Sun, 26 Apr 2026 02:30:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777195821; x=1777800621; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Ngldw17zqa0g84KPPJcvySA7+Z+hTyy5mHYtGRbcHBw=; b=XXLGQUJRimbjlyIR86xRJRXJHSFEN55NE+7os15wi8sZZAGNoeVtrfYy+Hoffaz87U rI0/VU65hFXLfXutekYRChrRRVwTiJewobn8ifFqfCZUpMjzlbMu7JtxkmOil6LsqXSo EHGHmbssLNqiXOk4cPAp/VGyxfvYle9aBOTY02It92rft003aFUVuAOQ32N1lSkgpfYO fBoh7Il/Qt0gNjJMvCn6SQ+6wAtbAdoVQY+Tn2dmuw8eE8WV0lQETBoRVE2Xw73m4e4n 5nP8uW+grIDq+BB9iyxkGXmSdIehNLm3SvPubB4EwzzRStyHhkwaeyhoaUWWrPSLC41u 2gng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777195821; x=1777800621; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ngldw17zqa0g84KPPJcvySA7+Z+hTyy5mHYtGRbcHBw=; b=YmJZ8x+qZKYQnleVpgLSFbjETB6tzXqFXptEsKxCpqYkJ5cIRInxa7MUvTbUljPMUy VdA/Y/FMhPbgN4hYpikI0nV7iWzxdXNCoHZjOsN1bCrKDT7wgsQVL9/tuqAUiMWoqoLj Wf6DAPtdtVPKE+WCLdhT7NpUZGgIHGhjjqxXsPQlp/TNwCgtjMCZsgEu1EixsRtB8/bp +vNajdWH42UQLmLVV/FrbIMHkd4jhD6voytheo5awW2fCMs8L2mRIHp3cvw1ewQIRAd/ oZSOs6+jXE7KpfnQGrUneNXYHUC+sjxtaT7u24JwE6qsurweOOV5DDkOeQ2D0oU3fzxi 3VpA== X-Forwarded-Encrypted: i=1; AFNElJ/PGCtJ9VXR0fCYL0/E5F1XSqiU+FPaIyCT417WmWLTsh2y+0T/S45wiG2GuwZ1SF1TYP5Cuc384Yb8tVQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yx5NDsZ19+zWcXQzzF7YNQh4x/ZqwiFtWqUTQ0DIEe3+gM24u9g tQEToisMlrX7Obf1H4yr78oieBMQDUtxL5FWWHFlurLy+j0QHYy7/wOGIJdwMTvo X-Gm-Gg: AeBDiesTPCqFhVOyL4KezpZI2C1Oqh0lsvISBwgxQ0lAO+wd5CJ+lQ5xMH4MTuKpaeq tov7TlTfAsvIL1JcovJcADl/GuPxQJFq+kDZLKq8DCors3v0Wx8W+vmacizomHOwOcJWrVF/pNK 1d1RKXUzLiYEyINmAu0kZczqoNw1n+HUFgv/PBp1rM93QkGTO64XuBAQgtxWQMNqwuUVo96h9/5 D+ZK8a7jnA3SwDy7OCqKjcT+4oSK/hZX9VSvkofBV6hUnMQYnB9CpDIIzMIX0WJ/AgHpv11G/1R rx8BeutDfh80n2o31UsE5G74eb3VNd1s06ilT6Fp95kYnJRpC8ut3ZEfv0f79Q8ehzp/o1DhR7q Q/dFDFMufK8S6YyJULg1F5SuuG52H8Kzg72abZVBXK8Lber0ByTU56RhSnLTijVnyONpGq74HSb uz+1PbBzpbhEe6D1yGK+dkp0uUY7DVNfkeAuPddYPu/4l3KOPl X-Received: by 2002:a17:90a:102:b0:35f:b940:4e81 with SMTP id 98e67ed59e1d1-3614046f98emr24975428a91.16.1777195820750; Sun, 26 Apr 2026 02:30:20 -0700 (PDT) Received: from [10.121.80.58] ([210.184.73.204]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-36141868906sm27782558a91.3.2026.04.26.02.30.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 26 Apr 2026 02:30:20 -0700 (PDT) Message-ID: <47128c90-86e3-48a6-aa11-3da39218ae7c@gmail.com> Date: Sun, 26 Apr 2026 17:30:15 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] f2fs: fix potential deadlock in f2fs_balance_fs() To: Chao Yu , jaegeuk@kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org References: <20260325133749.1053541-1-ruipengqi3@gmail.com> <607d2d34-9d58-42d1-8436-e85d9c73eb7a@kernel.org> Content-Language: en-US From: Ruipeng Qi In-Reply-To: <607d2d34-9d58-42d1-8436-e85d9c73eb7a@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2026/4/20 15:35, Chao Yu wrote: > Hi Ruipeng, > > Sorry, I missed your patch. > > On 3/25/2026 9:37 PM, ruipengqi wrote: >> From: Ruipeng Qi >> >> When the f2fs filesystem space is nearly exhausted, we encounter >> deadlock >> issues as below: >> >> INFO: task A:1890 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:A    state:D stack:0     pid:1890  tgid:1626  ppid:1153 >> flags:0x00000204 >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   io_schedule+0x44/0x68 >>   folio_wait_bit_common+0x174/0x370 >>   folio_wait_bit+0x20/0x38 >>   folio_wait_writeback+0x54/0xc8 >>   truncate_inode_partial_folio+0x70/0x1e0 >>   truncate_inode_pages_range+0x1b0/0x450 >>   truncate_pagecache+0x54/0x88 >>   f2fs_file_write_iter+0x3e8/0xb80 >>   do_iter_readv_writev+0xf0/0x1e0 >>   vfs_writev+0x138/0x2c8 >>   do_writev+0x88/0x130 >>   __arm64_sys_writev+0x28/0x40 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> INFO: task kworker/u8:11:2680853 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:kworker/u8:11   state:D stack:0     pid:2680853 tgid:2680853 >> ppid:2      flags:0x00000208 >> Workqueue: writeback wb_workfn (flush-254:0) >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   io_schedule+0x44/0x68 >>   folio_wait_bit_common+0x174/0x370 >>   __filemap_get_folio+0x214/0x348 >>   pagecache_get_page+0x20/0x70 >>   f2fs_get_read_data_page+0x150/0x3e8 >>   f2fs_get_lock_data_page+0x2c/0x160 >>   move_data_page+0x50/0x478 >>   do_garbage_collect+0xd38/0x1528 >>   f2fs_gc+0x240/0x7e0 >>   f2fs_balance_fs+0x1a0/0x208 >>   f2fs_write_single_data_page+0x6e4/0x730  //0xfffffe0d6ca08300 >>   f2fs_write_cache_pages+0x378/0x9b0 >>   f2fs_write_data_pages+0x2e4/0x388 >>   do_writepages+0x8c/0x2c8 >>   __writeback_single_inode+0x4c/0x498 >>   writeback_sb_inodes+0x234/0x4a8 >>   __writeback_inodes_wb+0x58/0x118 >>   wb_writeback+0x2f8/0x3c0 >>   wb_workfn+0x2c4/0x508 >>   process_one_work+0x180/0x408 >>   worker_thread+0x258/0x368 >>   kthread+0x118/0x128 >>   ret_from_fork+0x10/0x200 >> >> INFO: task kworker/u8:8:2641297 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:kworker/u8:8    state:D stack:0     pid:2641297 tgid:2641297 >> ppid:2      flags:0x00000208 >> Workqueue: writeback wb_workfn (flush-254:0) >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   rt_mutex_schedule+0x30/0x60 >>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8 >>   rwbase_write_lock+0x24c/0x378 >>   down_write+0x1c/0x30 >>   f2fs_balance_fs+0x184/0x208 >>   f2fs_write_inode+0xf4/0x328 >>   __writeback_single_inode+0x370/0x498 >>   writeback_sb_inodes+0x234/0x4a8 >>   __writeback_inodes_wb+0x58/0x118 >>   wb_writeback+0x2f8/0x3c0 >>   wb_workfn+0x2c4/0x508 >>   process_one_work+0x180/0x408 >>   worker_thread+0x258/0x368 >>   kthread+0x118/0x128 >>   ret_from_fork+0x10/0x20 >> >> INFO: task B:1902 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:B     state:D stack:0     pid:1902  tgid:1626  ppid:1153 >> flags:0x0000020c >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   rt_mutex_schedule+0x30/0x60 >>   __rt_mutex_slowlock_locked.constprop.0+0x460/0x8a8 >>   rwbase_write_lock+0x24c/0x378 >>   down_write+0x1c/0x30 >>   f2fs_balance_fs+0x184/0x208 >>   f2fs_map_blocks+0x94c/0x1110 >>   f2fs_file_write_iter+0x228/0xb80 >>   do_iter_readv_writev+0xf0/0x1e0 >>   vfs_writev+0x138/0x2c8 >>   do_writev+0x88/0x130 >>   __arm64_sys_writev+0x28/0x40 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> INFO: task sync:2769849 blocked for more than 120 seconds. >>        Tainted: G           O       6.12.41-g3fe07ddf05ab #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> message. >> task:sync            state:D stack:0     pid:2769849 tgid:2769849 >> ppid:736    flags:0x0000020c >> Call trace: >>   __switch_to+0xf4/0x158 >>   __schedule+0x27c/0x908 >>   schedule+0x3c/0x118 >>   wb_wait_for_completion+0xb0/0xe8 >>   sync_inodes_sb+0xc8/0x2b0 >>   sync_inodes_one_sb+0x24/0x38 >>   iterate_supers+0xa8/0x138 >>   ksys_sync+0x54/0xc8 >>   __arm64_sys_sync+0x18/0x30 >>   invoke_syscall+0x50/0x120 >>   el0_svc_common.constprop.0+0xc8/0xf0 >>   do_el0_svc+0x24/0x38 >>   el0_svc+0x30/0xf8 >>   el0t_64_sync_handler+0x120/0x130 >>   el0t_64_sync+0x190/0x198 >> >> The root cause is a potential deadlock between the following tasks: >> >> kworker/u8:11                Thread A >> - f2fs_write_single_data_page >>   - f2fs_do_write_data_page >>    - folio_start_writeback(X) >>    - f2fs_outplace_write_data >>     - bio_add_folio(X) >>   - folio_unlock(X) >>                     - truncate_inode_pages_range >>                      - __filemap_get_folio(X, FGP_LOCK) >>                      - truncate_inode_partial_folio(X) >>                       - folio_wait_writeback(X) >>   - f2fs_balance_fs >>    - f2fs_gc >>     - do_garbage_collect >>      - move_data_page >>       - f2fs_get_lock_data_page >>        - __filemap_get_folio(X, FGP_LOCK) >> >> Both threads try to access folio X. Thread A holds the lock but waits >> for writeback, while kworker waits for the lock. This causes a deadlock. >> >> Other threads also enter D state, waiting for locks such as gc_lock and >> writepages. >> >> To avoid this potential deadlock, always call f2fs_submit_merged_write >> before triggering f2fs_gc in f2fs_balance_fs. >> >> Signed-off-by: Ruipeng Qi >> --- >>   fs/f2fs/segment.c | 14 ++++++++++++++ >>   1 file changed, 14 insertions(+) >> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >> index 6a97fe76712b..b58299e49c23 100644 >> --- a/fs/f2fs/segment.c >> +++ b/fs/f2fs/segment.c >> @@ -454,6 +454,20 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, >> bool need) >>           io_schedule(); >>           finish_wait(&sbi->gc_thread->fggc_wq, &wait); >>       } else { >> + >> +        /* >> +         * Before triggering foreground GC, submit all cached DATA >> +         * write bios. During writeback, pages may be added to >> +         * write_io[DATA].bio with PG_writeback set but the bio not >> +         * yet submitted. If GC's move_data_page() blocks on >> +         * __folio_lock() for such a folio, and the lock holder waits >> +         * for PG_writeback to clear via VFS folio_wait_writeback() >> +         * neither thread can make progress. Flushing here ensures >> +         * the bio completion callback can clear PG_writeback. >> +         */ >> + >> +        f2fs_submit_merged_write(sbi, DATA); > > Do we need to call f2fs_submit_merged_ipu_write(sbi, bio, NULL) to commit > cached IPU folios as well? > > Not sure, this race condition will happen for node folio. > > Thanks, > Hi, Chao Thanks for your suggestion. After deeper analysis, this race condition applies to IPU folios but not node folios. Node folios are unlikely to have this flow. I will send a corrected version shortly. v2: - Commit cached OPU and IPU folios, not just OPU folios as in v1. BTW, Do you think it is possible to add an optional ->wait_folio_writeback() callback to address_space_operations. when provided, truncate_inode_partial_folio() calls f2fs_wait_on_page_writeback instead of the generic folio_wait_writeback(), which also fix this race condition. Thanks, >> + >>           struct f2fs_gc_control gc_control = { >>               .victim_segno = NULL_SEGNO, >>               .init_gc_type = f2fs_sb_has_blkzoned(sbi) ?