From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F267DC433FE for ; Mon, 14 Nov 2022 12:59:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237639AbiKNM76 (ORCPT ); Mon, 14 Nov 2022 07:59:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237640AbiKNM7z (ORCPT ); Mon, 14 Nov 2022 07:59:55 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EEED25E94 for ; Mon, 14 Nov 2022 04:59:52 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 02B16B80EB9 for ; Mon, 14 Nov 2022 12:59:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3140C433C1; Mon, 14 Nov 2022 12:59:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1668430789; bh=3/nS9cBNPeZvjK8qdD5rfSxsGVWJQO7Isn6O6l3dF/c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uh/uhqdJB9Hqpocx5XlizyaFzTtwXwLlplnl2FiuMNC3/yBoFYLtDfrl95dXm3SZA uBWYMqpRhXpxUbyLmsyaio2OoxvzCE+NiuOfeDLhEmUd/TJtfLwkuU0Dku3QOFp84n vANPF18DFcco+lmeliam1hCfSjY4erfyxjhwtSsU= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Ryusuke Konishi , syzbot+45d6ce7b7ad7ef455d03@syzkaller.appspotmail.com, Andrew Morton Subject: [PATCH 5.15 098/131] nilfs2: fix deadlock in nilfs_count_free_blocks() Date: Mon, 14 Nov 2022 13:46:07 +0100 Message-Id: <20221114124452.854647161@linuxfoundation.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221114124448.729235104@linuxfoundation.org> References: <20221114124448.729235104@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ryusuke Konishi commit 8ac932a4921a96ca52f61935dbba64ea87bbd5dc upstream. A semaphore deadlock can occur if nilfs_get_block() detects metadata corruption while locating data blocks and a superblock writeback occurs at the same time: task 1 task 2 ------ ------ * A file operation * nilfs_truncate() nilfs_get_block() down_read(rwsem A) <-- nilfs_bmap_lookup_contig() ... generic_shutdown_super() nilfs_put_super() * Prepare to write superblock * down_write(rwsem B) <-- nilfs_cleanup_super() * Detect b-tree corruption * nilfs_set_log_cursor() nilfs_bmap_convert_error() nilfs_count_free_blocks() __nilfs_error() down_read(rwsem A) <-- nilfs_set_error() down_write(rwsem B) <-- *** DEADLOCK *** Here, nilfs_get_block() readlocks rwsem A (= NILFS_MDT(dat_inode)->mi_sem) and then calls nilfs_bmap_lookup_contig(), but if it fails due to metadata corruption, __nilfs_error() is called from nilfs_bmap_convert_error() inside the lock section. Since __nilfs_error() calls nilfs_set_error() unless the filesystem is read-only and nilfs_set_error() attempts to writelock rwsem B (= nilfs->ns_sem) to write back superblock exclusively, hierarchical lock acquisition occurs in the order rwsem A -> rwsem B. Now, if another task starts updating the superblock, it may writelock rwsem B during the lock sequence above, and can deadlock trying to readlock rwsem A in nilfs_count_free_blocks(). However, there is actually no need to take rwsem A in nilfs_count_free_blocks() because it, within the lock section, only reads a single integer data on a shared struct with nilfs_sufile_get_ncleansegs(). This has been the case after commit aa474a220180 ("nilfs2: add local variable to cache the number of clean segments"), that is, even before this bug was introduced. So, this resolves the deadlock problem by just not taking the semaphore in nilfs_count_free_blocks(). Link: https://lkml.kernel.org/r/20221029044912.9139-1-konishi.ryusuke@gmail.com Fixes: e828949e5b42 ("nilfs2: call nilfs_error inside bmap routines") Signed-off-by: Ryusuke Konishi Reported-by: syzbot+45d6ce7b7ad7ef455d03@syzkaller.appspotmail.com Tested-by: Ryusuke Konishi Cc: [2.6.38+ Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman --- fs/nilfs2/the_nilfs.c | 2 -- 1 file changed, 2 deletions(-) --- a/fs/nilfs2/the_nilfs.c +++ b/fs/nilfs2/the_nilfs.c @@ -690,9 +690,7 @@ int nilfs_count_free_blocks(struct the_n { unsigned long ncleansegs; - down_read(&NILFS_MDT(nilfs->ns_dat)->mi_sem); ncleansegs = nilfs_sufile_get_ncleansegs(nilfs->ns_sufile); - up_read(&NILFS_MDT(nilfs->ns_dat)->mi_sem); *nblocks = (sector_t)ncleansegs * nilfs->ns_blocks_per_segment; return 0; }