From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40BB918D64F; Thu, 15 Aug 2024 13:55:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723730117; cv=none; b=tEz6wuHaV6moRZSgtJlM5VFsdt1iaKQ6MAAaDC3jyrCH2GARPe8rK+kdk5sZ+sFMBupThC+yw3/fXZCXbeifD+QiAzCqLI4mH9iM4xoJfLClCrHKUYrvZD0/Xf+07b5SQdIc5KOWY+gYEBtYCtHCnyl8/++kxj6426UV+tu4q+c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723730117; c=relaxed/simple; bh=6kB96IIRnD/TiQ4SYVo6uH4dEaD27+N6rahOj6VdSHI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=om+G36LBPkP7g+pQ+04qafj4lVK0XxzbzTx66apfu/q1dM1z7dddw2SXtVp8n3EBIPk3Q0ahtufIzUQmIlyQ5uHPsH7Svc6TevynffLRWR986Tyq7DFgJLqf+kVKMmbqZ3np8jNMKM7a2JypNObnFTvyGwGGqA+dXJ5YbrunLqQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=VWeJsRUh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="VWeJsRUh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BB9D8C32786; Thu, 15 Aug 2024 13:55:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1723730117; bh=6kB96IIRnD/TiQ4SYVo6uH4dEaD27+N6rahOj6VdSHI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VWeJsRUh3VlZgNK7qcGgYwIIg3BSKk0cPN4bAvlyasgAiw61SlC1qClmHa0NYpVtp mrxhpJOinzgWZSeDictSAvi1PazFFEbciPMuNhY/pukZ2MMrLMF7Qsx6J28900r4fQ QCo5SEWc4BAFS9dB6pFK6pYBKVvKqOhBZ4zCy4Sk= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Zhang Yi , Jan Kara , Theodore Tso , Sasha Levin Subject: [PATCH 5.15 309/484] ext4: check the extent status again before inserting delalloc block Date: Thu, 15 Aug 2024 15:22:47 +0200 Message-ID: <20240815131953.345044793@linuxfoundation.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240815131941.255804951@linuxfoundation.org> References: <20240815131941.255804951@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 5.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Zhang Yi [ Upstream commit 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c ] ext4_da_map_blocks looks up for any extent entry in the extent status tree (w/o i_data_sem) and then the looks up for any ondisk extent mapping (with i_data_sem in read mode). If it finds a hole in the extent status tree or if it couldn't find any entry at all, it then takes the i_data_sem in write mode to add a da entry into the extent status tree. This can actually race with page mkwrite & fallocate path. Note that this is ok between 1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the folio lock 2. ext4 buffered write path v/s ext4 fallocate because of the inode lock. But this can race between ext4_page_mkwrite() & ext4 fallocate path ext4_page_mkwrite() ext4_fallocate() block_page_mkwrite() ext4_da_map_blocks() //find hole in extent status tree ext4_alloc_file_blocks() ext4_map_blocks() //allocate block and unwritten extent ext4_insert_delayed_block() ext4_da_reserve_space() //reserve one more block ext4_es_insert_delayed_block() //drop unwritten extent and add delayed extent by mistake Then, the delalloc extent is wrong until writeback and the extra reserved block can't be released any more and it triggers below warning: EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared! Fix the problem by looking up extent status tree again while the i_data_sem is held in write mode. If it still can't find any entry, then we insert a new da entry into the extent status tree. Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin --- fs/ext4/inode.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 403f88662bc30..e765c0d05fea2 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1744,6 +1744,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, if (ext4_es_is_hole(&es)) goto add_delayed; +found: /* * Delayed extent could be allocated by fallocate. * So we need to check it. @@ -1788,6 +1789,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, add_delayed: down_write(&EXT4_I(inode)->i_data_sem); + /* + * Page fault path (ext4_page_mkwrite does not take i_rwsem) + * and fallocate path (no folio lock) can race. Make sure we + * lookup the extent status tree here again while i_data_sem + * is held in write mode, before inserting a new da entry in + * the extent status tree. + */ + if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) { + if (!ext4_es_is_hole(&es)) { + up_write(&EXT4_I(inode)->i_data_sem); + goto found; + } + } else if (!ext4_has_inline_data(inode)) { + retval = ext4_map_query_blocks(NULL, inode, map); + if (retval) { + up_write(&EXT4_I(inode)->i_data_sem); + return retval; + } + } + retval = ext4_insert_delayed_block(inode, map->m_lblk); up_write(&EXT4_I(inode)->i_data_sem); if (retval) -- 2.43.0