From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E05591170E for ; Mon, 21 Aug 2023 19:51:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B5A7C433C7; Mon, 21 Aug 2023 19:51:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1692647490; bh=g8XEdtCxs+Z6p6Xs8PAIa7tF9d7ZLbj8Zaoh6QjuRvk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WJQd2t3V4KqvNVltfAzr2RmVpAFc5h4jxMBrVklbY4O5YYIikIz8Z7HCEM5F5G/gW MpatM98XzxMs0krbk1nrmgJ0cdGqffDE9NVysa8QTPrSxi53q1/oC+EkJ92uRWG2r+ zjTSsZhUfRtX/Gd9E6S01oTNYdTROEX+/QMqcJTU= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Andrew Yang , Sergey Senozhatsky , AngeloGioacchino Del Regno , Matthias Brugger , Minchan Kim , Sebastian Andrzej Siewior , Andrew Morton , Sasha Levin Subject: [PATCH 6.1 005/194] zsmalloc: fix races between modifications of fullness and isolated Date: Mon, 21 Aug 2023 21:39:44 +0200 Message-ID: <20230821194122.943487705@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230821194122.695845670@linuxfoundation.org> References: <20230821194122.695845670@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Andrew Yang [ Upstream commit 4b5d1e47b69426c0f7491d97d73ad0152d02d437 ] We encountered many kernel exceptions of VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and BUG_ON(!pages[1]) in zs_unmap_object() lately. This issue only occurs when migration and reclamation occur at the same time. With our memory stress test, we can reproduce this issue several times a day. We have no idea why no one else encountered this issue. BTW, we switched to the new kernel version with this defect a few months ago. Since fullness and isolated share the same unsigned int, modifications of them should be protected by the same lock. [andrew.yang@mediatek.com: move comment] Link: https://lkml.kernel.org/r/20230727062910.6337-1-andrew.yang@mediatek.com Link: https://lkml.kernel.org/r/20230721063705.11455-1-andrew.yang@mediatek.com Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration") Signed-off-by: Andrew Yang Reviewed-by: Sergey Senozhatsky Cc: AngeloGioacchino Del Regno Cc: Matthias Brugger Cc: Minchan Kim Cc: Sebastian Andrzej Siewior Cc: Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin --- mm/zsmalloc.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 326faa751f0a4..2f5e6d35b03bd 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1816,6 +1816,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage, static bool zs_page_isolate(struct page *page, isolate_mode_t mode) { + struct zs_pool *pool; struct zspage *zspage; /* @@ -1826,9 +1827,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode) VM_BUG_ON_PAGE(PageIsolated(page), page); zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); inc_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock); return true; } @@ -1895,12 +1897,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, kunmap_atomic(s_addr); replace_sub_page(class, zspage, newpage, page); + dec_zspage_isolation(zspage); /* * Since we complete the data copy and set up new zspage structure, * it's okay to release the pool's lock. */ spin_unlock(&pool->lock); - dec_zspage_isolation(zspage); migrate_write_unlock(zspage); get_page(newpage); @@ -1917,15 +1919,17 @@ static int zs_page_migrate(struct page *newpage, struct page *page, static void zs_page_putback(struct page *page) { + struct zs_pool *pool; struct zspage *zspage; VM_BUG_ON_PAGE(!PageMovable(page), page); VM_BUG_ON_PAGE(!PageIsolated(page), page); zspage = get_zspage(page); - migrate_write_lock(zspage); + pool = zspage->pool; + spin_lock(&pool->lock); dec_zspage_isolation(zspage); - migrate_write_unlock(zspage); + spin_unlock(&pool->lock); } static const struct movable_operations zsmalloc_mops = { -- 2.40.1