From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A19F1C47DD9 for ; Mon, 25 Mar 2024 00:01:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F19B6B0083; Sun, 24 Mar 2024 20:01:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2795A6B0087; Sun, 24 Mar 2024 20:01:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F4396B0088; Sun, 24 Mar 2024 20:01:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EA0C66B0083 for ; Sun, 24 Mar 2024 20:01:49 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 642761C06FA for ; Mon, 25 Mar 2024 00:01:49 +0000 (UTC) X-FDA: 81933607938.07.F1E44EC Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) by imf04.hostedemail.com (Postfix) with ESMTP id EAFED4000A for ; Mon, 25 Mar 2024 00:01:46 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YNwn7HpM; spf=pass (imf04.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.183 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711324907; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qe6xqHSlnnr+98KLEqZDrt/HGpOBcW/zRiJgat60DCo=; b=CR9pObiFM9hkWDbfqVEQSB2pbjDR9Djv3DOOy54j2nf1IwamkawwkXQW15wfByl5njcUSe yP17sYXmgi/b1oXqrBJLRqCLQijzBg66nCK9zBIO80YIgjD3g3QUeCX97IfHh43fKbjEPa rJvwjv+q2Y+7B4mYvsECPgfGuUnNTsU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YNwn7HpM; spf=pass (imf04.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.183 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711324907; a=rsa-sha256; cv=none; b=W8r1wF9ikgqrgIopOcaHGIY/SW/kdEqhgkMcVkLqSpgFB5PKZjauQDIjApVH3nJ06wRr68 EKHRpeq0sQqKSPO9UIw+6kB7rE9vMX31jIZDMaFeqwK2o7MeN9Z63uv63tT1eSKvX7e20f tG9JtDKzOaZQv54qxaSuZJG2Z/pyd88= Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711324901; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qe6xqHSlnnr+98KLEqZDrt/HGpOBcW/zRiJgat60DCo=; b=YNwn7HpM3USRSh+6cv0pYoNd+E6X22pLWZxAk1qdUSCMbNjj3+ZMFHvErjXOWrCKfxhYT9 ds/+KKk1H6bHCdfhYMxrsvP+4B1lwIDVeJmPx0f+RxERmAoypc4cdeGAIRNxEo9hTeDUeu 9pp3awR/frB0uEjRnve1WQMeGHY9O7c= Date: Mon, 25 Mar 2024 08:01:24 +0800 MIME-Version: 1.0 Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices To: Johannes Weiner , Andrew Morton Cc: Zhongkun He , Chengming Zhou , Yosry Ahmed , Barry Song <21cnbao@gmail.com>, Chris Li , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240324210447.956973-1-hannes@cmpxchg.org> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou In-Reply-To: <20240324210447.956973-1-hannes@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: EAFED4000A X-Rspam-User: X-Stat-Signature: wdjurrm8myyndwgx5tm7qkeea7i4cq8x X-Rspamd-Server: rspam01 X-HE-Tag: 1711324906-935390 X-HE-Meta: U2FsdGVkX1+EEYOipswGsxZjEnftEOaYX3r2t1JGxTkeXqnhCb3FlsmRo+JTZlCyghHDDEEaJj/KnrKkypyOzwmK6sI1UZ/jz80WaWS5b9zUkHMmtdXY3gpz+MMBIFWP6TpoO2Ce510Qngiai0scGbI7Emp4x4GSqKJO7hMPA8LIXHr4BooW6K+WsVD/bcVp0XQNfBgCwU+GvIYSojqRO8WJNNNf7pa2e1JE3dJHb7KjMyv7riB9JEVFhxuBo7pcqlkhRqgUbCWFEbd82vHHc5AAo4HkYXM+q9/zMDVdcPy8GDwdB4qFG+AM2oRq/ct73fX98vm1EFrC9deUh1lzoJNpf34tdMqCalsErewUUyo2Oo0Ra2bC4jbE+NzlXNRa4SLbulNnj6Z7x5Eg5FBuyhzr4ekaWYcm2cGSDf9MFKNrZbLfi01R+sCnZRG3byPLKql5qQ2fClaYK3SRK/xNTwdvuqo4bVtvfakrqqT6BSuoX1Mwpz0vOg96csMr+7AIOqhcjnoDxty7J3Od7lkRAARG+EesOegPs8Mw+O5755ks8t24iI7bO0ojcG/a+39ufMWcbWxisIOjsyqPbIdx3Gc/fJY/tZsDc5nPNrOVzfKBkSKd6xl0t3lWnqGe/0+qwO2PHYqNHJoM5+olhOCdhKXcr7mj46APrtwHj+nt3mKt/Y/tnzOOu+oT4kas5uZWE8rwUFVBLKMg4DuYaM3bBEJ15AYURYb1vUfB+iTrxfj+i2aP2iIcn6O3/7IrnzsvJkUwPOHShWnpwwmdIsl7WY3hR7OxytDJ1h5QuK3YOIv94/x7XITfIHbqFMibYFtCORU6Mjy6z9J6tIgKqWIQ2kqAdw/NsP0KLtIES0a7iPDbrfarn0Zh6gkAYj/1aBBrCY5XuU79jKnggOVgcHyZ94IfvQPUBoEzbh9qt77JJg49G8d4ApOtafEZ/RoIeaKznOgufhOg2xKF+U66snc 1OhkDPJR diUqz9tqvxG1zJQJskZ6wstd0Er2GmAjg+KXgMMpDW96fvNsyOzEdBzDLANhCOWFCnq54xN+ObG9RfQB9qWayI5e6Xmu4vzfotw4zhcbjKsCKGBjbAwuTmiKjMHDyr4gXhizml0+saQFThkCBXc4sANRumUFXYE0pJBJ2HHOC6L8TEtApDbnkKF80mhtsee0WB5Rn10uga5+v/GcYmldoF3B0iRlZNxs5JfT+znPxeI2+FqJEtkK70zam5KyApdmhdfzytS4vEws1MNheuHyIoVwd5lXvIzWXGk4Dsb3dINc/E1gnTDSEJvaSMskAnmRFbq+/At8u3jM/6cXSrrleJ42Otg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/3/25 05:04, Johannes Weiner wrote: > Zhongkun He reports data corruption when combining zswap with zram. > > The issue is the exclusive loads we're doing in zswap. They assume > that all reads are going into the swapcache, which can assume > authoritative ownership of the data and so the zswap copy can go. > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try > to bypass the swapcache. This results in an optimistic read of the > swap data into a page that will be dismissed if the fault fails due to > races. In this case, zswap mustn't drop its authoritative copy. > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/ > Reported-by: Zhongkun He > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > Cc: stable@vger.kernel.org [6.5+] > Signed-off-by: Johannes Weiner > Tested-by: Zhongkun He Very nice solution! Reviewed-by: Chengming Zhou Thanks. > --- > mm/zswap.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 535c907345e0..41a1170f7cfe 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1622,6 +1622,7 @@ bool zswap_load(struct folio *folio) > swp_entry_t swp = folio->swap; > pgoff_t offset = swp_offset(swp); > struct page *page = &folio->page; > + bool swapcache = folio_test_swapcache(folio); > struct zswap_tree *tree = swap_zswap_tree(swp); > struct zswap_entry *entry; > u8 *dst; > @@ -1634,7 +1635,20 @@ bool zswap_load(struct folio *folio) > spin_unlock(&tree->lock); > return false; > } > - zswap_rb_erase(&tree->rbroot, entry); > + /* > + * When reading into the swapcache, invalidate our entry. The > + * swapcache can be the authoritative owner of the page and > + * its mappings, and the pressure that results from having two > + * in-memory copies outweighs any benefits of caching the > + * compression work. > + * > + * (Most swapins go through the swapcache. The notable > + * exception is the singleton fault on SWP_SYNCHRONOUS_IO > + * files, which reads into a private page and may free it if > + * the fault fails. We remain the primary owner of the entry.) > + */ > + if (swapcache) > + zswap_rb_erase(&tree->rbroot, entry); > spin_unlock(&tree->lock); > > if (entry->length) > @@ -1649,9 +1663,10 @@ bool zswap_load(struct folio *folio) > if (entry->objcg) > count_objcg_event(entry->objcg, ZSWPIN); > > - zswap_entry_free(entry); > - > - folio_mark_dirty(folio); > + if (swapcache) { > + zswap_entry_free(entry); > + folio_mark_dirty(folio); > + } > > return true; > }