From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24C43C54E58 for ; Mon, 25 Mar 2024 18:41:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B82D56B0092; Mon, 25 Mar 2024 14:41:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B33A56B0093; Mon, 25 Mar 2024 14:41:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A239A6B0095; Mon, 25 Mar 2024 14:41:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9064B6B0092 for ; Mon, 25 Mar 2024 14:41:45 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6275C1A0297 for ; Mon, 25 Mar 2024 18:41:45 +0000 (UTC) X-FDA: 81936430170.19.6F52247 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) by imf20.hostedemail.com (Postfix) with ESMTP id 8D6701C001E for ; Mon, 25 Mar 2024 18:41:43 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QEzL9PVF; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711392103; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rKvUlcpLk63ACsjKDDZfWCX8E5d/fciiBf5WKGyFmQs=; b=T4J6hjVxV9Buafo+uOwsKavajhL9vFpmKf6KxdD9bCQlXFGALHACtBZ8e3iHAik+LjNTDD ZlqGo0rVD8GfRTbXlJBqHNAKHf1J4A4CcnN0tiKT6YnofFigI1we/4P9cEinFulrK8a41V QouyeXE66xBch6uIbVIhX4UNw40+eHM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711392103; a=rsa-sha256; cv=none; b=1wB5jqxfG4lemZP8mcz6ZYX19V+xydC3MIz1jpJcJtwOui7VGVmH4qNnti2o/PqchULcWT C3J+D1gKNqCnmOgXb/cw4tTkWtizF9/fqp8MyqhiSF+No263IERNkaaVV43pfye16m62CR Gg6sTdaSgugLYuFqLL3dvfTvLWm9O38= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QEzL9PVF; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-a466e53f8c0so615195166b.1 for ; Mon, 25 Mar 2024 11:41:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711392102; x=1711996902; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rKvUlcpLk63ACsjKDDZfWCX8E5d/fciiBf5WKGyFmQs=; b=QEzL9PVFr6vPo6xM5jPZcxZ1BS1Z1DNL9RnDqsVsnEp9TfkrEX2O4dO9C92Pyh1J9c BCAmlDCzF0ALa82DDYzy949I0pEPThVr9jy2ZE6epmzy6Pw3vjNI51CYPEocdVToF6Qg xtn10NeZL5mDMr62VVrVn1nkdkYKiWFL/8/YutF1WnpKF2APgecX5bJLarWG89YvoVi5 oJhNQ5ABvOGo3CW9BsIV71IhPffa0A5GRx0kWV9L0L3l+spzuBKiWS6iuxGCrBh+XF7l 0Krj9clRgwi6SHaWsa2FcUA8iQTbTjS/5/Fitk6p6t809H3BGOYB9kdsmcLzLbbF54C9 bEJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711392102; x=1711996902; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rKvUlcpLk63ACsjKDDZfWCX8E5d/fciiBf5WKGyFmQs=; b=g/9u6sJrDlSzEDUqILKJBNsYVQEmsDjtMRo0PJAS+yDLdp88u9tdgDrMNRuvGIoP2W J5w/Pwc29/gWXX/4bDOF7iKXkW1grFGldIXO9ChacHESbOEwFbkkQxp2LDDFTv8bpGPu h/j33DVYaev7NlMjMzlioquRaEdryidMcBxEsT5Scm6xUc+uMSQ3bSuQfSl5FBHwUfG/ HZSllvpQV9YPLHpWytIZHOM1mUDxLjJ65WqE6bhWDOESk2zrf8qRB6acwnAB9kA/VwQ2 Xk19hlLFol7fgAv8rTZowO0n7vdSoMEKyYLJ9N7kLvXvzsxMIS3heeOtKnMx8XOM/HSg 9sWw== X-Forwarded-Encrypted: i=1; AJvYcCUoIXXuomRvfXOw8D0pRWMmMVk94vmrb/0G+C/Vhr5ZL+0JM/S8IFFHV2S4TBcfNLRJUmAQFb+8uwIPldc501OTYQg= X-Gm-Message-State: AOJu0YxJjzzrYLdWqqwSkNi4U02d8xOjtNUvfEBQedFFIaujdrqwVEEV LUJG2nWR+roblKUZXHFQ2STqEddiLreHD3cZjo9+UHM60ww75labaJeCFAzWyqtL6vJQoRxFnH9 QHbZFl8aqn/dQ3wAzr24osx03ZDbfH/5UrEPTm98G6RvHygmvXVVL X-Google-Smtp-Source: AGHT+IH7U9MjyZWBEh4R0acUddTJ1Je9zTxp/hmSsCVvU8DHvMTrz/lloebRfL6eyCoYpamLylTm3XvKe8aG84lVarU= X-Received: by 2002:a17:906:3993:b0:a46:8c9f:f783 with SMTP id h19-20020a170906399300b00a468c9ff783mr5661833eje.67.1711392101766; Mon, 25 Mar 2024 11:41:41 -0700 (PDT) MIME-Version: 1.0 References: <20240324210447.956973-1-hannes@cmpxchg.org> <20240325163003.GA42450@cmpxchg.org> In-Reply-To: <20240325163003.GA42450@cmpxchg.org> From: Yosry Ahmed Date: Mon, 25 Mar 2024 11:41:05 -0700 Message-ID: Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices To: Johannes Weiner Cc: Andrew Morton , Zhongkun He , Chengming Zhou , Barry Song <21cnbao@gmail.com>, Chris Li , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: zwscmi6wi4p7ac88tmz7zwaz4oehxsn4 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 8D6701C001E X-Rspam-User: X-HE-Tag: 1711392103-961428 X-HE-Meta: U2FsdGVkX19cBCeuq/oKasguQB3S8VYrON6Dv/5B2+7FeGuVNlqi0k8tVZZWJnFFkRfMFIolte71RyL8Vz63jof/zKRdhM4CJL4FKeqS4VRhPkxVd+TVjweJ3TReC+3dQtpnz6YDuvyn/5Rl/UzfLIqNqdR9QlkhpDmO+HWemkfv8CxnpS1OuoFlFPMejSA3ZLGFpx/k3cVrHXPEUSS/dziUck6+PXNMOAN6MmSEYB1jupAAwEIEK04GntVZpbHeqYIzdMR3zDPUgdis3sjMAwJlhV0xXPJFGfcP87qzwQfdIYtVzfX5JVYbeIW2arUfM9bdNwMeG9lyGcP+7Du69AdGc3ISDnNfas+To5aIfvgD7JBddMWn3qXRsczOQWXxksgT196MnrQytbXFpgFHj+pKOJT7mEXGNtmjZzfvWrqJZe3ttjjsXItBvOX4uO3cQZW1K7joggRgGkLHKYLipQ+qmN7q2bbtw8Usvu7ZQR37V7CnI2xZ40izO+4S5B0TQlyuFwd6R4X0sfOl5TP4cMvb4YJqOCzPgG64EMK5dB+U0Oi/py4z0FCSrMd/FvczU7aqNDFjNZ/oQ9YHOwVn0MfGvUlpo3iZtX5119QMfT9VM5Z6JS0S+Ut1BaliMSa7VWxxCx/2mvZrpr7SSp72VkDQ93JhZBMA3gh0I1UhIbAIPSQuSLdrWEzLpIg/uUjrmKriqKem7ulUmAP63o4zcrClFvRzd8KriEaUC9TQTigEgOoZw/3W2Cm9xDyguMtmlqVNSwYmliQ/Vad3kJCGMV5DC/o38q2R1JEZyTlS5lEu7cOJmURz0NFS1S5XyRHw+y4FfIrxfhsFq8QVGqQ/T2HL6WZ/TkeOaZd69NG1h06zYuaTiKJx/eEv6EvlwRrJBFCddY3+k0kD7dqhaERFgO3QAp4qKHujr4MA/bYJqAhtIjknnDe++0s2S38+IbSZQUpHQjDkoI+Lv8apl7x 7MNcOWDD i8ikXE6/4n0ukX6PFBhkpJT3oT2RHwgM6YIR9qJkhOulsuTLe/a28QvJFF5rTZYlZGgqGUvzfubzPpOl758TvanNerZDA3PSpcKThX3ZWFKPUtijV2I12ztD8jrb/2PQfdzm8NNIYBrPXasdst7TV0dWXaDmU1itjfIAtrjL/1bGHsyIKt0sFD1YNwRrFq6Y7HvT6Tqeaul0XWhTFkrSNJ8bbbjjwjPM/8bYzOScTjJjKQqZ86S4wlZzCmZPDqw4h5ElfrG3axzT+ueQlWooexBwswRRgq15vt4KLLVoLYir+BwRa1wAMXZT2knygb1HIwomjB6pCudvtgOw0Blm2LJqDP9bADlFVxekV4ubEkDWBb9l1I9Bzh6l0nrdtghzLALV0ZnlyuXTa+fY9j4TMjpspM6DNC6zarGpFcXqRlpSOm3mFYrf1onLA/3ArgE0Nkr5+Q8Kj/G8OvMOm5C+akaw/IonYKicuhCYw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 25, 2024 at 9:30=E2=80=AFAM Johannes Weiner wrote: > > On Sun, Mar 24, 2024 at 02:22:46PM -0700, Yosry Ahmed wrote: > > On Sun, Mar 24, 2024 at 2:04=E2=80=AFPM Johannes Weiner wrote: > > > > > > Zhongkun He reports data corruption when combining zswap with zram. > > > > > > The issue is the exclusive loads we're doing in zswap. They assume > > > that all reads are going into the swapcache, which can assume > > > authoritative ownership of the data and so the zswap copy can go. > > > > > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will tr= y > > > to bypass the swapcache. This results in an optimistic read of the > > > swap data into a page that will be dismissed if the fault fails due t= o > > > races. In this case, zswap mustn't drop its authoritative copy. > > > > > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=3DzV9P691B9bVq33erw= OXNTmEaUbi9DrDeJzw@mail.gmail.com/ > > > Reported-by: Zhongkun He > > > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > > > Cc: stable@vger.kernel.org [6.5+] > > > Signed-off-by: Johannes Weiner > > > Tested-by: Zhongkun He > > > > Do we also want to mention somewhere (commit log or comment) that > > keeping the entry in the tree is fine because we are still protected > > from concurrent loads/invalidations/writeback by swapcache_prepare() > > setting SWAP_HAS_CACHE or so? > > I don't think it's necessary, as zswap isn't doing anything special > here. It's up to the caller to follow the generic swap exclusion > protocol that zswap also adheres to. So IMO the relevant comment > should be, and is, above that swapcache_prepare() in do_swap_page(). >From the perspective of someone looking at the zswap code, it isn't immediately clear what protects the zswap entry in the non-exclusive load case from being freed from under us. At some point we had a refcount, then we used to remove it from the tree under lock so others wouldn't have access to it. Now it's less clear because we rely on protection outside of zswap code. We also document other places where we rely on the swapcache for synchronization, so I think it may be worth briefly mentioning this here as well, especially that in this code we explicitly check for the folio not being in the swapcache. That said, I don't feel strongly about it. Tracking down the SWP_SYNCHRONOUS_IO code should eventually make it clear. Also, the commit log will end up having a link to this thread anyway so the details are not completely unfindable :)