From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 869B0CCF9EC for ; Wed, 25 Sep 2024 19:20:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 78A9A6B00AF; Wed, 25 Sep 2024 15:20:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 713986B00B4; Wed, 25 Sep 2024 15:20:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53EBA6B00B5; Wed, 25 Sep 2024 15:20:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 334936B00AF for ; Wed, 25 Sep 2024 15:20:16 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A43E6AC463 for ; Wed, 25 Sep 2024 19:20:15 +0000 (UTC) X-FDA: 82604226390.02.64B7684 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 960F71C0013 for ; Wed, 25 Sep 2024 19:20:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=Jn69VXcR; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.172 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727291880; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qHqvjA9JzcIDCVpMVQmDmm2El9iw7THV4P5WFIx3v7M=; b=3rag17CPFhdZdCvRC/ngmGpWM4HYuIGXvWdGX6IRPEuQQFVqmBL/Gz+4WESvMdMWM83gta NZpxUUeiYeUiIpRuGF7MfgjjcWERgnO+w2voZ5cIhvHgm1+E6X2xniNIwTWY045hL2b58t kjMN33VoOCph/Cv93Dr6b43OaiZ673Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727291880; a=rsa-sha256; cv=none; b=HuWleLAw/UIy+6DB0+BnTvgxNdIBM/F9NMh8EmnTpZ3weKwgkWD5hs+PEv9tJggA8dIG98 B39QPiKCLdTkiL+kOFgxHMLYNVbcDKeCux2q7cb0snXpeoqdO4d6SGRfpOK2O5EfJLEXpJ 3bsDMIf7gj2deiPlBgog1iy/d6Ec+FE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=Jn69VXcR; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf18.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.172 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7a9ac2d50ffso21309385a.1 for ; Wed, 25 Sep 2024 12:20:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1727292012; x=1727896812; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qHqvjA9JzcIDCVpMVQmDmm2El9iw7THV4P5WFIx3v7M=; b=Jn69VXcR8jZpRTsopUbscCdwzML0e6ViiEksLFICXuz3zJX54dTllow99Swzt6fBxd 94qn+IXUalN5Y8c/+gAit4H3IYLHSDWPUKK/6RIYBaxb939im4aHaWsZzODO2KIKAs/o nsxFZcgIhEwoDaeW8kVF83MuBFRjPUqWKrWvXwUGptML1hC3N5kSTbpif4Jg5O+/fcZN K108jrPeuAvvXDxJKe/m9teZK7/yryvqrE1FrS7a7iE2vPdNt3nVjuCznaWi7UYDbQul 3mEeUrZfL4B32DXzjFcsX8StFzC5HLcs9KF+h4AWIT6R4EWD7yKu2H64BJS4lP13zYuA BqOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727292012; x=1727896812; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qHqvjA9JzcIDCVpMVQmDmm2El9iw7THV4P5WFIx3v7M=; b=XqFrYAurHVK3exr8WdaTl8jPVcIm+YYDOSJhTkOJa+iKwmp8ApmnL1c2Zjy01020F8 gy9B9M1WeCBDi1q1YxMqqfmdd+9vmH3hJrsW4B5QB02Ggs+ISH6+jbZtsNWybSDtUAYg M3dutIsDHwviUaC2nYGJWKQQmUf1e9dKNpvdmwmq3sJL7k4Fd1DgTVVw1zWPg4oU+9Hp qCGH+SjQzVY4UPrcpmtiOPgdl/VQO+EBf+4LZM6ma1KND8kRIyGa59W5gd6pdDpMgD0E QG6ksBzheuBb0wRgfvuVtkHaI7YSd4erWYFDJ4Npt9d1gNbz73abExAnU8zwB1k6xLF5 f/dg== X-Forwarded-Encrypted: i=1; AJvYcCUsGt/XOoQqzZ+cjIxrxQwMIahhUav0syPCiAkDMYnY34rKMniifS0XHfFtckYmorM7QoY02qEHQA==@kvack.org X-Gm-Message-State: AOJu0Yy/0v+OGwvlBLNY6NdNA0IcOUrfvly6aLlYwZmEFKSnGco9TB8B IWDOb0T2IBk57LEIlch3pTCvszYWeyEy0jc9Ms1To4Z/KIpQlnuoaWe3Drxhf3Q= X-Google-Smtp-Source: AGHT+IFuFTFZNFW4SZzFiLQpC5a5fLK+Z3m7ODWpO20Rxiaotgm4WEnT0OB4ZJ3f9UX8BpeUR+UFEw== X-Received: by 2002:a05:620a:4404:b0:7ac:dd88:cc80 with SMTP id af79cd13be357-7ae2c60d90emr118872085a.8.1727292012526; Wed, 25 Sep 2024 12:20:12 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7acde5305a5sm200369385a.1.2024.09.25.12.20.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Sep 2024 12:20:12 -0700 (PDT) Date: Wed, 25 Sep 2024 15:20:06 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Kanchana P Sridhar , linux-kernel@vger.kernel.org, linux-mm@kvack.org, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, shakeel.butt@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Subject: Re: [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store(). Message-ID: <20240925192006.GB876370@cmpxchg.org> References: <20240924011709.7037-1-kanchana.p.sridhar@intel.com> <20240924011709.7037-7-kanchana.p.sridhar@intel.com> <20240925134008.GA875661@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 960F71C0013 X-Stat-Signature: 8i9f4dtbd47xwzb8yqwj4hsni64rei68 X-Rspam-User: X-HE-Tag: 1727292013-385938 X-HE-Meta: U2FsdGVkX19dl4OcWkRJWTPwi7WHUL8lqpurCo+ikMMcbikkkwo4sUFJtsMyFW6Wkj0hdfY7u1xUANLgLXhyK4i8m2/S7UgxCOE+zZoEdX8Yipc0DyIjER3d10JJSnW753uYTa6xGUpfue8ISnBSPTwEE1E4vCjaqDKp6S6ol1cV3LNeDP9Tgw68SffBU97ciL/anh+XG9Qa01VzYIhiTcWP9fYw0DIPH4BC2WudjEYzjyib4nesVkzUZFsjtDndbNBphFDI0twgoN+onZJSEn70QN8gZseNGP00HPA39G3/7wxLaODrs07nZwcb882RZ1894zia0/J2Wv4xLUVC0gBUUd76QSQTcIFnBFQaY79YhTzmg5XR3ztEDIrYqjTP3vp6RZFKMVYICI0GCJvIzyF5c28BxTpv1X8r2bQqE6m09YcXLH9JL3/UvaNtcjKyLpmQIX5YeR2MVwZBeQqWHvrqDBqhTv7+snBl5JGPKELXfEcUx0561B3mHfTWRoM4laZ22KzbsTFfblbnLEv5OgkW4GQ+w1nvmy+b9bDuaRKPMKgr1cZMTjgvGvZNb4O7fRM6uC3z3JoABPaR3PL5WHWQTjxjXJ7KYzcmb///mdRTiIqooqTEY7sc1Mogh8KVZYQ+2oVh/q5jMtiWzd9udcgxE81mU+GQoIRPeL6XaRNXvU4oaTnV9hXZ2td/z1x5fMBT5Z2JxUEEGO8d5NPKM8XiEyGA618JGeq/UZ7eh1wtl2iRqb0HhSG6ERWQ8tTOVq8X/+QDA8Twt4mpibeI7cSaMAzei5i6cC4A1dLJA9JmiYU4zsAFkyPIqgN/ovMu/mN7+meMbkzpLFmcY1d0nYyqizrSUn1r3nUMswk5wuOQrduwTaKdASVWrvHL3XbxWNj5pTgL8aD19iSrlMku0GtWeuFVh6lKKnlGsZR6oyEde+j0dyx0YOxhBc2aJBU0KGI7AA6/HIB5uIGRjKd RCEz8XRH aR8v9u7T9WglY2PC2/gIQx/Q8DuneeVpsMHesy9p77DDC4DAWAslVVT7SgRxEMYqnKgTS1+evgD19/45w7KBldyVYj4QCSs1Tv832WgS9Bm9QZMPgoH4aG/hhq1NGoaKMPb+C3eLR+nYy8MCvMiwLobMuW6F9rgJSMJ1x2lCTpevTfzRSLCzJ4FUELSV5HtO/IBfON1vMBVziFFANwpbii0Qyf8VteGZHQ/9obmIb0qZ/NO5f1pRMLtAFEFrv8oeTeRVzgcMTIEoGh0QV5Zvi4smQGSu7JRd3BE0KQd2w+zMrsBoqpzFp3FgS5/fHq/62/3DOqH+D0rBkoApdNUTVpAtxWjIFi5gq8cUyuxnZ7T9BdYQco1Z+uLYensK1e+82aGk0Emwzs0QVtqamGVJDgBL8yPjRkaExxa5P/OoJTY8s5xPue302fOlrNFkv0/QXlRPbaIX7z0f+LmkQKlV1EFcMsaWmLNVIpM+QD0OHa+NZSl3SmhDmR4MIwYRjAySTXOBC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 25, 2024 at 11:30:34AM -0700, Yosry Ahmed wrote: > Johannes wrote: > > If this ever becomes an issue, we can handle it in a fastpath-slowpath > > scheme: check the limit up front for fast-path failure if we're > > already maxed out, just like now; then make obj_cgroup_charge_zswap() > > atomically charge against zswap.max and unwind the store if we raced. > > > > For now, I would just keep the simple version we currently have: check > > once in zswap_store() and then just go ahead for the whole folio. > > I am not totally against this but I feel like this is too optimistic. > I think we can keep it simple-ish by maintaining an ewma for the > compression ratio, we already have primitives for this (see > DECLARE_EWMA). > > Then in zswap_store(), we can use the ewma to estimate the compressed > size and use it to do the memcg and global limit checks once, like we > do today. Instead of just checking if we are below the limits, we > check if we have enough headroom for the estimated compressed size. > Then we call zswap_store_page() to do the per-page stuff, then do > batched charging and stats updates. I'm not sure what you gain from making a non-atomic check precise. You can get a hundred threads determining down precisely that *their* store will fit exactly into the last 800kB before the limit. > If you think that's an overkill we can keep doing the limit checks as > we do today, I just don't see how it would make a practical difference. What would make a difference is atomic transactional charging of the compressed size, and unwinding on failure - with the upfront check to avoid pointlessly compressing (outside of race conditions). And I'm not against doing that in general, I am just against doing it per default. It's a lot of complexity, and like I said, the practical usecase for limiting zswap memory to begin with is quite unclear to me. Zswap is not a limited resource. It's just memory. And you already had the memory for the uncompressed copy. So it's a bit strange to me to say "you have compressed your memory enough, so now you get sent to disk (or we declare OOM)". What would be a reason to limit it? It sort of makes sense as a binary switch, but I don't get the usecase for a granular limit. (And I blame my own cowardice for making the cgroup knob a limit, to keep options open, instead of a switch.) All that to say, this would be better in a follow-up patch. We allow overshooting now, it's not clear how overshooting by a larger amount makes a categorical difference. > but I would still like to see batching of all the limit checks, > charging, and stats updates. It makes little sense otherwise. Definitely. One check, one charge, one stat update per folio.