From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84A42CD4F25 for ; Wed, 4 Sep 2024 20:22:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 144306B020C; Wed, 4 Sep 2024 16:22:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00A396B020F; Wed, 4 Sep 2024 16:22:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D00E86B0209; Wed, 4 Sep 2024 16:22:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B32926B0201 for ; Wed, 4 Sep 2024 16:22:18 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7A561418AC for ; Wed, 4 Sep 2024 20:22:18 +0000 (UTC) X-FDA: 82528177956.13.A10D767 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by imf16.hostedemail.com (Postfix) with ESMTP id 99D5418000B for ; Wed, 4 Sep 2024 20:22:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PmLWo5S5; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725481260; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1KYWai7lvMoDFEhU+xx5+nV1ssYUW6kimYz3gbj4Z0c=; b=jKNzHXGct1jjS491YiYy+bW8Az0CZ6SbOtUZUogXIqnGbSzTWBqPN3eo0GIPGevgpa7x5z EuKUSOV8VOs7hN7rDxxgB4QK7UsjkEOai232b+9EZHqPdcvD3kBFhuPXknLj8BF1wBJDqX BLRLm+vSfulKzMoFY1PHrjcDJegtYUA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PmLWo5S5; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725481260; a=rsa-sha256; cv=none; b=FkTNfwWbUzbq2mCVMvLwZxtX7wSIzMvhtilJbgGqvNGbC7axNWsuXOl5fRuklTP5lbvxOu usdVtSftbEq88IvKLB0y4sfZwGTSetS+lovwLGGGyLeaZYiXRaHqba8w7rm1jcY+27Nuih QbIakRiQBuD5cI8UAkPinZhvjsSjnuc= Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a869f6ce2b9so9916566b.2 for ; Wed, 04 Sep 2024 13:22:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1725481335; x=1726086135; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1KYWai7lvMoDFEhU+xx5+nV1ssYUW6kimYz3gbj4Z0c=; b=PmLWo5S57FkbcT3171Zq9PzfKQfdqLhTL3MeNFPfexwLm7zHGqjBfBXDxF+vlv03FC 5wWNHzPK3sWREjWZSxRos34PsXOruxjstof02QwXJzn0VPxWkHgR6HmaVLI+rkHNlseG u/IaOfFV895fEmJ628KjpRx5v53vXE20IDp7V5yphiNXyxfWNJa9DarGOXUU6H15vq1T MKg7RgDYQSVKpeBInjCRW3+Jq+l8M2E6HXGo5whVD1sL07UvqUZIRVbhZL1vufn10wiE CbYwmrQ8LomqFheYl0BRnTtCILYvDadGW/whx3a7KkM2vbxUFp+/dI3NpFFQC6TXEWLB RfVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725481335; x=1726086135; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1KYWai7lvMoDFEhU+xx5+nV1ssYUW6kimYz3gbj4Z0c=; b=vB870hGhMZrWLmpVLfZ30Z+mK5mFFL4P/ZFxfE93KG9tBiypwo5PB88l8YJfJvc4JY 2PG82Wj5ecLzRz7suCLT8tH94cxRSJSWg1xaENA79PiLZxLHeMsq5yS08/5n6VZpJaO2 0jFDhYER8f/XimUG+7l/ChxoiDbstCp0ELuEgfH58xqDD8Ft7u9j/hLoQ1GlsTM56v43 xHc4Fq7jBcgMCLPUAs+EiKa52baHvRaaM+mjgxGAttf9dGj2E9lJ/aQsZDZt7ybptfyr BfbWCBEh/kvRPxhV3X/5yiggoC3PE68xkOmeAaA9k+YrJHdRsHBlFQCisjKAUWYJO41G UbSQ== X-Forwarded-Encrypted: i=1; AJvYcCWeUh3iPgV190V8KmuT3TMh9himnV5t8Z+r9GAqIjrWVtPQ9f1lyLXI9UPAoMBRERhk54IXnVNoAQ==@kvack.org X-Gm-Message-State: AOJu0YyF9sHrU5vTHjwso1IjDUiJawG1lGnzaV/94uFXke5mdMFyLuse xFSNyaOSEKsD7UFSioDUHbIT1TCpPWg4aYQakb5ZzBVsomuLiaE4CRqDHIevt6TB9IkexCQC87d 4QCZGE11ZFJVRzKKvBWXyoBzrF9MtLTW8KEkb X-Google-Smtp-Source: AGHT+IH2wXEm2ptjpL0NyJhUhgpSVjK/53CRuIdDgwa1w3tyzQo0CK7tIUyh2/pxABvFX0hfTbK0aaVC8u1kY8gJFg4= X-Received: by 2002:a17:906:f590:b0:a7a:adac:57d5 with SMTP id a640c23a62f3a-a8a32e6dd9emr465468066b.18.1725481334134; Wed, 04 Sep 2024 13:22:14 -0700 (PDT) MIME-Version: 1.0 References: <20240806022143.3924396-1-alexs@kernel.org> <20240806022311.3924442-1-alexs@kernel.org> <20240806123213.2a747a8321bdf452b3307fa9@linux-foundation.org> <20240807051754.GA428000@google.com> <20240814060354.GC8686@google.com> <66ce5eed.170a0220.387c4d.276d@mx.google.com> <9a6e3169-2ebd-47a5-b2e6-953a8a6730db@gmail.com> <66d8ba53.170a0220.844a7.1d81@mx.google.com> In-Reply-To: <66d8ba53.170a0220.844a7.1d81@mx.google.com> From: Yosry Ahmed Date: Wed, 4 Sep 2024 13:21:36 -0700 Message-ID: Subject: Re: [PATCH v5 00/21] mm/zsmalloc: add zpdesc memory descriptor for zswap.zpool To: Vishal Moola Cc: Alex Shi , Sergey Senozhatsky , Matthew Wilcox , Andrew Morton , alexs@kernel.org, Vitaly Wool , Miaohe Lin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, minchan@kernel.org, david@redhat.com, 42.hyeyoo@gmail.com, nphamcs@gmail.com, Dan Streetman , Seth Jennings Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: rw1ce7y56gt9smjnhgmxnx9rtekgso9a X-Rspam-User: X-Rspamd-Queue-Id: 99D5418000B X-Rspamd-Server: rspam02 X-HE-Tag: 1725481336-268116 X-HE-Meta: U2FsdGVkX1+tVHhfaGNRXr6iBHWB1G7gfEX/Q5xnQuyqhuq7//fgIMbERieiGdf+bQoLCDfKw2OWD2gzFx0ZQm8josYrkbhPjeLqnpLNOZOs1fpujCBz30twQrzWzW8SShbLUw/E3U7cj0nAudLos8o/7Ou+rVnfXcg3I925v102ixC+9/C/vyzFbnt1rujGlzgwyGdYEpVx2r8VmuHgG6nAK+mSMkuR3gsaMWDa71jfndUjj+6Vs7Zu0Um9S+U6CriCvUFQI3fa9R28OJyWbqlkkgqcw/fyvQLxHSBePd9ewt8OC9hJQUhYZnKKaxDU/kr5c3/gLI1htbnmtliVciU0Ry2HkNYjhxn/K6Il7sHMDiLAl5/9ccJHS7dR6Vvjwr+L5VuzMc6ieQnuot66jx8tI1gnSkOdZmsUFk+mThjrvrmUH51el/SDaS1wwzldK4jRzHBRdRStcEruWmwkBjt5r1t6Dw8i4rcqorC3ikyB1u5sedQQa7OUm5QqDm9S9Ue/T13pUpm4cK1mSEyhbFwsq9Yd2EX0p7t5o2qWH97IsBSsR9pLFiAijTyiJmNNFYZKYHrK+j85gXduTQ2soNIe6o3ZpfR40Tmqho8/I5N3mnrlKA8JvgaPhvv0i2w0wsyYdobNbPBXMlLc7F4PM2iG2LZ+YNqMVF/B4EJsov3G0btyRI7mo/kx3lp8WZO86U27MSnskA9GV/IuG6Z0p7uJ76jJ/wxyI0vFv8H2ADEijZPTPexHPoYGLDayrFZElaYkbIzBZTA4nhKAwVX2dEX7dwDYtybyxDjonE31ZQ+rfUK5A6+lK+gEItKAMA4SENdVKRI3a9MEnWV/FPAHq92FcFLq/j29J+1y4e4NUNg57jrqP1UTtHlKXRXH/7o+4XsLbZLjV86WshU/ksjN0n4447KYRZYORYwuQtJYwYuH3gRYdvaiqQCRMa3GiDANz3mYZUsOrnM1TfvMYJC xO6qaKRc y4WDu1kt6EH3YHrhFbnUC6OJBYEq0cmm52IuNKBOnxJNUuijjgBc2HXhfjOwqP3WAYhUUr0hxVC0a5g8K2GyaJPS5503svKCyxqlcb2mTRTvVa4+p1Y8cNKgOYrg7bJDbLdLQv1t0XTd9L5PJe6cak6oN2cZptdowmzhjPCPrbsmNUyMTBK6jdPsc1i1FQvADtgbBzCfg3i7hkegxJraKuwicgfVVKjaPhAEUrhf+UHS/LG3ZcloeKBOw/ltlRq2Cf940BXf23P4UZRzNRYBeZk7BgPaatKAdrtOU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 4, 2024 at 12:51=E2=80=AFPM Vishal Moola wrote: > > On Thu, Aug 29, 2024 at 05:42:06PM +0800, Alex Shi wrote: > > > > > > On 8/28/24 7:19 AM, Vishal Moola wrote: > > > On Wed, Aug 14, 2024 at 03:03:54PM +0900, Sergey Senozhatsky wrote: > > >> On (24/08/08 04:37), Matthew Wilcox wrote: > > >> [..] > > >>>> So I guess if we have something > > >>>> > > >>>> struct zspage { > > >>>> .. > > >>>> struct zpdesc *first_desc; > > >>>> .. > > >>>> } > > >>>> > > >>>> and we "chain" zpdesc-s to form a zspage, and make each of them po= int to > > >>>> a corresponding struct page (memdesc -> *page), then it'll resembl= e current > > >>>> zsmalloc and should work for everyone? I also assume for zspdesc-s= zsmalloc > > >>>> will need to maintain a dedicated kmem_cache? > > >>> > > >>> Right, we could do that. Each memdesc has to be a multiple of 16 b= ytes, > > >>> sp we'd be doing something like allocating 32 bytes for each page. > > >>> Is there really 32 bytes of information that we want to store for > > >>> each page? Or could we store all of the information in (a somewhat > > >>> larger) zspage? Assuming we allocate 3 pages per zspage, if we all= ocate > > >>> an extra 64 bytes in the zspage, we've saved 32 bytes per zspage. > > >> > > >> I certainly like (and appreciate) the approach that saves us > > >> some bytes here and there. zsmalloc page can consist of 1 to > > >> up to CONFIG_ZSMALLOC_CHAIN_SIZE (max 16) physical pages. I'm > > >> trying to understand (in pseudo-C code) what does a "somewhat larger > > >> zspage" mean. A fixed size array (given that we know the max number > > >> of physical pages) per-zspage? > > > > > > I haven't had the opportunity to respond until now as I was on vacati= on. > > > > > > With the current approach in a memdesc world, we would do the followi= ng: > > > > > > 1) kmem_cache_alloc() every single Zpdesc > > > 2) Allocate a memdesc/page that points to its own Zpdesc > > > 3) Access/Track Zpdescs directly > > > 4) Use those Zpdescs to build a Zspage > > > > > > An alternative approach would move more metadata storage from a Zpdes= c > > > into a Zspage instead. That extreme would leave us with: > > > > > > 1) kmem_cache_alloc() once for a Zspage > > > 2) Allocate a memdesc/page that points to the Zspage > > > 3) Use the Zspage to access/track its own subpages (through some magi= c > > > we would have to figure out) > > > 4) Zpdescs are just Zspages (since all the information would be in a = Zspage) > > > > > > IMO, we should introduce zpdescs first, then start to shift > > > metadata from "struct zpdesc" into "struct zspage" until we no longer > > > need "struct zpdesc". My big concern is whether or not this patchset = works > > > towards those goals. Will it make consolidating the metadata easier? = And are > > > these goals feasible (while maintaining the wins of zsmalloc)? Or sho= uld we > > > aim to leave zsmalloc as it is currently implemented? > > > > Uh, correct me if I am wrong. > > > > IMHO, regarding what this patchset does, it abstracts the memory descri= ptor usage > > for zswap/zram. > > Sorry, I misunderstood the patchset. I thought it was creating a > descriptor specifically for zsmalloc, when it seems like this is supposed= to > be a generic descriptor for all zpool allocators. The code comments and c= ommit > subjects are misleading and should be changed to reflect that. > > I'm onboard for using zpdesc for zbud and z3fold as well (or we'd have to= come > up with some other plan for them as well). Once we have a plan all the > maintainers agree on we can all be on our merry way :) > > The questions for all the zpool allocator maintainers are: > 1) Does your allocator need the space its using in struct page (aka > would it need a descriptor in a memdesc world)? > > 2) Is it feasible to store the information elsewhere (outside of struct > page)? And how much effort would that code conversion be? > > Thoughts? Seth/Dan, Vitaly/Miahoe, and Sergey? I would advise against spending effort on z3fold and zbud tbh, we want to deprecate them.