From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B5D1C43219 for ; Thu, 3 Nov 2022 23:32:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7A226B0072; Thu, 3 Nov 2022 19:32:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D034A6B0073; Thu, 3 Nov 2022 19:32:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA4128E0001; Thu, 3 Nov 2022 19:32:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A32FA6B0072 for ; Thu, 3 Nov 2022 19:32:29 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6FF3512014E for ; Thu, 3 Nov 2022 23:32:29 +0000 (UTC) X-FDA: 80093732418.22.46166ED Received: from mail-io1-f44.google.com (mail-io1-f44.google.com [209.85.166.44]) by imf13.hostedemail.com (Postfix) with ESMTP id 1F97320003 for ; Thu, 3 Nov 2022 23:32:28 +0000 (UTC) Received: by mail-io1-f44.google.com with SMTP id h206so2640206iof.10 for ; Thu, 03 Nov 2022 16:32:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8UxWIwT+GTNC+ocySk7F5QjNeOc/KHe0UoisfgSB+CU=; b=LmQuqTLKEJY306JaYnSUp1Zouu+FyqZNNowT/A8X2DuWy0GTv0K2sN8PEYzGNNIbo0 OkjQRI1iDVh+tQb15cQ9ZDbQZcIZfi7CZ6fwkNhTXv22GbpiN5d7CInVaXv2yft6Xtlg ZtLTXcphQ8CGcaD96lnjOWI1qrZbatxRI+raG0J5TPww7lBC3dpNA/iexdg3Q0pZdIdp w8cSahUfoXUrWW79tZuZM93YCRjDgm70JqT6zVXmjcUz+XhWABQOiO5Naw+Q05+bu7Dg m2N9eFWUJX1RtGvp/k301rVwGEica+u7hz18K71EmlJO32QoM4wIxYgrO5wwdtq8VeDP fuaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8UxWIwT+GTNC+ocySk7F5QjNeOc/KHe0UoisfgSB+CU=; b=BAMIR/8HQPPkyoBjY7uUTS05Gofwb4ZWLa7uk5FLrF4CytF62WipJbZxss01CNbIJr /Qb6BJEKlvWBoFtywqsvX2OZRXBgDK7IHtV7Gf7x/CSQV7qyyRKuPBEzILlqQ8mc8hIE wQNwDCuzaCQGlvPdHjjthCIfP5BgtjSVNee6dgjtxDhQX2O+dT8D00t86pvONg92Fzlf ZsPhathTCKfx0hHUpm45VEZ6A0HW7YVBeTFr/qYR6Ke4AXYaqaN4vU9AfKAQnIckHJPf Uf+0JfROlUs/tDO8U/WCh20oWdXQhA/9+oG2x9aC7hXXg/2NPyQIixygQVsl2/RQ5QU5 NsTw== X-Gm-Message-State: ACrzQf2nt9mBUY3s0sAguybqEk0DZ0lzdHnmKKA+TScKC+dvf6Uni+hj uQ0bw4leyC9V+SRIjxP9V6B1DUMR7O3XO6SUINY8cBtFO70= X-Google-Smtp-Source: AMsMyM7P+ZyxBvOcxG/tUPdSZp7tGjj00BoOOMKcOlsdSzAyjJTnZIbY+HELAta7aN1fC1LdLAm8T78mz0RBTfSEVhw= X-Received: by 2002:a02:900a:0:b0:35a:84e4:39aa with SMTP id w10-20020a02900a000000b0035a84e439aamr20730406jaf.191.1667518348285; Thu, 03 Nov 2022 16:32:28 -0700 (PDT) MIME-Version: 1.0 References: <20221026200613.1031261-1-nphamcs@gmail.com> <20221026200613.1031261-3-nphamcs@gmail.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 3 Nov 2022 16:31:51 -0700 Message-ID: Subject: Re: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks To: Minchan Kim Cc: Johannes Weiner , Sergey Senozhatsky , Nhat Pham , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ngupta@vflare.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667518349; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8UxWIwT+GTNC+ocySk7F5QjNeOc/KHe0UoisfgSB+CU=; b=E20f7tL0nVIPFkgAVPaAwh/AxmNslPk+DGL/gwrcHOSfZyZ0aJcYI0ZtVnhSEzojDr8SkU dn91cYYlVnY5DOU0Eh0qwgaw9YmvhWcxfeb8JlpXZWFvhU5ZTYcQ4CXo3vN1pSpAjOdVdZ 8aQK+IREdBwXJ+XAJgDCj+3XFUhU40s= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LmQuqTLK; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.166.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667518349; a=rsa-sha256; cv=none; b=N2LjD/x6AIk3uSPy5B/VOP2UcKPIOsgGmQlhw3oc52aeWH7d3XilBwozIt85I2GFH34bmd SRl00CmG1ap09y2rLbbo5bfV9PHrZIquTg1KotcVSav+/SHwYMwjL7RQTPEgVO1Bz5eGbD WTYlDu3EQXIIj2/g33/K2fI7p0ljmLA= Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LmQuqTLK; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.166.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: hc7auspyrntgc5pgidwtnukw9iw65afx X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 1F97320003 X-HE-Tag: 1667518348-284460 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 3, 2022 at 2:43 PM Minchan Kim wrote: > > On Thu, Nov 03, 2022 at 01:46:56PM -0700, Yosry Ahmed wrote: > > On Thu, Nov 3, 2022 at 1:37 PM Minchan Kim wrote: > > > > > > On Thu, Nov 03, 2022 at 11:10:47AM -0700, Yosry Ahmed wrote: > > > < snip > > > > > > > > > > > > I am also worry about that LRU stuff should be part of allocator > > > > > > > > instead of higher level. > > > > > > > > > > > > > > I'm sorry, but that's not a reasonable objection. > > > > > > > > > > > > > > These patches implement a core feature of being a zswap backend, using > > > > > > > standard LRU and locking techniques established by the other backends. > > > > > > > > > > > > > > I don't disagree that it would nicer if zswap had a strong abstraction > > > > > > > for backend pages and a generalized LRU. But that is major surgery on > > > > > > > a codebase of over 6,500 lines. It's not a reasonable ask to change > > > > > > > all that first before implementing a basic feature that's useful now. > > > > > > > > > > > > With same logic, folks added the LRU logic into their allocators > > > > > > without the effort considering moving the LRU into upper layer. > > > > > > > > > > > > And then trend is still going on since I have seen multiple times > > > > > > people are trying to add more allocators. So if it's not a reasonable > > > > > > ask to consier, we couldn't stop the trend in the end. > > > > > > > > > > So there is actually an ongoing effort to do that. Yosry and I have > > > > > spent quite some time on coming up with an LRU design that's > > > > > independent from compression policy over email and at Plumbers. > > > > > > > > > > My concern is more about the order of doing things: > > > > > > > > > > 1. The missing writeback support is a gaping hole in zsmalloc, which > > > > > affects production systems. A generalized LRU list is a good idea, > > > > > but it's a huge task that from a user pov really is not > > > > > critical. Even from a kernel dev / maintainer POV, there are bigger > > > > > fish to fry in the zswap code base and the backends than this. > > > > > > > > > > 2. Refactoring existing functionality is much easier than writing > > > > > generalized code that simultaneously enables new behavior. zsmalloc > > > > > is the most complex of our backends. To make its LRU writeback work > > > > > we had to patch zswap's ->map ordering to accomodate it, e.g. Such > > > > > tricky changes are easier to make and test incrementally. > > > > > > > > > > The generalized LRU project will hugely benefit from already having > > > > > a proven writeback implementation in zsmalloc, because then all the > > > > > requirements in zswap and zsmalloc will be in black and white. > > > > > > > > > > > > I get that your main interest is zram, and so this feature isn't of > > > > > > > interest to you. But zram isn't the only user, nor is it the primary > > > > > > > > > > > > I am interest to the feature but my interest is more of general swap > > > > > > layer to manage the LRU so that it could support any hierarchy among > > > > > > swap devices, not only zswap. > > > > > > > > > > I think we're on the same page about the longer term goals. > > > > > > > > > > > > > Yeah. As Johannes said, I was also recently looking into this. This > > > > can also help solve other problems than consolidating implementations. > > > > Currently if zswap rejects a page, it goes into swap, which is > > > > more-or-less a violation of page LRUs since hotter pages that are more > > > > recently reclaimed end up in swap (slow), while colder pages that were > > > > reclaimed before are in zswap. Having a separate layer managing the > > > > LRU of swap pages can also make sure this doesn't happen. > > > > > > True. > > > > > > > > > > > More broadly, making zswap a separate layer from swap enables other > > > > improvements such as using zswap regardless of the presence of a > > > > backend swapfile and not consuming space in swapfiles if a page is in > > > > zswap. Of course, this is a much larger surgery. > > > > > > If we could decouple the LRU writeback from zswap and supports > > > compression without backing swapfile, sounds like becoming more of > > > zram. ;-) > > > > That's a little bit grey. Maybe we can consolidate them one day :) > > > > We have been using zswap without swapfile at Google for a while, this > > gives us the ability to reject pages that do not compress well enough > > for us, which I suspect zram would not support given that it is > > designed to be the final destination of the page. Also, having the > > zRAM could do with little change but at current implmentation, it will > print swapout failure message(it's not a big deal since we could > suppress) but I have thought rather than that, we needs to move the > page unevictable LRU list with marking the page CoW to catch a time > to move the page into evictable LRU list o provide second chance to > be compressed. Just off-topic. Right. We do something similar-ish today. However, this does not work though for zswap if there is a backing swapfile, as the page needs to still be evictable to the swapfile. A decoupled LRU can also manage this appropriately. > > > same configuration and code running on machines whether or not they > > have a swapfile is nice, otherwise one would need to use zram if there > > is no swapfile and switch to zswap if there is one.