From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 028483D7D8B for ; Thu, 14 May 2026 08:13:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778746398; cv=none; b=ToJBSidCa+9eNYq4Zv4lLJl5Hz72c4Be0x67fJmOHOQ/v599pzv9IUHx9xXdLtxmyn6WTSTGOHcsojet5GJUw3p8ITribaF7qzpzhGudt38iatFTGe+hmv31LivMInePkvvXME2iljr6tUEi76G46Uom5kCNTShHDCG72I5MtCY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778746398; c=relaxed/simple; bh=NcPxqFXtdJSzLFs+u6XpxtuRnOT77nJVdPY9bb2XxnY=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=JACJ+laM/+zqlI5UsE5zZDZ2qFF/gYS/Nh2IY/nmHLHqLEmDbIUQT3Sy6SfSC56R5+II5Igo7zlJkBX9SaFrCHPgyWhVDjx4gHtWwGyUdKN6RSTipkvHqqMy9lGp1+XbBdSDWSjClJCKn/tn3nYklAVoijgpvYd3o7nA8m/KQzo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ijtfEVDS; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ijtfEVDS" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-83ec36a13e9so2577552b3a.0 for ; Thu, 14 May 2026 01:13:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778746396; x=1779351196; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=i0M2iptptb250Psp+ZWNGxv8OhRpj/x6t+2FKnbK4Kg=; b=ijtfEVDSeR+8p98ogDKwIpLL/wzJTcZrBnnkFwrrocc9hSKAhqmGfj7ICSbwVZirYk yfq2cte9rYV1b0dFWbWUiH4MRnO7dmSWX+lzqiaAuczCd20EuJhBphYE9QxAgeO3EDzO HV9yEy+HCJY+hXQYxgCGW/PPWcyy+JP3gnVAhTMFfL2Mzy2hYOFRfu/4WzfsMs53of71 uHibkJ67yACYGZ2o5KkFDygFCut01io2NGJ+o0A4P11t/tdRUMi0efDK2ywQUqNIe37l 3/cLaKsZ0o2ea0xK6ZWmIivFu0bYMKBrb5MArycaU8mAhKbGNtjsw7/uYJEYNXMpOjzY bSlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778746396; x=1779351196; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=i0M2iptptb250Psp+ZWNGxv8OhRpj/x6t+2FKnbK4Kg=; b=dyTYVVvSxRrhD3oZWjuYlCF2SABqpn4ndWAq1pTiEYGtlcDRncX88M8GoaM8EGQNjT ITXeHc8ceSDTKXFd39u92QBlJiG4HbQWETYjBpyVyla4CcUDqzcsjA+okw2LW501Bdww WV+5ChUZEFk8mdjvphYpgS8/I4fO5g9I0woKe9QVKGnBC5vr3XOeDxQuiqjEumm4G8TL 9QmLOU6WQruhk9hUmLbk29RQONwv3mv6/2K0c/KxaJ1+y1EvtLXMJ8NnR54/S8u9u1yh VESMPe3MAu29NpA+2IpbkFPEtwjBuALAqR4gvvQAAjwV7N2IUxfoOSf3R1YtJf2eewSf XQlQ== X-Forwarded-Encrypted: i=1; AFNElJ+9VNQVKGziBKM7VFopdacxs4vWpfcL2KQ4ss7NPcDiK/qexGyixTGrOrsjKLn1Eaolzpz1vBR1iw4F+II=@vger.kernel.org X-Gm-Message-State: AOJu0YzMkfspZXeEmVGxGpTqsIZhIiFxG9GJoiDlH/uZWIjv4xWdoFCs onWVMVCh8FjKPuywN//SzWmzPf8uUZcozfDWmUneFXWjLM1r4LMnTWNVGFaSQw== X-Gm-Gg: Acq92OHeguVSqxM6OUGaW9GjWS/SzOts6EzpxTf7BkupGVjRJh+kvMWwBRtVfGNaszs M2UiMM1JmB4Ac+9LCEfRY4EYiYMS9T64Vfw9WWzfg7QM9TufUwoIwqhDV/ROjcbL/K3WGAHIF7e 5AKyqlPaDOyfErCqfSOWNDGxG2g3V6YWHrn4sQZurm2u3GmOk+ZE3V3ndcmkyUNZacHZQ6CWcrM +rv6E/jVHMlJ1hz6FnUCTUTKs9IcPxUVuvQfK3otQXYyBOnI8sVgx3HNc0QT5tg3XBdCH1L+qeX GGIA5Ygqd2AGEN9zrs6avVKYV6Vm1sTFwZaMTJqkTwELBkKw03v4usc+H/DmOwl/4xconsATh3T FVJuCJjW6Td/ySBXMgO0j41QNL71srfjpEmUb3IFAQlatNdK+VfDmJcDfv/G/s9uNgYum0BeZCS 7OsQAs9ybNNOTi12/H0leI7g8Ft/dhgMKP9RAEAdJXLgw= X-Received: by 2002:a05:6a00:4ac8:b0:83a:7565:3505 with SMTP id d2e1a72fcca58-83f03e95beemr7192464b3a.8.1778746395930; Thu, 14 May 2026 01:13:15 -0700 (PDT) Received: from [10.125.192.65] ([210.184.73.204]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83f19f7cd19sm1845320b3a.54.2026.05.14.01.13.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 May 2026 01:13:14 -0700 (PDT) Message-ID: <8fa07929-ed41-b716-c888-0368f883a020@gmail.com> Date: Thu, 14 May 2026 16:13:03 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [PATCH 2/3] mm/zswap: Implement proactive writeback To: Nhat Pham , Yosry Ahmed , hannes@cmpxchg.org, mhocko@kernel.org, tj@kernel.org Cc: akpm@linux-foundation.org, shakeel.butt@linux.dev, mkoutny@suse.com, chengming.zhou@linux.dev, muchun.song@linux.dev, roman.gushchin@linux.dev, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Hao Jia , Alexandre Ghiti References: <20260511105149.75584-1-jiahao.kernel@gmail.com> <20260511105149.75584-3-jiahao.kernel@gmail.com> <12e4784e-2add-d849-7e54-bde8abfa6e78@gmail.com> <6fc7fdf0-368c-5129-038e-623f9db2aa88@gmail.com> From: Hao Jia In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2026/5/14 04:53, Nhat Pham wrote: > On Wed, May 13, 2026 at 11:55 AM Yosry Ahmed wrote: >> >>>> Zswap objects are organized into LRU and exposed to the shrinker >>>> interface. Echo-ing to memory.reclaim should also offload some zswap >>>> entries, correct? Are there still cold zswap entries that escape this, >>>> somehow? >>>> >>> >>> Yes, the memory.reclaim path does drive some zswap writeback, but >>> it is not enough for our case. >>> >>> 1. For a memcg that has reached steady state (a common case being >>> when memory.current is below the policy target), the userspace >>> reclaimer may not invoke memory.reclaim on it for a long time, >>> and so no second-level offloading happens through >>> memory.reclaim. In this state we want >>> memory.zswap.proactive_writeback to write back entries that >>> have sat in zswap past an age threshold, to further reclaim >>> the DRAM still held by the compressed data. >>> >>> 2. Even when memory.reclaim is running, the fraction of zswap >>> residency that ends up reaching the backing swap device is >>> still very small for many of our workloads, and the userspace >>> reclaimer has no way to participate in or control the >>> granularity of zswap writeback. So in our deployment we prefer >>> to leave the zswap shrinker disabled, decouple LRU -> zswap >>> from zswap -> swap, and use a dedicated proactive-writeback >>> interface that lifts the writeback policy into userspace where >>> it can evolve independently of the kernel. >> >> To be honest I see the point of proactively reclaiming compressed >> memory in zswap. If you use memory.reclaim, you are also reclaiming >> hotter memory in the process, and you are not necessarily getting as >> much writeback as you want. The memory in zswap is a more conservative >> choice for proactive reclaim because it's memory that's guaranteed to >> be cold(ish) and not being accessed. >> >> That being said, the interface is not great any way you cut it :/ >> >> I don't like the 'memory.zswap.proactive_writeback' name, maybe we can >> stay consistent by doing 'memory.zswap.reclaim', but that just as >> easily reads as "reclaim using zswap". Maybe >> 'memory.zswap.do_writeback' or something, idk. >> >> I also don't like having two proactive reclaim interfaces, so a voice >> in my head wants to tie this into 'memory.reclaim' somehow, but that >> includes adding a pretty specific argument (e.g. 'memory.reclaim >> zswap_writeback_only=1'. >> >> I don't like any of these options, and we also need to consider what >> the memcg maintainers think. I see the use case of proactive writeback >> but I am struggling to come up with a clean interface. >> >> I also think we should take the 'age' aspect out of the conversation >> for now, it can be a separate discussion. Well, unless we decide to >> tie it to memory.reclaim. If memory.reclaim broadly supports age-based >> reclaim then zswap writeback can be a natural part of that without >> requiring a specific interface. > > Yeah perhaps extending memory.reclaim is best... Sort of analogous to > the way we have swappiness to balance file v.s anon.... Thanks for the suggestions, Yosry and Nhat. My only concern is that if we eventually need to add more parameters to zswap_writeback (such as age or others) in the future, would it make the parameter parsing and the functionality of memory.reclaim overly complex? As you mentioned, if the memcg maintainers have no objections, I will attempt to implement it in v2. How about something like this? echo "100M zswap_writeback_only" > memory.reclaim Thanks, Hao