From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C18C514B977 for ; Fri, 24 Apr 2026 15:00:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777042841; cv=none; b=rjLc87mRQE7aCkyp3M9N2swLy2XClcgunLsJgJbBc4i9XFpSNSgrjRZnezBepDwSlbGc6B549ChoLhVAWGK3OKSwOLls1Lew4UlUbhbD+qTbmji1FZFhwNS9XoRku+MCMVu0Xi9nUGViq7jVTXeN2gN6v7IVH27tx/YWBT/SUUs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777042841; c=relaxed/simple; bh=uZz/QdEIpFNMtW4cYP3vBe6Wds0yKCp61AyJc0sbehw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=i6I4kj0v6BrYDO5CbZ70bFLmn0q8w2JMuyl1zwxdekOGzk6Cw9pgkhMHXPHrlp5BQXR6QuZCSgcaHeNdI68WI4ReUw6PEGzdJkSqjj70LlUwfq9VlIgpkluhhiBsJzLJI50LYT+yUfxiiytpcJEqHXbfcSWCAV46My4bfScYkMQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=readmodwrite.com; spf=none smtp.mailfrom=readmodwrite.com; dkim=pass (2048-bit key) header.d=readmodwrite-com.20251104.gappssmtp.com header.i=@readmodwrite-com.20251104.gappssmtp.com header.b=FzR8uOMF; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=readmodwrite.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=readmodwrite.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=readmodwrite-com.20251104.gappssmtp.com header.i=@readmodwrite-com.20251104.gappssmtp.com header.b="FzR8uOMF" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-b9e00649769so1138485366b.3 for ; Fri, 24 Apr 2026 08:00:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=readmodwrite-com.20251104.gappssmtp.com; s=20251104; t=1777042838; x=1777647638; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NVkm9NCoar5rTC7NMjwEsGBOw7MHNVq1Zpnh4oi0V7Q=; b=FzR8uOMF9+8f8zwgIQcviB80Cz+XGTEnUk0QUZ2t8ff0UjNhDUINatErMBCmqO1B3O 6YhJBUKmKovxSq9yIGspOXpj32FZ3K7NGV7v8ieXtv55tEkbcajuxRmnB/PQvwb4sTTk scvsQ+C/0quqMHZfkDrlBmunppl7BmwSbXdrkHTRy2Fp5gP3kBg6o4ZH2UrAICKma1fa 37siUKwDWjs7bqDzHw6PMaORtcNx07Iu+vi3moeFXLXnTTh53VHmy0Zz6GLyJFSZlgCK XIcuSnk1GYwjEKvBwyTE5t1I14sm56zEDXpDQBWHqQz751TDnQNg0QwBOEETN3U9E6mW dN3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777042838; x=1777647638; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NVkm9NCoar5rTC7NMjwEsGBOw7MHNVq1Zpnh4oi0V7Q=; b=tNjzBhSHrf2VwUR00e/n8+b7vudB32016CgWPYa3AJ9q6hH4L7oMm2A2A7imjjliSc nQpgRWpLV67LFbjB1uQe6uYjJT9ev78ZSwstEZZNIlYSX8r3Nn/v9rQP712s/AbV8pzc S179mSnZPbMn/aSjlF5p961vLpfrIPXO6ce7DMTHBrO5dusMpu9c98psK+wJP7ic+83L CG3iFWuOJrK0pp+B1uLbIIXBrCkeIYF/Ax4Pqs/lzBvPWEMsQQAvepc9ax4+x51jlPYy SqgmrpZo+e0zTziOYS/wn4Nqqrdq5TMPFwz9ZbHQQmYYGOr1BadU1+vzewFoecevNcT3 9Hpw== X-Forwarded-Encrypted: i=1; AFNElJ8woRyhj8LebTAT1ZbUKtAQkGR7Dqw5rQ2Ez+HxoIYtu0INuJ3mOU4sgYorFaEZELp7QW/IBYxTroYmLCs=@vger.kernel.org X-Gm-Message-State: AOJu0YwBRB4xby7iasLlanh9D/sokghgHfCS/L8Fqmr4VYJN8Y4jNPqq oSUnvg/rJDGOxG4y9iK/w7Ifmv2S/9ns3QRQ/XyjTbjpWhZ7eBHT1/olkyYDJSaop6o= X-Gm-Gg: AeBDiet9xUjR75XzkFvY+gfRuYw2/7HntOKfaVqUpGjiEnRitN2mwIURQ1KZKDO+hpO PWt3det2BQfFEeREWA+/QNxkPG2qA1Pcor1EI3r5fyqd26m/eb2irQxe9+eTC8aPB0hG5F5AKuj I6s5Ck9kVhd0YPF90sSrg4IfzdNTadXtXRhHzElmvpSGxReftzE5oSZQylAVbrU7OCR/VW9nOyj oudVWY2E8Wg/MT5WDVR+pLHA65+arBwYocUMQzeHFoFsMNFaMXwsKws+eU0MannTOTrTdUyalOV qW8EyoCAh64s41xsKfFPojHmsh3mXK8B0yfOCQ0mkqHi2RMaIYc4LSbMv4/pT8jsS2IUFP+1Fih x4tiyXHuky+CsXQAP4L006YNwINlMV/OkGIUsw13jgsBTzshYZ1OzrL5P5W1WtVx5ojgKIcvY7u vK21z0glGTlUzJpH4E81Cc X-Received: by 2002:a17:907:a0d6:b0:b98:595c:a76f with SMTP id a640c23a62f3a-ba418888cfdmr1573048866b.16.1777042837571; Fri, 24 Apr 2026 08:00:37 -0700 (PDT) Received: from localhost ([2a09:bac6:37a8:1f19::319:ba]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ba4517ef3d4sm779041766b.17.2026.04.24.08.00.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Apr 2026 08:00:36 -0700 (PDT) Date: Fri, 24 Apr 2026 16:00:35 +0100 From: Matt Fleming To: Shakeel Butt Cc: Barry Song , Andrew Morton , Christoph Hellwig , Jens Axboe , Sergey Senozhatsky , Roman Gushchin , Minchan Kim , kernel-team@cloudflare.com, Matt Fleming , Johannes Weiner , Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Zi Yan , Axel Rasmussen , Yuanchu Xie , Wei Xu , David Hildenbrand , Qi Zheng , Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: Require LRU reclaim progress before retrying direct reclaim Message-ID: References: <20260410101550.2930139-1-matt@readmodwrite.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Apr 16, 2026 at 02:58:30PM -0700, Shakeel Butt wrote: > On Thu, Apr 16, 2026 at 09:44:55AM +0800, Barry Song wrote: > > > > I am still struggling to understand when zram-backed > > reclamation cannot make progress. Is it because zram is > > full, or because folio_alloc_swap() fails? > > > > Or does zs_malloc() fail, causing pageout() to fail? > > Even incompressible pages are still written as > > ZRAM_HUGE pages and reclaimed successfully. > > We should have counters for these, right? Let me try and provide some more data for this. It's hard to replicate on our production systems so I've resorted to creating a minimal Qemu repro that has 1GiB RAM and zram disk = 1GiB. The workload is a simple anon memory mapper that allocs 900MiB of memory and touches all pages for 60s. zs_malloc --------- None of the zs_malloc() calls failed and we made ~1.2M of them during the test. Here's a breakdown of allocation sizes: @hist_zs_malloc_size: [32, 64) 4831015 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 409 | | [128, 256) 1090 | | [256, 512) 2334 | | [512, 1K) 5069 | | [1K, 2K) 11174 | | [2K, 4K) 2395 | | [4K, 8K) 237 | | During direct reclaim only: @hist_zs_malloc_size_in_dr: [32, 64) 1268042 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 52 | | [128, 256) 149 | | [256, 512) 292 | | [512, 1K) 1234 | | [1K, 2K) 3539 | | [2K, 4K) 1156 | | [4K, 8K) 135 | | /sys/block/zram0/mm_stat -------------------------- before: 4096 74 12288 0 12288 0 0 0 0 after: 42622976 9412667 10985472 0 34131968 0 1962 0 237 trace_mm_vmscan_lru_shrink_inactive ----------------------------------- Anon LRU shrink events: 397,949 sum(args->nr_scanned): 11,837,216 sum(args->nr_reclaimed): 4,871,775 sum(args->nr_dirty): 0 sum(args->nr_writeback): 0 sum(args->nr_congested): 0 sum(args->nr_immediate): 0 sum(args->nr_ref_keep): 5,200,896 sum(args->nr_unmap_fail): 0 File LRU shrink events: 2,632 sum(args->nr_scanned): 26,048 sum(args->nr_reclaimed): 12,681 sum(args->nr_dirty): 0 sum(args->nr_writeback): 0 sum(args->nr_congested): 0 sum(args->nr_immediate): 0 sum(args->nr_ref_keep): 476 sum(args->nr_unmap_fail): 0 > > I would rather detect what causes the lack of progress > > and implement a better fallback. > > This is a good question. I think we have appropriate counters in /proc/vmstat > for cases where pages keep getting recycled in the LRUs instead of reclaim. Here's the output of /proc/vmstat before and after the test runs. nr_free_pages 210,825 -> 206,742 (delta=-4,083) nr_free_pages_blocks 209,920 -> 65,536 (delta=-144,384) nr_zone_inactive_anon 1,685 -> 136 (delta=-1,549) nr_zone_active_anon 15 -> 3,774 (delta=3,759) nr_zone_inactive_file 329 -> 591 (delta=262) nr_zone_active_file 673 -> 504 (delta=-169) nr_zspages 3 -> 2,716 (delta=2,713) nr_inactive_anon 1,685 -> 136 (delta=-1,549) nr_active_anon 15 -> 3,774 (delta=3,759) nr_inactive_file 329 -> 591 (delta=262) nr_active_file 673 -> 504 (delta=-169) nr_slab_reclaimable 1,352 -> 2,037 (delta=685) nr_slab_unreclaimable 9,581 -> 11,689 (delta=2,108) nr_anon_pages 1,526 -> 262 (delta=-1,264) nr_mapped 912 -> 442 (delta=-470) nr_file_pages 1,132 -> 4,760 (delta=3,628) nr_shmem 162 -> 3,608 (delta=3,446) nr_swapcached 0 -> 19 (delta=19) nr_vmscan_write 0 -> 4,872,846 (delta=4,872,846) nr_written 1 -> 4,853,727 (delta=4,853,726) pgpgin 1,200 -> 19,035,312 (delta=19,034,112) pgpgout 4 -> 19,414,908 (delta=19,414,904) pswpin 0 -> 4,758,528 (delta=4,758,528) pswpout 0 -> 4,853,726 (delta=4,853,726) pgalloc_dma 32 -> 84,262 (delta=84,230) pgalloc_dma32 45,989 -> 5,095,307 (delta=5,049,318) pgfree 269,896 -> 5,415,629 (delta=5,145,733) pgactivate 2,820 -> 14,490 (delta=11,670) pgdeactivate 10 -> 10,924 (delta=10,914) pgfault 29,321 -> 5,088,427 (delta=5,059,106) pgmajfault 3,750 -> 4,794,781 (delta=4,791,031) pgrefill 0 -> 13,733 (delta=13,733) pgreuse 3,333 -> 5,852 (delta=2,519) pgsteal_kswapd 0 -> 3,605,552 (delta=3,605,552) pgsteal_direct 0 -> 1,280,091 (delta=1,280,091) pgscan_kswapd 0 -> 6,579,240 (delta=6,579,240) pgscan_direct 0 -> 5,290,778 (delta=5,290,778) pgscan_anon 0 -> 11,843,970 (delta=11,843,970) pgscan_file 0 -> 26,048 (delta=26,048) pgsteal_anon 0 -> 4,872,962 (delta=4,872,962) pgsteal_file 0 -> 12,681 (delta=12,681) allocstall_normal 0 -> 110 (delta=110) allocstall_movable 0 -> 32,088 (delta=32,088) oom_kill 0 -> 0 (delta=0) workingset_nodes 0 -> 302 (delta=302) workingset_refault_anon 0 -> 4,777,591 (delta=4,777,591) workingset_refault_file 0 -> 870 (delta=870) workingset_activate_anon 0 -> 487 (delta=487) kswapd_low_wmark_hit_quickly 0 -> 35 (delta=35) kswapd_high_wmark_hit_quickly 0 -> 99 (delta=99) pageoutrun 0 -> 135 (delta=135) pgmigrate_success 0 -> 21,317 (delta=21,317) compact_migrate_scanned 0 -> 98,848 (delta=98,848) compact_free_scanned 0 -> 136,667 (delta=136,667) swpin_zero 0 -> 19,069 (delta=19,069) swpout_zero 0 -> 19,120 (delta=19,120) swap_ra 0 -> 63 (delta=63) swap_ra_hit 0 -> 26 (delta=26) Happy to do any other tests or pull any other data for you to help. Thanks, Matt