All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Yosry Ahmed <yosry.ahmed@linux.dev>, Zi Yan <ziy@nvidia.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Usama Arif <usama.arif@linux.dev>,
	Kiryl Shutsemau <kas@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org,
	Alexander Egorenkov <egorenar@linux.ibm.com>
Subject: Re: [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru - [s390] panic in __memcg_list_lru_alloc
Date: Mon, 30 Mar 2026 16:56:56 -0400	[thread overview]
Message-ID: <acrjmBx_hX-Aa_SH@cmpxchg.org> (raw)
In-Reply-To: <acrf5_rS7hO5Ec0Q@cmpxchg.org>

On Mon, Mar 30, 2026 at 04:41:16PM -0400, Johannes Weiner wrote:
> Hello Mikhail,
> 
> On Mon, Mar 30, 2026 at 06:37:01PM +0200, Mikhail Zaslonko wrote:
> > with this series in linux-next (since next-20260324) I see a reproducible panic on s390 in the
> > dump kernel when running NVMe standalone dump (ngdump).
> > This only happens in the 'capture kernel', normal boot of the same kernel works fine.
> > 
> > [   14.350676] Unable to handle kernel pointer dereference in virtual kernel address space
> > [   14.350682] Failing address: 4000000000000000 TEID: 4000000000000803 ESOP-2 FSI
> > [   14.350686] Fault in home space mode while using kernel ASCE.
> > [   14.350689] AS:0000000002798007 R3:000000002d2c4007 S:000000002d2c3001 P:000000000000013d 
> > [   14.350730] Oops: 0038 ilc:3 [#1]SMP 
> > [   14.350735] Modules linked in: dm_service_time zfcp scsi_transport_fc uvdevice diag288_wdt nvme prng aes_s390 nvme_core des_s390 libdes zcrypt_cex4 dm_mirror dm_region_hash dm_log scsi_dh_rdac scsi_dh_emc scsi_dh_alua paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt rng_core pkey_pckmo pkey dm_multipath autofs4
> > [   14.350760] CPU: 0 UID: 0 PID: 32 Comm: khugepaged Not tainted 7.0.0-rc5-next-20260324 
> > [   14.350762] Hardware name: IBM 3931 A01 704 (LPAR)
> > [   14.350764] Krnl PSW : 0704d00180000000 000003ffe0443a82 (__memcg_list_lru_alloc+0x52/0x1d0)
> > [   14.350774]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> > [   14.350776] Krnl GPRS: 0000000000000402 00000000000bece0 0000000000000000 000003ffe1c17928
> > [   14.350778]            00000000001c24ca 0000000000000000 0000000000000000 000003ffe1c17948
> > [   14.350780]            0000000000000000 00000000000824c0 0000037200098000 4000000000000000
> > [   14.350782]            0000000000782400 0000000000000001 0000037fe00f39b8 0000037fe00f3918
> > [   14.350788] Krnl Code: 000003ffe0443a72: a7690000            lghi    %r6,0
> > [   14.350788]            000003ffe0443a76: e380f0a00004        lg      %r8,160(%r15)
> > [   14.350788]           *000003ffe0443a7c: e3b080b80004        lg      %r11,184(%r8)
> > [   14.350788]           >000003ffe0443a82: e330b9400012        lt      %r3,2368(%r11)
> > [   14.350788]            000003ffe0443a88: a7a40065            brc     10,000003ffe0443b52
> > [   14.350788]            000003ffe0443a8c: e3b0f0a00004        lg      %r11,160(%r15)
> > [   14.350788]            000003ffe0443a92: ec68006f007c        cgij    %r6,0,8,000003ffe0443b70
> > [   14.350788]            000003ffe0443a98: e300b9400014        lgf     %r0,2368(%r11)
> > [   14.350825] Call Trace:
> > [   14.350826]  [<000003ffe0443a82>] __memcg_list_lru_alloc+0x52/0x1d0 
> > [   14.350831]  [<000003ffe044529a>] folio_memcg_list_lru_alloc+0xba/0x150 
> > [   14.350834]  [<000003ffe04f279a>] alloc_charge_folio+0x18a/0x250 
> > [   14.350839]  [<000003ffe04f34dc>] collapse_huge_page+0x8c/0x890 
> > [   14.350841]  [<000003ffe04f4222>] collapse_scan_pmd+0x542/0x690 
> > [   14.350844]  [<000003ffe04f65b4>] collapse_single_pmd+0x144/0x240 
> > [   14.350847]  [<000003ffe04f69ce>] collapse_scan_mm_slot.constprop.0+0x31e/0x480 
> > [   14.350849]  [<000003ffe04f6d3c>] khugepaged+0x20c/0x210 
> > [   14.350852]  [<000003ffe019b0a8>] kthread+0x148/0x170 
> > [   14.350856]  [<000003ffe0119fec>] __ret_from_fork+0x3c/0x240 
> > [   14.350860]  [<000003ffe0ffa4b2>] ret_from_fork+0xa/0x30 
> > [   14.350865] Last Breaking-Event-Address:
> > [   14.350865]  [<000003ffe0445294>] folio_memcg_list_lru_alloc+0xb4/0x150
> > [   14.350870] Kernel panic - not syncing: Fatal exception: panic_on_oops

Can you verify whether the kdump kernel boots with
cgroup_disable=memory?

I think there is an issue with how we call __list_lru_init(). The
existing callsites had their own memcg_kmem_online() guards. But the
THP one does not, so we're creating a memcg-aware list_lru, but the
do-while hierarchy walk in __memcg_list_lru_alloc() runs into a NULL
memcg.

Can you try the below on top of that -next checkout?

diff --git a/mm/list_lru.c b/mm/list_lru.c
index 1ccdd45b1d14..7c7024e33653 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -637,7 +637,7 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware, struct shrinker *shr
 	else
 		lru->shrinker_id = -1;
 
-	if (mem_cgroup_kmem_disabled())
+	if (mem_cgroup_disabled() || mem_cgroup_kmem_disabled())
 		memcg_aware = false;
 #endif


  reply	other threads:[~2026-03-30 20:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 19:53 [PATCH v3 0/7] mm: switch THP shrinker to list_lru Johannes Weiner
2026-03-18 19:53 ` [PATCH v3 1/7] mm: list_lru: lock_list_lru_of_memcg() cannot return NULL if !skip_empty Johannes Weiner
2026-03-18 20:12   ` Shakeel Butt
2026-03-24 11:30   ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 2/7] mm: list_lru: deduplicate unlock_list_lru() Johannes Weiner
2026-03-24 11:32   ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 3/7] mm: list_lru: move list dead check to lock_list_lru_of_memcg() Johannes Weiner
2026-03-18 20:20   ` Shakeel Butt
2026-03-24 11:34   ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 4/7] mm: list_lru: deduplicate lock_list_lru() Johannes Weiner
2026-03-18 20:22   ` Shakeel Butt
2026-03-24 11:36   ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 5/7] mm: list_lru: introduce caller locking for additions and deletions Johannes Weiner
2026-03-18 20:51   ` Shakeel Butt
2026-03-20 16:18     ` Johannes Weiner
2026-03-24 11:55   ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 6/7] mm: list_lru: introduce folio_memcg_list_lru_alloc() Johannes Weiner
2026-03-18 20:52   ` Shakeel Butt
2026-03-18 21:01   ` Shakeel Butt
2026-03-24 12:01   ` Lorenzo Stoakes (Oracle)
2026-03-30 16:54     ` Johannes Weiner
2026-04-01 14:43       ` Lorenzo Stoakes (Oracle)
2026-03-18 19:53 ` [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru Johannes Weiner
2026-03-18 20:26   ` David Hildenbrand (Arm)
2026-03-18 23:18   ` Shakeel Butt
2026-03-24 13:48   ` Lorenzo Stoakes (Oracle)
2026-03-30 16:40     ` Johannes Weiner
2026-04-01 17:33       ` Lorenzo Stoakes (Oracle)
2026-04-06 21:37         ` Johannes Weiner
2026-04-07  9:55           ` Lorenzo Stoakes (Oracle)
2026-03-27  7:51   ` Kairui Song
2026-03-30 16:51     ` Johannes Weiner
2026-03-30 16:37   ` [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru - [s390] panic in __memcg_list_lru_alloc Mikhail Zaslonko
2026-03-30 19:03     ` Andrew Morton
2026-03-30 20:41     ` Johannes Weiner
2026-03-30 20:56       ` Johannes Weiner [this message]
2026-03-30 22:46         ` Vasily Gorbik
2026-03-31  8:04         ` Mikhail Zaslonko
2026-03-18 21:00 ` [PATCH v3 0/7] mm: switch THP shrinker to list_lru Lorenzo Stoakes (Oracle)
2026-03-18 22:31   ` Johannes Weiner
2026-03-19  8:47     ` Lorenzo Stoakes (Oracle)
2026-03-19  8:52       ` David Hildenbrand (Arm)
2026-03-19 11:45         ` Lorenzo Stoakes (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acrjmBx_hX-Aa_SH@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=david@kernel.org \
    --cc=egorenar@linux.ibm.com \
    --cc=kas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=usama.arif@linux.dev \
    --cc=yosry.ahmed@linux.dev \
    --cc=zaslonko@linux.ibm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.