From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B95613DA5BD for ; Tue, 28 Apr 2026 10:20:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777371648; cv=none; b=YXdykpEibpsZcbI8n/WcTdanbHWYC4XxkxiC307JuhCp/v0F5Tn3n5tMS58gOh/ZHxoAhQjmSY2AxZsVEgjh8rBVND3dxm0DgJKxC/+HN6vJltlCyDQvrzyLFO0IjKKDVb0vGGmTkmbiewp6z+6sttOqazAewjko44C0nDO23lU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777371648; c=relaxed/simple; bh=/D40+Kj5C6EgJHgc5lHn5/AssBmdhcktb1Du5CL3res=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=R2Fukf4tN7KZphSuIDZckW75ctLKny8FY16x9RdUZjVD9zJxZiHGEUPpxkY5qntxd5iIC1HbedbLxAbRdvmNxVbz9RMVUnGa4NWtzXhgE6cJQI4LJdOD6+z6olQ5QO60cShW2iw7EXQhTknR1GVx3YDeNTFwY5TUzh8iq/xSk6o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=sCOv82sH; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="sCOv82sH" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1777371644; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ir9gkX6ap7Dl25Nv6DGNL7wm/7HC7xIHOVSKox27TU0=; b=sCOv82sHo8j5DwFPY6zfrT/QR99Up68p9w56NxjMqvWU9MoK4t3UBvo2YJ2SEk/LeK24W9 /MOahleMNZtzPj5D/yJlyhVe0QZRZ7/5KWImhYzt5xp3XcOPOtXgrfpDJ0FCC3ZeQdPmgK afJq29LB6ZxvF5mhNAo2cF4pPog8lDE= Date: Tue, 28 Apr 2026 18:19:53 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH] mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end() To: Shakeel Butt Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, muchun.song@linux.dev, yosry@kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng References: <20260428030621.94470-1-qi.zheng@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 4/28/26 5:59 PM, Shakeel Butt wrote: > On Tue, Apr 28, 2026 at 11:06:21AM +0800, Qi Zheng wrote: >> From: Qi Zheng >> >> Currently, get_non_dying_memcg_start() and get_non_dying_memcg_end() both >> evaluate cgroup_subsys_on_dfl(memory_cgrp_subsys) independently to >> determine whether to acquire or release the RCU read lock. >> >> However, the result of cgroup_subsys_on_dfl() can change dynamically at >> runtime due to cgroup hierarchy rebinding (e.g., when the memory >> controller is moved between cgroup v1 and v2 hierarchies). This can cause >> the following warning: >> >> ===================================== >> WARNING: bad unlock balance detected! >> 7.0.0-next-20260420+ #83 Tainted: G W >> ------------------------------------- >> memcg-repro/270 is trying to release lock (rcu_read_lock) at: >> [] rcu_read_unlock+0x17/0x60 >> but there are no more locks to release! >> >> other info that might help us debug this: >> 1 lock held by memcg-repro/270: >> #0: ffff888102fa2088 (vm_lock){++++}-{0:0}, at: do_user_addr_fault+0x285/0x880 >> >> stack backtrace: >> CPU: 0 UID: 0 PID: 270 Comm: memcg-repro Tainted: G W 7.0.0-next-20260420+ # >> Tainted: [W]=WARN >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 >> Call Trace: >> >> ? rcu_read_unlock+0x17/0x60 >> dump_stack_lvl+0x77/0xb0 >> print_unlock_imbalance_bug+0xe0/0xf0 >> ? rcu_read_unlock+0x17/0x60 >> lock_release+0x21d/0x2a0 >> rcu_read_unlock+0x1c/0x60 >> do_pte_missing+0x233/0xb40 >> __handle_mm_fault+0x80e/0xcd0 >> handle_mm_fault+0x146/0x310 >> do_user_addr_fault+0x303/0x880 >> exc_page_fault+0x9b/0x270 >> asm_exc_page_fault+0x26/0x30 >> RIP: 0033:0x5590e4eb41ea >> Code: 61 cc 66 0f 6f e0 66 0f 61 c2 66 0f db cd 66 0f 69 e2 66 0f 6f d0 66 0f 69 d4 66 0f 61 0 >> RSP: 002b:00007ffcad25f030 EFLAGS: 00010202 >> RAX: 00005590e4eb8010 RBX: 00007ffcad260f7d RCX: 00007f73c474d44d >> RDX: 00005590e4eb80a0 RSI: 00005590e4eb503c RDI: 000000000000000f >> RBP: 00005590e4eb70a0 R08: 0000000000000000 R09: 00007f73c483a680 >> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >> R13: 00007ffcad25f180 R14: 00005590e4eb6dd8 R15: 00007f73c4869020 >> >> ------------[ cut here ]------------ >> >> Fix this by explicitly tracking the RCU lock state, ensuring that >> rcu_read_unlock() in get_non_dying_memcg_end() is strictly paired with >> the lock acquisition, regardless of any runtime rebinding events. >> >> Fixes: 8285917d6f38 ("mm: memcontrol: prepare for reparenting non-hierarchical stats") >> Signed-off-by: Qi Zheng >> --- >> mm/memcontrol.c | 31 +++++++++++++++++++++---------- >> 1 file changed, 21 insertions(+), 10 deletions(-) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index c3d98ab41f1f1..38f48a45b7ae5 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -805,12 +805,17 @@ static long memcg_state_val_in_pages(int idx, long val) >> * Used in mod_memcg_state() and mod_memcg_lruvec_state() to avoid race with >> * reparenting of non-hierarchical state_locals. >> */ >> -static inline struct mem_cgroup *get_non_dying_memcg_start(struct mem_cgroup *memcg) >> +static inline struct mem_cgroup *get_non_dying_memcg_start(struct mem_cgroup *memcg, >> + bool *rcu_locked) >> { >> - if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) >> + /* Rebinding can cause this value to be changed at runtime */ >> + if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) { >> + *rcu_locked = false; >> return memcg; >> + } >> >> rcu_read_lock(); >> + *rcu_locked = true; >> >> while (memcg_is_dying(memcg)) >> memcg = parent_mem_cgroup(memcg); >> @@ -818,20 +823,23 @@ static inline struct mem_cgroup *get_non_dying_memcg_start(struct mem_cgroup *me >> return memcg; >> } >> >> -static inline void get_non_dying_memcg_end(void) >> +static inline void get_non_dying_memcg_end(bool rcu_locked) >> { >> - if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) >> + if (!rcu_locked) >> return; >> >> rcu_read_unlock(); >> } >> #else >> -static inline struct mem_cgroup *get_non_dying_memcg_start(struct mem_cgroup *memcg) >> +static inline struct mem_cgroup *get_non_dying_memcg_start(struct mem_cgroup *memcg, >> + bool *rcu_locked) >> { >> + *rcu_locked = false; > > We don't need to set rcu_locked to false here as we don't access in !V1 build > option. Will do. > > With that fixed, you can add: > > Acked-by: Shakeel Butt Thanks!