All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <muchun.song@linux.dev>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	vbabka@kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] mm: kmem: add lockdep assertion to obj_cgroup_memcg
Date: Wed, 31 Jul 2024 15:02:40 +0800	[thread overview]
Message-ID: <B54AF0F4-6CE8-4791-868E-62C7704AB832@linux.dev> (raw)
In-Reply-To: <3c4b978b-b1fe-42d2-b1a7-a58609433f3c@samsung.com>



> On Jul 31, 2024, at 02:52, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> On 25.07.2024 11:43, Muchun Song wrote:
>> The obj_cgroup_memcg() is supposed to safe to prevent the returned
>> memory cgroup from being freed only when the caller is holding the
>> rcu read lock or objcg_lock or cgroup_mutex. It is very easy to
>> ignore thoes conditions when users call some upper APIs which call
>> obj_cgroup_memcg() internally like mem_cgroup_from_slab_obj() (See
>> the link below). So it is better to add lockdep assertion to
>> obj_cgroup_memcg() to find those issues ASAP.
>> 
>> Because there is no user of obj_cgroup_memcg() holding objcg_lock
>> to make the returned memory cgroup safe, do not add objcg_lock
>> assertion (We should export objcg_lock if we really want to do).
>> Additionally, this is some internal implementation detail of memcg
>> and should not be accessible outside memcg code.
>> 
>> Some users like __mem_cgroup_uncharge() do not care the lifetime
>> of the returned memory cgroup, which just want to know if the
>> folio is charged to a memory cgroup, therefore, they do not need
>> to hold the needed locks. In which case, introduce a new helper
>> folio_memcg_charged() to do this. Compare it to folio_memcg(), it
>> could eliminate a memory access of objcg->memcg for kmem, actually,
>> a really small gain.
>> 
>> Link: https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> This patch landed in today's linux-next as commit 230b2f1f31b9 ("mm: 
> kmem: add lockdep assertion to obj_cgroup_memcg"). I my tests I found 
> that it triggers the following warning on Debian bookworm/sid system 
> image running under QEMU RISCV64:

Thanks for your report.

I'd like to say excellent since it indeed indicates this patch works
well. Your report is actually a bug that I fixed it in [1] but not
related to this patch.

[1] https://lore.kernel.org/all/20240718083607.42068-1-songmuchun@bytedance.com/

> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at include/linux/memcontrol.h:373 
> mem_cgroup_from_slab_obj+0x13e/0x1ea
> Modules linked in:
> CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.10.0+ #15154
> Hardware name: riscv-virtio,qemu (DT)
> epc : mem_cgroup_from_slab_obj+0x13e/0x1ea
>  ra : mem_cgroup_from_slab_obj+0x13c/0x1ea
> ...
> [<ffffffff80257256>] mem_cgroup_from_slab_obj+0x13e/0x1ea
> [<ffffffff801f0b3e>] list_lru_del_obj+0xa6/0xc2
> [<ffffffff8027c6c6>] d_lru_del+0x8c/0xa4
> [<ffffffff8027da60>] __dentry_kill+0x15e/0x17a
> [<ffffffff8027ec3c>] dput.part.0+0x242/0x3e6
> [<ffffffff8027edee>] dput+0xe/0x18
> [<ffffffff8027324c>] lookup_fast+0x80/0xce
> [<ffffffff80273e28>] walk_component+0x20/0x13c
> [<ffffffff802747e2>] path_lookupat+0x64/0x16c
> [<ffffffff80274bf4>] filename_lookup+0x76/0x122
> [<ffffffff80274d80>] user_path_at+0x30/0x4a
> [<ffffffff802d12bc>] __riscv_sys_name_to_handle_at+0x52/0x1d8
> [<ffffffff80b60324>] do_trap_ecall_u+0x14e/0x1da
> [<ffffffff80b6c546>] handle_exception+0xca/0xd6
> irq event stamp: 198187
> hardirqs last  enabled at (198187): [<ffffffff8028ca9e>] 
> lookup_mnt+0x186/0x308
> hardirqs last disabled at (198186): [<ffffffff8028ca74>] 
> lookup_mnt+0x15c/0x308
> softirqs last  enabled at (198172): [<ffffffff800e34f6>] 
> cgroup_apply_control_enable+0x1f6/0x2fc
> softirqs last disabled at (198170): [<ffffffff800e34d8>] 
> cgroup_apply_control_enable+0x1d8/0x2fc
> ---[ end trace 0000000000000000 ]---
> 
> Similar warning appears on ARM64 Debian bookworm system. Reverting it on 
> top of linux-next hides the issue, but I assume this is not the best way 
> to fix it.
> 
> I'm testing kernel built from riscv/defconfig with PROVE_LOCKING, 
> DEBUG_ATOMIC_SLEEP, DEBUG_DRIVER and DEBUG_DEVRES options enabled.
> 
>> ---
>> v3:
>>  - Use lockdep_assert_once(Vlastimil).
>> 
>> v2:
>>  - Remove mention of objcg_lock in obj_cgroup_memcg()(Shakeel Butt).
>> 
>>  include/linux/memcontrol.h | 20 +++++++++++++++++---
>>  mm/memcontrol.c            |  6 +++---
>>  2 files changed, 20 insertions(+), 6 deletions(-)
>> 
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index fc94879db4dff..95f823deafeca 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -360,11 +360,11 @@ static inline bool folio_memcg_kmem(struct folio *folio);
>>   * After the initialization objcg->memcg is always pointing at
>>   * a valid memcg, but can be atomically swapped to the parent memcg.
>>   *
>> - * The caller must ensure that the returned memcg won't be released:
>> - * e.g. acquire the rcu_read_lock or css_set_lock.
>> + * The caller must ensure that the returned memcg won't be released.
>>   */
>>  static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg)
>>  {
>> + lockdep_assert_once(rcu_read_lock_held() || lockdep_is_held(&cgroup_mutex));
>>   return READ_ONCE(objcg->memcg);
>>  }
>> 
>> @@ -438,6 +438,19 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio)
>>   return __folio_memcg(folio);
>>  }
>> 
>> +/*
>> + * folio_memcg_charged - If a folio is charged to a memory cgroup.
>> + * @folio: Pointer to the folio.
>> + *
>> + * Returns true if folio is charged to a memory cgroup, otherwise returns false.
>> + */
>> +static inline bool folio_memcg_charged(struct folio *folio)
>> +{
>> + if (folio_memcg_kmem(folio))
>> + return __folio_objcg(folio) != NULL;
>> + return __folio_memcg(folio) != NULL;
>> +}
>> +
>>  /**
>>   * folio_memcg_rcu - Locklessly get the memory cgroup associated with a folio.
>>   * @folio: Pointer to the folio.
>> @@ -454,7 +467,6 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio)
>>   unsigned long memcg_data = READ_ONCE(folio->memcg_data);
>> 
>>   VM_BUG_ON_FOLIO(folio_test_slab(folio), folio);
>> - WARN_ON_ONCE(!rcu_read_lock_held());
>> 
>>   if (memcg_data & MEMCG_DATA_KMEM) {
>>   struct obj_cgroup *objcg;
>> @@ -463,6 +475,8 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio)
>>   return obj_cgroup_memcg(objcg);
>>   }
>> 
>> + WARN_ON_ONCE(!rcu_read_lock_held());
>> +
>>   return (struct mem_cgroup *)(memcg_data & ~OBJEXTS_FLAGS_MASK);
>>  }
>> 
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 622d4544edd24..3da0284573857 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2366,7 +2366,7 @@ void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
>> 
>>  static void commit_charge(struct folio *folio, struct mem_cgroup *memcg)
>>  {
>> - VM_BUG_ON_FOLIO(folio_memcg(folio), folio);
>> + VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio);
>>   /*
>>    * Any of the following ensures page's memcg stability:
>>    *
>> @@ -4617,7 +4617,7 @@ void __mem_cgroup_uncharge(struct folio *folio)
>>   struct uncharge_gather ug;
>> 
>>   /* Don't touch folio->lru of any random page, pre-check: */
>> - if (!folio_memcg(folio))
>> + if (!folio_memcg_charged(folio))
>>   return;
>> 
>>   uncharge_gather_clear(&ug);
>> @@ -4662,7 +4662,7 @@ void mem_cgroup_replace_folio(struct folio *old, struct folio *new)
>>   return;
>> 
>>   /* Page cache replacement: new folio already charged? */
>> - if (folio_memcg(new))
>> + if (folio_memcg_charged(new))
>>   return;
>> 
>>   memcg = folio_memcg(old);
> 
> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland



  parent reply	other threads:[~2024-07-31  7:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240730185206eucas1p28b14a1d9802ce2703bd13edc75e1b55d@eucas1p2.samsung.com>
2024-07-25  9:43 ` [PATCH v3] mm: kmem: add lockdep assertion to obj_cgroup_memcg Muchun Song
2024-07-25 15:58   ` Shakeel Butt
2024-07-25 17:39   ` Roman Gushchin
2024-07-26  8:09   ` Vlastimil Babka (SUSE)
2024-07-30 18:52   ` Marek Szyprowski
2024-07-30 20:18     ` Andrew Morton
2024-07-31  6:53       ` Muchun Song
2024-07-31  7:02     ` Muchun Song [this message]
2024-07-31  8:09       ` Marek Szyprowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B54AF0F4-6CE8-4791-868E-62C7704AB832@linux.dev \
    --to=muchun.song@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.