public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Ying Huang <huang.ying.caritas@gmail.com>,
	"Dan Williams" <djbw@kernel.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>, <linux-mm@kvack.org>,
	<linux-cxl@vger.kernel.org>, <driver-core@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>,
	David Hildenbrand <david@kernel.org>,
	"Oscar Salvador" <osalvador@suse.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Rafael J Wysocki <rafael@kernel.org>,
	"Danilo Krummrich" <dakr@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
Date: Wed, 29 Apr 2026 11:08:51 +0800	[thread overview]
Message-ID: <c6e1df1e-be5e-2468-d46e-453985ba1e79@huawei.com> (raw)
In-Reply-To: <94F5B89A-008A-4EDB-920F-31B4895C2699@linux.dev>

On 2026/4/28 21:52, Muchun Song wrote:
> 
> 
> 
>> On Apr 28, 2026, at 20:34, Miaohe Lin <linmiaohe@huawei.com> wrote:
>> On 2026/4/28 19:40, Muchun Song wrote:
>>>
>>>
>>>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>>> On 2026/4/28 16:52, Muchun Song wrote:
>>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>>>> serialize the xarray lookup against memory block removal.
>>>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>>>> the memory block cannot disappear between xa_load() and get_device().
>>>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>> Thanks for update.
>>>>> ---
>>>>> drivers/base/memory.c | 10 ++++++++--
>>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>>>> index 6981b55d582a..f76aee29e9a5 100644
>>>>> --- a/drivers/base/memory.c
>>>>> +++ b/drivers/base/memory.c
>>>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>>>> void memblk_nr_poison_inc(unsigned long pfn)
>>>>> {
>>>>>    const unsigned long block_id = pfn_to_block_id(pfn);
>>>>> -    struct memory_block *mem = find_memory_block_by_id(block_id);
>>>>> +    struct memory_block *mem;
>>>>> +    lock_device_hotplug();
>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>>>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>>>> deadlock. Or am I miss something?
>>>
>>> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
>>
>> Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
>>
>> remove_memory
>>  lock_device_hotplug <-- first called here
>>  try_remove_memory
>>    remove_memory_block_devices
>>      num_poisoned_pages_sub
> 
> Passing pfn = -1 here.
> 
>>        memblk_nr_poison_sub
>>          lock_device_hotplug <-- deadlock here
> 
> No. Can’t reach here. No deadlock.

Right, I missed that. Thanks. But I'm still worried that there might be potential issues.
For example, this function could be called while lock_page is held. Acquiring lock_device_hotplug
while already holding lock_page might cause problems, though I haven't seen any specific issues yet.
Also there might be some other potential scenarios that haven't been considered. Hope I'm just
overthinking it. :)

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.
.


  reply	other threads:[~2026-04-29  3:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 13:52 [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-29  3:08 ` Miaohe Lin [this message]
2026-04-29  3:32   ` Oscar Salvador
2026-04-29  4:18     ` Muchun Song
2026-04-29 10:11       ` Usama Arif
2026-04-29 10:44         ` David Hildenbrand (Arm)
2026-04-30  7:59           ` Muchun Song
2026-04-30 15:50             ` Andrew Morton
2026-05-01  2:55               ` Muchun Song
  -- strict thread matches above, loose matches on Subject: below --
2026-04-28  8:52 [PATCH v2 0/3] mm: Fix memory block leaks and locking Muchun Song
2026-04-28  8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-28  9:17   ` Oscar Salvador
2026-04-28  9:21     ` Muchun Song
2026-04-28 11:37   ` Miaohe Lin
2026-04-28 11:40     ` Muchun Song
2026-04-28 12:34       ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6e1df1e-be5e-2468-d46e-453985ba1e79@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dakr@kernel.org \
    --cc=david@kernel.org \
    --cc=djbw@kernel.org \
    --cc=driver-core@lists.linux.dev \
    --cc=gregkh@linuxfoundation.org \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rafael@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=stable@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox