From: Miaohe Lin <linmiaohe@huawei.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ying Huang <huang.ying.caritas@gmail.com>,
"Dan Williams" <djbw@kernel.org>,
Naoya Horiguchi <nao.horiguchi@gmail.com>, <linux-mm@kvack.org>,
<linux-cxl@vger.kernel.org>, <driver-core@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>,
David Hildenbrand <david@kernel.org>,
"Oscar Salvador" <osalvador@suse.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Rafael J Wysocki <rafael@kernel.org>,
"Danilo Krummrich" <dakr@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
Date: Wed, 29 Apr 2026 11:08:51 +0800 [thread overview]
Message-ID: <c6e1df1e-be5e-2468-d46e-453985ba1e79@huawei.com> (raw)
In-Reply-To: <94F5B89A-008A-4EDB-920F-31B4895C2699@linux.dev>
On 2026/4/28 21:52, Muchun Song wrote:
>
>
>
>> On Apr 28, 2026, at 20:34, Miaohe Lin <linmiaohe@huawei.com> wrote:
>> On 2026/4/28 19:40, Muchun Song wrote:
>>>
>>>
>>>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>>> On 2026/4/28 16:52, Muchun Song wrote:
>>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>>>> serialize the xarray lookup against memory block removal.
>>>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>>>> the memory block cannot disappear between xa_load() and get_device().
>>>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>> Thanks for update.
>>>>> ---
>>>>> drivers/base/memory.c | 10 ++++++++--
>>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>>>> index 6981b55d582a..f76aee29e9a5 100644
>>>>> --- a/drivers/base/memory.c
>>>>> +++ b/drivers/base/memory.c
>>>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>>>> void memblk_nr_poison_inc(unsigned long pfn)
>>>>> {
>>>>> const unsigned long block_id = pfn_to_block_id(pfn);
>>>>> - struct memory_block *mem = find_memory_block_by_id(block_id);
>>>>> + struct memory_block *mem;
>>>>> + lock_device_hotplug();
>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>>>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>>>> deadlock. Or am I miss something?
>>>
>>> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
>>
>> Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
>>
>> remove_memory
>> lock_device_hotplug <-- first called here
>> try_remove_memory
>> remove_memory_block_devices
>> num_poisoned_pages_sub
>
> Passing pfn = -1 here.
>
>> memblk_nr_poison_sub
>> lock_device_hotplug <-- deadlock here
>
> No. Can’t reach here. No deadlock.
Right, I missed that. Thanks. But I'm still worried that there might be potential issues.
For example, this function could be called while lock_page is held. Acquiring lock_device_hotplug
while already holding lock_page might cause problems, though I haven't seen any specific issues yet.
Also there might be some other potential scenarios that haven't been considered. Hope I'm just
overthinking it. :)
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks.
.
next prev parent reply other threads:[~2026-04-29 3:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 13:52 [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-29 3:08 ` Miaohe Lin [this message]
2026-04-29 3:32 ` Oscar Salvador
2026-04-29 4:18 ` Muchun Song
2026-04-29 10:11 ` Usama Arif
2026-04-29 10:44 ` David Hildenbrand (Arm)
2026-04-30 7:59 ` Muchun Song
2026-04-30 15:50 ` Andrew Morton
2026-05-01 2:55 ` Muchun Song
-- strict thread matches above, loose matches on Subject: below --
2026-04-28 8:52 [PATCH v2 0/3] mm: Fix memory block leaks and locking Muchun Song
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-28 9:17 ` Oscar Salvador
2026-04-28 9:21 ` Muchun Song
2026-04-28 11:37 ` Miaohe Lin
2026-04-28 11:40 ` Muchun Song
2026-04-28 12:34 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6e1df1e-be5e-2468-d46e-453985ba1e79@huawei.com \
--to=linmiaohe@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=dakr@kernel.org \
--cc=david@kernel.org \
--cc=djbw@kernel.org \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=huang.ying.caritas@gmail.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=nao.horiguchi@gmail.com \
--cc=osalvador@suse.de \
--cc=rafael@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=stable@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox