* [PATCH v2 1/3] mm/memory_hotplug: fix memory block reference leak on remove
[not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
@ 2026-04-28 8:52 ` Muchun Song
2026-04-28 8:52 ` [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting Muchun Song
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2 siblings, 0 replies; 9+ messages in thread
From: Muchun Song @ 2026-04-28 8:52 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
Cc: Vishal Verma, Ying Huang, Dan Williams, Miaohe Lin,
Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
Muchun Song, stable, muchun.song
remove_memory_blocks_and_altmaps() looks up each memory block with
find_memory_block(), which acquires a reference to the memory block
device.
That reference is never dropped on this path, resulting in a leaked
device reference when removing memory blocks and their altmaps. Drop
the reference after retrieving mem->altmap and clearing mem->altmap,
before removing the memory block device.
Fixes: 6b8f0798b85a ("mm/memory_hotplug: split memmap_on_memory requests across memblocks")
Cc: stable@vger.kernel.org
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
---
v1->v2:
- Add Acked-by from Oscar.
- I didn't add memory_block_get_by_id/memory_block_put because this
is a pure bugfix series. I will send that separate cleanup after
the bugfixes have been merged.
---
mm/memory_hotplug.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2a943ec57c85..40c7915dabe0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1422,6 +1422,8 @@ static void remove_memory_blocks_and_altmaps(u64 start, u64 size)
altmap = mem->altmap;
mem->altmap = NULL;
+ /* drop the ref. we got via find_memory_block() */
+ put_device(&mem->dev);
remove_memory_block_devices(cur_start, memblock_size);
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting
[not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
2026-04-28 8:52 ` [PATCH v2 1/3] mm/memory_hotplug: fix memory block reference leak on remove Muchun Song
@ 2026-04-28 8:52 ` Muchun Song
2026-04-28 9:13 ` Oscar Salvador
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2 siblings, 1 reply; 9+ messages in thread
From: Muchun Song @ 2026-04-28 8:52 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
Cc: Vishal Verma, Ying Huang, Dan Williams, Miaohe Lin,
Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
Muchun Song, stable, muchun.song
memblk_nr_poison_inc() and memblk_nr_poison_sub() look up a memory
block via find_memory_block_by_id(), which acquires a reference to the
memory block device.
Both helpers use the returned memory block without dropping that
reference, leaking the device reference on each successful lookup. Drop
the reference after updating nr_hwpoison.
Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
Cc: stable@vger.kernel.org
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
---
v1->v2:
- Add Reviewed-by from Miaohe.
- Add device_hotplug_lock in the next patch.
---
drivers/base/memory.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index f806a683b767..6981b55d582a 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -1230,8 +1230,10 @@ void memblk_nr_poison_inc(unsigned long pfn)
const unsigned long block_id = pfn_to_block_id(pfn);
struct memory_block *mem = find_memory_block_by_id(block_id);
- if (mem)
+ if (mem) {
atomic_long_inc(&mem->nr_hwpoison);
+ put_device(&mem->dev);
+ }
}
void memblk_nr_poison_sub(unsigned long pfn, long i)
@@ -1239,8 +1241,10 @@ void memblk_nr_poison_sub(unsigned long pfn, long i)
const unsigned long block_id = pfn_to_block_id(pfn);
struct memory_block *mem = find_memory_block_by_id(block_id);
- if (mem)
+ if (mem) {
atomic_long_sub(i, &mem->nr_hwpoison);
+ put_device(&mem->dev);
+ }
}
static unsigned long memblk_nr_poison(struct memory_block *mem)
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
[not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
2026-04-28 8:52 ` [PATCH v2 1/3] mm/memory_hotplug: fix memory block reference leak on remove Muchun Song
2026-04-28 8:52 ` [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting Muchun Song
@ 2026-04-28 8:52 ` Muchun Song
2026-04-28 9:17 ` Oscar Salvador
2026-04-28 11:37 ` Miaohe Lin
2 siblings, 2 replies; 9+ messages in thread
From: Muchun Song @ 2026-04-28 8:52 UTC (permalink / raw)
To: David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
Cc: Vishal Verma, Ying Huang, Dan Williams, Miaohe Lin,
Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
Muchun Song, stable, muchun.song
memblk_nr_poison_inc() and memblk_nr_poison_sub() call
find_memory_block_by_id(), which requires device_hotplug_lock to
serialize the xarray lookup against memory block removal.
Take device_hotplug_lock around the lookup and nr_hwpoison update so
the memory block cannot disappear between xa_load() and get_device().
Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
Cc: stable@vger.kernel.org
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
drivers/base/memory.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 6981b55d582a..f76aee29e9a5 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
void memblk_nr_poison_inc(unsigned long pfn)
{
const unsigned long block_id = pfn_to_block_id(pfn);
- struct memory_block *mem = find_memory_block_by_id(block_id);
+ struct memory_block *mem;
+ lock_device_hotplug();
+ mem = find_memory_block_by_id(block_id);
if (mem) {
atomic_long_inc(&mem->nr_hwpoison);
put_device(&mem->dev);
}
+ unlock_device_hotplug();
}
void memblk_nr_poison_sub(unsigned long pfn, long i)
{
const unsigned long block_id = pfn_to_block_id(pfn);
- struct memory_block *mem = find_memory_block_by_id(block_id);
+ struct memory_block *mem;
+ lock_device_hotplug();
+ mem = find_memory_block_by_id(block_id);
if (mem) {
atomic_long_sub(i, &mem->nr_hwpoison);
put_device(&mem->dev);
}
+ unlock_device_hotplug();
}
static unsigned long memblk_nr_poison(struct memory_block *mem)
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting
2026-04-28 8:52 ` [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting Muchun Song
@ 2026-04-28 9:13 ` Oscar Salvador
0 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2026-04-28 9:13 UTC (permalink / raw)
To: Muchun Song
Cc: David Hildenbrand, Greg Kroah-Hartman, Rafael J . Wysocki,
Danilo Krummrich, Andrew Morton, Vishal Verma, Ying Huang,
Dan Williams, Miaohe Lin, Naoya Horiguchi, linux-mm, linux-cxl,
driver-core, linux-kernel, stable, muchun.song
On Tue, Apr 28, 2026 at 04:52:18PM +0800, Muchun Song wrote:
> memblk_nr_poison_inc() and memblk_nr_poison_sub() look up a memory
> block via find_memory_block_by_id(), which acquires a reference to the
> memory block device.
>
> Both helpers use the returned memory block without dropping that
> reference, leaking the device reference on each successful lookup. Drop
> the reference after updating nr_hwpoison.
>
> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
> Cc: stable@vger.kernel.org
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
@ 2026-04-28 9:17 ` Oscar Salvador
2026-04-28 9:21 ` Muchun Song
2026-04-28 11:37 ` Miaohe Lin
1 sibling, 1 reply; 9+ messages in thread
From: Oscar Salvador @ 2026-04-28 9:17 UTC (permalink / raw)
To: Muchun Song
Cc: David Hildenbrand, Greg Kroah-Hartman, Rafael J . Wysocki,
Danilo Krummrich, Andrew Morton, Vishal Verma, Ying Huang,
Dan Williams, Miaohe Lin, Naoya Horiguchi, linux-mm, linux-cxl,
driver-core, linux-kernel, stable, muchun.song
On Tue, Apr 28, 2026 at 04:52:19PM +0800, Muchun Song wrote:
> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
> find_memory_block_by_id(), which requires device_hotplug_lock to
> serialize the xarray lookup against memory block removal.
>
> Take device_hotplug_lock around the lookup and nr_hwpoison update so
> the memory block cannot disappear between xa_load() and get_device().
>
> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
> Cc: stable@vger.kernel.org
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
It might have made sense to join both patches? Anyway:
Acked-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
2026-04-28 9:17 ` Oscar Salvador
@ 2026-04-28 9:21 ` Muchun Song
0 siblings, 0 replies; 9+ messages in thread
From: Muchun Song @ 2026-04-28 9:21 UTC (permalink / raw)
To: Oscar Salvador
Cc: Muchun Song, David Hildenbrand, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton, Vishal Verma,
Ying Huang, Dan Williams, Miaohe Lin, Naoya Horiguchi, linux-mm,
linux-cxl, driver-core, linux-kernel, stable
> On Apr 28, 2026, at 17:17, Oscar Salvador <osalvador@suse.de> wrote:
>
> On Tue, Apr 28, 2026 at 04:52:19PM +0800, Muchun Song wrote:
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>> find_memory_block_by_id(), which requires device_hotplug_lock to
>> serialize the xarray lookup against memory block removal.
>>
>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>> the memory block cannot disappear between xa_load() and get_device().
>>
>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> It might have made sense to join both patches? Anyway:
Either way works for me. I’ve been following the 'one thing per
patch' principle. If I still need to update v3, I can merge them;
otherwise, I’d prefer to keep it as is. I'm a little lazy. :)
>
> Acked-by: Oscar Salvador <osalvador@suse.de>
Thanks.
>
>
> --
> Oscar Salvador
> SUSE Labs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-28 9:17 ` Oscar Salvador
@ 2026-04-28 11:37 ` Miaohe Lin
2026-04-28 11:40 ` Muchun Song
1 sibling, 1 reply; 9+ messages in thread
From: Miaohe Lin @ 2026-04-28 11:37 UTC (permalink / raw)
To: Muchun Song
Cc: Vishal Verma, Ying Huang, Dan Williams, Naoya Horiguchi, linux-mm,
linux-cxl, driver-core, linux-kernel, stable, muchun.song,
David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
On 2026/4/28 16:52, Muchun Song wrote:
> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
> find_memory_block_by_id(), which requires device_hotplug_lock to
> serialize the xarray lookup against memory block removal.
>
> Take device_hotplug_lock around the lookup and nr_hwpoison update so
> the memory block cannot disappear between xa_load() and get_device().
>
> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
> Cc: stable@vger.kernel.org
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Thanks for update.
> ---
> drivers/base/memory.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 6981b55d582a..f76aee29e9a5 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
> void memblk_nr_poison_inc(unsigned long pfn)
> {
> const unsigned long block_id = pfn_to_block_id(pfn);
> - struct memory_block *mem = find_memory_block_by_id(block_id);
> + struct memory_block *mem;
>
> + lock_device_hotplug();
memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
deadlock. Or am I miss something?
Thanks.
.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
2026-04-28 11:37 ` Miaohe Lin
@ 2026-04-28 11:40 ` Muchun Song
2026-04-28 12:34 ` Miaohe Lin
0 siblings, 1 reply; 9+ messages in thread
From: Muchun Song @ 2026-04-28 11:40 UTC (permalink / raw)
To: Miaohe Lin
Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> On 2026/4/28 16:52, Muchun Song wrote:
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>> find_memory_block_by_id(), which requires device_hotplug_lock to
>> serialize the xarray lookup against memory block removal.
>>
>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>> the memory block cannot disappear between xa_load() and get_device().
>>
>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> Thanks for update.
>
>> ---
>> drivers/base/memory.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 6981b55d582a..f76aee29e9a5 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>> void memblk_nr_poison_inc(unsigned long pfn)
>> {
>> const unsigned long block_id = pfn_to_block_id(pfn);
>> - struct memory_block *mem = find_memory_block_by_id(block_id);
>> + struct memory_block *mem;
>>
>> + lock_device_hotplug();
>
> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
> deadlock. Or am I miss something?
I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
Thanks.
>
> Thanks.
> .
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
2026-04-28 11:40 ` Muchun Song
@ 2026-04-28 12:34 ` Miaohe Lin
0 siblings, 0 replies; 9+ messages in thread
From: Miaohe Lin @ 2026-04-28 12:34 UTC (permalink / raw)
To: Muchun Song
Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
On 2026/4/28 19:40, Muchun Song wrote:
>
>
>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>
>> On 2026/4/28 16:52, Muchun Song wrote:
>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>> serialize the xarray lookup against memory block removal.
>>>
>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>> the memory block cannot disappear between xa_load() and get_device().
>>>
>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>
>> Thanks for update.
>>
>>> ---
>>> drivers/base/memory.c | 10 ++++++++--
>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>> index 6981b55d582a..f76aee29e9a5 100644
>>> --- a/drivers/base/memory.c
>>> +++ b/drivers/base/memory.c
>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>> void memblk_nr_poison_inc(unsigned long pfn)
>>> {
>>> const unsigned long block_id = pfn_to_block_id(pfn);
>>> - struct memory_block *mem = find_memory_block_by_id(block_id);
>>> + struct memory_block *mem;
>>>
>>> + lock_device_hotplug();
>>
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>> deadlock. Or am I miss something?
>
> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
remove_memory
lock_device_hotplug <-- first called here
try_remove_memory
remove_memory_block_devices
num_poisoned_pages_sub
memblk_nr_poison_sub
lock_device_hotplug <-- deadlock here
Hope I'm not mistaken again. :)
Thank.
.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-28 12:34 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
2026-04-28 8:52 ` [PATCH v2 1/3] mm/memory_hotplug: fix memory block reference leak on remove Muchun Song
2026-04-28 8:52 ` [PATCH v2 2/3] drivers/base/memory: fix memory block reference leak in poison accounting Muchun Song
2026-04-28 9:13 ` Oscar Salvador
2026-04-28 8:52 ` [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-28 9:17 ` Oscar Salvador
2026-04-28 9:21 ` Muchun Song
2026-04-28 11:37 ` Miaohe Lin
2026-04-28 11:40 ` Muchun Song
2026-04-28 12:34 ` Miaohe Lin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox