public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
       [not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
@ 2026-04-28  8:52 ` Muchun Song
  2026-04-28  9:17   ` Oscar Salvador
  2026-04-28 11:37   ` Miaohe Lin
  0 siblings, 2 replies; 10+ messages in thread
From: Muchun Song @ 2026-04-28  8:52 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J . Wysocki, Danilo Krummrich, Andrew Morton
  Cc: Vishal Verma, Ying Huang, Dan Williams, Miaohe Lin,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	Muchun Song, stable, muchun.song

memblk_nr_poison_inc() and memblk_nr_poison_sub() call
find_memory_block_by_id(), which requires device_hotplug_lock to
serialize the xarray lookup against memory block removal.

Take device_hotplug_lock around the lookup and nr_hwpoison update so
the memory block cannot disappear between xa_load() and get_device().

Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
Cc: stable@vger.kernel.org
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 drivers/base/memory.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 6981b55d582a..f76aee29e9a5 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
 void memblk_nr_poison_inc(unsigned long pfn)
 {
 	const unsigned long block_id = pfn_to_block_id(pfn);
-	struct memory_block *mem = find_memory_block_by_id(block_id);
+	struct memory_block *mem;
 
+	lock_device_hotplug();
+	mem = find_memory_block_by_id(block_id);
 	if (mem) {
 		atomic_long_inc(&mem->nr_hwpoison);
 		put_device(&mem->dev);
 	}
+	unlock_device_hotplug();
 }
 
 void memblk_nr_poison_sub(unsigned long pfn, long i)
 {
 	const unsigned long block_id = pfn_to_block_id(pfn);
-	struct memory_block *mem = find_memory_block_by_id(block_id);
+	struct memory_block *mem;
 
+	lock_device_hotplug();
+	mem = find_memory_block_by_id(block_id);
 	if (mem) {
 		atomic_long_sub(i, &mem->nr_hwpoison);
 		put_device(&mem->dev);
 	}
+	unlock_device_hotplug();
 }
 
 static unsigned long memblk_nr_poison(struct memory_block *mem)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28  8:52 ` Muchun Song
@ 2026-04-28  9:17   ` Oscar Salvador
  2026-04-28  9:21     ` Muchun Song
  2026-04-28 11:37   ` Miaohe Lin
  1 sibling, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2026-04-28  9:17 UTC (permalink / raw)
  To: Muchun Song
  Cc: David Hildenbrand, Greg Kroah-Hartman, Rafael J . Wysocki,
	Danilo Krummrich, Andrew Morton, Vishal Verma, Ying Huang,
	Dan Williams, Miaohe Lin, Naoya Horiguchi, linux-mm, linux-cxl,
	driver-core, linux-kernel, stable, muchun.song

On Tue, Apr 28, 2026 at 04:52:19PM +0800, Muchun Song wrote:
> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
> find_memory_block_by_id(), which requires device_hotplug_lock to
> serialize the xarray lookup against memory block removal.
> 
> Take device_hotplug_lock around the lookup and nr_hwpoison update so
> the memory block cannot disappear between xa_load() and get_device().
> 
> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
> Cc: stable@vger.kernel.org
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

It might have made sense to join both patches? Anyway:

Acked-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28  9:17   ` Oscar Salvador
@ 2026-04-28  9:21     ` Muchun Song
  0 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2026-04-28  9:21 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Muchun Song, David Hildenbrand, Greg Kroah-Hartman,
	Rafael J . Wysocki, Danilo Krummrich, Andrew Morton, Vishal Verma,
	Ying Huang, Dan Williams, Miaohe Lin, Naoya Horiguchi, linux-mm,
	linux-cxl, driver-core, linux-kernel, stable



> On Apr 28, 2026, at 17:17, Oscar Salvador <osalvador@suse.de> wrote:
> 
> On Tue, Apr 28, 2026 at 04:52:19PM +0800, Muchun Song wrote:
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>> find_memory_block_by_id(), which requires device_hotplug_lock to
>> serialize the xarray lookup against memory block removal.
>> 
>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>> the memory block cannot disappear between xa_load() and get_device().
>> 
>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> It might have made sense to join both patches? Anyway:

Either way works for me. I’ve been following the 'one thing per
patch' principle. If I still need to update v3, I can merge them;
otherwise, I’d prefer to keep it as is. I'm a little lazy. :)

> 
> Acked-by: Oscar Salvador <osalvador@suse.de>

Thanks.

> 
> 
> -- 
> Oscar Salvador
> SUSE Labs


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28  8:52 ` Muchun Song
  2026-04-28  9:17   ` Oscar Salvador
@ 2026-04-28 11:37   ` Miaohe Lin
  2026-04-28 11:40     ` Muchun Song
  1 sibling, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2026-04-28 11:37 UTC (permalink / raw)
  To: Muchun Song
  Cc: Vishal Verma, Ying Huang, Dan Williams, Naoya Horiguchi, linux-mm,
	linux-cxl, driver-core, linux-kernel, stable, muchun.song,
	David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J . Wysocki, Danilo Krummrich, Andrew Morton

On 2026/4/28 16:52, Muchun Song wrote:
> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
> find_memory_block_by_id(), which requires device_hotplug_lock to
> serialize the xarray lookup against memory block removal.
> 
> Take device_hotplug_lock around the lookup and nr_hwpoison update so
> the memory block cannot disappear between xa_load() and get_device().
> 
> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
> Cc: stable@vger.kernel.org
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Thanks for update.

> ---
>  drivers/base/memory.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 6981b55d582a..f76aee29e9a5 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>  void memblk_nr_poison_inc(unsigned long pfn)
>  {
>  	const unsigned long block_id = pfn_to_block_id(pfn);
> -	struct memory_block *mem = find_memory_block_by_id(block_id);
> +	struct memory_block *mem;
>  
> +	lock_device_hotplug();

memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
deadlock. Or am I miss something?

Thanks.
.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28 11:37   ` Miaohe Lin
@ 2026-04-28 11:40     ` Muchun Song
  2026-04-28 12:34       ` Miaohe Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2026-04-28 11:40 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J . Wysocki, Danilo Krummrich, Andrew Morton



> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
> 
> On 2026/4/28 16:52, Muchun Song wrote:
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>> find_memory_block_by_id(), which requires device_hotplug_lock to
>> serialize the xarray lookup against memory block removal.
>> 
>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>> the memory block cannot disappear between xa_load() and get_device().
>> 
>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> Thanks for update.
> 
>> ---
>> drivers/base/memory.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 6981b55d582a..f76aee29e9a5 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>> void memblk_nr_poison_inc(unsigned long pfn)
>> {
>> 	const unsigned long block_id = pfn_to_block_id(pfn);
>> - 	struct memory_block *mem = find_memory_block_by_id(block_id);
>> + 	struct memory_block *mem;
>> 
>> + 	lock_device_hotplug();
> 
> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
> deadlock. Or am I miss something?

I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?

Thanks.

> 
> Thanks.
> .



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28 11:40     ` Muchun Song
@ 2026-04-28 12:34       ` Miaohe Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2026-04-28 12:34 UTC (permalink / raw)
  To: Muchun Song
  Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J . Wysocki, Danilo Krummrich, Andrew Morton

On 2026/4/28 19:40, Muchun Song wrote:
> 
> 
>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>
>> On 2026/4/28 16:52, Muchun Song wrote:
>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>> serialize the xarray lookup against memory block removal.
>>>
>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>> the memory block cannot disappear between xa_load() and get_device().
>>>
>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>
>> Thanks for update.
>>
>>> ---
>>> drivers/base/memory.c | 10 ++++++++--
>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>> index 6981b55d582a..f76aee29e9a5 100644
>>> --- a/drivers/base/memory.c
>>> +++ b/drivers/base/memory.c
>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>> void memblk_nr_poison_inc(unsigned long pfn)
>>> {
>>> 	const unsigned long block_id = pfn_to_block_id(pfn);
>>> - 	struct memory_block *mem = find_memory_block_by_id(block_id);
>>> + 	struct memory_block *mem;
>>>
>>> + 	lock_device_hotplug();
>>
>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>> deadlock. Or am I miss something?
> 
> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?

Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:

remove_memory
  lock_device_hotplug <-- first called here
  try_remove_memory
    remove_memory_block_devices
      num_poisoned_pages_sub
        memblk_nr_poison_sub
          lock_device_hotplug <-- deadlock here

Hope I'm not mistaken again. :)

Thank.
.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
@ 2026-04-28 13:52 Muchun Song
  2026-04-29  3:08 ` Miaohe Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2026-04-28 13:52 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J Wysocki, Danilo Krummrich, Andrew Morton




> On Apr 28, 2026, at 20:34, Miaohe Lin <linmiaohe@huawei.com> wrote:
> On 2026/4/28 19:40, Muchun Song wrote:
>> 
>> 
>>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>> On 2026/4/28 16:52, Muchun Song wrote:
>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>>> serialize the xarray lookup against memory block removal.
>>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>>> the memory block cannot disappear between xa_load() and get_device().
>>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>>> Cc: stable@vger.kernel.org
>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> Thanks for update.
>>>> ---
>>>> drivers/base/memory.c | 10 ++++++++--
>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>>> index 6981b55d582a..f76aee29e9a5 100644
>>>> --- a/drivers/base/memory.c
>>>> +++ b/drivers/base/memory.c
>>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>>> void memblk_nr_poison_inc(unsigned long pfn)
>>>> {
>>>>    const unsigned long block_id = pfn_to_block_id(pfn);
>>>> -    struct memory_block *mem = find_memory_block_by_id(block_id);
>>>> +    struct memory_block *mem;
>>>> +    lock_device_hotplug();
>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>>> deadlock. Or am I miss something?
>> 
>> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
> 
> Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
> 
> remove_memory
>  lock_device_hotplug <-- first called here
>  try_remove_memory
>    remove_memory_block_devices
>      num_poisoned_pages_sub

Passing pfn = -1 here.

>        memblk_nr_poison_sub
>          lock_device_hotplug <-- deadlock here

No. Can’t reach here. No deadlock.

Thanks.

> 
> Hope I'm not mistaken again. :)
> 
> Thank.
> .

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-28 13:52 [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
@ 2026-04-29  3:08 ` Miaohe Lin
  2026-04-29  3:32   ` Oscar Salvador
  0 siblings, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2026-04-29  3:08 UTC (permalink / raw)
  To: Muchun Song
  Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Oscar Salvador, Greg Kroah-Hartman,
	Rafael J Wysocki, Danilo Krummrich, Andrew Morton

On 2026/4/28 21:52, Muchun Song wrote:
> 
> 
> 
>> On Apr 28, 2026, at 20:34, Miaohe Lin <linmiaohe@huawei.com> wrote:
>> On 2026/4/28 19:40, Muchun Song wrote:
>>>
>>>
>>>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@huawei.com> wrote:
>>>> On 2026/4/28 16:52, Muchun Song wrote:
>>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>>>> serialize the xarray lookup against memory block removal.
>>>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>>>> the memory block cannot disappear between xa_load() and get_device().
>>>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>> Thanks for update.
>>>>> ---
>>>>> drivers/base/memory.c | 10 ++++++++--
>>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>>>> index 6981b55d582a..f76aee29e9a5 100644
>>>>> --- a/drivers/base/memory.c
>>>>> +++ b/drivers/base/memory.c
>>>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>>>> void memblk_nr_poison_inc(unsigned long pfn)
>>>>> {
>>>>>    const unsigned long block_id = pfn_to_block_id(pfn);
>>>>> -    struct memory_block *mem = find_memory_block_by_id(block_id);
>>>>> +    struct memory_block *mem;
>>>>> +    lock_device_hotplug();
>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>>>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>>>> deadlock. Or am I miss something?
>>>
>>> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
>>
>> Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
>>
>> remove_memory
>>  lock_device_hotplug <-- first called here
>>  try_remove_memory
>>    remove_memory_block_devices
>>      num_poisoned_pages_sub
> 
> Passing pfn = -1 here.
> 
>>        memblk_nr_poison_sub
>>          lock_device_hotplug <-- deadlock here
> 
> No. Can’t reach here. No deadlock.

Right, I missed that. Thanks. But I'm still worried that there might be potential issues.
For example, this function could be called while lock_page is held. Acquiring lock_device_hotplug
while already holding lock_page might cause problems, though I haven't seen any specific issues yet.
Also there might be some other potential scenarios that haven't been considered. Hope I'm just
overthinking it. :)

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.
.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-29  3:08 ` Miaohe Lin
@ 2026-04-29  3:32   ` Oscar Salvador
  2026-04-29  4:18     ` Muchun Song
  0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2026-04-29  3:32 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Muchun Song, Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Greg Kroah-Hartman, Rafael J Wysocki,
	Danilo Krummrich, Andrew Morton

On Wed, Apr 29, 2026 at 11:08:51AM +0800, Miaohe Lin wrote:
> Right, I missed that. Thanks. But I'm still worried that there might be potential issues.
> For example, this function could be called while lock_page is held. Acquiring lock_device_hotplug
> while already holding lock_page might cause problems, though I haven't seen any specific issues yet.
> Also there might be some other potential scenarios that haven't been considered. Hope I'm just
> overthinking it. :)

lock_device_hotplug is a mutex lock, and we already take other mutex locks while
holding lock_folio in other paths, so I am not sure I see what should be special
in this case.


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
  2026-04-29  3:32   ` Oscar Salvador
@ 2026-04-29  4:18     ` Muchun Song
  0 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2026-04-29  4:18 UTC (permalink / raw)
  To: Oscar Salvador, Miaohe Lin
  Cc: Muchun Song, Vishal Verma, Ying Huang, Dan Williams,
	Naoya Horiguchi, linux-mm, linux-cxl, driver-core, linux-kernel,
	stable, David Hildenbrand, Greg Kroah-Hartman, Rafael J Wysocki,
	Danilo Krummrich, Andrew Morton



> On Apr 29, 2026, at 11:32, Oscar Salvador <osalvador@suse.de> wrote:
> 
> On Wed, Apr 29, 2026 at 11:08:51AM +0800, Miaohe Lin wrote:
>> Right, I missed that. Thanks. But I'm still worried that there might be potential issues.
>> For example, this function could be called while lock_page is held. Acquiring lock_device_hotplug
>> while already holding lock_page might cause problems, though I haven't seen any specific issues yet.
>> Also there might be some other potential scenarios that haven't been considered. Hope I'm just
>> overthinking it. :)
> 
> lock_device_hotplug is a mutex lock, and we already take other mutex locks while
> holding lock_folio in other paths, so I am not sure I see what should be special
> in this case.

Hi Oscar and Miaohe,

I saw sashiko's report [1] related to folio lock and lock_device_hotplug.
Seems it is possible. You can correct me if I am wrong.

[1] https://sashiko.dev/#/patchset/20260428085219.1316047-1-songmuchun%40bytedance.com

We could fix this by calling action_result() without holding folio lock.
What do you think?

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ee42d4361309..bc76066547ce 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2534,8 +2534,9 @@ int memory_failure(unsigned long pfn, int flags)
         * Abort on fail: __filemap_remove_folio() assumes unmapped page.
         */
        if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
+               folio_unlock(folio);
                res = action_result(pfn, MF_MSG_UNMAP_FAILED, MF_FAILED);
-               goto unlock_page;
+               goto unlock_mutex;
        }

        /*
@@ -2543,16 +2544,15 @@ int memory_failure(unsigned long pfn, int flags)
         */
        if (folio_test_lru(folio) && !folio_test_swapcache(folio) &&
            folio->mapping == NULL) {
+               folio_unlock(folio);
                res = action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED);
-               goto unlock_page;
+               goto unlock_mutex;
        }

 identify_page_state:
        res = identify_page_state(pfn, p, page_flags);
        mutex_unlock(&mf_mutex);
        return res;
-unlock_page:
-       folio_unlock(folio);
 unlock_mutex:
        mutex_unlock(&mf_mutex);
        return res;

> 
> 
> -- 
> Oscar Salvador
> SUSE Labs


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-04-29  4:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 13:52 [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup Muchun Song
2026-04-29  3:08 ` Miaohe Lin
2026-04-29  3:32   ` Oscar Salvador
2026-04-29  4:18     ` Muchun Song
     [not found] <20260428085219.1316047-1-songmuchun@bytedance.com>
2026-04-28  8:52 ` Muchun Song
2026-04-28  9:17   ` Oscar Salvador
2026-04-28  9:21     ` Muchun Song
2026-04-28 11:37   ` Miaohe Lin
2026-04-28 11:40     ` Muchun Song
2026-04-28 12:34       ` Miaohe Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox