Re: [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Jinjiang Tu <tujinjiang@huawei.com>, Oscar Salvador <osalvador@suse.de>
Cc: akpm@linux-foundation.org, linmiaohe@huawei.com,
	mhocko@kernel.org, linux-mm@kvack.org,
	wangkefeng.wang@huawei.com
Subject: Re: [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range
Date: Thu, 3 Jul 2025 09:57:50 +0200	[thread overview]
Message-ID: <c46e6ba9-edfe-44ea-8d2a-dea812b6d3b5@redhat.com> (raw)
In-Reply-To: <373d02c5-2b62-8543-b786-8fd591ad56eb@huawei.com>

On 03.07.25 09:46, Jinjiang Tu wrote:
> 
> 在 2025/7/1 22:21, Oscar Salvador 写道:
>> On Fri, Jun 27, 2025 at 08:57:47PM +0800, Jinjiang Tu wrote:
>>> In do_migrate_range(), the hwpoisoned folio may be large folio, which
>>> can't be handled by unmap_poisoned_folio().
>>>
>>> I can reproduce this issue in qemu after adding delay in memory_failure()
>>>
>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>> Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>>> RIP: 0010:try_to_unmap_one+0x16a/0xfc0
>>>    <TASK>
>>>    rmap_walk_anon+0xda/0x1f0
>>>    try_to_unmap+0x78/0x80
>>>    ? __pfx_try_to_unmap_one+0x10/0x10
>>>    ? __pfx_folio_not_mapped+0x10/0x10
>>>    ? __pfx_folio_lock_anon_vma_read+0x10/0x10
>>>    unmap_poisoned_folio+0x60/0x140
>>>    do_migrate_range+0x4d1/0x600
>>>    ? slab_memory_callback+0x6a/0x190
>>>    ? notifier_call_chain+0x56/0xb0
>>>    offline_pages+0x3e6/0x460
>>>    memory_subsys_offline+0x130/0x1f0
>>>    device_offline+0xba/0x110
>>>    acpi_bus_offline+0xb7/0x130
>>>    acpi_scan_hot_remove+0x77/0x290
>>>    acpi_device_hotplug+0x1e0/0x240
>>>    acpi_hotplug_work_fn+0x1a/0x30
>>>    process_one_work+0x186/0x340
>>>
>>> In this case, just make offline_pages() fail.
>>>
>>> Besides, do_migrate_range() may be called between memory_failure set
>>> hwposion flag and ioslate the folio from lru, so remove WARN_ON(). In other
>>> places, unmap_poisoned_folio() is called when the folio is isolated, obey
>>> it in do_migrate_range() too.
>>>
>>> Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined")
>>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
>> ...
>>> @@ -2041,11 +2048,9 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>>>    
>>>    			ret = scan_movable_pages(pfn, end_pfn, &pfn);
>>>    			if (!ret) {
>>> -				/*
>>> -				 * TODO: fatal migration failures should bail
>>> -				 * out
>>> -				 */
>>> -				do_migrate_range(pfn, end_pfn);
>>> +				ret = do_migrate_range(pfn, end_pfn);
>>> +				if (ret)
>>> +					break;
>> I am not really sure about this one.
>> I get the reason you're adding it, but note that migrate_pages() can also return
>> "fatal" errors and we don't propagate that.
>>
>> The moto has always been to migrate as much as possible, and this changes this
>> behaviour.
>>    
> If we just skip to next pfn, offline_pages() will deadloop meaningless
> util received signal.

Yeah, that's also not good,

> It seems there is no document to guarantee memory offline have to
> migrate as much as possible.

We should try offlining as good as possible. But if there is something 
we just cannot possibly migrate, there is no sense in retrying.

Now, could we run into this case here because we are racing with other 
code, and actually retrying again could make it work?

Remind me again: how exactly do we arrive at this point of having a 
large folio that is hwpoisoned but still mapped?

In memory_failure(), we do on a  large folio

1) folio_set_has_hwpoisoned
2) try_to_split_thp_page
3) if splitting fails, kill_procs_now

So given that, couldn't we just retry the migration until the race is 
over and we are good?

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2025-07-03  7:57 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27 12:57 [PATCH v2 0/2] fix two calls of unmap_poisoned_folio() for large folio Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 1/2] mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list Jinjiang Tu
2025-06-27 17:10   ` David Hildenbrand
2025-06-27 22:00   ` Andrew Morton
2025-06-28  2:38     ` Jinjiang Tu
2025-06-28  3:13   ` Miaohe Lin
2025-07-01 14:13   ` Oscar Salvador
2025-07-03  7:30     ` Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range Jinjiang Tu
2025-07-01 14:21   ` Oscar Salvador
2025-07-03  7:46     ` Jinjiang Tu
2025-07-03  7:57       ` David Hildenbrand [this message]
2025-07-03  8:24         ` Jinjiang Tu
2025-07-03  9:06           ` David Hildenbrand
2025-07-07 11:51             ` Jinjiang Tu
2025-07-07 12:37               ` David Hildenbrand
2025-07-08  1:15                 ` Jinjiang Tu
2025-07-08  9:54                   ` David Hildenbrand
2025-07-09 16:27                     ` Zi Yan
2025-07-14 13:53                       ` Pankaj Raghav
2025-07-14 14:20                         ` Zi Yan
2025-07-14 14:24                           ` David Hildenbrand
2025-07-14 15:09                             ` Pankaj Raghav (Samsung)
2025-07-14 15:14                               ` David Hildenbrand
2025-07-14 15:25                                 ` Zi Yan
2025-07-14 15:28                                   ` Zi Yan
2025-07-14 15:33                                     ` David Hildenbrand
2025-07-14 15:44                                       ` Zi Yan
2025-07-14 15:52                                         ` David Hildenbrand
2025-07-20  2:23                                           ` Andrew Morton
2025-07-22 15:30                                             ` David Hildenbrand
2025-08-21  5:02                                               ` Andrew Morton
2025-08-21 22:07                                                 ` David Hildenbrand
2025-08-22 17:24                                                   ` Zi Yan
2025-08-25  2:05                                                   ` Miaohe Lin
2025-07-03  7:53   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c46e6ba9-edfe-44ea-8d2a-dea812b6d3b5@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=tujinjiang@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).