public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, osalvador@suse.de,
	nao.horiguchi@gmail.com, mhocko@suse.com,
	Wupeng Ma <mawupeng1@huawei.com>
Subject: Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
Date: Tue, 21 Jan 2025 08:58:41 +0100	[thread overview]
Message-ID: <2fc40750-2ff7-4e73-ba52-c4d9caaa4f4f@redhat.com> (raw)
In-Reply-To: <aab326a0-968f-fbdf-be30-78e771f6fda2@huawei.com>

On 21.01.25 04:20, Miaohe Lin wrote:
> On 2025/1/20 16:46, David Hildenbrand wrote:
>> On 20.01.25 08:49, David Hildenbrand wrote:
>>>
>>>>>         if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>>             struct address_space *mapping;
>>>>>     @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>             if (!mapping) {
>>>>>                 pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>>                     folio_pfn(folio));
>>>>> -            return;
>>>>> +            return -EBUSY;
>>>>>             }
>>>>>                try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>         } else {
>>>>>             try_to_unmap(folio, ttu);
>>>>>         }
>>>>> +
>>>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>>>
>>>> Do we really need this return value? It's unused in do_migrate_range().
>>>
>>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>>> It cleans up hwpoison_user_mappings() quite nicely.
>>
>> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.
> 
> Do you mean TTU_SYNC? It seems it's not set.

With your patch it will be now, which is the right thing to do I think.

> 
> There might be a race will hit the proposed pr_warn_ratelimit():
> 
> /* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
> do_migrate_range
>   folio_test_hwpoison
>    folio_mapped
>    <folio is isolated for reclaim again.>
>     unmap_poisoned_folio
>    <folio is put buck.>
>      pr_warn_ratelimit(folio_mapped)
> 
> But I might be miss something. And even this race is possible, it should be really hard to hit.

Does try_to_unmap() care about isolation? Skimming over the code, I 
don't think so. I assume once we take the folio lock, races with reclaim 
are impossible.

In any case, the race is unexpected, so pr_warn_() would be helpful and 
not harmful.

Memory offlining code will later simply skip all PageHWPoison() pages, 
independent of the refcount as it seems. Failing to unmap might not be 
handled correctly at all ... I think this might be problematic in other 
regard (e.g., GUP references), but failing to unmap is "obviously" bad I 
think :)

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-01-21  7:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-16  6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
2025-01-17  3:57   ` kernel test robot
2025-01-17  4:16     ` mawupeng
2025-01-17  4:39   ` kernel test robot
2025-01-17  4:49   ` kernel test robot
2025-01-20  6:24   ` Miaohe Lin
2025-01-20  7:49     ` David Hildenbrand
2025-01-20  8:46       ` David Hildenbrand
2025-01-21  3:20         ` Miaohe Lin
2025-01-21  7:58           ` David Hildenbrand [this message]
2025-01-22  7:38             ` Miaohe Lin
2025-01-21  2:46       ` Miaohe Lin
2025-01-20  9:06     ` mawupeng
2025-01-20  7:55   ` David Hildenbrand
2025-01-16  6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
2025-01-20  9:25   ` David Hildenbrand
2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
2025-01-20  6:32   ` Miaohe Lin
2025-01-21  2:17     ` mawupeng
2025-01-20  8:01   ` David Hildenbrand
2025-01-20  9:11     ` mawupeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2fc40750-2ff7-4e73-ba52-c4d9caaa4f4f@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawupeng1@huawei.com \
    --cc=mhocko@suse.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox