From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>
Cc: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
Matthew Wilcox <willy@infradead.org>,
Luis Chamberlain <mcgrof@kernel.org>,
Jinjiang Tu <tujinjiang@huawei.com>,
Oscar Salvador <osalvador@suse.de>,
akpm@linux-foundation.org, linmiaohe@huawei.com,
mhocko@kernel.org, linux-mm@kvack.org,
wangkefeng.wang@huawei.com
Subject: Re: [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range
Date: Mon, 14 Jul 2025 17:33:40 +0200 [thread overview]
Message-ID: <c0f5492c-f9fe-48c8-98bc-d8cc8e7e00b3@redhat.com> (raw)
In-Reply-To: <A1B06D4B-5B39-47C3-94E9-93FE3F0A758D@nvidia.com>
On 14.07.25 17:28, Zi Yan wrote:
> On 14 Jul 2025, at 11:25, Zi Yan wrote:
>
>> On 14 Jul 2025, at 11:14, David Hildenbrand wrote:
>>
>>> On 14.07.25 17:09, Pankaj Raghav (Samsung) wrote:
>>>>>>>> So we will need to take care of madvise cold or pageout case?
>>>>>>>>
>>>>>>>> Hi Matthew, Pankaj, and Luis,
>>>>>>>>
>>>>>>>> Is it possible to partially map a min-order folio in a fs with LBS? Based on my
>>>>>>>
>>>>>>> Typically, FSs match the min order with the blocksize of the filesystem.
>>>>>>> As a filesystem block is the smallest unit of data that the filesystem uses
>>>>>>> to store file data on the disk, we cannot partially map them.
>>>>>>>
>>>>>>> So if I understand your question correctly, the answer is no.
>>>>>
>>>>> I'm confused. Shouldn't this be trivially possible?
>>>>>
>>>> Hmm, maybe I misunderstood the question?
>>>>
>>>>> E.g., just mmap() a single page of such a file? Who would make that fail?
>>>>>
>>>>
>>>> My point was, even if you try to mmap a single page of a file, page
>>>> cache will read the whole block (that corresponds to min order folio).
>>>>
>>>> Technically we can mmap a single page of file, but FS will always read
>>>> and write **at least** in min folio order chunks.
>>>
>>> Okay, so it can be partially mapped into page tables :) What happens in the background (page cache management) is a different story
>>
>> David, thanks for getting to the bottom of this.
>>
>> OK. So we will see deadlock looping in madvise cold or pageout case.
>> I wonder how to proceed with this. Since the folio is seen as a whole
>> by fs, it should be marked cold/paged out as a whole. Maybe we should
>> skip the partially mapped region?
>
> Actually, it is skipped, since split_folio() bumps new_order to the min
> order, and if the folio order is already at min order, split code return
> -EINVAL. This makes the madvise cold or pageout code move to the next
> address.
But what if the folio order is 2x min_order etc?
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2025-07-14 15:33 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-27 12:57 [PATCH v2 0/2] fix two calls of unmap_poisoned_folio() for large folio Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 1/2] mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list Jinjiang Tu
2025-06-27 17:10 ` David Hildenbrand
2025-06-27 22:00 ` Andrew Morton
2025-06-28 2:38 ` Jinjiang Tu
2025-06-28 3:13 ` Miaohe Lin
2025-07-01 14:13 ` Oscar Salvador
2025-07-03 7:30 ` Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range Jinjiang Tu
2025-07-01 14:21 ` Oscar Salvador
2025-07-03 7:46 ` Jinjiang Tu
2025-07-03 7:57 ` David Hildenbrand
2025-07-03 8:24 ` Jinjiang Tu
2025-07-03 9:06 ` David Hildenbrand
2025-07-07 11:51 ` Jinjiang Tu
2025-07-07 12:37 ` David Hildenbrand
2025-07-08 1:15 ` Jinjiang Tu
2025-07-08 9:54 ` David Hildenbrand
2025-07-09 16:27 ` Zi Yan
2025-07-14 13:53 ` Pankaj Raghav
2025-07-14 14:20 ` Zi Yan
2025-07-14 14:24 ` David Hildenbrand
2025-07-14 15:09 ` Pankaj Raghav (Samsung)
2025-07-14 15:14 ` David Hildenbrand
2025-07-14 15:25 ` Zi Yan
2025-07-14 15:28 ` Zi Yan
2025-07-14 15:33 ` David Hildenbrand [this message]
2025-07-14 15:44 ` Zi Yan
2025-07-14 15:52 ` David Hildenbrand
2025-07-20 2:23 ` Andrew Morton
2025-07-22 15:30 ` David Hildenbrand
2025-08-21 5:02 ` Andrew Morton
2025-08-21 22:07 ` David Hildenbrand
2025-08-22 17:24 ` Zi Yan
2025-08-25 2:05 ` Miaohe Lin
2025-07-03 7:53 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c0f5492c-f9fe-48c8-98bc-d8cc8e7e00b3@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kernel@pankajraghav.com \
--cc=linmiaohe@huawei.com \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=tujinjiang@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).