linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, rppt@kernel.org, linux-mm@kvack.org,
	Nathan Zimmer <nzimmer@sgi.com>
Subject: Re: [PATCH 2/4] mm: not __SetPageReserved on initializing hot-plugged memory
Date: Sat, 29 Jun 2024 16:38:47 +0200	[thread overview]
Message-ID: <b0adbb0c-ad59-4bc5-ba0b-0af464b94557@redhat.com> (raw)
In-Reply-To: <20240629083222.35mebt7kqxiepsgg@master>

On 29.06.24 10:32, Wei Yang wrote:
> On Sat, Jun 29, 2024 at 08:19:49AM +0200, David Hildenbrand wrote:
>> On 29.06.24 03:33, Wei Yang wrote:
>>> Initialize all pages reserved is an ancient behavior.
>>>
>>> Since commit 92923ca3aace ("mm: meminit: only set page reserved in the
>>> memblock region"), SetPageReserved is removed from
>>> __init_single_page(). Only those reserved pages are marked PG_reserved.
>>>
>>> But we still set PG_reserved on offline and check it on online.
>>>
>>> Following two commits removed both of them:
>>>
>>> * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved()
>>>     when offlining") removed the set on offline.
>>> * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in
>>>     online_pages_range()") removed the check on online.
>>>
>>> This means we set PG_reserved for hot-plugged memory at initialization
>>> is not helpful and a little different from bootmem initialization path.
>>> Now we can remove it.
>>
>> It's not that easy for ZONE_DEVICE.
>>
>> Also, see mm/mm-stable
>>
>> commit 3dadec1babf9eee0c67c967df931d6f0cb124a04
>> Author: David Hildenbrand <david@redhat.com>
>> Date:   Fri Jun 7 11:09:36 2024 +0200
>>
>>     mm: pass meminit_context to __free_pages_core()
>>
>>     Patch series "mm/memory_hotplug: use PageOffline() instead of
>>     PageReserved() for !ZONE_DEVICE".
>>
>>
>> commit b873faaa609ab44c223b2327f55d2b6a2ba4ca9c
>> Author: David Hildenbrand <david@redhat.com>
>> Date:   Fri Jun 7 11:09:37 2024 +0200
>>
>>     mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline()
>> instead of PageReserved()
>>
> 
> Let me try to understand this.
> 
> You also tries to get rid of PG_reserved but you want PG_offline instead,
> because this benefit virtio-mem, right?

We now make proper use of PG_offline. All hotplugged pages start out 
PG_offline once we turn the section online. Only the ones that actually 
get exposed to the buddy -- actually get onlined -- get PG_offline 
cleared. A side effect of that is less hacks for virtio-mem, and more 
natural handling for the other ballooning drivers that hotplug memory.

In the future, I'm planning on moving more fake-offlining code from 
virtio-mem the core, making use of more PG_offline in memory.

For now, it's stops the PG_reserved use while maintaining the same 
semantics as before: the page content and "struct page" is not to be 
touched by anybody except the "owner".

> 
> But I don't get why PG_offline is wrong for ZONE_DEVICE. I may miss some
> knowledge for it.

I suggest you take a look at the PG_offline documentation. ZONE_DEVICE 
are certainly not logically offline pages. They will never be considered 
online as part of online sections. But they will never be handed to the 
buddy.

Maybe we want a dedicate page type for them in the future, not sure. We 
can right now identify them reliably using the zone idx.

Using a page type right now is very likely not possible, because we 
might be using the page->_mapcount in rmap code when mapping some of 
them to user space.

If we want to get rid of the PG_reserved for them right now, we'll have 
to make sure all existing PageReserved checks won't be degraded. For
example, drivers/vfio/vfio_iommu_type1.c might need some work (no sure).

The KVM one in kvm_pfn_to_refcounted_page() should already be fine, 
because they really want to refcount them.

A lot of other ones like can_gather_numa_stats(), already refuse 
is_zone_device_page() manually, and maybe we want to factor both checks 
out into a separate function like "is_special_reserved_page()" or sth 
like that.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-06-29 14:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-29  1:33 [PATCH 1/4] mm: use zonelist_zone() to get zone Wei Yang
2024-06-29  1:33 ` [PATCH 2/4] mm: not __SetPageReserved on initializing hot-plugged memory Wei Yang
2024-06-29  6:19   ` David Hildenbrand
2024-06-29  8:32     ` Wei Yang
2024-06-29 14:38       ` David Hildenbrand [this message]
2024-06-30  7:32         ` Wei Yang
2024-06-29  1:33 ` [PATCH 3/4] mm/page_alloc: put __free_pages_core() in __meminit section Wei Yang
2024-06-29  1:33 ` [PATCH 4/4] mm/page_alloc: no need to ClearPageReserved on giving page to buddy system Wei Yang
2024-06-29  3:21   ` Matthew Wilcox
2024-06-29  8:44     ` Wei Yang
2024-06-29 16:28       ` Matthew Wilcox
2024-06-29 16:45         ` Matthew Wilcox
2024-06-30  7:30           ` Wei Yang
     [not found]   ` <4a93f7b7-8ba8-4877-99c7-1048674d074d@redhat.com>
     [not found]     ` <299a4d6a-6b76-49b7-be2e-573cd66fd46f@redhat.com>
2024-06-29  8:48       ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0adbb0c-ad59-4bc5-ba0b-0af464b94557@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=nzimmer@sgi.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).