linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	virtualization@lists.linux.dev,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Brendan Jackman" <jackmanb@google.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: Re: [PATCH v1 1/2] mm/memory_hotplug: PG_offline_skippable for offlining memory blocks with PageOffline pages
Date: Wed, 14 May 2025 21:51:52 +0200	[thread overview]
Message-ID: <fb12fab1-f2df-4b4f-ae83-1b45e2a7d6bd@redhat.com> (raw)
In-Reply-To: <7E5BD96D-971F-4AFC-AC09-503310BE8E68@nvidia.com>

Thanks a bucnh for the review!

>> +
>> +		if (PageOffline(page) && PageOfflineSkippable(page))
>> +			continue;
>> +
> 
> Some comment like "Skippable PageOffline() pages are not migratable but are
> skipped during memory offline" might help understand the change.

I can add a comment like for the other cases.

> 
> Some thoughts after reading the related code:
> 
> offline_pages() is a little convoluted, since it has two steps to make
> sure a range of memory can be offlined:
> 1. start_isolate_page_range() checks unmovable (maybe not migratable
> is more precise) pages using has_unmovable_pages(), but leaves unmovable
> PageOffline() page handling to the driver;

Right, it's best-effort. For ZONE_MOVABLE we're skipping the checks 
completely.

I was wondering for a longer time that -- with the isolate flag being a 
separate bit soon :) -- we could simply isolate the whole range, and 
then fail if we stumble over an unmovable page during migration. That 
is, get rid of has_unmovable_pages() entirely. Un-doing the isolation 
would then properly preserve the migratetype -- no harm done?

Certainly worth a look. What do you think about that?

> 2. scan_movable_pages() and do_migrate_range() migrate pages and handle
> different types of PageOffline() pages.

Right, migrate what we can, skip these special once, and bail out on any 
others (at least in this patch, patch #2 restores the pre-virtio-mem 
behavior).

> 
> It might make the logic cleaner if start_isolate_page_range() can
> have a callback to allow driver to handle PageOffline() pages.

We have to identify them repeadetly, so start_isolate_page_range() would 
not be sufficient. Also, callbacks are rather tricky for the case where 
we cannot really stabilize the page.

> 
> Hmm, Skippable PageOffline() pages are virtio-mem specific, I wonder
> why offline_pages() takes care of them. Shouldn't virtio-mem driver
> handle them?
 > I also realize that the two steps in offline_pages()> are very 
similar to alloc_contig_range() and virtio-mem is using
> alloc_contig_range(). I wonder if the two steps in offline_pages()
> can be abstracted out and shared with virtio-mem.Yes, offline_pages()> operates at memory section granularity, different from the granularity,
> pageblock size, of alloc_contig_range() used in virtio-mem, but
> both are trying to check unmovable pages and migrate movable pages.

Not sure I get completely what you mean. virtio-mem really wants to use 
alloc_contig_range() to allocate ranges it wants to unplug, because it 
will fail fast and allows for smaller ranges.

offline_pages() is primarily also about offlining the memory section, 
which virtio-mem must do in order to remove the Linux memory block.

> 
> 
>>   		folio = page_folio(page);
>>
>>   		if (!folio_try_get(folio))
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index a6fe1e9b95941..325b51c905076 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -7023,12 +7023,12 @@ unsigned long __offline_isolated_pages(unsigned long start_pfn,
>>   			continue;
>>   		}
>>   		/*
>> -		 * At this point all remaining PageOffline() pages have a
>> -		 * reference count of 0 and can simply be skipped.
>> +		 * At this point all remaining PageOffline() pages must be
>> +		 * "skippable" and have exactly one reference.
>>   		 */
>>   		if (PageOffline(page)) {
>> -			BUG_ON(page_count(page));
>> -			BUG_ON(PageBuddy(page));
>> +			WARN_ON_ONCE(!PageOfflineSkippable(page));
>> +			WARN_ON_ONCE(page_count(page) != 1);
>>   			already_offline++;
>>   			pfn++;
>>   			continue;
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index b2fc5266e3d26..debd898b794ea 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -121,16 +121,11 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
>>   			continue;
>>
>>   		/*
>> -		 * We treat all PageOffline() pages as movable when offlining
>> -		 * to give drivers a chance to decrement their reference count
>> -		 * in MEM_GOING_OFFLINE in order to indicate that these pages
>> -		 * can be offlined as there are no direct references anymore.
>> -		 * For actually unmovable PageOffline() where the driver does
>> -		 * not support this, we will fail later when trying to actually
>> -		 * move these pages that still have a reference count > 0.
>> -		 * (false negatives in this function only)
>> +		 * PageOffline() pages that are marked as "skippable"
>> +		 * are treated like movable pages for memory offlining.
>>   		 */
>> -		if ((flags & MEMORY_OFFLINE) && PageOffline(page))
>> +		if ((flags & MEMORY_OFFLINE) && PageOffline(page) &&
>> +		    PageOfflineSkippable(page))
>>   			continue;
> 
> With this change, we are no longer give non-virtio-mem driver a chance
> to decrease PageOffline(page) refcount? Or virtio-mem is the only driver
> doing this?

The only in-tree driver making use of this so far, yes.


There is one tweak I'll have to perform in the virtio-mem driver to 
cover one corner case: when force-unloading the virtio-mem driver, we 
have to make sure that memory blocks with fake-offline pages cannot get 
offlined (re-onlining would be bad). I'll fix that up.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-05-14 19:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14 11:15 [PATCH v1 0/2] mm/memory_hotplug: introduce and use PG_offline_skippable David Hildenbrand
2025-05-14 11:15 ` [PATCH v1 1/2] mm/memory_hotplug: PG_offline_skippable for offlining memory blocks with PageOffline pages David Hildenbrand
2025-05-14 19:00   ` Zi Yan
2025-05-14 19:51     ` David Hildenbrand [this message]
2025-05-14 20:30       ` Zi Yan
2025-05-19 14:39         ` David Hildenbrand
2025-05-14 11:15 ` [PATCH v1 2/2] mm/memory_hotplug: remove -EBUSY handling from scan_movable_pages() David Hildenbrand
2025-05-14 19:57   ` David Hildenbrand
2025-05-14 13:45 ` [PATCH v1 0/2] mm/memory_hotplug: introduce and use PG_offline_skippable Zi Yan
2025-05-14 14:12   ` David Hildenbrand
2025-05-14 15:49     ` Zi Yan
2025-05-14 17:28       ` David Hildenbrand
2025-05-14 17:43         ` Zi Yan
2025-05-14 17:46           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb12fab1-f2df-4b4f-ae83-1b45e2a7d6bd@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=eperezma@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux.dev \
    --cc=willy@infradead.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).