All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	corbet@lwn.net, akpm@linux-foundation.org, paulmck@kernel.org,
	mike.kravetz@oracle.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	duanxiongchun@bytedance.com, smuchun@gmail.com
Subject: Re: [PATCH v2 2/2] mm: memory_hotplug: introduce SECTION_CANNOT_OPTIMIZE_VMEMMAP
Date: Fri, 17 Jun 2022 11:25:20 +0200	[thread overview]
Message-ID: <751b003d-e7c2-dfcc-82b3-e80612cdc104@redhat.com> (raw)
In-Reply-To: <YqxE/yJ1srzpegPb@FVFYT0MHHV2J.usts.net>

On 17.06.22 11:10, Muchun Song wrote:
> On Fri, Jun 17, 2022 at 09:39:27AM +0200, David Hildenbrand wrote:
>> On 17.06.22 09:28, Muchun Song wrote:
>>> On Fri, Jun 17, 2022 at 07:46:53AM +0200, Oscar Salvador wrote:
>>>> On Thu, Jun 16, 2022 at 09:30:33AM +0200, David Hildenbrand wrote:
>>>>> IIRC, that was used to skip these patches on the offlining path before
>>>>> we provided the ranges to offline_pages().
>>>>
>>>> Yeah, it was designed for that purpose back then.
>>>>
>>>>> I'd not mess with PG_reserved, and give them a clearer name, to not
>>>>> confuse them with other, ordinary, vmemmap pages that are not
>>>>> self-hosted (maybe in the future we might want to flag all vmemmap pages
>>>>> with a new type?).
>>>>
>>>> Not sure whether a new type is really needed, or to put it another way, I
>>>> cannot see the benefit.
>>>>
>>>>>
>>>>> I'd just try reusing the flag PG_owner_priv_1. And eventually, flag all
>>>>> (v)memmap pages with a type PG_memmap. However, the latter would be
>>>>> optional and might not be strictly required
>>>>>
>>>>>
>>>>> So what think could make sense is
>>>>>
>>>>> /* vmemmap pages that are self-hosted and cannot be optimized/freed. */
>>>>> PG_vmemmap_self_hosted = PG_owner_priv_1,
>>>>
>>>> Sure, I just lightly tested the below, and seems to work, but not sure
>>>> whether that is what you are referring to.
>>>> @Munchun: thoughts?
>>>>
>>>
>>> I think it works and fits my requirement.
>>>
>>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>>>> index e66f7aa3191d..a4556afd7bda 100644
>>>> --- a/include/linux/page-flags.h
>>>> +++ b/include/linux/page-flags.h
>>>> @@ -193,6 +193,11 @@ enum pageflags {
>>>>  
>>>>  	/* Only valid for buddy pages. Used to track pages that are reported */
>>>>  	PG_reported = PG_uptodate,
>>>> +
>>>> +#ifdef CONFIG_MEMORY_HOTPLUG
>>>> +	/* For self-hosted memmap pages */
>>>> +	PG_vmemmap_self_hosted = PG_owner_priv_1,
>>>> +#endif
>>>>  };
>>>>  
>>>>  #define PAGEFLAGS_MASK		((1UL << NR_PAGEFLAGS) - 1)
>>>> @@ -628,6 +633,10 @@ PAGEFLAG_FALSE(SkipKASanPoison, skip_kasan_poison)
>>>>   */
>>>>  __PAGEFLAG(Reported, reported, PF_NO_COMPOUND)
>>>>  
>>>> +#ifdef CONFIG_MEMORY_HOTPLUG
>>>> +PAGEFLAG(Vmemmap_self_hosted, vmemmap_self_hosted, PF_ANY)
>>>> +#endif
>>>> +
>>>>  /*
>>>>   * On an anonymous page mapped into a user virtual memory area,
>>>>   * page->mapping points to its anon_vma, not to a struct address_space;
>>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>>> index 1089ea8a9c98..e2de7ed27e9e 100644
>>>> --- a/mm/hugetlb_vmemmap.c
>>>> +++ b/mm/hugetlb_vmemmap.c
>>>> @@ -101,6 +101,14 @@ void hugetlb_vmemmap_free(struct hstate *h, struct page *head)
>>>>  {
>>>>  	unsigned long vmemmap_addr = (unsigned long)head;
>>>>  	unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;
>>>> +	struct mem_section *ms = __pfn_to_section(page_to_pfn(head));
>>>> +	struct page *memmap;
>>>> +
>>>> +	memmap = sparse_decode_mem_map(ms->section_mem_map,
>>>> +				       pfn_to_section_nr(page_to_pfn(head)));
>>>> +
>>>> +	if (PageVmemmap_self_hosted(memmap))
>>>> +		return;
>>>
>>> I think here needs a loop if it is a 1GB page (spans multiple sections).
>>> Right?  Here is an implementation based on another approach. But I think
>>> your implementation is more simpler and efficient.  Would you mind me
>>> squash your diff into my patch and with your "Co-developed-by"?
>>
>> Due to hugtlb alignment requirements, and the vmemmap pages being at the
>> start of the hotplugged memory region, I think that cannot currently
>> happen. Checking the first vmemmap page might be good enough for now,
>> and probably for the future.
>>
> 
> If the memory block size is 128MB, then a 1GB huge page spans 8 blocks.
> Is it possible that some blocks of them are vmemmap-hosted?

No, don't think so. If you think about it, a huge/gigantic page can only
start in a memmap-on-memory region but never end in on (or overlap one)
-- because the reserved memmap part of the memory block always precedes
actually usable data.

So even with variable-size memory blocks and weird address alignment,
checking the first memmap of a huge page for vmemmp-on-memory should be
sufficient.

Unless I am missing something.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2022-06-17  9:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20  2:55 [PATCH v2 0/2] make hugetlb_optimize_vmemmap compatible with memmap_on_memory Muchun Song
2022-05-20  2:55 ` [PATCH v2 1/2] mm: memory_hotplug: enumerate all supported section flags Muchun Song
2022-06-15  9:35   ` David Hildenbrand
2022-06-15 13:02     ` Muchun Song
2022-05-20  2:55 ` [PATCH v2 2/2] mm: memory_hotplug: introduce SECTION_CANNOT_OPTIMIZE_VMEMMAP Muchun Song
2022-06-15  9:51   ` David Hildenbrand
2022-06-16  2:45     ` Muchun Song
2022-06-16  7:21       ` David Hildenbrand
2022-06-16 10:16         ` Muchun Song
2022-06-16  3:57     ` Oscar Salvador
2022-06-16  7:30       ` David Hildenbrand
2022-06-17  5:46         ` Oscar Salvador
2022-06-17  7:28           ` Muchun Song
2022-06-17  7:39             ` David Hildenbrand
2022-06-17  9:10               ` Muchun Song
2022-06-17  9:25                 ` David Hildenbrand [this message]
2022-06-17  9:40                   ` Muchun Song
2022-06-17  9:48             ` Oscar Salvador
2022-06-17  7:43           ` David Hildenbrand
2022-06-17  9:54             ` Oscar Salvador
2022-06-17 10:14               ` David Hildenbrand
2022-06-17 10:49                 ` Muchun Song
2022-06-17 11:19                   ` Muchun Song
2022-06-18  5:49           ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=751b003d-e7c2-dfcc-82b3-e80612cdc104@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=duanxiongchun@bytedance.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=smuchun@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.