linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Usama Arif <usama.arif@bytedance.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: linux-mm@kvack.org, muchun.song@linux.dev,
	mike.kravetz@oracle.com, linux-kernel@vger.kernel.org,
	fam.zheng@bytedance.com, liangma@liangbit.com,
	simon.evans@bytedance.com, punit.agrawal@bytedance.com
Subject: Re: [External] Re: [RFC 2/4] mm/memblock: Add hugepage_size member to struct memblock_region
Date: Thu, 27 Jul 2023 21:56:28 +0100	[thread overview]
Message-ID: <383a72f7-b9d3-963f-e5fb-b0036376399e@bytedance.com> (raw)
In-Reply-To: <20230727043002.GA1901145@kernel.org>



On 27/07/2023 05:30, Mike Rapoport wrote:
> On Wed, Jul 26, 2023 at 04:02:21PM +0100, Usama Arif wrote:
>>
>> On 26/07/2023 12:01, Mike Rapoport wrote:
>>> On Mon, Jul 24, 2023 at 02:46:42PM +0100, Usama Arif wrote:
>>>> This propagates the hugepage size from the memblock APIs
>>>> (memblock_alloc_try_nid_raw and memblock_alloc_range_nid)
>>>> so that it can be stored in struct memblock region. This does not
>>>> introduce any functional change and hugepage_size is not used in
>>>> this commit. It is just a setup for the next commit where huge_pagesize
>>>> is used to skip initialization of struct pages that will be freed later
>>>> when HVO is enabled.
>>>>
>>>> Signed-off-by: Usama Arif <usama.arif@bytedance.com>
>>>> ---
>>>>    arch/arm64/mm/kasan_init.c                   |  2 +-
>>>>    arch/powerpc/platforms/pasemi/iommu.c        |  2 +-
>>>>    arch/powerpc/platforms/pseries/setup.c       |  4 +-
>>>>    arch/powerpc/sysdev/dart_iommu.c             |  2 +-
>>>>    include/linux/memblock.h                     |  8 ++-
>>>>    mm/cma.c                                     |  4 +-
>>>>    mm/hugetlb.c                                 |  6 +-
>>>>    mm/memblock.c                                | 60 ++++++++++++--------
>>>>    mm/mm_init.c                                 |  2 +-
>>>>    mm/sparse-vmemmap.c                          |  2 +-
>>>>    tools/testing/memblock/tests/alloc_nid_api.c |  2 +-
>>>>    11 files changed, 56 insertions(+), 38 deletions(-)
>>>>
>>>
>>> [ snip ]
>>>
>>>> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
>>>> index f71ff9f0ec81..bb8019540d73 100644
>>>> --- a/include/linux/memblock.h
>>>> +++ b/include/linux/memblock.h
>>>> @@ -63,6 +63,7 @@ struct memblock_region {
>>>>    #ifdef CONFIG_NUMA
>>>>    	int nid;
>>>>    #endif
>>>> +	phys_addr_t hugepage_size;
>>>>    };
>>>>    /**
>>>> @@ -400,7 +401,8 @@ phys_addr_t memblock_phys_alloc_range(phys_addr_t size, phys_addr_t align,
>>>>    				      phys_addr_t start, phys_addr_t end);
>>>>    phys_addr_t memblock_alloc_range_nid(phys_addr_t size,
>>>>    				      phys_addr_t align, phys_addr_t start,
>>>> -				      phys_addr_t end, int nid, bool exact_nid);
>>>> +				      phys_addr_t end, int nid, bool exact_nid,
>>>> +				      phys_addr_t hugepage_size);
>>>
>>> Rather than adding yet another parameter to memblock_phys_alloc_range() we
>>> can have an API that sets a flag on the reserved regions.
>>> With this the hugetlb reservation code can set a flag when HVO is
>>> enabled and memmap_init_reserved_pages() will skip regions with this flag
>>> set.
>>>
>>
>> Hi,
>>
>> Thanks for the review.
>>
>> I think you meant memblock_alloc_range_nid/memblock_alloc_try_nid_raw and
>> not memblock_phys_alloc_range?
> 
> Yes.
>   
>> My initial approach was to use flags, but I think it looks worse than what I
>> have done in this RFC (I have pushed the flags prototype at
>> https://github.com/uarif1/linux/commits/flags_skip_prep_init_gigantic_HVO,
>> top 4 commits for reference (the main difference is patch 2 and 4 from
>> RFC)). The major points are (the bigger issue is in patch 4):
>>
>> - (RFC vs flags patch 2 comparison) In the RFC, hugepage_size is propagated
>> from memblock_alloc_try_nid_raw through function calls. When using flags,
>> the "no_init" boolean is propogated from memblock_alloc_try_nid_raw through
>> function calls until the region flags are available in memblock_add_range
>> and the new MEMBLOCK_NOINIT flag is set. I think its a bit more tricky to
>> introduce a new function to set the flag in the region AFTER the call to
>> memblock_alloc_try_nid_raw has finished as the memblock_region can not be
>> found.
>> So something (hugepage_size/flag information) still has to be propagated
>> through function calls and a new argument needs to be added.
> 
> Sorry if I wasn't clear. I didn't mean to add flags parameter, I meant to
> add a flag and a function that sets this flag for a range. So for
> MEMBLOCK_NOINIT there would be
> 
> int memblock_mark_noinit(phys_addr_t base, phys_addr_t size);
> 
> I'd just name this flag MEMBLOCK_RSRV_NOINIT to make it clear it controls
> the reserved regions.
> 
> This won't require updating all call sites of memblock_alloc_range_nid()
> and memblock_alloc_try_nid_raw() but only a small refactoring of
> memblock_setclr_flag() and its callers.
> 

Thanks for this, its much cleaner doing the way you described. I have 
sent v1 implementing this 
https://lore.kernel.org/all/20230727204624.1942372-1-usama.arif@bytedance.com/.

Regards,
Usama



  reply	other threads:[~2023-07-27 20:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-24 13:46 [RFC 0/4] mm/memblock: Skip prep and initialization of struct pages freed later by HVO Usama Arif
2023-07-24 13:46 ` [RFC 1/4] mm/hugetlb: Skip prep of tail pages when HVO is enabled Usama Arif
2023-07-24 13:46 ` [RFC 2/4] mm/memblock: Add hugepage_size member to struct memblock_region Usama Arif
2023-07-26 11:01   ` Mike Rapoport
2023-07-26 15:02     ` [External] " Usama Arif
2023-07-27  4:30       ` Mike Rapoport
2023-07-27 20:56         ` Usama Arif [this message]
2023-07-24 13:46 ` [RFC 3/4] mm/hugetlb_vmemmap: Use nid of the head page to reallocate it Usama Arif
2023-07-24 13:46 ` [RFC 4/4] mm/memblock: Skip initialization of struct pages freed later by HVO Usama Arif
2023-07-26 10:34 ` [RFC 0/4] mm/memblock: Skip prep and " Usama Arif

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=383a72f7-b9d3-963f-e5fb-b0036376399e@bytedance.com \
    --to=usama.arif@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=liangma@liangbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=punit.agrawal@bytedance.com \
    --cc=rppt@kernel.org \
    --cc=simon.evans@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).