Re: [PATCH RESEND v2 4/6] mm/page_alloc: sort out the alloc_contig_range() gfp flags mess

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Naveen N Rao <naveen@kernel.org>,
	Madhavan Srinivasan <maddy@linux.ibm.com>
Subject: Re: [PATCH RESEND v2 4/6] mm/page_alloc: sort out the alloc_contig_range() gfp flags mess
Date: Tue, 3 Dec 2024 20:07:36 +0100	[thread overview]
Message-ID: <ae0b70e7-bc7f-4f43-82af-0e0c1a02f735@redhat.com> (raw)
In-Reply-To: <498871B1-D26C-4934-8E89-C6C8ECE8872A@nvidia.com>

On 03.12.24 16:49, Zi Yan wrote:
> On 3 Dec 2024, at 9:24, Vlastimil Babka wrote:
> 
>> On 12/3/24 15:12, David Hildenbrand wrote:
>>> On 03.12.24 14:55, Vlastimil Babka wrote:
>>>> On 12/3/24 10:47, David Hildenbrand wrote:
>>>>> It's all a bit complicated for alloc_contig_range(). For example, we don't
>>>>> support many flags, so let's start bailing out on unsupported
>>>>> ones -- ignoring the placement hints, as we are already given the range
>>>>> to allocate.
>>>>>
>>>>> While we currently set cc.gfp_mask, in __alloc_contig_migrate_range() we
>>>>> simply create yet another GFP mask whereby we ignore the reclaim flags
>>>>> specify by the caller. That looks very inconsistent.
>>>>>
>>>>> Let's clean it up, constructing the gfp flags used for
>>>>> compaction/migration exactly once. Update the documentation of the
>>>>> gfp_mask parameter for alloc_contig_range() and alloc_contig_pages().
>>>>>
>>>>> Acked-by: Zi Yan <ziy@nvidia.com>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>
>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>
>>>>> +	/*
>>>>> +	 * Flags to control page compaction/migration/reclaim, to free up our
>>>>> +	 * page range. Migratable pages are movable, __GFP_MOVABLE is implied
>>>>> +	 * for them.
>>>>> +	 *
>>>>> +	 * Traditionally we always had __GFP_HARDWALL|__GFP_RETRY_MAYFAIL set,
>>>>> +	 * keep doing that to not degrade callers.
>>>>> +	 */
>>>>
>>>> Wonder if we could revisit that eventually. Why limit migration targets by
>>>> cpuset via __GFP_HARDWALL if we were not called with __GFP_HARDWALL? And why
>>>> weaken the attempts with __GFP_RETRY_MAYFAIL if we didn't specify it?
>>>
>>> See below.
>>>
>>>>
>>>> Unless I'm missing something, cc->gfp is only checked for __GFP_FS and
>>>> __GFP_NOWARN in few places, so it's mostly migration_target_control the
>>>> callers could meaningfully influence.
>>>
>>> Note the fist change in the file, where we now use the mask instead of coming up
>>> with another one out of the blue. :)
>>
>> I know. What I wanted to say - cc->gfp is on its own only checked in few
>> places, but now since we also translate it to migration_target_control's
>> gfp_mask, it's mostly that part the caller might influence with the passed
>> flags. But we still impose own additions to it, limiting that influence.
>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index ce7589a4ec01..54594cc4f650 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6294,7 +6294,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>>>    	int ret = 0;
>>>    	struct migration_target_control mtc = {
>>>    		.nid = zone_to_nid(cc->zone),
>>> -		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
>>> +		.gfp_mask = cc->gfp_mask,
>>>    		.reason = MR_CONTIG_RANGE,
>>>    	};
>>>
>>> GFP_USER contains __GFP_HARDWALL. I am not sure if that matters here, but
>>
>> Yeah wonder if GFP_USER was used specifically for that part, or just randomly :)
>>
>>> likely the thing we are assuming here is that we are migrating a page, and
>>> usually, these are user allocation (except maybe balloon and some other non-lru
>>> movable things).
>>
>> Yeah and user allocations obey cpuset and mempolicies etc. But these are
>> likely somebody elses allocations that were done according to their
>> policies. With our migration we might be actually violating those, which
>> probably can't be helped (is at least migration within the same node
>> preferred? hmm). But it doesn't seem to me that our caller's restrictions
>> (if those exist, would be enforced by __GFP_HARDWALL) are that relevant for
>> somebody else's pages?
> 
> Yeah, I was wondering why current_gfp_context() is used to adjust gfp_mask,
> since current context might not be relevant. But I see it is used in
> the original code, so I did not ask. If current context is irrelevant w.r.t
> the to-be-migrated pages, should current_gfp_context() part be removed?

Please see how current_gfp_context() is only concerned (excluding the 
__GFP_MOVABLE thing we unconditionally set ...) about reclaim flags. 
This part make absolute sense to respect here.

So that is something different than __GFP_HARDWALL that *we so far 
unconditionally set* and is not a "reclaim" flag.

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2024-12-03 19:07 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03  9:47 [PATCH RESEND v2 0/6] mm/page_alloc: gfp flags cleanups for alloc_contig_*() David Hildenbrand
2024-12-03  9:47 ` [PATCH RESEND v2 1/6] mm/page_isolation: don't pass gfp flags to isolate_single_pageblock() David Hildenbrand
2024-12-03 13:31   ` Vlastimil Babka
2024-12-03 15:30   ` Oscar Salvador
2024-12-03 21:44   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 2/6] mm/page_isolation: don't pass gfp flags to start_isolate_page_range() David Hildenbrand
2024-12-03 13:32   ` Vlastimil Babka
2024-12-03 15:32   ` Oscar Salvador
2024-12-03 21:44   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 3/6] mm/page_alloc: make __alloc_contig_migrate_range() static David Hildenbrand
2024-12-03 13:33   ` Vlastimil Babka
2024-12-03 15:33   ` Oscar Salvador
2024-12-03 21:45   ` Vishal Moola
2024-12-03  9:47 ` [PATCH RESEND v2 4/6] mm/page_alloc: sort out the alloc_contig_range() gfp flags mess David Hildenbrand
2024-12-03 13:55   ` Vlastimil Babka
2024-12-03 14:12     ` David Hildenbrand
2024-12-03 14:24       ` Vlastimil Babka
2024-12-03 15:49         ` Zi Yan
2024-12-03 19:07           ` David Hildenbrand [this message]
2024-12-03 19:19         ` David Hildenbrand
2024-12-04  8:54           ` Vlastimil Babka
2024-12-04  8:59           ` Oscar Salvador
2024-12-04  9:03             ` Vlastimil Babka
2024-12-04  9:15               ` Oscar Salvador
2024-12-04  9:28                 ` David Hildenbrand
2024-12-04 10:04                   ` Oscar Salvador
2024-12-04 11:05                     ` David Hildenbrand
2024-12-04  9:00   ` Oscar Salvador
2024-12-03  9:47 ` [PATCH RESEND v2 5/6] mm/page_alloc: forward the gfp flags from alloc_contig_range() to post_alloc_hook() David Hildenbrand
2024-12-03 14:36   ` Vlastimil Babka
2024-12-04  9:03   ` Oscar Salvador
2024-12-03  9:47 ` [PATCH RESEND v2 6/6] powernv/memtrace: use __GFP_ZERO with alloc_contig_pages() David Hildenbrand
2024-12-03 14:39   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae0b70e7-bc7f-4f43-82af-0e0c1a02f735@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=naveen@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).