public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Joonwon Kang <joonwonkang@google.com>
To: dennis@kernel.org
Cc: akpm@linux-foundation.org, cl@gentwo.org, dodam@google.com,
	 joonwonkang@google.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,  tj@kernel.org
Subject: Re: [PATCH v3 3/3] percpu: Fix hint invariant breakage
Date: Wed, 29 Apr 2026 15:46:23 +0000	[thread overview]
Message-ID: <20260429154623.2389935-1-joonwonkang@google.com> (raw)
In-Reply-To: <aegQgyf3KuIZMK9x@palisades.local>

Appreciate your insightful reviews!

> On Fri, Apr 10, 2026 at 05:44:17PM +0000, Joonwon Kang wrote:
> > The invariant "scan_hint_start > contig_hint_start if and only if
> > scan_hint == contig_hint" should be kept for hint management. However,
> > it could be broken in some cases:
> > 
> >   - if (new contig == contig_hint == scan_hint) && (contig_hint_start <
> >     scan_hint_start < new contig start) && the new contig is to become a
> >     new contig_hint due to its better alignment, then scan_hint should
> >     be invalidated instead of keeping the old value.
> > 
> >   - if (new contig == contig_hint > scan_hint) && (new contig start <
> >     contig_hint_start) && the new contig is not to become a new
> >     contig_hint, then scan_hint should be not updated to the new contig.
> > 
> > This commit mainly fixes this invariant breakage and includes more:
> > 
> >   - Refactor the percpu block update code to make it more visible on
> >     what to consider, e.g. when the new contig overlaps with the old
> >     contig_hint or scan_hint.
> > 
> 
> Could you please split this up between the fixing the scan_hint issues
> and then refactoring. That way this commit isn't doing too much.
> 

Agree that this commit is doing much of work. However, when I tried only
fixing the scan_hint issues without refactoring earlier, I realized that
the overlapping cases were also not being handled properly in this
function. Without handling the overlapping cases, there could be other
hidden scan_hint breakage issues. Handling the overlapping cases required
restructuring of the function, which led to the changes to the code that
were also required for resolving the scan_hint issues. Let me see if I
can decently split them.

> >   - Merge the new contig with other hints when it overlaps with them and
> >     treat it as a whole free region instead of a separate small region.
> > 
> >   - Fix the invariant breakage and also optimizes scan_hint further.
> >     Some of the optimization cases when no overlap occurs are:
> > 
> >     - if (new contig > contig_hint > scan_hint) && (scan_hint_start < new
> >       contig start < contig_hint_start), then keep scan_hint instead of
> >       invalidating it.
> > 
> >     - if (new contig > contig_hint == scan_hint) && (contig_hint_start <
> >       new contig start < scan_hint_start), then update scan_hint to the
> >       old contig_hint instead of invalidating it.
> > 
> >     - if (new contig == contig_hint > scan_hint) && (new contig start <
> >       contig_hint_start) && the new contig is to become a new contig_hint
> >       due to its better alignment, then update scan_hint to the old
> >       contig_hint instead of invalidating or keeping it.
> > 
> > Signed-off-by: Joonwon Kang <joonwonkang@google.com>
> > ---
> > v2 -> v3: Merge the new contig with other hints when it overlaps with
> >   them and treat it as a whole free region instead of a separate small
> >   region.
> > 
> > v1 -> v2: Consider cases where the new contig overlaps with the existing
> >   contig_hint or scan_hint.
> > 
> >  mm/percpu.c | 130 ++++++++++++++++++++++++++++++++++++----------------
> >  1 file changed, 90 insertions(+), 40 deletions(-)
> > 
> > diff --git a/mm/percpu.c b/mm/percpu.c
> > index f16533ed4a49..d5b0b4863ffe 100644
> > --- a/mm/percpu.c
> > +++ b/mm/percpu.c
> > @@ -629,7 +629,27 @@ static inline bool pcpu_region_overlap(int a, int b, int x, int y)
> >   */
> >  static void pcpu_block_update(struct pcpu_block_md *block, int start, int end)
> >  {
> > -	int contig = end - start;
> > +	int contig;
> > +	int scan_hint_cand_1 = 0;
> > +	int scan_hint_cand_1_start = 0;
> > +	int scan_hint_cand_2 = 0;
> > +	int scan_hint_cand_2_start = 0;
> 
> I'm not really a fan of this cand_1 and cand_2. Re-reading all this
> code. I think I really should have introduced something like:
> 
> struct pcpu_region {
>     int start;
>     int size;
> }
> 
> It's really easy to mix contig_hint and contig_hint_start. If you don't
> mind, it would be ideal if we could introduce it but it might be a
> non-trivial amount of refactor.
> 
> I'd say the shortest path forward would be to do just to fix the
> scan_hint issues.
> 

It will be great if we introduce the struct. I will do it as a separate
patch in v4.

> > +	bool overlap_with_contig_hint = pcpu_region_overlap(start, end,
> > +		block->contig_hint_start,
> > +		block->contig_hint_start + block->contig_hint);
> > +	bool overlap_with_scan_hint = pcpu_region_overlap(start, end,
> > +		block->scan_hint_start,
> > +		block->scan_hint_start + block->scan_hint);
> > +
> 
> This one isn't used again so we should probably just inline it.
> 

Acknowledged.

> > +	if (block->contig_hint && overlap_with_contig_hint) {
> > +		start = min(start, block->contig_hint_start);
> > +		end = max(end, block->contig_hint_start + block->contig_hint);
> > +	}
> > +	if (block->scan_hint && overlap_with_scan_hint) {
> > +		start = min(start, block->scan_hint_start);
> > +		end = max(end, block->scan_hint_start + block->scan_hint);
> > +	}
> > +	contig = end - start;
> >  
> >  	block->first_free = min(block->first_free, start);
> >  	if (start == 0)
> > @@ -646,56 +666,86 @@ static void pcpu_block_update(struct pcpu_block_md *block, int start, int end)
> >  	}
> >  
> >  	if (contig > block->contig_hint) {
> > -		/* promote the old contig_hint to be the new scan_hint */
> > -		if (start > block->contig_hint_start) {
> > -			if (block->contig_hint > block->scan_hint) {
> > -				block->scan_hint_start =
> > -					block->contig_hint_start;
> > -				block->scan_hint = block->contig_hint;
> > -			} else if (start < block->scan_hint_start) {
> > -				/*
> > -				 * The old contig_hint == scan_hint.  But, the
> > -				 * new contig is larger so hold the invariant
> > -				 * scan_hint_start < contig_hint_start.
> > -				 */
> > -				block->scan_hint = 0;
> > -			}
> > -		} else {
> > -			block->scan_hint = 0;
> > +		if (!overlap_with_contig_hint) {
> > +			scan_hint_cand_1 = block->contig_hint;
> > +			scan_hint_cand_1_start = block->contig_hint_start;
> >  		}
> > -		block->contig_hint_start = start;
> > +
> >  		block->contig_hint = contig;
> > +		block->contig_hint_start = start;
> >  	} else if (contig == block->contig_hint) {
> >  		if (block->contig_hint_start &&
> >  		    (!start ||
> >  		     __ffs(start) > __ffs(block->contig_hint_start))) {
> > -			/* start has a better alignment so use it */
> > +			scan_hint_cand_1 = block->contig_hint;
> > +			scan_hint_cand_1_start = block->contig_hint_start;
> > +
> > +			/* Start has a better alignment so use it. */
> >  			block->contig_hint_start = start;
> > -			if (start < block->scan_hint_start &&
> > -			    block->contig_hint > block->scan_hint)
> > -				block->scan_hint = 0;
> > -		} else if (start > block->scan_hint_start ||
> > -			   block->contig_hint > block->scan_hint) {
> > -			/*
> > -			 * Knowing contig == contig_hint, update the scan_hint
> > -			 * if it is farther than or larger than the current
> > -			 * scan_hint.
> > -			 */
> > -			block->scan_hint_start = start;
> > -			block->scan_hint = contig;
> > +		} else {
> > +			if (!overlap_with_contig_hint) {
> > +				scan_hint_cand_1 = contig;
> > +				scan_hint_cand_1_start = start;
> > +			}
> >  		}
> >  	} else {
> >  		/*
> > -		 * The region is smaller than the contig_hint.  So only update
> > -		 * the scan_hint if it is larger than or equal and farther than
> > -		 * the current scan_hint.
> > +		 * Consider only when the new contig is larger than or equal to
> > +		 * the old scan hint.
> >  		 */
> > -		if ((start < block->contig_hint_start &&
> > -		     (contig > block->scan_hint ||
> > -		      (contig == block->scan_hint &&
> > -		       start > block->scan_hint_start)))) {
> > -			block->scan_hint_start = start;
> > -			block->scan_hint = contig;
> > +		if (contig >= block->scan_hint) {
> > +			scan_hint_cand_1 = contig;
> > +			scan_hint_cand_1_start = start;
> > +		}
> > +	}
> > +
> > +	if (block->scan_hint &&
> > +	    !pcpu_region_overlap(start, end, block->scan_hint_start,
> > +				 block->scan_hint_start + block->scan_hint)) {
> > +		scan_hint_cand_2 = block->scan_hint;
> > +		scan_hint_cand_2_start = block->scan_hint_start;
> > +	}
> > +
> > +	/* Make scan_hint_cand_1 be the best candidate for the new scan hint. */
> > +	if ((scan_hint_cand_2 > scan_hint_cand_1) ||
> > +	    (scan_hint_cand_2 == scan_hint_cand_1 &&
> > +	     scan_hint_cand_2_start > scan_hint_cand_1_start)) {
> > +		int tmp_hint = scan_hint_cand_1;
> > +		int tmp_hint_start = scan_hint_cand_1_start;
> > +
> > +		scan_hint_cand_1 = scan_hint_cand_2;
> > +		scan_hint_cand_1_start = scan_hint_cand_2_start;
> > +		scan_hint_cand_2 = tmp_hint;
> > +		scan_hint_cand_2_start = tmp_hint_start;
> > +	}
> > +
> > +	/*
> > +	 * At this point, it is guaranteed that none of the scan hint
> > +	 * candidates overlaps with the new contig hint while they may overlap
> > +	 * with the old scan hint, and that the first candidate is larger in
> > +	 * size or, it equal, farther than the second one.
> > +	 */
> > +
> > +	if (block->contig_hint > scan_hint_cand_1) {
> > +		if (scan_hint_cand_1_start < block->contig_hint_start) {
> > +			block->scan_hint = scan_hint_cand_1;
> > +			block->scan_hint_start = scan_hint_cand_1_start;
> > +		} else if (scan_hint_cand_2_start < block->contig_hint_start) {
> > +			block->scan_hint = scan_hint_cand_2;
> > +			block->scan_hint_start = scan_hint_cand_2_start;
> > +		} else {
> > +			block->scan_hint = 0;
> > +		}
> > +	} else if (block->contig_hint == scan_hint_cand_1) {
> > +		if (scan_hint_cand_1_start > block->contig_hint_start) {
> > +			block->scan_hint = scan_hint_cand_1;
> > +			block->scan_hint_start = scan_hint_cand_1_start;
> > +		} else if (scan_hint_cand_2 < block->contig_hint &&
> > +			   scan_hint_cand_2_start < scan_hint_cand_1_start) {
> > +			block->scan_hint = scan_hint_cand_2;
> > +			block->scan_hint_start = scan_hint_cand_2_start;
> > +		} else {
> > +			block->scan_hint = 0;
> >  		}
> 
> 
> This seems easier to invert and do something like this:
> 
> Probably something like:
> 
> bool overlap_contig_hint = pcpu_region_overlap(new_region, scan_hint);
> bool overlap_scan_hint = pcpu_region_overlap(new_region, scan_hint);
> 
> if (overlap_scan_hint) {
>     scan_hint = 0;
> }
> 
> if (overlap_contig_hint) {
>     contig_hint = new_region;
>     return;
> }
> 
> // Now the new_region is distinct from the contig_hint and then can do
> // the standard logic here.
> 
> if (new_region > contig_hint) { }
> else if (new_region == contig_hint) {}
> else {}
> 
> What do you think? I think makes cand_1 and cand_2 not necessary.
> 

Hmm, with this, if contig_hint == scan_hint && new_region overlaps only
with contig_hint && the entire overlapped region > contig_hint, then we
will have a wrong scan_hint, i.e. new contig_hint_start < scan_hint_start
even though new contig_hint > scan_hint.

But, I can see your point here. Let me see if I can get rid of cand_1 and
cand_2.

> >  	}
> >  }
> > -- 
> > 2.53.0.1213.gd9a14994de-goog
> > 

Thanks,
Joonwon Kang

  reply	other threads:[~2026-04-29 15:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 17:44 [PATCH v3 1/3] percpu: Fix wrong chunk hints update Joonwon Kang
2026-04-10 17:44 ` [PATCH v3 2/3] percpu: Do not trust hint starts when they are not set Joonwon Kang
2026-04-21 23:28   ` Dennis Zhou
2026-04-29 13:03     ` Joonwon Kang
2026-04-10 17:44 ` [PATCH v3 3/3] percpu: Fix hint invariant breakage Joonwon Kang
2026-04-22  0:04   ` Dennis Zhou
2026-04-29 15:46     ` Joonwon Kang [this message]
2026-04-21 23:26 ` [PATCH v3 1/3] percpu: Fix wrong chunk hints update Dennis Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260429154623.2389935-1-joonwonkang@google.com \
    --to=joonwonkang@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=dennis@kernel.org \
    --cc=dodam@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox