From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58EE72E718B for ; Wed, 29 Apr 2026 15:46:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777477587; cv=none; b=bp72isyP6np2U9Zs72MxYIGBxklyPGFankAVT9+oxGCL8f12e9JDY6RH5Azy/4J0x98uLWM3YMTCL7u8eU00XdKi6gd3nKAnu9D9wtMTLc7c7oPMc5KQ7vRdeqRuNrw78uhdEROnTEDfH0toN0IzYbGlJW//Cvc8p+nHt8Ijpfs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777477587; c=relaxed/simple; bh=Co30pJpyLq/gHwEH42TA9nVCeF6IRKPicZj8inleQ40=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ftIXn1Ip4ZB9PnaKGctNjvs0UuBuCaF1EI4KjU1GMUeALKVx8tRiOy2/eAYYfXefXdVPIkxVWPnn9ABcPJ4GvREj/SFW+9t+Lqkx/F6LVdZftMwPmAitpLHJJhLpeQkkGxkWy+1jE0qd0w2iGo2JvzFsrlwiaCBU6+HsPkRW3GI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--joonwonkang.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kw/OWNzd; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--joonwonkang.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kw/OWNzd" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-35da99b90f6so15062202a91.1 for ; Wed, 29 Apr 2026 08:46:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1777477585; x=1778082385; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wv7l/d7stfmZDhBxwEoQyea5vyEpHne3OgvUxiVZSCI=; b=kw/OWNzdHoYgBssNPCvZWcnGuggp51fXL+e+r3I4hnbB4Agbf+4klHy2vwFoYHTi9B XvXWrptqU0RjAE1qx7BswfozXifbm4zHtFMyhQd8Tun+IhnY4L0ZfE6Yfx/tTAVY3QH1 UrZZ9pSKvwhUWZZ3MXWdbiGABn0cWH95TS/nSoX079vtghZfK1a8H9YHGiCxDtKFwMCx kpvbduutDMKOkhBbnMRPkvXsM0+76Wq/ZXnseXcYu2zbrMTrC2ITCqeuRQhe5UjbNaI4 /qOhvMYit8nsSCnLDYTHdPhiLkL3Bc3AjoKcU44aeZeEhVwkkAdHcitP0THChym3zkC7 I7zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777477585; x=1778082385; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wv7l/d7stfmZDhBxwEoQyea5vyEpHne3OgvUxiVZSCI=; b=P7eU4XzhaC0wTprogjmBdnKAWS54NUbTryI5mkMThZKccOEtjxCLm/gBRkuH13zC6S 7+bkOpQGNWhCR11NstiWzlcs40Jane3K56Ud9nMF+CqhLnkBqYANtaF8HgupCd6sWWZY s7ObQ8iOxt0asmMHX6aOkqK93n94ADHkw93NrwZMC6wepZaxGeGg1R0VHgzGJP1C6Sxs UtnqRXeHdl3DHSBhxaVLBqXlRFg7/CRwD/pk/gxLirT0yW42wfSmMCa8R6kzjMpXCZ5/ tKIIvOJDrLzh2a/GQ4w3+0pNrI/iUcuFd3xTJVpc2uMKjLImOIJyu5VI89snanJ5A5fY js8Q== X-Forwarded-Encrypted: i=1; AFNElJ/JLLTj3YzURA9xXbQLrGKFdu6IwRrNu+I6nf/ZWQk4E6duVUhejawm20Cgk270DJ9af4uVhYg6F4HWMzg=@vger.kernel.org X-Gm-Message-State: AOJu0YxK5yrGXQqSQRQQYEnU8RC2YmVLMUtlENm3KJ1TO/2X0iHjGN1Y DHqmJTf4oI48yUgfLTfAV6ZZg8wzYyP5rQAP9hLbMfLYCNxMzB6pXWUy0jNisrhLWNGnKryqn7F gBQgs6xkzGT9FSF4hVCrba11mew== X-Received: from pjbfa16.prod.google.com ([2002:a17:90a:f0d0:b0:35f:bf4f:6d9b]) (user=joonwonkang job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:264c:b0:35b:a418:698c with SMTP id 98e67ed59e1d1-364a0b82c88mr4365221a91.12.1777477585304; Wed, 29 Apr 2026 08:46:25 -0700 (PDT) Date: Wed, 29 Apr 2026 15:46:23 +0000 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.54.0.545.g6539524ca2-goog Message-ID: <20260429154623.2389935-1-joonwonkang@google.com> Subject: Re: [PATCH v3 3/3] percpu: Fix hint invariant breakage From: Joonwon Kang To: dennis@kernel.org Cc: akpm@linux-foundation.org, cl@gentwo.org, dodam@google.com, joonwonkang@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, tj@kernel.org Content-Type: text/plain; charset="UTF-8" Appreciate your insightful reviews! > On Fri, Apr 10, 2026 at 05:44:17PM +0000, Joonwon Kang wrote: > > The invariant "scan_hint_start > contig_hint_start if and only if > > scan_hint == contig_hint" should be kept for hint management. However, > > it could be broken in some cases: > > > > - if (new contig == contig_hint == scan_hint) && (contig_hint_start < > > scan_hint_start < new contig start) && the new contig is to become a > > new contig_hint due to its better alignment, then scan_hint should > > be invalidated instead of keeping the old value. > > > > - if (new contig == contig_hint > scan_hint) && (new contig start < > > contig_hint_start) && the new contig is not to become a new > > contig_hint, then scan_hint should be not updated to the new contig. > > > > This commit mainly fixes this invariant breakage and includes more: > > > > - Refactor the percpu block update code to make it more visible on > > what to consider, e.g. when the new contig overlaps with the old > > contig_hint or scan_hint. > > > > Could you please split this up between the fixing the scan_hint issues > and then refactoring. That way this commit isn't doing too much. > Agree that this commit is doing much of work. However, when I tried only fixing the scan_hint issues without refactoring earlier, I realized that the overlapping cases were also not being handled properly in this function. Without handling the overlapping cases, there could be other hidden scan_hint breakage issues. Handling the overlapping cases required restructuring of the function, which led to the changes to the code that were also required for resolving the scan_hint issues. Let me see if I can decently split them. > > - Merge the new contig with other hints when it overlaps with them and > > treat it as a whole free region instead of a separate small region. > > > > - Fix the invariant breakage and also optimizes scan_hint further. > > Some of the optimization cases when no overlap occurs are: > > > > - if (new contig > contig_hint > scan_hint) && (scan_hint_start < new > > contig start < contig_hint_start), then keep scan_hint instead of > > invalidating it. > > > > - if (new contig > contig_hint == scan_hint) && (contig_hint_start < > > new contig start < scan_hint_start), then update scan_hint to the > > old contig_hint instead of invalidating it. > > > > - if (new contig == contig_hint > scan_hint) && (new contig start < > > contig_hint_start) && the new contig is to become a new contig_hint > > due to its better alignment, then update scan_hint to the old > > contig_hint instead of invalidating or keeping it. > > > > Signed-off-by: Joonwon Kang > > --- > > v2 -> v3: Merge the new contig with other hints when it overlaps with > > them and treat it as a whole free region instead of a separate small > > region. > > > > v1 -> v2: Consider cases where the new contig overlaps with the existing > > contig_hint or scan_hint. > > > > mm/percpu.c | 130 ++++++++++++++++++++++++++++++++++++---------------- > > 1 file changed, 90 insertions(+), 40 deletions(-) > > > > diff --git a/mm/percpu.c b/mm/percpu.c > > index f16533ed4a49..d5b0b4863ffe 100644 > > --- a/mm/percpu.c > > +++ b/mm/percpu.c > > @@ -629,7 +629,27 @@ static inline bool pcpu_region_overlap(int a, int b, int x, int y) > > */ > > static void pcpu_block_update(struct pcpu_block_md *block, int start, int end) > > { > > - int contig = end - start; > > + int contig; > > + int scan_hint_cand_1 = 0; > > + int scan_hint_cand_1_start = 0; > > + int scan_hint_cand_2 = 0; > > + int scan_hint_cand_2_start = 0; > > I'm not really a fan of this cand_1 and cand_2. Re-reading all this > code. I think I really should have introduced something like: > > struct pcpu_region { > int start; > int size; > } > > It's really easy to mix contig_hint and contig_hint_start. If you don't > mind, it would be ideal if we could introduce it but it might be a > non-trivial amount of refactor. > > I'd say the shortest path forward would be to do just to fix the > scan_hint issues. > It will be great if we introduce the struct. I will do it as a separate patch in v4. > > + bool overlap_with_contig_hint = pcpu_region_overlap(start, end, > > + block->contig_hint_start, > > + block->contig_hint_start + block->contig_hint); > > + bool overlap_with_scan_hint = pcpu_region_overlap(start, end, > > + block->scan_hint_start, > > + block->scan_hint_start + block->scan_hint); > > + > > This one isn't used again so we should probably just inline it. > Acknowledged. > > + if (block->contig_hint && overlap_with_contig_hint) { > > + start = min(start, block->contig_hint_start); > > + end = max(end, block->contig_hint_start + block->contig_hint); > > + } > > + if (block->scan_hint && overlap_with_scan_hint) { > > + start = min(start, block->scan_hint_start); > > + end = max(end, block->scan_hint_start + block->scan_hint); > > + } > > + contig = end - start; > > > > block->first_free = min(block->first_free, start); > > if (start == 0) > > @@ -646,56 +666,86 @@ static void pcpu_block_update(struct pcpu_block_md *block, int start, int end) > > } > > > > if (contig > block->contig_hint) { > > - /* promote the old contig_hint to be the new scan_hint */ > > - if (start > block->contig_hint_start) { > > - if (block->contig_hint > block->scan_hint) { > > - block->scan_hint_start = > > - block->contig_hint_start; > > - block->scan_hint = block->contig_hint; > > - } else if (start < block->scan_hint_start) { > > - /* > > - * The old contig_hint == scan_hint. But, the > > - * new contig is larger so hold the invariant > > - * scan_hint_start < contig_hint_start. > > - */ > > - block->scan_hint = 0; > > - } > > - } else { > > - block->scan_hint = 0; > > + if (!overlap_with_contig_hint) { > > + scan_hint_cand_1 = block->contig_hint; > > + scan_hint_cand_1_start = block->contig_hint_start; > > } > > - block->contig_hint_start = start; > > + > > block->contig_hint = contig; > > + block->contig_hint_start = start; > > } else if (contig == block->contig_hint) { > > if (block->contig_hint_start && > > (!start || > > __ffs(start) > __ffs(block->contig_hint_start))) { > > - /* start has a better alignment so use it */ > > + scan_hint_cand_1 = block->contig_hint; > > + scan_hint_cand_1_start = block->contig_hint_start; > > + > > + /* Start has a better alignment so use it. */ > > block->contig_hint_start = start; > > - if (start < block->scan_hint_start && > > - block->contig_hint > block->scan_hint) > > - block->scan_hint = 0; > > - } else if (start > block->scan_hint_start || > > - block->contig_hint > block->scan_hint) { > > - /* > > - * Knowing contig == contig_hint, update the scan_hint > > - * if it is farther than or larger than the current > > - * scan_hint. > > - */ > > - block->scan_hint_start = start; > > - block->scan_hint = contig; > > + } else { > > + if (!overlap_with_contig_hint) { > > + scan_hint_cand_1 = contig; > > + scan_hint_cand_1_start = start; > > + } > > } > > } else { > > /* > > - * The region is smaller than the contig_hint. So only update > > - * the scan_hint if it is larger than or equal and farther than > > - * the current scan_hint. > > + * Consider only when the new contig is larger than or equal to > > + * the old scan hint. > > */ > > - if ((start < block->contig_hint_start && > > - (contig > block->scan_hint || > > - (contig == block->scan_hint && > > - start > block->scan_hint_start)))) { > > - block->scan_hint_start = start; > > - block->scan_hint = contig; > > + if (contig >= block->scan_hint) { > > + scan_hint_cand_1 = contig; > > + scan_hint_cand_1_start = start; > > + } > > + } > > + > > + if (block->scan_hint && > > + !pcpu_region_overlap(start, end, block->scan_hint_start, > > + block->scan_hint_start + block->scan_hint)) { > > + scan_hint_cand_2 = block->scan_hint; > > + scan_hint_cand_2_start = block->scan_hint_start; > > + } > > + > > + /* Make scan_hint_cand_1 be the best candidate for the new scan hint. */ > > + if ((scan_hint_cand_2 > scan_hint_cand_1) || > > + (scan_hint_cand_2 == scan_hint_cand_1 && > > + scan_hint_cand_2_start > scan_hint_cand_1_start)) { > > + int tmp_hint = scan_hint_cand_1; > > + int tmp_hint_start = scan_hint_cand_1_start; > > + > > + scan_hint_cand_1 = scan_hint_cand_2; > > + scan_hint_cand_1_start = scan_hint_cand_2_start; > > + scan_hint_cand_2 = tmp_hint; > > + scan_hint_cand_2_start = tmp_hint_start; > > + } > > + > > + /* > > + * At this point, it is guaranteed that none of the scan hint > > + * candidates overlaps with the new contig hint while they may overlap > > + * with the old scan hint, and that the first candidate is larger in > > + * size or, it equal, farther than the second one. > > + */ > > + > > + if (block->contig_hint > scan_hint_cand_1) { > > + if (scan_hint_cand_1_start < block->contig_hint_start) { > > + block->scan_hint = scan_hint_cand_1; > > + block->scan_hint_start = scan_hint_cand_1_start; > > + } else if (scan_hint_cand_2_start < block->contig_hint_start) { > > + block->scan_hint = scan_hint_cand_2; > > + block->scan_hint_start = scan_hint_cand_2_start; > > + } else { > > + block->scan_hint = 0; > > + } > > + } else if (block->contig_hint == scan_hint_cand_1) { > > + if (scan_hint_cand_1_start > block->contig_hint_start) { > > + block->scan_hint = scan_hint_cand_1; > > + block->scan_hint_start = scan_hint_cand_1_start; > > + } else if (scan_hint_cand_2 < block->contig_hint && > > + scan_hint_cand_2_start < scan_hint_cand_1_start) { > > + block->scan_hint = scan_hint_cand_2; > > + block->scan_hint_start = scan_hint_cand_2_start; > > + } else { > > + block->scan_hint = 0; > > } > > > This seems easier to invert and do something like this: > > Probably something like: > > bool overlap_contig_hint = pcpu_region_overlap(new_region, scan_hint); > bool overlap_scan_hint = pcpu_region_overlap(new_region, scan_hint); > > if (overlap_scan_hint) { > scan_hint = 0; > } > > if (overlap_contig_hint) { > contig_hint = new_region; > return; > } > > // Now the new_region is distinct from the contig_hint and then can do > // the standard logic here. > > if (new_region > contig_hint) { } > else if (new_region == contig_hint) {} > else {} > > What do you think? I think makes cand_1 and cand_2 not necessary. > Hmm, with this, if contig_hint == scan_hint && new_region overlaps only with contig_hint && the entire overlapped region > contig_hint, then we will have a wrong scan_hint, i.e. new contig_hint_start < scan_hint_start even though new contig_hint > scan_hint. But, I can see your point here. Let me see if I can get rid of cand_1 and cand_2. > > } > > } > > -- > > 2.53.0.1213.gd9a14994de-goog > > Thanks, Joonwon Kang