Re: [PATCH 3/3] x86/sbm: Derive leaf granularity from LLC cacheinfo instead of topology domain

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: K Prateek Nayak <kprateek.nayak@amd.com>
To: Chen Yu <yu.c.chen@intel.com>, <tim.c.chen@linux.intel.com>,
	<peterz@infradead.org>
Cc: <pan.deng@intel.com>, <mingo@kernel.org>,
	<linux-kernel@vger.kernel.org>, <tianyou.li@intel.com>
Subject: Re: [PATCH 3/3] x86/sbm: Derive leaf granularity from LLC cacheinfo instead of topology domain
Date: Mon, 11 May 2026 13:18:41 +0530	[thread overview]
Message-ID: <d8bf62c3-8f4a-43f3-b7f5-a3213722f6dd@amd.com> (raw)
In-Reply-To: <20260510155920.2587431-4-yu.c.chen@intel.com>

Hello Chenyu, Tim,

Thank you for working on it!

I got distracted seeing if there was a way to make the sbitmap work here
with minimal overhead but that data structure is too over engineered for
block layer specific use cases (separate cacheline to track clearing
with delayed clearing, alloc_hints, and other fancy bells and whistles)
and it took me a while to get something going with that.

Perhaps keeping the minimal sbm functionality out separate is a good way
forward.

On 5/10/2026 9:29 PM, Chen Yu wrote:
> @@ -317,12 +319,16 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id)
>  	if (c->x86 < 0x17) {
>  		/* Pre-Zen: LLC is at the node level */
>  		c->topo.llc_id = die_id;
> +		if (c == &boot_cpu_data)
> +			arch_sbm_shift = topology_get_domain_shift(TOPO_DIE_DOMAIN);
>  	} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
>  		/*
>  		 * Family 17h up to 1F models: LLC is at the core
>  		 * complex level.  Core complex ID is ApicId[3].
>  		 */
>  		c->topo.llc_id = c->topo.apicid >> 3;
> +		if (c == &boot_cpu_data)
> +			arch_sbm_shift = 3;
>  	} else {
>  		/*
>  		 * Newer families: LLC ID is calculated from the number
> @@ -331,8 +337,11 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id)
>  		u32 llc_index = find_num_cache_leaves(c) - 1;
>  		struct _cpuid4_info id4 = {};
>  
> -		if (!amd_fill_cpuid4_info(llc_index, &id4))
> +		if (!amd_fill_cpuid4_info(llc_index, &id4)) {
>  			c->topo.llc_id = get_cache_id(c->topo.apicid, &id4);
> +			if (c == &boot_cpu_data)
> +				arch_sbm_shift = get_count_order(1 + id4.eax.split.num_threads_sharing);
> +		}

So I'm slightly skeptical on AMD's heterogenous processors based systems
getting this right but I have to get my hands on one to confirm.

Either ways, it seems like an AMD specific problem that I'll chase down
if it exists but this should be fine from testing perspective on your
system.

>  struct sbm *sbm_alloc(void)
>  {
> -	unsigned int nr = arch_sbm_leafs;
> -	unsigned int nbits = 1U << arch_sbm_shift;
> -	unsigned int nlongs = BITS_TO_LONGS(nbits);
> -	struct sbm_root *root = kzalloc_flex(*root, leafs, nr);
> +	unsigned int nr;
> +	unsigned int nbits;
> +	unsigned int nlongs;
> +	struct sbm_root *root;
>  	struct sbm_leaf *leaf;
> +
> +	if (!arch_sbm_shift) {
> +		unsigned int max_idx = num_possible_cpus();
> +
> +		/*
> +		 * unsigned long is the base unit for bitmap in sbm_leaf.
> +		 * Use that for default bitmap size for compact bitmap
> +		 * without unused bits.
> +		 */
> +		arch_sbm_shift = BYTES_TO_BITS(sizeof(unsigned long));
> +		arch_sbm_leafs = 1 + (max_idx >> arch_sbm_shift);
> +		arch_sbm_mask = (1 << arch_sbm_shift) - 1;
> +		arch_sbm_bits = arch_sbm_shift;

Side note:

So while chasing sbitmap, I realized there are some users of sbitmap out
there that are essentially using its minimal functionality that smb
provides and can be converted over to save an extra cacheline worth of
overhead.

Does it make sense to keep the arch_sbm_* stuff specific to the
scheduler and allow wider use of sbm for any sparse bitmap usage?

> +	}
> +
> +	nr = arch_sbm_leafs;
> +	nbits = 1U << arch_sbm_shift;
> +	nlongs = BITS_TO_LONGS(nbits);
> +	root = kzalloc_flex(*root, leafs, nr);
>  	if (!root)
>  		return NULL;
>  

My QEMU has suddenly refused to boot after the conversion to cache
properties leaf changes so I'll try to see why that is the case.

-- 
Thanks and Regards,
Prateek

next prev parent reply	other threads:[~2026-05-11  7:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <729726b9-c669-41e2-887d-bdf9da703034@amd.com>
2026-05-10 15:59 ` [PATCH v2 1/4] sched/rt: Optimize cpupri_vec layout to mitigate cache line contention Chen Yu
2026-05-10 15:59   ` [PATCH 1/3] x86/sbm: Fix domain shift calculation and sbm_find_next_bit() Chen Yu
2026-05-10 15:59   ` [PATCH 2/3] lib/sbm: Use dynamically sized bitmap in sbm_leaf Chen Yu
2026-05-10 15:59   ` [PATCH 3/3] x86/sbm: Derive leaf granularity from LLC cacheinfo instead of topology domain Chen Yu
2026-05-11  7:48     ` K Prateek Nayak [this message]
2026-05-12  9:29       ` Chen, Yu C

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8bf62c3-8f4a-43f3-b7f5-a3213722f6dd@amd.com \
    --to=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pan.deng@intel.com \
    --cc=peterz@infradead.org \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox