linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] platform/x86/intel-uncore-freq: Present unique domain ID per package
@ 2025-08-25 21:43 Srinivas Pandruvada
  2025-08-28 10:36 ` Ilpo Järvinen
  0 siblings, 1 reply; 3+ messages in thread
From: Srinivas Pandruvada @ 2025-08-25 21:43 UTC (permalink / raw)
  To: hdegoede, ilpo.jarvinen
  Cc: platform-driver-x86, linux-kernel, Srinivas Pandruvada

In partitioned systems, the domain ID is unique in the partition and a
package can have multiple partitions.

Some user-space tools, such as turbostat, assume the domain ID is unique
per package. These tools map CPU power domains, which are unique to a
package. However, this approach does not work in partitioned systems.

There is no architectural definition of "partition" to present to user
space.

To support these tools, set the domain_id to be unique per package. For
compute die IDs, uniqueness can be achieved using the platform info
cdie_mask, mirroring the behavior observed in non-partitioned systems.

For IO dies, which lack a direct CPU relationship, any unique logical
ID can be assigned. Here domain IDs for IO dies are configured after all
compute domain IDs. During the probe, keep the index of the next IO
domain ID after the last IO domain ID of the current partition. Since
CPU packages are symmetric, partition information is same for all
packages.

The Intel Speed Select driver has already implemented a similar change
to make the domain ID unique, with compute dies listed first, followed
by I/O dies.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
v2:
- Add some comments
- Change update_domain_id() to set_domian_id() to set domain_id instead of update
- cluster_info->uncore_data.domain_id += * is changed to add multiple steps to
get to this equation
- Handle case when only when no compute dies in partition 

 .../uncore-frequency/uncore-frequency-tpmi.c  | 76 ++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
index cb4905bad89b..a30a99048db9 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
@@ -369,6 +369,79 @@ static void uncore_set_agent_type(struct tpmi_uncore_cluster_info *cluster_info)
 	cluster_info->uncore_data.agent_type_mask = FIELD_GET(UNCORE_AGENT_TYPES, status);
 }
 
+#define MAX_PARTITIONS	2
+
+/* IO domain ID start index for a partition */
+static u8 io_die_start[MAX_PARTITIONS];
+
+/* Next IO domain ID index after the current partition IO die IDs */
+static u8 io_die_index_next;
+
+/* Lock to protect io_die_start, io_die_index_next */
+static DEFINE_MUTEX(domain_lock);
+
+static void set_domain_id(int id,  int num_resources,
+			  struct oobmsm_plat_info *plat_info,
+			  struct tpmi_uncore_cluster_info *cluster_info)
+{
+	u8 part_io_index = 0, cdie_range, pkg_io_index, max_dies;
+
+	if (plat_info->partition >= MAX_PARTITIONS) {
+		cluster_info->uncore_data.domain_id = id;
+		return;
+	}
+
+	if (cluster_info->uncore_data.agent_type_mask & AGENT_TYPE_CORE) {
+		cluster_info->uncore_data.domain_id = cluster_info->cdie_id;
+		return;
+	}
+
+	/* Unlikely but cdie_mask may have holes, so take range */
+	cdie_range = fls(plat_info->cdie_mask) - ffs(plat_info->cdie_mask) + 1;
+	max_dies = topology_max_dies_per_package();
+
+	/*
+	 * If the CPU doesn't enumerate dies, then just current cdie range
+	 * the max.
+	 */
+	if (cdie_range > max_dies)
+		max_dies = cdie_range;
+
+	guard(mutex)(&domain_lock);
+
+	if (!io_die_index_next)
+		io_die_index_next = max_dies;
+
+	if (!io_die_start[plat_info->partition]) {
+		io_die_start[plat_info->partition] = io_die_index_next;
+		/*
+		 * number of IO dies = num_resources - cdie_range. Hence
+		 * next partition io_die_index_next is set after IO dies
+		 * in the current partition.
+		 */
+		io_die_index_next += (num_resources - cdie_range);
+	}
+
+	/*
+	 * Index from IO die start within the partition:
+	 * This is the first valid domain after the cdies. If there are
+	 * no cdies in a partition just start from 0.
+	 * For example the current resource index 5 and cdies end at
+	 * index 3 (cdie_cnt = 4). Then the io only index 5 - 4 = 1.
+	 */
+	if (cdie_range)
+		part_io_index = id - cdie_range;
+
+	/*
+	 * Add to the IO die start index for this partition in this package
+	 * to make unique in the package.
+	 */
+	pkg_io_index = io_die_start[plat_info->partition] + part_io_index;
+
+	/* Assign this to domain ID */
+	cluster_info->uncore_data.domain_id = pkg_io_index;
+}
+
 /* Callback for sysfs read for TPMI uncore values. Called under mutex locks. */
 static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncore_index index)
 {
@@ -605,11 +678,12 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
 			cluster_info->uncore_data.package_id = pkg;
 			/* There are no dies like Cascade Lake */
 			cluster_info->uncore_data.die_id = 0;
-			cluster_info->uncore_data.domain_id = i;
 			cluster_info->uncore_data.cluster_id = j;
 
 			set_cdie_id(i, cluster_info, plat_info);
 
+			set_domain_id(i, num_resources, plat_info, cluster_info);
+
 			cluster_info->uncore_root = tpmi_uncore;
 
 			if (TPMI_MINOR_VERSION(pd_info->ufs_header_ver) >= UNCORE_ELC_SUPPORTED_VERSION)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] platform/x86/intel-uncore-freq: Present unique domain ID per package
  2025-08-25 21:43 [PATCH v2] platform/x86/intel-uncore-freq: Present unique domain ID per package Srinivas Pandruvada
@ 2025-08-28 10:36 ` Ilpo Järvinen
  2025-08-28 12:40   ` srinivas pandruvada
  0 siblings, 1 reply; 3+ messages in thread
From: Ilpo Järvinen @ 2025-08-28 10:36 UTC (permalink / raw)
  To: Srinivas Pandruvada; +Cc: Hans de Goede, platform-driver-x86, LKML

On Mon, 25 Aug 2025, Srinivas Pandruvada wrote:

> In partitioned systems, the domain ID is unique in the partition and a
> package can have multiple partitions.
> 
> Some user-space tools, such as turbostat, assume the domain ID is unique
> per package. These tools map CPU power domains, which are unique to a
> package. However, this approach does not work in partitioned systems.
> 
> There is no architectural definition of "partition" to present to user
> space.
> 
> To support these tools, set the domain_id to be unique per package. For
> compute die IDs, uniqueness can be achieved using the platform info
> cdie_mask, mirroring the behavior observed in non-partitioned systems.
> 
> For IO dies, which lack a direct CPU relationship, any unique logical
> ID can be assigned. Here domain IDs for IO dies are configured after all
> compute domain IDs. During the probe, keep the index of the next IO
> domain ID after the last IO domain ID of the current partition. Since
> CPU packages are symmetric, partition information is same for all
> packages.
> 
> The Intel Speed Select driver has already implemented a similar change
> to make the domain ID unique, with compute dies listed first, followed
> by I/O dies.
> 
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> ---
> v2:
> - Add some comments
> - Change update_domain_id() to set_domian_id() to set domain_id instead of update
> - cluster_info->uncore_data.domain_id += * is changed to add multiple steps to
> get to this equation
> - Handle case when only when no compute dies in partition 
> 
>  .../uncore-frequency/uncore-frequency-tpmi.c  | 76 ++++++++++++++++++-
>  1 file changed, 75 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> index cb4905bad89b..a30a99048db9 100644
> --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> @@ -369,6 +369,79 @@ static void uncore_set_agent_type(struct tpmi_uncore_cluster_info *cluster_info)
>  	cluster_info->uncore_data.agent_type_mask = FIELD_GET(UNCORE_AGENT_TYPES, status);
>  }
>  
> +#define MAX_PARTITIONS	2
> +
> +/* IO domain ID start index for a partition */
> +static u8 io_die_start[MAX_PARTITIONS];
> +
> +/* Next IO domain ID index after the current partition IO die IDs */
> +static u8 io_die_index_next;
> +
> +/* Lock to protect io_die_start, io_die_index_next */
> +static DEFINE_MUTEX(domain_lock);
> +
> +static void set_domain_id(int id,  int num_resources,
> +			  struct oobmsm_plat_info *plat_info,
> +			  struct tpmi_uncore_cluster_info *cluster_info)
> +{
> +	u8 part_io_index = 0, cdie_range, pkg_io_index, max_dies;
> +
> +	if (plat_info->partition >= MAX_PARTITIONS) {
> +		cluster_info->uncore_data.domain_id = id;
> +		return;
> +	}
> +
> +	if (cluster_info->uncore_data.agent_type_mask & AGENT_TYPE_CORE) {
> +		cluster_info->uncore_data.domain_id = cluster_info->cdie_id;
> +		return;
> +	}
> +
> +	/* Unlikely but cdie_mask may have holes, so take range */
> +	cdie_range = fls(plat_info->cdie_mask) - ffs(plat_info->cdie_mask) + 1;
> +	max_dies = topology_max_dies_per_package();
> +
> +	/*
> +	 * If the CPU doesn't enumerate dies, then just current cdie range
> +	 * the max.

This sound broken grammar to my non-native ear. Did you mean:

..., then just use current cdie range as the max.

?


> +	 */
> +	if (cdie_range > max_dies)
> +		max_dies = cdie_range;
> +
> +	guard(mutex)(&domain_lock);
> +
> +	if (!io_die_index_next)
> +		io_die_index_next = max_dies;
> +
> +	if (!io_die_start[plat_info->partition]) {
> +		io_die_start[plat_info->partition] = io_die_index_next;
> +		/*
> +		 * number of IO dies = num_resources - cdie_range. Hence
> +		 * next partition io_die_index_next is set after IO dies
> +		 * in the current partition.
> +		 */
> +		io_die_index_next += (num_resources - cdie_range);
> +	}
> +
> +	/*
> +	 * Index from IO die start within the partition:
> +	 * This is the first valid domain after the cdies. If there are
> +	 * no cdies in a partition just start from 0.
> +	 * For example the current resource index 5 and cdies end at
> +	 * index 3 (cdie_cnt = 4). Then the io only index 5 - 4 = 1.
> +	 */
> +	if (cdie_range)

"start from 0" sounds a bit alarming to me as if that condition could 
happen also after starting, that is, more than once within a partition 
which would result in using part_io_index = 0 twice?

> +		part_io_index = id - cdie_range;
> +
> +	/*
> +	 * Add to the IO die start index for this partition in this package
> +	 * to make unique in the package.
> +	 */
> +	pkg_io_index = io_die_start[plat_info->partition] + part_io_index;
> +
> +	/* Assign this to domain ID */
> +	cluster_info->uncore_data.domain_id = pkg_io_index;
> +}
> +
>  /* Callback for sysfs read for TPMI uncore values. Called under mutex locks. */
>  static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncore_index index)
>  {
> @@ -605,11 +678,12 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
>  			cluster_info->uncore_data.package_id = pkg;
>  			/* There are no dies like Cascade Lake */
>  			cluster_info->uncore_data.die_id = 0;
> -			cluster_info->uncore_data.domain_id = i;
>  			cluster_info->uncore_data.cluster_id = j;
>  
>  			set_cdie_id(i, cluster_info, plat_info);
>  
> +			set_domain_id(i, num_resources, plat_info, cluster_info);
> +
>  			cluster_info->uncore_root = tpmi_uncore;
>  
>  			if (TPMI_MINOR_VERSION(pd_info->ufs_header_ver) >= UNCORE_ELC_SUPPORTED_VERSION)
> 

-- 
 i.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] platform/x86/intel-uncore-freq: Present unique domain ID per package
  2025-08-28 10:36 ` Ilpo Järvinen
@ 2025-08-28 12:40   ` srinivas pandruvada
  0 siblings, 0 replies; 3+ messages in thread
From: srinivas pandruvada @ 2025-08-28 12:40 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Hans de Goede, platform-driver-x86, LKML

On Thu, 2025-08-28 at 13:36 +0300, Ilpo Järvinen wrote:
> On Mon, 25 Aug 2025, Srinivas Pandruvada wrote:
> 
> > In partitioned systems, the domain ID is unique in the partition
> > and a
> > package can have multiple partitions.
> > 
> > Some user-space tools, such as turbostat, assume the domain ID is
> > unique
> > per package. These tools map CPU power domains, which are unique to
> > a
> > package. However, this approach does not work in partitioned
> > systems.
> > 
> > There is no architectural definition of "partition" to present to
> > user
> > space.
> > 
> > To support these tools, set the domain_id to be unique per package.
> > For
> > compute die IDs, uniqueness can be achieved using the platform info
> > cdie_mask, mirroring the behavior observed in non-partitioned
> > systems.
> > 
> > For IO dies, which lack a direct CPU relationship, any unique
> > logical
> > ID can be assigned. Here domain IDs for IO dies are configured
> > after all
> > compute domain IDs. During the probe, keep the index of the next IO
> > domain ID after the last IO domain ID of the current partition.
> > Since
> > CPU packages are symmetric, partition information is same for all
> > packages.
> > 
> > The Intel Speed Select driver has already implemented a similar
> > change
> > to make the domain ID unique, with compute dies listed first,
> > followed
> > by I/O dies.
> > 
> > Signed-off-by: Srinivas Pandruvada
> > <srinivas.pandruvada@linux.intel.com>
> > ---
> > v2:
> > - Add some comments
> > - Change update_domain_id() to set_domian_id() to set domain_id
> > instead of update
> > - cluster_info->uncore_data.domain_id += * is changed to add
> > multiple steps to
> > get to this equation
> > - Handle case when only when no compute dies in partition 
> > 
> >  .../uncore-frequency/uncore-frequency-tpmi.c  | 76
> > ++++++++++++++++++-
> >  1 file changed, 75 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-
> > frequency-tpmi.c b/drivers/platform/x86/intel/uncore-
> > frequency/uncore-frequency-tpmi.c
> > index cb4905bad89b..a30a99048db9 100644
> > --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > tpmi.c
> > +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-
> > tpmi.c
> > @@ -369,6 +369,79 @@ static void uncore_set_agent_type(struct
> > tpmi_uncore_cluster_info *cluster_info)
> >  	cluster_info->uncore_data.agent_type_mask =
> > FIELD_GET(UNCORE_AGENT_TYPES, status);
> >  }
> >  
> > +#define MAX_PARTITIONS	2
> > +
> > +/* IO domain ID start index for a partition */
> > +static u8 io_die_start[MAX_PARTITIONS];
> > +
> > +/* Next IO domain ID index after the current partition IO die IDs
> > */
> > +static u8 io_die_index_next;
> > +
> > +/* Lock to protect io_die_start, io_die_index_next */
> > +static DEFINE_MUTEX(domain_lock);
> > +
> > +static void set_domain_id(int id,  int num_resources,
> > +			  struct oobmsm_plat_info *plat_info,
> > +			  struct tpmi_uncore_cluster_info
> > *cluster_info)
> > +{
> > +	u8 part_io_index = 0, cdie_range, pkg_io_index, max_dies;
> > +
> > +	if (plat_info->partition >= MAX_PARTITIONS) {
> > +		cluster_info->uncore_data.domain_id = id;
> > +		return;
> > +	}
> > +
> > +	if (cluster_info->uncore_data.agent_type_mask &
> > AGENT_TYPE_CORE) {
> > +		cluster_info->uncore_data.domain_id =
> > cluster_info->cdie_id;
> > +		return;
> > +	}
> > +
> > +	/* Unlikely but cdie_mask may have holes, so take range */
> > +	cdie_range = fls(plat_info->cdie_mask) - ffs(plat_info-
> > >cdie_mask) + 1;
> > +	max_dies = topology_max_dies_per_package();
> > +
> > +	/*
> > +	 * If the CPU doesn't enumerate dies, then just current
> > cdie range
> > +	 * the max.
> 
> This sound broken grammar to my non-native ear. Did you mean:
> 
> ..., then just use current cdie range as the max.
> 
> ?
Yes, I have swallowed "as" in my non native year!

> 
> 
> > +	 */
> > +	if (cdie_range > max_dies)
> > +		max_dies = cdie_range;
> > +
> > +	guard(mutex)(&domain_lock);
> > +
> > +	if (!io_die_index_next)
> > +		io_die_index_next = max_dies;
> > +
> > +	if (!io_die_start[plat_info->partition]) {
> > +		io_die_start[plat_info->partition] =
> > io_die_index_next;
> > +		/*
> > +		 * number of IO dies = num_resources - cdie_range.
> > Hence
> > +		 * next partition io_die_index_next is set after
> > IO dies
> > +		 * in the current partition.
> > +		 */
> > +		io_die_index_next += (num_resources - cdie_range);
> > +	}
> > +
> > +	/*
> > +	 * Index from IO die start within the partition:
> > +	 * This is the first valid domain after the cdies. If
> > there are
> > +	 * no cdies in a partition just start from 0.
> > +	 * For example the current resource index 5 and cdies end
> > at
> > +	 * index 3 (cdie_cnt = 4). Then the io only index 5 - 4 =
> > 1.
> > +	 */
> > +	if (cdie_range)
> 
> "start from 0" sounds a bit alarming to me as if that condition could
> happen also after starting, that is, more than once within a
> partition 
> which would result in using part_io_index = 0 twice?
You are correct. This is possible to create configuration like this. I
will remove this check.

Thanks,
Srinivas

> 
> > +		part_io_index = id - cdie_range;
> > +
> > +	/*
> > +	 * Add to the IO die start index for this partition in
> > this package
> > +	 * to make unique in the package.
> > +	 */
> > +	pkg_io_index = io_die_start[plat_info->partition] +
> > part_io_index;
> > +
> > +	/* Assign this to domain ID */
> > +	cluster_info->uncore_data.domain_id = pkg_io_index;
> > +}
> > +
> >  /* Callback for sysfs read for TPMI uncore values. Called under
> > mutex locks. */
> >  static int uncore_read(struct uncore_data *data, unsigned int
> > *value, enum uncore_index index)
> >  {
> > @@ -605,11 +678,12 @@ static int uncore_probe(struct
> > auxiliary_device *auxdev, const struct auxiliary_
> >  			cluster_info->uncore_data.package_id =
> > pkg;
> >  			/* There are no dies like Cascade Lake */
> >  			cluster_info->uncore_data.die_id = 0;
> > -			cluster_info->uncore_data.domain_id = i;
> >  			cluster_info->uncore_data.cluster_id = j;
> >  
> >  			set_cdie_id(i, cluster_info, plat_info);
> >  
> > +			set_domain_id(i, num_resources, plat_info,
> > cluster_info);
> > +
> >  			cluster_info->uncore_root = tpmi_uncore;
> >  
> >  			if (TPMI_MINOR_VERSION(pd_info-
> > >ufs_header_ver) >= UNCORE_ELC_SUPPORTED_VERSION)
> > 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-28 12:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 21:43 [PATCH v2] platform/x86/intel-uncore-freq: Present unique domain ID per package Srinivas Pandruvada
2025-08-28 10:36 ` Ilpo Järvinen
2025-08-28 12:40   ` srinivas pandruvada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).