linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
@ 2023-09-06  0:06 Srinivas Pandruvada
  2023-09-06  2:20 ` Zhang, Rui
  0 siblings, 1 reply; 4+ messages in thread
From: Srinivas Pandruvada @ 2023-09-06  0:06 UTC (permalink / raw)
  To: rafael, rui.zhang
  Cc: linux-pm, sumeet.r.pawnikar, linux-kernel, Srinivas Pandruvada,
	stable

System runs at minimum performance, once powercap RAPL package domain
"enabled" flag is toggled.

Setting RAPL package domain enabled flag to 0, results in setting of
power limit 4 (PL4) MSR 0x601 to 0. This implies disabling PL4 limit.
The PL4 limit controls the peak power. This can significantly change
the performance. Even worse, when the enabled flag is set to 1 again.
This will set PL4 MSR value to 0x01, which means reduce peak power to
0.125W. This will force the system to run at the lowest possible
performance.

This is caused by a change which assumes that there is an enable bit
in the PL4 MSR like other power limits.

In functions set_floor_freq_default() and rapl_remove_package(), call
rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit 1
and 2. Similarly don't read PL_ENABLE for PL4 to check the presence of
power limit 4. Power limit 4 support is based on CPU model in this
driver. No additional checks can be done.

Fixes: 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits support")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org # v6.5+
---
 drivers/powercap/intel_rapl_common.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 5c2e6d5eea2a..0afedf7ad872 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -184,8 +184,6 @@ static int get_pl_prim(struct rapl_domain *rd, int pl, enum pl_prims prim)
 	case POWER_LIMIT4:
 		if (prim == PL_LIMIT)
 			return POWER_LIMIT4;
-		if (prim == PL_ENABLE)
-			return PL4_ENABLE;
 		/* PL4 would be around two times PL2, use same prim as PL2. */
 		if (prim == PL_MAX_POWER)
 			return MAX_POWER;
@@ -1033,17 +1031,13 @@ static void package_power_limit_irq_restore(struct rapl_package *rp)
 
 static void set_floor_freq_default(struct rapl_domain *rd, bool mode)
 {
-	int i;
-
 	/* always enable clamp such that p-state can go below OS requested
 	 * range. power capping priority over guranteed frequency.
 	 */
 	rapl_write_pl_data(rd, POWER_LIMIT1, PL_CLAMP, mode);
 
-	for (i = POWER_LIMIT2; i < NR_POWER_LIMITS; i++) {
-		rapl_write_pl_data(rd, i, PL_ENABLE, mode);
-		rapl_write_pl_data(rd, i, PL_CLAMP, mode);
-	}
+	rapl_write_pl_data(rd, POWER_LIMIT2, PL_ENABLE, mode);
+	rapl_write_pl_data(rd, POWER_LIMIT2, PL_CLAMP, mode);
 }
 
 static void set_floor_freq_atom(struct rapl_domain *rd, bool enable)
@@ -1458,7 +1452,7 @@ static void rapl_detect_powerlimit(struct rapl_domain *rd)
 			}
 		}
 
-		if (rapl_read_pl_data(rd, i, PL_ENABLE, false, &val64))
+		if (i != POWER_LIMIT4 && rapl_read_pl_data(rd, i, PL_ENABLE, false, &val64))
 			rd->rpl[i].name = NULL;
 	}
 }
@@ -1510,7 +1504,7 @@ void rapl_remove_package(struct rapl_package *rp)
 	for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
 		int i;
 
-		for (i = POWER_LIMIT1; i < NR_POWER_LIMITS; i++) {
+		for (i = POWER_LIMIT1; i <= POWER_LIMIT2; i++) {
 			rapl_write_pl_data(rd, i, PL_ENABLE, 0);
 			rapl_write_pl_data(rd, i, PL_CLAMP, 0);
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
  2023-09-06  0:06 [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0 Srinivas Pandruvada
@ 2023-09-06  2:20 ` Zhang, Rui
  2023-09-06 14:36   ` srinivas pandruvada
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Rui @ 2023-09-06  2:20 UTC (permalink / raw)
  To: srinivas.pandruvada@linux.intel.com, rafael@kernel.org
  Cc: linux-pm@vger.kernel.org, Pawnikar, Sumeet R,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org

Hi, Srinivas,

Thanks for the patch.

On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> System runs at minimum performance, once powercap RAPL package domain
> "enabled" flag is toggled.
> 
> Setting RAPL package domain enabled flag to 0, results in setting of
> power limit 4 (PL4) MSR 0x601 to 0.

This is the bug. Setting enabled flag should only affect the Enable bit
but not the other bits.

>  This implies disabling PL4 limit.
> The PL4 limit controls the peak power. This can significantly change
> the performance. Even worse, when the enabled flag is set to 1 again.
> This will set PL4 MSR value to 0x01, which means reduce peak power to
> 0.125W. This will force the system to run at the lowest possible
> performance.
> 
> This is caused by a change which assumes that there is an enable bit
> in the PL4 MSR like other power limits.

MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per power
limit enable bit.

> 
> In functions set_floor_freq_default() and rapl_remove_package(), call
> rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit 1
> and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> of
> power limit 4.

IMO, the rootcause is that we expose an non-exits PL4_ENABLE primitive
for MSR Interface, and even worse we expose it wrongly that the
primitive uses the power limit bits.

Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
support") pokes the MSR interface PL4_ENABLE primitive and exposes this
bug.

Sumeet is testing the below fix right now, and I suppose he will give
some update soon.

thanks,
rui

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 8fac57b28f8a..d407f918876f 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -658,8 +658,6 @@ static struct rapl_primitive_info rpi_msr[NR_RAPL_PRIMITIVES] = {
 			    RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
 	[PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP, POWER_LIMIT2_CLAMP, 48,
 			    RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
-	[PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE, POWER_LIMIT4_MASK, 0,
-				RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT, 0),
 	[TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1, TIME_WINDOW1_MASK, 17,
 			    RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
 	[TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2, TIME_WINDOW2_MASK, 49,




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
  2023-09-06  2:20 ` Zhang, Rui
@ 2023-09-06 14:36   ` srinivas pandruvada
  2023-09-06 19:07     ` Pawnikar, Sumeet R
  0 siblings, 1 reply; 4+ messages in thread
From: srinivas pandruvada @ 2023-09-06 14:36 UTC (permalink / raw)
  To: Zhang, Rui, rafael@kernel.org
  Cc: linux-pm@vger.kernel.org, Pawnikar, Sumeet R,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org

Hi Rui,

On Wed, 2023-09-06 at 02:20 +0000, Zhang, Rui wrote:
> Hi, Srinivas,
> 
> Thanks for the patch.
> 
> On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> > System runs at minimum performance, once powercap RAPL package
> > domain
> > "enabled" flag is toggled.
> > 
> > Setting RAPL package domain enabled flag to 0, results in setting
> > of
> > power limit 4 (PL4) MSR 0x601 to 0.
> 
> This is the bug. Setting enabled flag should only affect the Enable
> bit
> but not the other bits.
> 
> >  This implies disabling PL4 limit.
> > The PL4 limit controls the peak power. This can significantly
> > change
> > the performance. Even worse, when the enabled flag is set to 1
> > again.
> > This will set PL4 MSR value to 0x01, which means reduce peak power
> > to
> > 0.125W. This will force the system to run at the lowest possible
> > performance.
> > 
> > This is caused by a change which assumes that there is an enable
> > bit
> > in the PL4 MSR like other power limits.
> 
> MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per
> power
> limit enable bit.
That is correct, but not sure that in practice used or not.

> 
> > 
> > In functions set_floor_freq_default() and rapl_remove_package(),
> > call
> > rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit
> > 1
> > and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> > of
> > power limit 4.
> 
> IMO, the rootcause is that we expose an non-exits PL4_ENABLE
> primitive
> for MSR Interface, and even worse we expose it wrongly that the
> primitive uses the power limit bits.
> 
> Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
> support") pokes the MSR interface PL4_ENABLE primitive and exposes
> this
> bug.
> 
> Sumeet is testing the below fix right now, and I suppose he will give
> some update soon.
> 
> thanks,
> rui
> 
> diff --git a/drivers/powercap/intel_rapl_common.c
> b/drivers/powercap/intel_rapl_common.c
> index 8fac57b28f8a..d407f918876f 100644
> --- a/drivers/powercap/intel_rapl_common.c
> +++ b/drivers/powercap/intel_rapl_common.c
> @@ -658,8 +658,6 @@ static struct rapl_primitive_info
> rpi_msr[NR_RAPL_PRIMITIVES] = {
>                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> 0),
>         [PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP,
> POWER_LIMIT2_CLAMP, 48,
>                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> 0),
> -       [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE,
> POWER_LIMIT4_MASK, 0,
> -                               RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT,
> 0),
>         [TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1,
> TIME_WINDOW1_MASK, 17,
>                             RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
>         [TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2,
> TIME_WINDOW2_MASK, 49,
> 
> 
Let me try this, but this is not enough. The enable/disable is also
gets checked for presence of PL4.

Thanks,
Srinivas

> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
  2023-09-06 14:36   ` srinivas pandruvada
@ 2023-09-06 19:07     ` Pawnikar, Sumeet R
  0 siblings, 0 replies; 4+ messages in thread
From: Pawnikar, Sumeet R @ 2023-09-06 19:07 UTC (permalink / raw)
  To: srinivas pandruvada, Zhang, Rui, rafael@kernel.org
  Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org



> -----Original Message-----
> From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
> Sent: Wednesday, September 6, 2023 8:06 PM
> To: Zhang, Rui <rui.zhang@intel.com>; rafael@kernel.org
> Cc: linux-pm@vger.kernel.org; Pawnikar, Sumeet R
> <sumeet.r.pawnikar@intel.com>; linux-kernel@vger.kernel.org;
> stable@vger.kernel.org
> Subject: Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0
> 
> Hi Rui,
> 
> On Wed, 2023-09-06 at 02:20 +0000, Zhang, Rui wrote:
> > Hi, Srinivas,
> >
> > Thanks for the patch.
> >
> > On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> > > System runs at minimum performance, once powercap RAPL package
> > > domain "enabled" flag is toggled.
> > >
> > > Setting RAPL package domain enabled flag to 0, results in setting of
> > > power limit 4 (PL4) MSR 0x601 to 0.
> >
> > This is the bug. Setting enabled flag should only affect the Enable
> > bit but not the other bits.
> >
> > >  This implies disabling PL4 limit.
> > > The PL4 limit controls the peak power. This can significantly change
> > > the performance. Even worse, when the enabled flag is set to 1
> > > again.
> > > This will set PL4 MSR value to 0x01, which means reduce peak power
> > > to 0.125W. This will force the system to run at the lowest possible
> > > performance.
> > >
> > > This is caused by a change which assumes that there is an enable bit
> > > in the PL4 MSR like other power limits.
> >
> > MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per
> > power limit enable bit.
> That is correct, but not sure that in practice used or not.
> 
> >
> > >
> > > In functions set_floor_freq_default() and rapl_remove_package(),
> > > call
> > > rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit
> > > 1
> > > and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> > > of
> > > power limit 4.
> >
> > IMO, the rootcause is that we expose an non-exits PL4_ENABLE
> > primitive
> > for MSR Interface, and even worse we expose it wrongly that the
> > primitive uses the power limit bits.
> >
> > Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
> > support") pokes the MSR interface PL4_ENABLE primitive and exposes
> > this
> > bug.
> >
> > Sumeet is testing the below fix right now, and I suppose he will give
> > some update soon.
> > 

Testing still going on. 

> > thanks,
> > rui
> >
> > diff --git a/drivers/powercap/intel_rapl_common.c
> > b/drivers/powercap/intel_rapl_common.c
> > index 8fac57b28f8a..d407f918876f 100644
> > --- a/drivers/powercap/intel_rapl_common.c
> > +++ b/drivers/powercap/intel_rapl_common.c
> > @@ -658,8 +658,6 @@ static struct rapl_primitive_info
> > rpi_msr[NR_RAPL_PRIMITIVES] = {
> >                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> > 0),
> >         [PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP,
> > POWER_LIMIT2_CLAMP, 48,
> >                             RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT,
> > 0),
> > -       [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE,
> > POWER_LIMIT4_MASK, 0,
> > -                               RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT,
> > 0),
> >         [TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1,
> > TIME_WINDOW1_MASK, 17,
> >                             RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
> >         [TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2,
> > TIME_WINDOW2_MASK, 49,
> >
> >
> Let me try this, but this is not enough. The enable/disable is also
> gets checked for presence of PL4.
> 

Yes, we need the check as well.

Thanks,
Sumeet. 

> Thanks,
> Srinivas
> 
> >


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-09-06 19:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-06  0:06 [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0 Srinivas Pandruvada
2023-09-06  2:20 ` Zhang, Rui
2023-09-06 14:36   ` srinivas pandruvada
2023-09-06 19:07     ` Pawnikar, Sumeet R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).