Linux Power Management development
 help / color / mirror / Atom feed
* [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification
@ 2025-11-16 12:31 Rafael J. Wysocki
  2025-11-16 12:34 ` [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-16 12:31 UTC (permalink / raw)
  To: Linux PM; +Cc: Christian Loehle, LKML

Hi,

This series is based on

https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/

and the previous teo updates sent recently.

It fixes a reverse condition in teo_update() (first patch) and quite
dramatically simplifies the "intercepts" logic in teo_select() (second
patch).

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
  2025-11-16 12:31 [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Rafael J. Wysocki
@ 2025-11-16 12:34 ` Rafael J. Wysocki
  2025-11-17  9:06   ` Christian Loehle
  2025-11-16 12:35 ` [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup Rafael J. Wysocki
  2025-11-28 23:45 ` [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Christian Loehle
  2 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-16 12:34 UTC (permalink / raw)
  To: Linux PM; +Cc: Christian Loehle, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The condition deciding whether or not to increase cpu_data->tick_intercepts
in teo_update() is reverse, so fix it.

Fixes: d619b5cc6780 ("cpuidle: teo: Simplify counting events used for tick management")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

I'm planning to apply this for 6.19 on top of

https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/

because that patch (indirectly) depends on commit d619b5cc6780.

---
 drivers/cpuidle/governors/teo.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/cpuidle/governors/teo.c
+++ b/drivers/cpuidle/governors/teo.c
@@ -251,7 +251,7 @@ static void teo_update(struct cpuidle_dr
 		cpu_data->state_bins[idx_timer].hits += PULSE;
 	} else {
 		cpu_data->state_bins[idx_duration].intercepts += PULSE;
-		if (TICK_NSEC <= measured_ns)
+		if (measured_ns <= TICK_NSEC)
 			cpu_data->tick_intercepts += PULSE;
 	}
 }




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup
  2025-11-16 12:31 [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Rafael J. Wysocki
  2025-11-16 12:34 ` [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() Rafael J. Wysocki
@ 2025-11-16 12:35 ` Rafael J. Wysocki
  2025-11-19 21:13   ` Rafael J. Wysocki
  2025-11-20  8:45   ` Christian Loehle
  2025-11-28 23:45 ` [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Christian Loehle
  2 siblings, 2 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-16 12:35 UTC (permalink / raw)
  To: Linux PM; +Cc: Christian Loehle, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Simplify the loop looking up a candidate idle state in the case when an
intercept is likely to occur by adding a search for the state index limit
if the tick is stopped before it.

First, call tick_nohz_tick_stopped() just once and if it returns true,
look for the shallowest state index below the current candidate one with
target residency at least equal to the tick period length.

Next, simply look for a state that is not shallower than the one found
in the previous step and, ideally, satisfies the intercepts majority
condition.

Since teo_state_ok() has no callers any more after the above changes,
drop it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/teo.c |   62 ++++++++++------------------------------
 1 file changed, 16 insertions(+), 46 deletions(-)

--- a/drivers/cpuidle/governors/teo.c
+++ b/drivers/cpuidle/governors/teo.c
@@ -256,12 +256,6 @@ static void teo_update(struct cpuidle_dr
 	}
 }
 
-static bool teo_state_ok(int i, struct cpuidle_driver *drv)
-{
-	return !tick_nohz_tick_stopped() ||
-		drv->states[i].target_residency_ns >= TICK_NSEC;
-}
-
 /**
  * teo_find_shallower_state - Find shallower idle state matching given duration.
  * @drv: cpuidle driver containing state data.
@@ -383,7 +377,18 @@ static int teo_select(struct cpuidle_dri
 	 * better choice.
 	 */
 	if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
-		int first_suitable_idx = idx;
+		int min_idx = idx0;
+
+		if (tick_nohz_tick_stopped()) {
+			/*
+			 * Look for the shallowest idle state below the current
+			 * candidate one whose target residency is not below the
+			 * tick period length.
+			 */
+			while (min_idx < idx &&
+			       drv->states[min_idx].target_residency_ns < TICK_NSEC)
+				min_idx++;
+		}
 
 		/*
 		 * Look for the deepest idle state whose target residency had
@@ -393,49 +398,14 @@ static int teo_select(struct cpuidle_dri
 		 * Take the possible duration limitation present if the tick
 		 * has been stopped already into account.
 		 */
-		intercept_sum = 0;
-
-		for (i = idx - 1; i >= 0; i--) {
-			struct teo_bin *bin = &cpu_data->state_bins[i];
-
-			intercept_sum += bin->intercepts;
-
-			if (2 * intercept_sum > idx_intercept_sum) {
-				/*
-				 * Use the current state unless it is too
-				 * shallow or disabled, in which case take the
-				 * first enabled state that is deep enough.
-				 */
-				if (teo_state_ok(i, drv) &&
-				    !dev->states_usage[i].disable) {
-					idx = i;
-					break;
-				}
-				idx = first_suitable_idx;
-				break;
-			}
+		for (i = idx - 1, intercept_sum = 0; i >= min_idx; i--) {
+			intercept_sum += cpu_data->state_bins[i].intercepts;
 
 			if (dev->states_usage[i].disable)
 				continue;
 
-			if (teo_state_ok(i, drv)) {
-				/*
-				 * The current state is deep enough, but still
-				 * there may be a better one.
-				 */
-				first_suitable_idx = i;
-				continue;
-			}
-
-			/*
-			 * The current state is too shallow, so if no suitable
-			 * states other than the initial candidate have been
-			 * found, give up (the remaining states to check are
-			 * shallower still), but otherwise the first suitable
-			 * state other than the initial candidate may turn out
-			 * to be preferable.
-			 */
-			if (first_suitable_idx == idx)
+			idx = i;
+			if (2 * intercept_sum > idx_intercept_sum)
 				break;
 		}
 	}




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
  2025-11-16 12:34 ` [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() Rafael J. Wysocki
@ 2025-11-17  9:06   ` Christian Loehle
  2025-11-17 16:15     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Loehle @ 2025-11-17  9:06 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM; +Cc: LKML, Aboorva Devarajan

On 11/16/25 12:34, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The condition deciding whether or not to increase cpu_data->tick_intercepts
> in teo_update() is reverse, so fix it.
> 
> Fixes: d619b5cc6780 ("cpuidle: teo: Simplify counting events used for tick management")
> Cc: All applicable <stable@vger.kernel.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> I'm planning to apply this for 6.19 on top of
> 
> https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/
> 
> because that patch (indirectly) depends on commit d619b5cc6780.
> 
> ---
>  drivers/cpuidle/governors/teo.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -251,7 +251,7 @@ static void teo_update(struct cpuidle_dr
>  		cpu_data->state_bins[idx_timer].hits += PULSE;
>  	} else {
>  		cpu_data->state_bins[idx_duration].intercepts += PULSE;
> -		if (TICK_NSEC <= measured_ns)
> +		if (measured_ns <= TICK_NSEC)

nit: Why <= instead of <?
I guess it really doesn't matter with measured_ns only being a rough approximation
with an error in the order of wakeup-latency.

Reviewed-by:
Christian Loehle <christian.loehle@arm.com>

Let me go write some tests for all these edge cases :/

IIRC Aboorva's power systems have no idle state deeper than TICK_NSEC, so
this might make a big difference here, hence CCed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
  2025-11-17  9:06   ` Christian Loehle
@ 2025-11-17 16:15     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-17 16:15 UTC (permalink / raw)
  To: Christian Loehle; +Cc: Rafael J. Wysocki, Linux PM, LKML, Aboorva Devarajan

On Mon, Nov 17, 2025 at 10:06 AM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 11/16/25 12:34, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > The condition deciding whether or not to increase cpu_data->tick_intercepts
> > in teo_update() is reverse, so fix it.
> >
> > Fixes: d619b5cc6780 ("cpuidle: teo: Simplify counting events used for tick management")
> > Cc: All applicable <stable@vger.kernel.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > I'm planning to apply this for 6.19 on top of
> >
> > https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/
> >
> > because that patch (indirectly) depends on commit d619b5cc6780.
> >
> > ---
> >  drivers/cpuidle/governors/teo.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- a/drivers/cpuidle/governors/teo.c
> > +++ b/drivers/cpuidle/governors/teo.c
> > @@ -251,7 +251,7 @@ static void teo_update(struct cpuidle_dr
> >               cpu_data->state_bins[idx_timer].hits += PULSE;
> >       } else {
> >               cpu_data->state_bins[idx_duration].intercepts += PULSE;
> > -             if (TICK_NSEC <= measured_ns)
> > +             if (measured_ns <= TICK_NSEC)
>
> nit: Why <= instead of <?

Because it was <= before.

> I guess it really doesn't matter with measured_ns only being a rough approximation
> with an error in the order of wakeup-latency.

Right and moreover, TICK_NSEC is an upper bound for tick wakeups, they
occur earlier as a rule.

> Reviewed-by:
> Christian Loehle <christian.loehle@arm.com>
>
> Let me go write some tests for all these edge cases :/
>
> IIRC Aboorva's power systems have no idle state deeper than TICK_NSEC, so
> this might make a big difference here, hence CCed.

Like x86 systems with HZ < 1000 which are the majority nowadays AFAICS.

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup
  2025-11-16 12:35 ` [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup Rafael J. Wysocki
@ 2025-11-19 21:13   ` Rafael J. Wysocki
  2025-11-20  8:45   ` Christian Loehle
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-19 21:13 UTC (permalink / raw)
  To: Christian Loehle, Linux PM; +Cc: LKML

On Sun, Nov 16, 2025 at 1:35 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Simplify the loop looking up a candidate idle state in the case when an
> intercept is likely to occur by adding a search for the state index limit
> if the tick is stopped before it.
>
> First, call tick_nohz_tick_stopped() just once and if it returns true,
> look for the shallowest state index below the current candidate one with
> target residency at least equal to the tick period length.
>
> Next, simply look for a state that is not shallower than the one found
> in the previous step and, ideally, satisfies the intercepts majority
> condition.
>
> Since teo_state_ok() has no callers any more after the above changes,
> drop it.
>
> No intentional functional impact.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/governors/teo.c |   62 ++++++++++------------------------------
>  1 file changed, 16 insertions(+), 46 deletions(-)
>
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -256,12 +256,6 @@ static void teo_update(struct cpuidle_dr
>         }
>  }
>
> -static bool teo_state_ok(int i, struct cpuidle_driver *drv)
> -{
> -       return !tick_nohz_tick_stopped() ||
> -               drv->states[i].target_residency_ns >= TICK_NSEC;
> -}
> -
>  /**
>   * teo_find_shallower_state - Find shallower idle state matching given duration.
>   * @drv: cpuidle driver containing state data.
> @@ -383,7 +377,18 @@ static int teo_select(struct cpuidle_dri
>          * better choice.
>          */
>         if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
> -               int first_suitable_idx = idx;
> +               int min_idx = idx0;
> +
> +               if (tick_nohz_tick_stopped()) {
> +                       /*
> +                        * Look for the shallowest idle state below the current
> +                        * candidate one whose target residency is not below the
> +                        * tick period length.
> +                        */
> +                       while (min_idx < idx &&
> +                              drv->states[min_idx].target_residency_ns < TICK_NSEC)
> +                               min_idx++;
> +               }
>
>                 /*
>                  * Look for the deepest idle state whose target residency had
> @@ -393,49 +398,14 @@ static int teo_select(struct cpuidle_dri
>                  * Take the possible duration limitation present if the tick
>                  * has been stopped already into account.
>                  */
> -               intercept_sum = 0;
> -
> -               for (i = idx - 1; i >= 0; i--) {
> -                       struct teo_bin *bin = &cpu_data->state_bins[i];
> -
> -                       intercept_sum += bin->intercepts;
> -
> -                       if (2 * intercept_sum > idx_intercept_sum) {
> -                               /*
> -                                * Use the current state unless it is too
> -                                * shallow or disabled, in which case take the
> -                                * first enabled state that is deep enough.
> -                                */
> -                               if (teo_state_ok(i, drv) &&
> -                                   !dev->states_usage[i].disable) {
> -                                       idx = i;
> -                                       break;
> -                               }
> -                               idx = first_suitable_idx;
> -                               break;
> -                       }
> +               for (i = idx - 1, intercept_sum = 0; i >= min_idx; i--) {
> +                       intercept_sum += cpu_data->state_bins[i].intercepts;
>
>                         if (dev->states_usage[i].disable)
>                                 continue;
>
> -                       if (teo_state_ok(i, drv)) {
> -                               /*
> -                                * The current state is deep enough, but still
> -                                * there may be a better one.
> -                                */
> -                               first_suitable_idx = i;
> -                               continue;
> -                       }
> -
> -                       /*
> -                        * The current state is too shallow, so if no suitable
> -                        * states other than the initial candidate have been
> -                        * found, give up (the remaining states to check are
> -                        * shallower still), but otherwise the first suitable
> -                        * state other than the initial candidate may turn out
> -                        * to be preferable.
> -                        */
> -                       if (first_suitable_idx == idx)
> +                       idx = i;
> +                       if (2 * intercept_sum > idx_intercept_sum)
>                                 break;
>                 }
>         }

I'd appreciate feedback on this one.

I regard it as a significant improvement even though it is not
expected to change functionality.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup
  2025-11-16 12:35 ` [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup Rafael J. Wysocki
  2025-11-19 21:13   ` Rafael J. Wysocki
@ 2025-11-20  8:45   ` Christian Loehle
  2025-11-20 11:07     ` Rafael J. Wysocki
  1 sibling, 1 reply; 10+ messages in thread
From: Christian Loehle @ 2025-11-20  8:45 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM; +Cc: LKML

On 11/16/25 12:35, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Simplify the loop looking up a candidate idle state in the case when an
> intercept is likely to occur by adding a search for the state index limit
> if the tick is stopped before it.
> 
> First, call tick_nohz_tick_stopped() just once and if it returns true,
> look for the shallowest state index below the current candidate one with
> target residency at least equal to the tick period length.
> 
> Next, simply look for a state that is not shallower than the one found
> in the previous step and, ideally, satisfies the intercepts majority
> condition.

NIT: The ideally is a bit handwavy, maybe:
Next, look for the deepest state that satisfies the intercepts majority
condition but select no shallower state than the one from the previous step.

Sounds a bit verbose I guess.

> 
> Since teo_state_ok() has no callers any more after the above changes,
> drop it.
> 
> No intentional functional impact.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/governors/teo.c |   62 ++++++++++------------------------------
>  1 file changed, 16 insertions(+), 46 deletions(-)
> 
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -256,12 +256,6 @@ static void teo_update(struct cpuidle_dr
>  	}
>  }
>  
> -static bool teo_state_ok(int i, struct cpuidle_driver *drv)
> -{
> -	return !tick_nohz_tick_stopped() ||
> -		drv->states[i].target_residency_ns >= TICK_NSEC;
> -}
> -
>  /**
>   * teo_find_shallower_state - Find shallower idle state matching given duration.
>   * @drv: cpuidle driver containing state data.
> @@ -383,7 +377,18 @@ static int teo_select(struct cpuidle_dri
>  	 * better choice.
>  	 */
>  	if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
> -		int first_suitable_idx = idx;
> +		int min_idx = idx0;
> +
> +		if (tick_nohz_tick_stopped()) {
> +			/*
> +			 * Look for the shallowest idle state below the current
> +			 * candidate one whose target residency is not below the
> +			 * tick period length.
> +			 */

NIT: s/not below/above
or just use >= in the comment?

> +			while (min_idx < idx &&
> +			       drv->states[min_idx].target_residency_ns < TICK_NSEC)
> +				min_idx++;
> +		}
>  
>  		/*
>  		 * Look for the deepest idle state whose target residency had
> @@ -393,49 +398,14 @@ static int teo_select(struct cpuidle_dri
>  		 * Take the possible duration limitation present if the tick
>  		 * has been stopped already into account.
>  		 */
> -		intercept_sum = 0;
> -
> -		for (i = idx - 1; i >= 0; i--) {
> -			struct teo_bin *bin = &cpu_data->state_bins[i];
> -
> -			intercept_sum += bin->intercepts;
> -
> -			if (2 * intercept_sum > idx_intercept_sum) {
> -				/*
> -				 * Use the current state unless it is too
> -				 * shallow or disabled, in which case take the
> -				 * first enabled state that is deep enough.
> -				 */
> -				if (teo_state_ok(i, drv) &&
> -				    !dev->states_usage[i].disable) {
> -					idx = i;
> -					break;
> -				}
> -				idx = first_suitable_idx;
> -				break;
> -			}
> +		for (i = idx - 1, intercept_sum = 0; i >= min_idx; i--) {
> +			intercept_sum += cpu_data->state_bins[i].intercepts;
>  
>  			if (dev->states_usage[i].disable)
>  				continue;
>  
> -			if (teo_state_ok(i, drv)) {
> -				/*
> -				 * The current state is deep enough, but still
> -				 * there may be a better one.
> -				 */
> -				first_suitable_idx = i;
> -				continue;
> -			}
> -
> -			/*
> -			 * The current state is too shallow, so if no suitable
> -			 * states other than the initial candidate have been
> -			 * found, give up (the remaining states to check are
> -			 * shallower still), but otherwise the first suitable
> -			 * state other than the initial candidate may turn out
> -			 * to be preferable.
> -			 */
> -			if (first_suitable_idx == idx)
> +			idx = i;
> +			if (2 * intercept_sum > idx_intercept_sum)
>  				break;
>  		}
>  	}

Thanks, that is indeed a nice simplification. I'll get test results out on Monday,
sorry!

Reviewed-by: Christian Loehle <christian.loehle@arm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup
  2025-11-20  8:45   ` Christian Loehle
@ 2025-11-20 11:07     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-11-20 11:07 UTC (permalink / raw)
  To: Christian Loehle; +Cc: Rafael J. Wysocki, Linux PM, LKML

On Thu, Nov 20, 2025 at 9:45 AM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 11/16/25 12:35, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Simplify the loop looking up a candidate idle state in the case when an
> > intercept is likely to occur by adding a search for the state index limit
> > if the tick is stopped before it.
> >
> > First, call tick_nohz_tick_stopped() just once and if it returns true,
> > look for the shallowest state index below the current candidate one with
> > target residency at least equal to the tick period length.
> >
> > Next, simply look for a state that is not shallower than the one found
> > in the previous step and, ideally, satisfies the intercepts majority
> > condition.
>
> NIT: The ideally is a bit handwavy, maybe:
> Next, look for the deepest state that satisfies the intercepts majority
> condition but select no shallower state than the one from the previous step.
>
> Sounds a bit verbose I guess.

I'll figure out something suitable.

> >
> > Since teo_state_ok() has no callers any more after the above changes,
> > drop it.
> >
> > No intentional functional impact.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/cpuidle/governors/teo.c |   62 ++++++++++------------------------------
> >  1 file changed, 16 insertions(+), 46 deletions(-)
> >
> > --- a/drivers/cpuidle/governors/teo.c
> > +++ b/drivers/cpuidle/governors/teo.c
> > @@ -256,12 +256,6 @@ static void teo_update(struct cpuidle_dr
> >       }
> >  }
> >
> > -static bool teo_state_ok(int i, struct cpuidle_driver *drv)
> > -{
> > -     return !tick_nohz_tick_stopped() ||
> > -             drv->states[i].target_residency_ns >= TICK_NSEC;
> > -}
> > -
> >  /**
> >   * teo_find_shallower_state - Find shallower idle state matching given duration.
> >   * @drv: cpuidle driver containing state data.
> > @@ -383,7 +377,18 @@ static int teo_select(struct cpuidle_dri
> >        * better choice.
> >        */
> >       if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
> > -             int first_suitable_idx = idx;
> > +             int min_idx = idx0;
> > +
> > +             if (tick_nohz_tick_stopped()) {
> > +                     /*
> > +                      * Look for the shallowest idle state below the current
> > +                      * candidate one whose target residency is not below the
> > +                      * tick period length.
> > +                      */
>
> NIT: s/not below/above
> or just use >= in the comment?

Well, I can just say "equal to or greater than" or "at least equal to"
(slightly preferred).

> > +                     while (min_idx < idx &&
> > +                            drv->states[min_idx].target_residency_ns < TICK_NSEC)
> > +                             min_idx++;
> > +             }
> >
> >               /*
> >                * Look for the deepest idle state whose target residency had
> > @@ -393,49 +398,14 @@ static int teo_select(struct cpuidle_dri
> >                * Take the possible duration limitation present if the tick
> >                * has been stopped already into account.
> >                */
> > -             intercept_sum = 0;
> > -
> > -             for (i = idx - 1; i >= 0; i--) {
> > -                     struct teo_bin *bin = &cpu_data->state_bins[i];
> > -
> > -                     intercept_sum += bin->intercepts;
> > -
> > -                     if (2 * intercept_sum > idx_intercept_sum) {
> > -                             /*
> > -                              * Use the current state unless it is too
> > -                              * shallow or disabled, in which case take the
> > -                              * first enabled state that is deep enough.
> > -                              */
> > -                             if (teo_state_ok(i, drv) &&
> > -                                 !dev->states_usage[i].disable) {
> > -                                     idx = i;
> > -                                     break;
> > -                             }
> > -                             idx = first_suitable_idx;
> > -                             break;
> > -                     }
> > +             for (i = idx - 1, intercept_sum = 0; i >= min_idx; i--) {
> > +                     intercept_sum += cpu_data->state_bins[i].intercepts;
> >
> >                       if (dev->states_usage[i].disable)
> >                               continue;
> >
> > -                     if (teo_state_ok(i, drv)) {
> > -                             /*
> > -                              * The current state is deep enough, but still
> > -                              * there may be a better one.
> > -                              */
> > -                             first_suitable_idx = i;
> > -                             continue;
> > -                     }
> > -
> > -                     /*
> > -                      * The current state is too shallow, so if no suitable
> > -                      * states other than the initial candidate have been
> > -                      * found, give up (the remaining states to check are
> > -                      * shallower still), but otherwise the first suitable
> > -                      * state other than the initial candidate may turn out
> > -                      * to be preferable.
> > -                      */
> > -                     if (first_suitable_idx == idx)
> > +                     idx = i;
> > +                     if (2 * intercept_sum > idx_intercept_sum)
> >                               break;
> >               }
> >       }
>
> Thanks, that is indeed a nice simplification. I'll get test results out on Monday,
> sorry!

No worries.

> Reviewed-by: Christian Loehle <christian.loehle@arm.com>

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification
  2025-11-16 12:31 [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Rafael J. Wysocki
  2025-11-16 12:34 ` [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() Rafael J. Wysocki
  2025-11-16 12:35 ` [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup Rafael J. Wysocki
@ 2025-11-28 23:45 ` Christian Loehle
  2025-12-02 17:54   ` Doug Smythies
  2 siblings, 1 reply; 10+ messages in thread
From: Christian Loehle @ 2025-11-28 23:45 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM; +Cc: LKML

On 11/16/25 12:31, Rafael J. Wysocki wrote:
> Hi,
> 
> This series is based on
> 
> https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/
> 
> and the previous teo updates sent recently.
> 
> It fixes a reverse condition in teo_update() (first patch) and quite
> dramatically simplifies the "intercepts" logic in teo_select() (second
> patch).
> 

FWIW nothing note worthy in my tests, as expected.
Tested-by: Christian Loehle <christian.loehle@arm.com>

teo-0 is the with the mentioned baseline:
device     gov        iter     iops      idles  idle_miss      ratio     belows     aboves
mmcblk1    teo-0      0      2331     643796         88      0.000         56         32
mmcblk1    teo-0      1      2327     624534        120      0.000         36         84
mmcblk1    teo-0      2      2352     642172         33      0.000         22         11
mmcblk1    teo-1      0      2330     626270         89      0.000         41         48
mmcblk1    teo-1      1      2330     641274         16      0.000         15          1
mmcblk1    teo-1      2      2344     644533         58      0.000         44         14
mmcblk1    teo-2      0      2331     639104         44      0.000         31         13
mmcblk1    teo-2      1      2327     606128        110      0.000         33         77
mmcblk1    teo-2      2      2352     582654         50      0.000         35         15
mmcblk2    teo-0      0      5704     903238        260      0.000        245         15
mmcblk2    teo-0      1      5605     717534       4586      0.006       4572         14
mmcblk2    teo-0      2      5669     798154         61      0.000         34         27
mmcblk2    teo-1      0      5662     748450         78      0.000         36         42
mmcblk2    teo-1      1      5646     688848        105      0.000         60         45
mmcblk2    teo-1      2      5687     836774         32      0.000         27          5
mmcblk2    teo-2      0      5695     911770         35      0.000         23         12
mmcblk2    teo-2      1      5669     762958         85      0.000         50         35
mmcblk2    teo-2      2      5639     625948        150      0.000         98         52
nvme0n1    teo-0      0     11274     823656         62      0.000         44         18
nvme0n1    teo-0      1     11761     846600         70      0.000         41         29
nvme0n1    teo-0      2     10488     779408        121      0.000         56         65
nvme0n1    teo-1      0     11822     859692         71      0.000         34         37
nvme0n1    teo-1      1     10486     773748        223      0.000         44        179
nvme0n1    teo-1      2     10624     782348        222      0.000         45        177
nvme0n1    teo-2      0     10488     765694        122      0.000         31         91
nvme0n1    teo-2      1     10661     785510        150      0.000         47        103
nvme0n1    teo-2      2     10519     775320        173      0.000         44        129
nullb0     teo-0      0    103101     111844         50      0.000         28         22
nullb0     teo-0      1    104242     111172        270      0.002        156        114
nullb0     teo-0      2    103724     115100        359      0.003        203        156
nullb0     teo-1      0    103315     113054        142      0.001         49         93
nullb0     teo-1      1    103155     114067        143      0.001         41        102
nullb0     teo-1      2    103910     112446        182      0.002         32        150
nullb0     teo-2      0    104451     113320        118      0.001         40         78
nullb0     teo-2      1    103130     112782        190      0.002         38        152
nullb0     teo-2      2    102745     114172        147      0.001         45        102
mtdblock3  teo-0      0       250     931002      12700      0.014      12694          6
mtdblock3  teo-0      1       257     937274        220      0.000        207         13
mtdblock3  teo-0      2       261     300176         91      0.000         41         50
mtdblock3  teo-1      0       257     912068        109      0.000         65         44
mtdblock3  teo-1      1       257     848038        161      0.000        135         26
mtdblock3  teo-1      2       261     311794         74      0.000         36         38
mtdblock3  teo-2      0       255     953540       3440      0.004       3434          6
mtdblock3  teo-2      1       260     514246         60      0.000         37         23
mtdblock3  teo-2      2       260     427200        682      0.002        645         37

test       gov        i     score  %change    idles  idle_miss  miss_rt   belows   aboves
schbench   teo-0      0    194.07    +0.00%    25308        6    0.000        5        1
schbench   teo-0      1    194.53    +0.24%    25490        7    0.000        6        1
schbench   teo-0      2    194.37    +0.15%    25040        6    0.000        6        0
schbench   teo-0      3    194.00    -0.04%    25040        4    0.000        4        0
schbench   teo-0      4    207.80    +7.07%    26324        6    0.000        6        0
schbench   teo-1      0    189.80    -2.20%    23776        2    0.000        2        0
schbench   teo-1      1    188.13    -3.06%    24790        5    0.000        5        0
schbench   teo-1      2    189.63    -2.29%    24870       19    0.001       18        1
schbench   teo-1      3    187.40    -3.44%    24606        6    0.000        6        0
schbench   teo-1      4    187.17    -3.56%    24214        2    0.000        2        0
schbench   teo-2      0    188.70    -2.77%    24336        5    0.000        5        0
schbench   teo-2      1    185.23    -4.56%    24029        3    0.000        3        0
schbench   teo-2      2    188.87    -2.68%    24904        3    0.000        3        0
schbench   teo-2      3    186.97    -3.66%    24478        4    0.000        4        0
schbench   teo-2      4    210.07    +8.24%    27196        6    0.000        6        0
ebizzy     teo-0      0  10688.00    +0.00%     1096        7    0.006        6        1
ebizzy     teo-0      1  10668.00    -0.19%     1132       11    0.010        9        2
ebizzy     teo-0      2  10696.00    +0.07%     1066        6    0.006        5        1
ebizzy     teo-0      3  10702.00    +0.13%     1056        7    0.007        4        3
ebizzy     teo-0      4  10635.00    -0.50%     1206        8    0.007        7        1
ebizzy     teo-1      0  10724.00    +0.34%     1216       11    0.009        8        3
ebizzy     teo-1      1  10747.00    +0.55%     1240        7    0.006        5        2
ebizzy     teo-1      2  10715.00    +0.25%     1118        7    0.006        6        1
ebizzy     teo-1      3  10698.00    +0.09%     1144        6    0.005        4        2
ebizzy     teo-1      4  10735.00    +0.44%     1203        9    0.007        7        2
ebizzy     teo-2      0  10678.00    -0.09%     1114        9    0.008        8        1
ebizzy     teo-2      1  10697.00    +0.08%     1186        6    0.005        5        1
ebizzy     teo-2      2  10713.00    +0.23%     1162       11    0.009        9        2
ebizzy     teo-2      3  10512.00    -1.65%     1190        8    0.007        8        0
ebizzy     teo-2      4   9890.00    -7.47%     1150       13    0.011       11        2
adrestia   teo-0      0     12.00    +0.00%   103584       15    0.000       14        1
adrestia   teo-0      1     12.00    +0.00%   103298       20    0.000       10       10
adrestia   teo-0      2     12.00    +0.00%   103248       29    0.000       18       11
adrestia   teo-0      3     12.00    +0.00%   103484       23    0.000       15        8
adrestia   teo-0      4     12.00    +0.00%   103362       29    0.000       20        9
adrestia   teo-1      0     12.00    +0.00%   103942       14    0.000       13        1
adrestia   teo-1      1     12.00    +0.00%   103521       18    0.000        9        9
adrestia   teo-1      2     12.00    +0.00%   103642       19    0.000       10        9
adrestia   teo-1      3     12.00    +0.00%   103534       17    0.000       11        6
adrestia   teo-1      4     12.00    +0.00%   103214       14    0.000        6        8
adrestia   teo-2      0     12.00    +0.00%   103270        8    0.000        6        2
adrestia   teo-2      1     12.00    +0.00%   103364       22    0.000       13        9
adrestia   teo-2      2     12.00    +0.00%   103588       17    0.000        6       11
adrestia   teo-2      3     12.00    +0.00%   103906       22    0.000       17        5
adrestia   teo-2      4     12.00    +0.00%   103218       12    0.000        3        9
hackbench  teo-0      0     21.38    +0.00%    14764        8    0.001        5        3
hackbench  teo-0      1     21.57    +0.86%    17512       16    0.001       13        3
hackbench  teo-0      2     21.60    +1.02%    15664       16    0.001       15        1
hackbench  teo-0      3     21.76    +1.74%    20614       12    0.001       11        1
hackbench  teo-0      4     21.61    +1.05%    18614       10    0.001       10        0
hackbench  teo-1      0     21.59    +0.95%    15952       11    0.001        8        3
hackbench  teo-1      1     21.73    +1.64%    16618        9    0.001        8        1
hackbench  teo-1      2     21.46    +0.34%    15084       12    0.001        9        3
hackbench  teo-1      3     21.57    +0.86%    13264       16    0.001       14        2
hackbench  teo-1      4     21.53    +0.69%    12514       10    0.001        9        1
hackbench  teo-2      0     21.63    +1.16%    15236       12    0.001        8        4
hackbench  teo-2      1     21.74    +1.66%    15748       12    0.001       11        1
hackbench  teo-2      2     21.72    +1.59%    17514       11    0.001       11        0
hackbench  teo-2      3     21.67    +1.32%    14972       11    0.001        9        2
hackbench  teo-2      4     21.85    +2.19%    18396       14    0.001       11        3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification
  2025-11-28 23:45 ` [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Christian Loehle
@ 2025-12-02 17:54   ` Doug Smythies
  0 siblings, 0 replies; 10+ messages in thread
From: Doug Smythies @ 2025-12-02 17:54 UTC (permalink / raw)
  To: 'Christian Loehle', 'Rafael J. Wysocki'
  Cc: 'LKML', 'Linux PM', Doug Smythies

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

On 2025.11.29 15:46 Christian Loehle wrote:
> On 11/16/25 12:31, Rafael J. Wysocki wrote:
>> Hi,
>> 
>> This series is based on
>> 
>> https://lore.kernel.org/linux-pm/6228387.lOV4Wx5bFT@rafael.j.wysocki/
>> 
>> and the previous teo updates sent recently.
>> 
>> It fixes a reverse condition in teo_update() (first patch) and quite
>> dramatically simplifies the "intercepts" logic in teo_select() (second
>> patch).
>> 
>
> FWIW nothing note worthy in my tests, as expected.
> Tested-by: Christian Loehle <christian.loehle@arm.com>

Same here.
Most results were within +/- 3%.

I made a mistake and didn't include one patch, and had to go back and re-do
many tests.

No need to read further unless you want to.

System:
Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz, 6 cores 12 CPUs.
CPU frequency scaling driver: intel_pstate; Governor performance.
HWP: Enabled.

Kernel: 6.18-rc4, to be able to re-use some previous test results.

Patch legend:
0ed8b4d13e03 (HEAD -> ver-2) cpuidle: governors: teo: Rework the handling of tick wakeups   <<< ver3
2ae756ea9507 sched/idle: disable tick in idle=poll idle entry    <<< ver2
5a0b904e467b cpuidle: governors: teo: Simplify intercepts-based state lookup
390ef7a94166 cpuidle: governors: teo: Fix tick_intercepts handling in teo_update()
fcfd905b2d92 cpuidle: governors: teo: Decay metrics below DECAY_SHIFT threshold   <<< rc4-rjw
870005165dd8 cpuidle: governors: teo: Use s64 consistently in teo_update()
538e62c2304b cpuidle: governors: teo: Drop redundant function parameter
4c5e9f0c553b cpuidle: governors: teo: Drop incorrect target residency check
6146a0f1dfae (tag: v6.18-rc4, origin/master, origin/HEAD, master) Linux 6.18-rc4   <<< rc4-teo

Interesting result 1: 6 ping pong pairs, no work per token stop, 105,000,000 loops:
rc4-teo:	Average: 5.53143 uSec per loop
rc4-rjw:	Average: 5.50305 uSec per loop -0.51% (Better)
ver2:	Average: 4.71545 uSec per loop -14.75%
ver3:	Average: 4.75843 uSec per loop -13.97%
Idle state 0 residency goes from about <1% for the first 2 tests to about 18% for the last 2 tests.
Idle state 1 residency goes from about 27% for the first 2 tests to about 0% for the last 2 tests.

Interesting result 2: 12 CPU token passing ring with a specific work packet per token stop:
Times are uSec/loop.

rc4-teo:
test runs: 286
average	1407.054587 reference
max	1504.4721
min	1191.2274
range	313.2447

rc4-rjw:
test runs: 606
average	1436.159564 +2.07% (worse)
max	1528.4379
min	1197.0521
range	331.3858

ver2:
test runs: 256
average	1477.551961 +5.01% (worse)
max	1537.4085
min	1295.7111
range	241.6974

ver3:
test runs: 156
average	1161.19326 -17.47% (better)
max	1241.5409
min	1152.9638
range	88.5771 

Idle state 1 Residency is about 9% for ver3 verses 3.5% for ver2
" cpuidle: governors: teo: Rework the handling of tick wakeups" pushes out the idle state transition as a function of the work packet per stop.
This operating point was cherry picked from a test that varied work per token stop over a wide range.

Interesting result 3: adrestia test:
While the actual test results were uneventful, looking at the tail of the wakeup time histogram revealed a tightening with ver3.
Such an improvement wouldn't show up in the 90th percentile, witch is declared as the wakeup latency.
See the attached detail graph.

For completeness, results (times are uSec):
rc4-teo-2.txt:	entries:	2000000	min: 2417	ave: 2655	90th percentile:	2739	max:	347648
rc4-teo.txt:	entries:	2000000	min: 2433	ave: 2649	90th percentile:	2745	max:	12339
ver2.txt:	entries:	2000000	min: 2302	ave: 2630	90th percentile:	2724	max:	12382
ver3.txt		entries:	2000000	min: 2450	ave: 2652	90th percentile:	2727	max:	12297
ver3-2.txt:	entries:	2000000	min: 2409	ave: 2668	90th percentile:	2751	max:	13663
ver3-3.txt:	entries:	2000000	min: 2391	ave: 2644	90th percentile:	2714	max:	11554

... Doug


[-- Attachment #2: histogram-detail-b.png --]
[-- Type: image/png, Size: 70070 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-02 17:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-16 12:31 [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Rafael J. Wysocki
2025-11-16 12:34 ` [PATCH v1 1/2] cpuidle: governors: teo: Fix tick_intercepts handling in teo_update() Rafael J. Wysocki
2025-11-17  9:06   ` Christian Loehle
2025-11-17 16:15     ` Rafael J. Wysocki
2025-11-16 12:35 ` [PATCH v1 2/2] cpuidle: governors: teo: Simplify intercepts-based state lookup Rafael J. Wysocki
2025-11-19 21:13   ` Rafael J. Wysocki
2025-11-20  8:45   ` Christian Loehle
2025-11-20 11:07     ` Rafael J. Wysocki
2025-11-28 23:45 ` [PATCH v1 0/2] cpuidle: governors: teo: Fix and simplification Christian Loehle
2025-12-02 17:54   ` Doug Smythies

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox