All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: tglx@linutronix.de, peterz@infradead.org, rafael@kernel.org,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	vincent.guittot@linaro.org
Subject: Re: [RFC PATCH 2/2] sched: idle: IRQ based next prediction for idle period
Date: Thu, 7 Jan 2016 16:42:58 +0100	[thread overview]
Message-ID: <568E8782.9020401@linaro.org> (raw)
In-Reply-To: <alpine.LFD.2.20.1601061156120.1652@knanqh.ubzr>

On 01/06/2016 06:40 PM, Nicolas Pitre wrote:
> On Wed, 6 Jan 2016, Daniel Lezcano wrote:
>
>> Many IRQs are quiet most of the time, or they tend to come in bursts of
>> fairly equal time intervals within each burst. It is therefore possible
>> to detect those IRQs with stable intervals and guestimate when the next
>> IRQ event is most likely to happen.
>>
>> Examples of such IRQs may include audio related IRQs where the FIFO size
>> and/or DMA descriptor size with the sample rate create stable intervals,
>> block devices during large data transfers, etc.  Even network streaming
>> of multimedia content creates patterns of periodic network interface IRQs
>> in some cases.
>>
>> This patch adds code to track the mean interval and variance for each IRQ
>> over a window of time intervals between IRQ events. Those statistics can
>> be used to assist cpuidle in selecting the most appropriate sleep state
>> by predicting the most likely time for the next interrupt.
>>
>> Because the stats are gathered in interrupt context, the core computation
>> is as light as possible.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>
> There are still a few problems with this patch.
>
> Please see comments below.

[ ... ]

>> +/**
>> + * stats_variance - compute the variance
>> + *
>> + * @s: the statistic structure
>> + *
>> + * Returns an u64 corresponding to the variance, or zero if there is
>> + * no data
>> + */
>> +static u64 stats_variance(struct stats *s, u32 mean)
>> +{
>> +	int i;
>> +	u64 variance = 0;
>> +
>> +	/*
>> +	 * The variance is the sum of the squared difference to the
>> +	 * average divided by the number of elements.
>> +	 */
>> +	for (i = 0; i < STATS_NR_VALUES; i++) {
>> +		s32 diff = s->values[i] - mean;
>> +		variance += (u64)diff * diff;
>
> Strictly speaking, diff should be casted to s64 as it is a signed value
> that may actually be negative.  Because of the strange C type promotion
> rules, the generated code appears correct (at least on ARM), but it
> would be clearer to use s64 anyway.

I don't get the connection in your explanation of why it should be a 
s64. It is already a signed s32, s->values[] are s32 and mean is u32.

What would be the benefit to convert diff to s64 ?

> The product will end up being positive in all cases of course so
> variance may remain as a u64.
>

[ ... ]

>> +static ktime_t next_irq_event(void)
>> +{
>> +	unsigned int irq, cpu = raw_smp_processor_id();
>> +	ktime_t diff, next, min = { .tv64 = KTIME_MAX };
>
> To avoid exposing the ktime_t definition, you should use
> ktime_set(KTIME_SEC_MAX, 0) instead of { .tv64 = KTIME_MAX }.

Ok.

>> +	struct wakeup *w;
>> +	u32 interval, mean;
>> +	u64 variance;
>> +
>> +	/*
>> +	 * Lookup the interrupt array for this cpu and search for the
>> +	 * earlier expected interruption.
>> +	 */
>> +        for (irq = 0; irq < NR_IRQS; irq++) {
>> +
>> +		ktime_t now = ktime_get();
>
> ktime_get() is potentially non-trivial. It should be plenty good enough
> to call it only once outside the loop.

Ok.

>> +		w = per_cpu(wakeups[irq], cpu);
>> +
>> +		/*
>> +		 * The interrupt was not setup as a source of a wakeup
>> +		 * or the wakeup source is not considered at this
>> +		 * moment stable enough to do a prediction.
>> +		 */
>> +		if (!w)
>> +			continue;
>> +
>> +		/*
>> +		 * No statistics available yet.
>> +		 */
>> +		if (ktime_equal(w->timestamp, ktime_set(0, 0)))
>> +			continue;
>> +
>> +		diff = ktime_sub(now, w->timestamp);
>> +
>> +		/*
>> +		 * There is no point attempting predictions on interrupts more
>> +		 * than 1 second apart. This has no benefit for sleep state
>> +		 * selection and increases the risk of overflowing our variance
>> +		 * computation. Reset all stats in that case.
>> +		 */
>> +		if (unlikely(ktime_after(diff, ktime_set(1, 0)))) {
>> +			stats_reset(&w->stats);
>> +			continue;
>> +		}
>
> The above is wrong. It is not computing the interval between successive
> interruts but rather the interval between the last interrupt occurrence
> and the present time (i.e. when we're about to go idle).  This won't
> prevent interrupt intervals greater than one second from being summed
> and potentially overflowing the variance if this code is executed less
> than a second after one such IRQ interval.  This test should rather be
> performed in sched_idle_irq().

Ok, I will move it.

>> +		 * If the mean value is null, just ignore this wakeup
>> +		 * source.
>> +		 */
>> +		mean = stats_mean(&w->stats);
>> +		if (!mean)
>> +			continue;
>> +
>> +		variance = stats_variance(&w->stats, mean);
>> +		/*
>> +		 * We want to check the last interval is:
>> +		 *
>> +		 *  mean - stddev < interval < mean + stddev
>> +		 *
>> +		 * That simplifies to:
>> +		 *
>> +		 * -stddev < interval - mean < stddev
>> +		 *
>> +		 * abs(interval - mean) < stddev
>> +		 *
>> +		 * The standard deviation is the sqrt of the variance:
>> +		 *
>> +		 * abs(interval - mean) < sqrt(variance)
>> +		 *
>> +		 * and we want to prevent to do an sqrt, so we square
>> +		 * the equation:
>> +		 *
>> +		 * (interval - mean)^2 < variance
>> +		 *
>> +		 * So if the latest value of the stats complies with
>> +		 * this condition, then the wakeup source is
>> +		 * considered predictable and can be used to predict
>> +		 * the next event.
>> +		 */
>> +		interval = w->stats.values[(w->stats.w_ptr + 1) % STATS_NR_VALUES];
>
> But here (w->stats.w_ptr + 1) % STATS_NR_VALUES does not point at the
> last interval. It rather points at the second oldest.
> To make things simpler, you should rather pre-increment the pointer
> before updating the array in stats_add(), and here the value you want
> will directly be accessible with w->stats.values[w->stats.w_ptr].

Yeah, there is a glitch here. I will look at it.

>> +		if ((u64)((interval - mean) * (interval - mean)) > variance)
>> +			continue;
>
> Same comment as in stats_variance(): this should be s64.
>
>> +		/*
>> +		 * Let's compute the next event: the wakeup source is
>> +		 * considered predictable, we add the average interval
>> +		 * time added to the latest interruption event time.
>> +		 */
>> +		next = ktime_add_us(w->timestamp, stats_mean(&w->stats));
>> +
>> +		/*
>> +		 * If the interrupt is supposed to happen before the
>> +		 * minimum time, then it becomes the minimum.
>> +		 */
>> +		if (ktime_before(next, min))
>> +			min = next;
>> +	}
>> +
>> +	/*
>> +	 * At this point, we have our prediction but the caller is
>> +	 * expecting the remaining time before the next event, so
>> +	 * compute the expected sleep length.
>> +	 */
>> +	diff = ktime_sub(min, ktime_get());
>
> You should use the variable 'now' rather than asking for the current
> time again.

Yep.

>> +	/*
>> +	 * The result could be negative for different reasons:
>> +	 *  - the prediction is incorrect
>> +	 *  - the prediction was too near now and expired while we were
>> +	 *    in this function
>> +	 *
>> +	 * In both cases, we return KTIME_MAX as a failure to do a
>> +	 * prediction
>> +	 */
>> +	if (ktime_compare(diff, ktime_set(0, 0)) <= 0)
>> +		return (ktime_t){ .tv64 = KTIME_MAX };
>
> See my comment about ktime_t internals at the beginning of this
> function.

Ok.

Thanks for the review.

   -- Daniel

-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


  reply	other threads:[~2016-01-07 15:43 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-06 15:22 [RFC PATCH 0/2] IRQ based next prediction Daniel Lezcano
2016-01-06 15:22 ` [RFC PATCH 1/2] irq: Add a framework to measure interrupt timings Daniel Lezcano
2016-01-08 15:31   ` Thomas Gleixner
2016-01-12 11:42     ` Daniel Lezcano
2016-01-06 15:22 ` [RFC PATCH 2/2] sched: idle: IRQ based next prediction for idle period Daniel Lezcano
2016-01-06 17:40   ` Nicolas Pitre
2016-01-07 15:42     ` Daniel Lezcano [this message]
2016-01-12 19:27       ` Nicolas Pitre
2016-01-10 22:37     ` Daniel Lezcano
2016-01-10 22:46       ` Nicolas Pitre
2016-01-10 22:58         ` Daniel Lezcano
2016-01-10 23:13           ` Nicolas Pitre
2016-01-08 15:43   ` Thomas Gleixner
2016-01-12 12:41     ` Daniel Lezcano
2016-01-12 13:42       ` Thomas Gleixner
2016-01-12 14:16         ` Daniel Lezcano
2016-01-12 14:26           ` Thomas Gleixner
2016-01-12 14:52             ` Daniel Lezcano
2016-01-12 15:12               ` Thomas Gleixner
2016-01-12 16:04                 ` Daniel Lezcano
2016-01-13  9:17                   ` Thomas Gleixner
2016-01-18 13:21     ` Daniel Lezcano
2016-01-20 15:41       ` Thomas Gleixner
2016-01-20 16:00         ` [RFC V2 0/2] IRQ based next prediction Daniel Lezcano
2016-01-20 16:00           ` [RFC V2 1/2] irq: Add a framework to measure interrupt timings Daniel Lezcano
2016-01-20 17:55             ` Thomas Gleixner
2016-01-21  9:25               ` Daniel Lezcano
2016-01-21 10:27                 ` Thomas Gleixner
2016-01-20 19:07             ` Peter Zijlstra
2016-01-20 19:57               ` Thomas Gleixner
2016-01-20 20:04                 ` Nicolas Pitre
2016-01-20 20:20                 ` Peter Zijlstra
2016-01-20 20:22                   ` Thomas Gleixner
2016-01-21  9:50                 ` Daniel Lezcano
2016-01-21 10:08                   ` Peter Zijlstra
2016-01-21 12:38                     ` Daniel Lezcano
2016-01-21 20:27                     ` Thomas Gleixner
2016-01-21 13:52                   ` Thomas Gleixner
2016-01-21 14:19                     ` Daniel Lezcano
2016-01-21 18:56                       ` Thomas Gleixner
2016-01-22 10:15                         ` Peter Zijlstra
2016-01-21  9:26               ` Daniel Lezcano
2016-01-20 19:28             ` Peter Zijlstra
2016-01-21  9:53               ` Daniel Lezcano
2016-01-20 16:00           ` [RFC V2 2/2] sched: idle: IRQ based next prediction for idle period Daniel Lezcano
2016-01-20 17:46             ` Nicolas Pitre
2016-01-20 18:44               ` Peter Zijlstra
2016-01-21 10:03               ` Daniel Lezcano
2016-01-20 19:02             ` Peter Zijlstra
2016-01-20 19:17               ` Nicolas Pitre
2016-01-20 19:29                 ` Peter Zijlstra
2016-01-20 19:34             ` Peter Zijlstra
2016-01-20 19:40             ` Peter Zijlstra
2016-01-20 19:57               ` Nicolas Pitre
2016-01-20 20:22                 ` Peter Zijlstra
2016-01-20 19:49             ` Thomas Gleixner
2016-01-21 13:54               ` Daniel Lezcano
2016-01-21 14:12                 ` Thomas Gleixner
2016-01-20 16:00           ` [RFC V2 0/2] IRQ based next prediction Daniel Lezcano
2016-01-20 16:00           ` [RFC V2 1/2] irq: Add a framework to measure interrupt timings Daniel Lezcano
2016-01-20 16:00           ` [RFC V2 2/2] sched: idle: IRQ based next prediction for idle period Daniel Lezcano
2016-01-20 20:14             ` Nicolas Pitre
2016-01-21 13:04               ` Daniel Lezcano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568E8782.9020401@linaro.org \
    --to=daniel.lezcano@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=nicolas.pitre@linaro.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.