linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/time: Sanity check of decrementer expiration is necessary
@ 2012-06-01 10:01 Paul Mackerras
  2012-06-04  2:31 ` Anton Blanchard
  0 siblings, 1 reply; 2+ messages in thread
From: Paul Mackerras @ 2012-06-01 10:01 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anton Blanchard

This reverts 68568add2c ("powerpc/time: Remove unnecessary sanity check
of decrementer expiration").  We do need to check whether we have reached
the expiration time of the next event, because we sometimes get an early
decrementer interrupt, most notably when we set the decrementer to 1 in
arch_irq_work_raise().  The effect of not having the sanity check is that
if timer_interrupt() gets called early, we leave the decrementer set to
its maximum value, which means we then don't get any more decrementer
interrupts for about 4 seconds (or longer, depending on timebase
frequency).  I saw these pauses as a consequence of getting a stray
hypervisor decrementer interrupt left over from exiting a KVM guest.

This isn't quite a straight revert because of changes to the surrounding
code, but it restores the same algorithm as was previously used.

Cc: stable@kernel.org
Cc: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
If there are no objections, I'll send this to Linus shortly.  This
regression is present in 3.3 and 3.4 as well as current upstream.

 arch/powerpc/kernel/time.c |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 99a995c..be171ee 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -475,6 +475,7 @@ void timer_interrupt(struct pt_regs * regs)
 	struct pt_regs *old_regs;
 	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
 	struct clock_event_device *evt = &__get_cpu_var(decrementers);
+	u64 now;
 
 	/* Ensure a positive value is written to the decrementer, or else
 	 * some CPUs will continue to take decrementer exceptions.
@@ -509,9 +510,16 @@ void timer_interrupt(struct pt_regs * regs)
 		irq_work_run();
 	}
 
-	*next_tb = ~(u64)0;
-	if (evt->event_handler)
-		evt->event_handler(evt);
+	now = get_tb_or_rtc();
+	if (now >= *next_tb) {
+		*next_tb = ~(u64)0;
+		if (evt->event_handler)
+			evt->event_handler(evt);
+	} else {
+		now = *next_tb - now;
+		if (now <= DECREMENTER_MAX)
+			set_dec((int)now);
+	}
 
 #ifdef CONFIG_PPC64
 	/* collect purr register values often, for accurate calculations */
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] powerpc/time: Sanity check of decrementer expiration is necessary
  2012-06-01 10:01 [PATCH] powerpc/time: Sanity check of decrementer expiration is necessary Paul Mackerras
@ 2012-06-04  2:31 ` Anton Blanchard
  0 siblings, 0 replies; 2+ messages in thread
From: Anton Blanchard @ 2012-06-04  2:31 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev

Hi Paul,

> This reverts 68568add2c ("powerpc/time: Remove unnecessary sanity
> check of decrementer expiration").  We do need to check whether we
> have reached the expiration time of the next event, because we
> sometimes get an early decrementer interrupt, most notably when we
> set the decrementer to 1 in arch_irq_work_raise().  The effect of not
> having the sanity check is that if timer_interrupt() gets called
> early, we leave the decrementer set to its maximum value, which means
> we then don't get any more decrementer interrupts for about 4 seconds
> (or longer, depending on timebase frequency).  I saw these pauses as
> a consequence of getting a stray hypervisor decrementer interrupt
> left over from exiting a KVM guest.

Urgh, sorry for that mess.

Acked-by: Anton Blanchard <anton@samba.org>

Anton

> This isn't quite a straight revert because of changes to the
> surrounding code, but it restores the same algorithm as was
> previously used.
> 
> Cc: stable@kernel.org
> Cc: Anton Blanchard <anton@samba.org>
> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> If there are no objections, I'll send this to Linus shortly.  This
> regression is present in 3.3 and 3.4 as well as current upstream.
> 
>  arch/powerpc/kernel/time.c |   14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> index 99a995c..be171ee 100644
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -475,6 +475,7 @@ void timer_interrupt(struct pt_regs * regs)
>  	struct pt_regs *old_regs;
>  	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
>  	struct clock_event_device *evt =
> &__get_cpu_var(decrementers);
> +	u64 now;
>  
>  	/* Ensure a positive value is written to the decrementer, or
> else
>  	 * some CPUs will continue to take decrementer exceptions.
> @@ -509,9 +510,16 @@ void timer_interrupt(struct pt_regs * regs)
>  		irq_work_run();
>  	}
>  
> -	*next_tb = ~(u64)0;
> -	if (evt->event_handler)
> -		evt->event_handler(evt);
> +	now = get_tb_or_rtc();
> +	if (now >= *next_tb) {
> +		*next_tb = ~(u64)0;
> +		if (evt->event_handler)
> +			evt->event_handler(evt);
> +	} else {
> +		now = *next_tb - now;
> +		if (now <= DECREMENTER_MAX)
> +			set_dec((int)now);
> +	}
>  
>  #ifdef CONFIG_PPC64
>  	/* collect purr register values often, for accurate
> calculations */

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-06-04  2:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-01 10:01 [PATCH] powerpc/time: Sanity check of decrementer expiration is necessary Paul Mackerras
2012-06-04  2:31 ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).