interrupt latency while resuming.

linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* interrupt latency while resuming.
@ 2011-01-08 16:58 Abhijeet Dharmapurikar
  2011-01-08 17:13 ` Russell King - ARM Linux
  0 siblings, 1 reply; 4+ messages in thread
From: Abhijeet Dharmapurikar @ 2011-01-08 16:58 UTC (permalink / raw)
  To: tglx
  Cc: Rafael J. Wysocki, Andrew Morton, Russell King,
	Greg Kroah-Hartman, Alan Stern, linux-kernel, linux-arm-msm

I am trying to address an issue of not handling a wakeup interrupt quick 
enough while resuming. It is an edge triggered interrupt with the 
IRQ_WAKEUP flag set. The interrupt controller implements lazy disabling 
of interrupts, IOW does not have a disable callback in the irq_chip.

So while going in to supend that interrupt is marked  IRQ_DISABLED in 
dpm_suspend_noirq().

On resume handle_edge_trigger is run right after 
arch_suspend_enable_irqs(). It finds the interrupt marked  IRQ_DISABLED 
and it sets the IRQ_PENDING flag and does not call the handler.

As the resume path unrolls, non boot cpus  are enabled, 
dpm_resume_noirq() is run. At that time it finds the IRQ_PENDING flag is 
set on this interrupt and the interrupt handler is run.

The problem is, this is very late for the interrupt to be run. Possibly 
because enable_nonboot_cpu takes a while or the resume_noirq callbacks 
take a long time etc.

I tried using IRQF_NO_SUSPEND for that interrupt and it seems the 
interrupt is handled as soon as arch_suspend_enable_irqs() is run.
However I suspect that with this the system will fail to abort suspend - 
The interrupt could trigger between dpm_suspend_noirq() and 
arch_suspend_disable_irqs() and since it wont be marked IRQF_PENDING the 
system goes to suspend never to be woken up again. (the check to abort 
suspend because of a pending interrupt is done in check_wakeup_irqs() in 
sysdev_suspend). I dont think IRQF_NO_SUSPEND was designed for wakeup 
interrupts. Please correct me if I am missing something here.

A solution that comes to mind is enabling such interrupts right before 
arch_suspend_enable_irqs() is run. In some more detail, mark these 
interrupts as IRQF_LOW_SUSPEND_LATENCY in their irq_desc->status and 
enable such interrupts before doing arch_suspend_enable_irqs(). That way 
when arch_suspend_enable_irqs() happens, the handler is run immediately.
We skip enabling such interrupts in resume_device_irqs() to avoid 
enabling them twice.

Will appreciate any other suggestions towards fixing the delay.

Abhijeet

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: interrupt latency while resuming.
  2011-01-08 16:58 interrupt latency while resuming Abhijeet Dharmapurikar
@ 2011-01-08 17:13 ` Russell King - ARM Linux
  2011-01-08 19:09   ` Abhijeet Dharmapurikar
  2011-01-11  5:11   ` Stephen Boyd
  0 siblings, 2 replies; 4+ messages in thread
From: Russell King - ARM Linux @ 2011-01-08 17:13 UTC (permalink / raw)
  To: Abhijeet Dharmapurikar
  Cc: tglx, Rafael J. Wysocki, Andrew Morton, Greg Kroah-Hartman,
	Alan Stern, linux-kernel, linux-arm-msm

On Sat, Jan 08, 2011 at 08:58:57AM -0800, Abhijeet Dharmapurikar wrote:
>
> I am trying to address an issue of not handling a wakeup interrupt quick  
> enough while resuming. It is an edge triggered interrupt with the  
> IRQ_WAKEUP flag set. The interrupt controller implements lazy disabling  
> of interrupts, IOW does not have a disable callback in the irq_chip.
>
> So while going in to supend that interrupt is marked  IRQ_DISABLED in  
> dpm_suspend_noirq().
>
> On resume handle_edge_trigger is run right after  
> arch_suspend_enable_irqs(). It finds the interrupt marked  IRQ_DISABLED  
> and it sets the IRQ_PENDING flag and does not call the handler.
>
> As the resume path unrolls, non boot cpus  are enabled,  
> dpm_resume_noirq() is run. At that time it finds the IRQ_PENDING flag is  
> set on this interrupt and the interrupt handler is run.

The reason it's after non-boot CPUs are enabled is that the affinity
settings may require it to be run on one of these CPUs, so we need to
ensure that those CPUs are up.

We know it takes something like 200ms to run the lpg calibration, and
this is responsible for the majority of the time when bringing a CPU
online.

One solution is to sort out whether we need to re-run the lpj calibration
each time we resume a CPU.  At the moment, we're just using the boot CPU
loops_per_jiffy for every udelay(), assuming that all CPUs are running
at the same clock rate - so we _could_ skip the calibration as its
never used.

Or we could fix things so that it is used (which also sorts udelay for
asymetrically clocked secondary CPUs) but then we always have to re-run
the calibration, or we have to stipulate that secondary CPUs come up at
the same clock rate as they went down.  I suspect there's systems out
there where the latter is not true at present, so it's not something we
can impose - which means we're stuck having to do 200ms of recalibration
per CPU.

Another solution is to check whether we can run the handler before other
CPUs are online - it may mean we're breaking IRQ affinity, but any
non-boot affinity has already been broken by taking the other CPUs
offline.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: interrupt latency while resuming.
  2011-01-08 17:13 ` Russell King - ARM Linux
@ 2011-01-08 19:09   ` Abhijeet Dharmapurikar
  2011-01-11  5:11   ` Stephen Boyd
  1 sibling, 0 replies; 4+ messages in thread
From: Abhijeet Dharmapurikar @ 2011-01-08 19:09 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: tglx, Rafael J. Wysocki, Andrew Morton, Greg Kroah-Hartman,
	Alan Stern, linux-kernel, linux-arm-msm

On 01/08/2011 09:13 AM, Russell King - ARM Linux wrote:
> On Sat, Jan 08, 2011 at 08:58:57AM -0800, Abhijeet Dharmapurikar wrote:
>>
>> I am trying to address an issue of not handling a wakeup interrupt quick
>> enough while resuming. It is an edge triggered interrupt with the
>> IRQ_WAKEUP flag set. The interrupt controller implements lazy disabling
>> of interrupts, IOW does not have a disable callback in the irq_chip.
>>
>> So while going in to supend that interrupt is marked  IRQ_DISABLED in
>> dpm_suspend_noirq().
>>
>> On resume handle_edge_trigger is run right after
>> arch_suspend_enable_irqs(). It finds the interrupt marked  IRQ_DISABLED
>> and it sets the IRQ_PENDING flag and does not call the handler.
>>
>> As the resume path unrolls, non boot cpus  are enabled,
>> dpm_resume_noirq() is run. At that time it finds the IRQ_PENDING flag is
>> set on this interrupt and the interrupt handler is run.

> Another solution is to check whether we can run the handler before other
> CPUs are online - it may mean we're breaking IRQ affinity, but any
> non-boot affinity has already been broken by taking the other CPUs
> offline.

Agree. I think we are breaking affinity anyways by not setting the 
interrupt targets to pre disable_nonboot_cpus().

We can say that for interrupts marked IRQF_LOW_SUSPEND_LATENCY we may 
not be  honoring the affinity. They *will* run on the boot cpu, even if 
their affinity is set to others upon resume. Driver authors should 
excercise caution wrt to affinity for such interrupts.
For all the other interrupts we can restore the targets to pre 
disable_nonboot_cpu state. But this is a different problem.

Do you agree?

I will test the proposal of IRQF_LOW_SUSPEND_LATENCY and push out a 
patch soon.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: interrupt latency while resuming.
  2011-01-08 17:13 ` Russell King - ARM Linux
  2011-01-08 19:09   ` Abhijeet Dharmapurikar
@ 2011-01-11  5:11   ` Stephen Boyd
  1 sibling, 0 replies; 4+ messages in thread
From: Stephen Boyd @ 2011-01-11  5:11 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Abhijeet Dharmapurikar, tglx, Rafael J. Wysocki, Andrew Morton,
	Greg Kroah-Hartman, Alan Stern, linux-kernel, linux-arm-msm

On 01/08/2011 09:13 AM, Russell King - ARM Linux wrote:
>
> We know it takes something like 200ms to run the lpg calibration, and
> this is responsible for the majority of the time when bringing a CPU
> online.
>
> One solution is to sort out whether we need to re-run the lpj calibration
> each time we resume a CPU.  At the moment, we're just using the boot CPU
> loops_per_jiffy for every udelay(), assuming that all CPUs are running
> at the same clock rate - so we _could_ skip the calibration as its
> never used.
>
> Or we could fix things so that it is used (which also sorts udelay for
> asymetrically clocked secondary CPUs) but then we always have to re-run
> the calibration, or we have to stipulate that secondary CPUs come up at
> the same clock rate as they went down.  I suspect there's systems out
> there where the latter is not true at present, so it's not something we
> can impose - which means we're stuck having to do 200ms of recalibration
> per CPU.

We're actually using a timer based udelay() (which uses
calibrate_delay_direct() instead of the more expensive delay loop) and
thus lpj is calculated in about 50ms. You raise an interesting point
though since we don't need to recalibrate as our timesource is constant.
It would be great to skip the recalibration in such a situation and
speed up onlining CPUs.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-01-11  5:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-08 16:58 interrupt latency while resuming Abhijeet Dharmapurikar
2011-01-08 17:13 ` Russell King - ARM Linux
2011-01-08 19:09   ` Abhijeet Dharmapurikar
2011-01-11  5:11   ` Stephen Boyd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).