All of lore.kernel.org
 help / color / mirror / Atom feed
* [Adeos-main] PATCH: adeos handle_event losing events
@ 2005-08-31 11:54 Heikki Lindholm
  2005-08-31 12:35 ` Philippe Gerum
  0 siblings, 1 reply; 4+ messages in thread
From: Heikki Lindholm @ 2005-08-31 11:54 UTC (permalink / raw)
  To: adeos-main; +Cc: rtai-dev, Philippe Gerum

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

Hello,

While processing __adeos_handle_event enables interrupts for the ectual 
event handler periods. Interrupts might happen then and only logged for 
domains lower than the current domain in the __adeos_handle_event loop. 
The logged irqs would normally get replayed at the next real interrupt 
if the domain that caused the event was lower than the ones the 
interrupts were logged in or if it wasn't lower, at the next 
suspend_domain. On powersave-enabled ppc machines this might cause a 
chain where the cpu goes napping and wakes at the next timer tick which, 
having possibly been missed at *handle_event, happens after maximum 
decrementer period (~minutes).

The following patch seems to help in the described case, but I've also 
observed another case where interrupt seems to get dropped altogether. 
Performance didn't seem to suffer much from the patch. If anything, the 
latency/cruncher results got better. This is for the non-threaded 
domains case.

-- Heikki Lindholm

[-- Attachment #2: adeos-handle_event-p1.patch --]
[-- Type: text/plain, Size: 1032 bytes --]

--- kernel/adeos.c.orig	2005-08-31 12:01:30.002707784 +0300
+++ kernel/adeos.c	2005-08-31 13:10:25.148070136 +0300
@@ -196,6 +196,7 @@
     adeos_declare_cpuid;
     adevinfo_t evinfo;
     int propagate = 1;
+    int notfirst = 0;
 
     adeos_lock_cpu(flags);
 
@@ -204,7 +205,20 @@
     list_for_each_safe(pos,npos,&__adeos_pipeline) {
 
     	next_domain = list_entry(pos,adomain_t,p_link);
-
+	/* did the previous unlock_cpu gap cause interrupts here? */
+	if (notfirst && next_domain != adp_root && !test_bit(IPIPE_STALL_FLAG,
+			&next_domain->cpudata[cpuid].status) &&
+			next_domain->cpudata[cpuid].irq_pending_hi != 0)
+		{
+		adp_cpu_current[cpuid] = next_domain;
+		__adeos_sync_stage(IPIPE_IRQMASK_ANY);
+		
+		if (adp_cpu_current[cpuid] != next_domain)
+			/* Something has changed the current domain under our
+			 * feet recycling the register set; take note. */
+			this_domain = adp_cpu_current[cpuid];
+		}
+	
 	if (next_domain->events[event].handler != NULL)
 	    {
 	    adp_cpu_current[cpuid] = next_domain;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Adeos-main] PATCH: adeos handle_event losing events
  2005-08-31 11:54 [Adeos-main] PATCH: adeos handle_event losing events Heikki Lindholm
@ 2005-08-31 12:35 ` Philippe Gerum
  2005-08-31 13:09   ` Philippe Gerum
  2005-08-31 13:46   ` [Adeos-main] PATCH: adeos handle_event losing events *corrected* Heikki Lindholm
  0 siblings, 2 replies; 4+ messages in thread
From: Philippe Gerum @ 2005-08-31 12:35 UTC (permalink / raw)
  To: Heikki Lindholm; +Cc: adeos-main, rtai-dev, Philippe Gerum

Quoting Heikki Lindholm <holindho@domain.hid>:

> Hello,
>
> While processing __adeos_handle_event enables interrupts for the 
> ectual event handler periods. Interrupts might happen then and only 
> logged for domains lower than the current domain in the 
> __adeos_handle_event loop. The logged irqs would normally get 
> replayed at the next real interrupt if the domain that caused the 
> event was lower than the ones the interrupts were logged in or if it 
> wasn't lower, at the next suspend_domain. On powersave-enabled ppc 
> machines this might cause a chain where the cpu goes napping and 
> wakes at the next timer tick which, having possibly been missed at 
> *handle_event, happens after maximum decrementer period (~minutes).
>
> The following patch seems to help in the described case, but I've 
> also observed another case where interrupt seems to get dropped 
> altogether. Performance didn't seem to suffer much from the patch. If 
> anything, the latency/cruncher results got better. This is for the 
> non-threaded domains case.

Good spot, thanks. I'd also suggest a different implementation to solve this,
still in adeos_handle_event() though:

- you only need to sync the next domain when it does not handle the current
event; if it does, then it would suspend calling adeos_suspend_domain() when
the event is processed, hence switching to the next domain down the pipeline
that has IRQs to process anyway. Additionally, I don't see how notfirst goes
non-zero in this patch, and why we'd need to make some exception case for the
root domain (i.e. you may have domains below it down the pipeline).

- in the !threaded case, calling __adeos_switch_to() instead of changing the
domain descriptor by hand then syncing the stage would be better, since it
would perform all the additional housekeeping chores, like calling the switch
hook and resetting the current descriptor pointer.

>
> -- Heikki Lindholm
>





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Adeos-main] PATCH: adeos handle_event losing events
  2005-08-31 12:35 ` Philippe Gerum
@ 2005-08-31 13:09   ` Philippe Gerum
  2005-08-31 13:46   ` [Adeos-main] PATCH: adeos handle_event losing events *corrected* Heikki Lindholm
  1 sibling, 0 replies; 4+ messages in thread
From: Philippe Gerum @ 2005-08-31 13:09 UTC (permalink / raw)
  To: Heikki Lindholm, adeos-main, rtai-dev

Quoting Philippe Gerum <rpm@xenomai.org>:

> Quoting Heikki Lindholm <holindho@domain.hid>:
>
>> Hello,
>>
>> While processing __adeos_handle_event enables interrupts for the 
>> ectual event handler periods. Interrupts might happen then and only 
>> logged for domains lower than the current domain in the 
>> __adeos_handle_event loop. The logged irqs would normally get 
>> replayed at the next real interrupt if the domain that caused the 
>> event was lower than the ones the interrupts were logged in or if it 
>> wasn't lower, at the next suspend_domain. On powersave-enabled ppc 
>> machines this might cause a chain where the cpu goes napping and 
>> wakes at the next timer tick which, having possibly been missed at 
>> *handle_event, happens after maximum decrementer period (~minutes).
>>
>> The following patch seems to help in the described case, but I've 
>> also observed another case where interrupt seems to get dropped 
>> altogether. Performance didn't seem to suffer much from the patch. 
>> If anything, the latency/cruncher results got better. This is for 
>> the non-threaded domains case.
>
> Good spot, thanks. I'd also suggest a different implementation to solve this,
> still in adeos_handle_event() though:
>
> - you only need to sync the next domain when it does not handle the current
> event; if it does, then it would suspend calling adeos_suspend_domain() when
> the event is processed, hence switching to the next domain down the pipeline
> that has IRQs to process anyway. Additionally, I don't see how notfirst goes
> non-zero in this patch, and why we'd need to make some exception case for the
> root domain (i.e. you may have domains below it down the pipeline).
>

Actually, what's above is true for the threaded case, but we do need to sync
both cases for the !threaded one.

> - in the !threaded case, calling __adeos_switch_to() instead of changing the
> domain descriptor by hand then syncing the stage would be better, since it
> would perform all the additional housekeeping chores, like calling the switch
> hook and resetting the current descriptor pointer.
>
>>
>> -- Heikki Lindholm
>>
>
>
>
>





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Adeos-main] PATCH: adeos handle_event losing events *corrected*
  2005-08-31 12:35 ` Philippe Gerum
  2005-08-31 13:09   ` Philippe Gerum
@ 2005-08-31 13:46   ` Heikki Lindholm
  1 sibling, 0 replies; 4+ messages in thread
From: Heikki Lindholm @ 2005-08-31 13:46 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main, rtai-dev

[-- Attachment #1: Type: text/plain, Size: 2507 bytes --]

Philippe Gerum kirjoitti:
> Quoting Heikki Lindholm <holindho@domain.hid>:
> 
>> Hello,
>>
>> While processing __adeos_handle_event enables interrupts for the 
>> ectual event handler periods. Interrupts might happen then and only 
>> logged for domains lower than the current domain in the 
>> __adeos_handle_event loop. The logged irqs would normally get replayed 
>> at the next real interrupt if the domain that caused the event was 
>> lower than the ones the interrupts were logged in or if it wasn't 
>> lower, at the next suspend_domain. On powersave-enabled ppc machines 
>> this might cause a chain where the cpu goes napping and wakes at the 
>> next timer tick which, having possibly been missed at *handle_event, 
>> happens after maximum decrementer period (~minutes).
>>
>> The following patch seems to help in the described case, but I've also 
>> observed another case where interrupt seems to get dropped altogether. 
>> Performance didn't seem to suffer much from the patch. If anything, 
>> the latency/cruncher results got better. This is for the non-threaded 
>> domains case.
> 
> 
> Good spot, thanks. I'd also suggest a different implementation to solve 
> this,
> still in adeos_handle_event() though:
> 
> - you only need to sync the next domain when it does not handle the current
> event; if it does, then it would suspend calling adeos_suspend_domain() 
> when
> the event is processed, hence switching to the next domain down the 
> pipeline
> that has IRQs to process anyway. Additionally, I don't see how notfirst 
> goes
> non-zero in this patch, and why we'd need to make some exception case 
> for the
> root domain (i.e. you may have domains below it down the pipeline).

GRRRRrr! I failed to incorporate that one "notfirst" line from my 
debug-tree to the CVS tree. No wonder it didn't work there anymore - I 
was banging my head to the wall in vain! Thinking that over again, I 
can't think why would the root domain *need* to be excluded, it just 
seemed to save some work in the "normal" fusion case where most events 
get propagated to root, which caused them anyway.

> - in the !threaded case, calling __adeos_switch_to() instead of changing 
> the
> domain descriptor by hand then syncing the stage would be better, since it
> would perform all the additional housekeeping chores, like calling the 
> switch
> hook and resetting the current descriptor pointer.

Yes. A little more work, but looks neater.

Anyway, here's a corrected version.

-- Heikki Lindholm

[-- Attachment #2: adeos-handle_event-p2.patch --]
[-- Type: text/plain, Size: 1005 bytes --]

--- kernel/adeos.c.orig	2005-08-31 12:01:30.002707784 +0300
+++ kernel/adeos.c	2005-08-31 16:33:06.496264192 +0300
@@ -196,6 +196,7 @@
     adeos_declare_cpuid;
     adevinfo_t evinfo;
     int propagate = 1;
+    int notfirst = 0;
 
     adeos_lock_cpu(flags);
 
@@ -204,11 +205,21 @@
     list_for_each_safe(pos,npos,&__adeos_pipeline) {
 
     	next_domain = list_entry(pos,adomain_t,p_link);
-
+	/* did the previous unlock_cpu gap cause interrupts here? */
+	if (notfirst && !test_bit(IPIPE_STALL_FLAG,
+			&next_domain->cpudata[cpuid].status) &&
+			next_domain->cpudata[cpuid].irq_pending_hi != 0)
+		{
+		__adeos_switch_to(this_domain,next_domain,cpuid);
+		/* in case the domain was changed */
+		this_domain = adp_cpu_current[cpuid]; 
+		}
+	
 	if (next_domain->events[event].handler != NULL)
 	    {
 	    adp_cpu_current[cpuid] = next_domain;
 	    evinfo.domid = start_domain->domid;
+	    notfirst = 1;
 	    adeos_unlock_cpu(flags);
 	    evinfo.event = event;
 	    evinfo.evdata = evdata;

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-08-31 13:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-31 11:54 [Adeos-main] PATCH: adeos handle_event losing events Heikki Lindholm
2005-08-31 12:35 ` Philippe Gerum
2005-08-31 13:09   ` Philippe Gerum
2005-08-31 13:46   ` [Adeos-main] PATCH: adeos handle_event losing events *corrected* Heikki Lindholm

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.