All of lore.kernel.org
 help / color / mirror / Atom feed
* [Adeos-main] [pull request] x86: critical fix for native_safe_halt
@ 2009-12-12 21:37 Jan Kiszka
  2009-12-13 17:02 ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-12 21:37 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 3550 bytes --]

Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
cpu_idle on x86: We failed to check the pipeline log before issuing the
real hlt. This caused IRQ latencies or even drops for Linux,
specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
ipipe_critical_enter to deadlock frequently enough.

The first patch of this series fixes this (see below), the second one
simply removes the two useless ipipe_suspend_domain calls.


The following changes since commit 7cd56451b429702996adef794f01f1067e5fff51:
  Philippe Gerum (1):
        ipipe-2.6.31.1-x86-2.4-08

are available in the git repository at:

  git://git.kiszka.org/ipipe-2.6 queues/2.6.31-x86

Jan Kiszka (2):
      x86: Properly virtualize native_safe_halt
      x86: Drop redundant ipipe_suspend_domain from cpu_idle

 arch/x86/include/asm/ipipe_base.h |    2 ++
 arch/x86/include/asm/irqflags.h   |    8 +++++---
 arch/x86/kernel/ipipe.c           |   23 +++++++++++++++++++++++
 arch/x86/kernel/process_32.c      |    2 --
 arch/x86/kernel/process_64.c      |    3 ---
 5 files changed, 30 insertions(+), 8 deletions(-)

------

x86: Properly virtualize native_safe_halt

We have to check the root domain's log before halting. If it is not
empty, we need to replay and return immediately. Otherwise we risk to
defer or even lose root domain interrupts.

Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
 arch/x86/include/asm/ipipe_base.h |    2 ++
 arch/x86/include/asm/irqflags.h   |    8 +++++---
 arch/x86/kernel/ipipe.c           |   23 +++++++++++++++++++++++
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/ipipe_base.h b/arch/x86/include/asm/ipipe_base.h
index c3bd7d1..3c81096 100644
--- a/arch/x86/include/asm/ipipe_base.h
+++ b/arch/x86/include/asm/ipipe_base.h
@@ -203,6 +203,8 @@ static inline unsigned long __ipipe_test_root(void)
 
 #endif /* !CONFIG_SMP */
 
+void __ipipe_halt_root(void);
+
 void __ipipe_serial_debug(const char *fmt, ...);
 
 #endif	/* !__ASSEMBLY__ */
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 71ede94..1baceba 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -72,10 +72,12 @@ static inline void native_irq_enable(void)
 
 static inline void native_safe_halt(void)
 {
-#ifdef CONFIG_IPIPE_TRACE_IRQSOFF
-	ipipe_trace_end(0x8000000E);
-#endif
+#ifdef CONFIG_IPIPE
+	barrier();
+	__ipipe_halt_root();
+#else
 	asm volatile("sti; hlt": : :"memory");
+#endif
 }
 
 static inline void native_halt(void)
diff --git a/arch/x86/kernel/ipipe.c b/arch/x86/kernel/ipipe.c
index d4789c8..487b2b0 100644
--- a/arch/x86/kernel/ipipe.c
+++ b/arch/x86/kernel/ipipe.c
@@ -601,6 +601,29 @@ void __ipipe_preempt_schedule_irq(void)
 
 #endif /* !CONFIG_X86_32 */
 
+void __ipipe_halt_root(void)
+{
+	struct ipipe_percpu_domain_data *p;
+
+	/* Emulate sti+hlt sequence over the root domain. */
+
+	local_irq_disable_hw();
+
+	p = ipipe_root_cpudom_ptr();
+
+	clear_bit(IPIPE_STALL_FLAG, &p->status); 
+
+	if (unlikely(p->irqpend_himask != 0)) {
+		__ipipe_sync_pipeline(IPIPE_IRQMASK_ANY);
+		local_irq_enable_hw();
+	} else {
+#ifdef CONFIG_IPIPE_TRACE_IRQSOFF
+		ipipe_trace_end(0x8000000E);
+#endif /* CONFIG_IPIPE_TRACE_IRQSOFF */
+		asm volatile("sti; hlt": : :"memory");
+	}
+}
+
 static void do_machine_check_vector(struct pt_regs *regs, long error_code)
 {
 #ifdef CONFIG_X86_MCE
-- 
1.6.0.2


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-12 21:37 [Adeos-main] [pull request] x86: critical fix for native_safe_halt Jan Kiszka
@ 2009-12-13 17:02 ` Philippe Gerum
  2009-12-13 17:48   ` Jan Kiszka
  0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2009-12-13 17:02 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> cpu_idle on x86: We failed to check the pipeline log before issuing the
> real hlt. This caused IRQ latencies or even drops for Linux,
> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> ipipe_critical_enter to deadlock frequently enough.
> 
> The first patch of this series fixes this (see below), the second one
> simply removes the two useless ipipe_suspend_domain calls.
> 

What your patch does as well, is killing the ability to run low priority
domains below the root level.

> 
> The following changes since commit 7cd56451b429702996adef794f01f1067e5fff51:
>   Philippe Gerum (1):
>         ipipe-2.6.31.1-x86-2.4-08
> 
> are available in the git repository at:
> 
>   git://git.kiszka.org/ipipe-2.6 queues/2.6.31-x86
> 
> Jan Kiszka (2):
>       x86: Properly virtualize native_safe_halt
>       x86: Drop redundant ipipe_suspend_domain from cpu_idle
> 
>  arch/x86/include/asm/ipipe_base.h |    2 ++
>  arch/x86/include/asm/irqflags.h   |    8 +++++---
>  arch/x86/kernel/ipipe.c           |   23 +++++++++++++++++++++++
>  arch/x86/kernel/process_32.c      |    2 --
>  arch/x86/kernel/process_64.c      |    3 ---
>  5 files changed, 30 insertions(+), 8 deletions(-)
> 
> ------
> 
> x86: Properly virtualize native_safe_halt
> 
> We have to check the root domain's log before halting. If it is not
> empty, we need to replay and return immediately. Otherwise we risk to
> defer or even lose root domain interrupts.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> ---
>  arch/x86/include/asm/ipipe_base.h |    2 ++
>  arch/x86/include/asm/irqflags.h   |    8 +++++---
>  arch/x86/kernel/ipipe.c           |   23 +++++++++++++++++++++++
>  3 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/ipipe_base.h b/arch/x86/include/asm/ipipe_base.h
> index c3bd7d1..3c81096 100644
> --- a/arch/x86/include/asm/ipipe_base.h
> +++ b/arch/x86/include/asm/ipipe_base.h
> @@ -203,6 +203,8 @@ static inline unsigned long __ipipe_test_root(void)
>  
>  #endif /* !CONFIG_SMP */
>  
> +void __ipipe_halt_root(void);
> +
>  void __ipipe_serial_debug(const char *fmt, ...);
>  
>  #endif	/* !__ASSEMBLY__ */
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 71ede94..1baceba 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -72,10 +72,12 @@ static inline void native_irq_enable(void)
>  
>  static inline void native_safe_halt(void)
>  {
> -#ifdef CONFIG_IPIPE_TRACE_IRQSOFF
> -	ipipe_trace_end(0x8000000E);
> -#endif
> +#ifdef CONFIG_IPIPE
> +	barrier();
> +	__ipipe_halt_root();
> +#else
>  	asm volatile("sti; hlt": : :"memory");
> +#endif
>  }
>  
>  static inline void native_halt(void)
> diff --git a/arch/x86/kernel/ipipe.c b/arch/x86/kernel/ipipe.c
> index d4789c8..487b2b0 100644
> --- a/arch/x86/kernel/ipipe.c
> +++ b/arch/x86/kernel/ipipe.c
> @@ -601,6 +601,29 @@ void __ipipe_preempt_schedule_irq(void)
>  
>  #endif /* !CONFIG_X86_32 */
>  
> +void __ipipe_halt_root(void)
> +{
> +	struct ipipe_percpu_domain_data *p;
> +
> +	/* Emulate sti+hlt sequence over the root domain. */
> +
> +	local_irq_disable_hw();
> +
> +	p = ipipe_root_cpudom_ptr();
> +
> +	clear_bit(IPIPE_STALL_FLAG, &p->status); 
> +
> +	if (unlikely(p->irqpend_himask != 0)) {
> +		__ipipe_sync_pipeline(IPIPE_IRQMASK_ANY);
> +		local_irq_enable_hw();
> +	} else {
> +#ifdef CONFIG_IPIPE_TRACE_IRQSOFF
> +		ipipe_trace_end(0x8000000E);
> +#endif /* CONFIG_IPIPE_TRACE_IRQSOFF */
> +		asm volatile("sti; hlt": : :"memory");
> +	}
> +}
> +
>  static void do_machine_check_vector(struct pt_regs *regs, long error_code)
>  {
>  #ifdef CONFIG_X86_MCE


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-13 17:02 ` Philippe Gerum
@ 2009-12-13 17:48   ` Jan Kiszka
  2009-12-13 18:05     ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-13 17:48 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 1025 bytes --]

Philippe Gerum wrote:
> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
>> cpu_idle on x86: We failed to check the pipeline log before issuing the
>> real hlt. This caused IRQ latencies or even drops for Linux,
>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
>> ipipe_critical_enter to deadlock frequently enough.
>>
>> The first patch of this series fixes this (see below), the second one
>> simply removes the two useless ipipe_suspend_domain calls.
>>
> 
> What your patch does as well, is killing the ability to run low priority
> domains below the root level.

Yes, I'm killing the dream.

I heavily doubt that the functions I removed in the second patch ever
contributed something good to this. It's always the job of the lowest
domain to issue hardware halt, not of some arbitrary mid-prio domain.
Moreover, what would be the practical use for such model in the context
of Linux?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-13 17:48   ` Jan Kiszka
@ 2009-12-13 18:05     ` Philippe Gerum
  2009-12-13 18:19       ` Jan Kiszka
  0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2009-12-13 18:05 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> >> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> >> cpu_idle on x86: We failed to check the pipeline log before issuing the
> >> real hlt. This caused IRQ latencies or even drops for Linux,
> >> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> >> ipipe_critical_enter to deadlock frequently enough.
> >>
> >> The first patch of this series fixes this (see below), the second one
> >> simply removes the two useless ipipe_suspend_domain calls.
> >>
> > 
> > What your patch does as well, is killing the ability to run low priority
> > domains below the root level.
> 
> Yes, I'm killing the dream.
> 
> I heavily doubt that the functions I removed in the second patch ever
> contributed something good to this. It's always the job of the lowest
> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> Moreover, what would be the practical use for such model in the context
> of Linux?

That is _not_ the point. The point is, when submitting a patch, please
make sure to raise all the concerns it might introduce wrt to changing
the base features. I'm not opposed to make the feature set evolve, but I
don't want this to happen "by mistake".

> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-13 18:05     ` Philippe Gerum
@ 2009-12-13 18:19       ` Jan Kiszka
  2009-12-16 11:26         ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-13 18:19 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 1786 bytes --]

Philippe Gerum wrote:
> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
>>>> real hlt. This caused IRQ latencies or even drops for Linux,
>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
>>>> ipipe_critical_enter to deadlock frequently enough.
>>>>
>>>> The first patch of this series fixes this (see below), the second one
>>>> simply removes the two useless ipipe_suspend_domain calls.
>>>>
>>> What your patch does as well, is killing the ability to run low priority
>>> domains below the root level.
>> Yes, I'm killing the dream.
>>
>> I heavily doubt that the functions I removed in the second patch ever
>> contributed something good to this. It's always the job of the lowest
>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
>> Moreover, what would be the practical use for such model in the context
>> of Linux?
> 
> That is _not_ the point. The point is, when submitting a patch, please
> make sure to raise all the concerns it might introduce wrt to changing
> the base features. I'm not opposed to make the feature set evolve, but I
> don't want this to happen "by mistake".

Just pushed

"x86: Drop redundant ipipe_suspend_domain from cpu_idle

Allowing domains below root always required more than these calls (Linux
would have to give up idle management). And syncing the root domain now
takes place in __ipipe_halt_root. So remove these suspension calls."

as commit message for the second patch. Is that what you are looking for?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-13 18:19       ` Jan Kiszka
@ 2009-12-16 11:26         ` Philippe Gerum
  2009-12-16 11:53           ` Jan Kiszka
  0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2009-12-16 11:26 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> >>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> >>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
> >>>> real hlt. This caused IRQ latencies or even drops for Linux,
> >>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> >>>> ipipe_critical_enter to deadlock frequently enough.
> >>>>
> >>>> The first patch of this series fixes this (see below), the second one
> >>>> simply removes the two useless ipipe_suspend_domain calls.
> >>>>
> >>> What your patch does as well, is killing the ability to run low priority
> >>> domains below the root level.
> >> Yes, I'm killing the dream.
> >>
> >> I heavily doubt that the functions I removed in the second patch ever
> >> contributed something good to this. It's always the job of the lowest
> >> domain to issue hardware halt, not of some arbitrary mid-prio domain.

Actually, no it's not. You may use a low-priority domain to run idle
level jobs outside of the linux infrastructure for that purpose (e.g.
RCU). A high priority domain may want to post events for a low priority
domain to act upon when a mid priority domain is about to enter the CPU
idle state.

> >> Moreover, what would be the practical use for such model in the context
> >> of Linux?
> > 
> > That is _not_ the point. The point is, when submitting a patch, please
> > make sure to raise all the concerns it might introduce wrt to changing
> > the base features. I'm not opposed to make the feature set evolve, but I
> > don't want this to happen "by mistake".
> 
> Just pushed
> 
> "x86: Drop redundant ipipe_suspend_domain from cpu_idle
> 
> Allowing domains below root always required more than these calls (Linux
> would have to give up idle management). And syncing the root domain now
> takes place in __ipipe_halt_root. So remove these suspension calls."
> 
> as commit message for the second patch. Is that what you are looking for?
> 

Not exactly, because your comment states that what was removed was
intrinsically useless, it was not, and has been used, even if only in a
couple of occasions, mainly to enable a debugging hack. This is why I
choked on removing a feature silently. But at the same time, we are in a
simplification trend of the I-pipe toward X3, so I agree on the final
goal you are pursuing with that patch.

In short, let's move on, I'll merge that as well, now that everything is
on the table, and there is no objection anyway.

> Jan
> 
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 11:26         ` Philippe Gerum
@ 2009-12-16 11:53           ` Jan Kiszka
  2009-12-16 14:19             ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-16 11:53 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

Philippe Gerum wrote:
> On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
>>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
>>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
>>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
>>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
>>>>>> ipipe_critical_enter to deadlock frequently enough.
>>>>>>
>>>>>> The first patch of this series fixes this (see below), the second one
>>>>>> simply removes the two useless ipipe_suspend_domain calls.
>>>>>>
>>>>> What your patch does as well, is killing the ability to run low priority
>>>>> domains below the root level.
>>>> Yes, I'm killing the dream.
>>>>
>>>> I heavily doubt that the functions I removed in the second patch ever
>>>> contributed something good to this. It's always the job of the lowest
>>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> 
> Actually, no it's not. You may use a low-priority domain to run idle
> level jobs outside of the linux infrastructure for that purpose (e.g.
> RCU). A high priority domain may want to post events for a low priority
> domain to act upon when a mid priority domain is about to enter the CPU
> idle state.

Even if all the related bugs were fixed: When you pass down control to
the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
finally returns from its event handler - likely not what "low-prio"
suggests.

> 
>>>> Moreover, what would be the practical use for such model in the context
>>>> of Linux?
>>> That is _not_ the point. The point is, when submitting a patch, please
>>> make sure to raise all the concerns it might introduce wrt to changing
>>> the base features. I'm not opposed to make the feature set evolve, but I
>>> don't want this to happen "by mistake".
>> Just pushed
>>
>> "x86: Drop redundant ipipe_suspend_domain from cpu_idle
>>
>> Allowing domains below root always required more than these calls (Linux
>> would have to give up idle management). And syncing the root domain now
>> takes place in __ipipe_halt_root. So remove these suspension calls."
>>
>> as commit message for the second patch. Is that what you are looking for?
>>
> 
> Not exactly, because your comment states that what was removed was
> intrinsically useless, it was not, and has been used, even if only in a
> couple of occasions, mainly to enable a debugging hack. This is why I
> choked on removing a feature silently. But at the same time, we are in a
> simplification trend of the I-pipe toward X3, so I agree on the final
> goal you are pursuing with that patch.
> 
> In short, let's move on, I'll merge that as well, now that everything is
> on the table, and there is no objection anyway.

I couldn't agree more. :)

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 11:53           ` Jan Kiszka
@ 2009-12-16 14:19             ` Philippe Gerum
  2009-12-16 14:30               ` Philippe Gerum
  2009-12-16 14:32               ` Jan Kiszka
  0 siblings, 2 replies; 13+ messages in thread
From: Philippe Gerum @ 2009-12-16 14:19 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> >>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> >>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
> >>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
> >>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> >>>>>> ipipe_critical_enter to deadlock frequently enough.
> >>>>>>
> >>>>>> The first patch of this series fixes this (see below), the second one
> >>>>>> simply removes the two useless ipipe_suspend_domain calls.
> >>>>>>
> >>>>> What your patch does as well, is killing the ability to run low priority
> >>>>> domains below the root level.
> >>>> Yes, I'm killing the dream.
> >>>>
> >>>> I heavily doubt that the functions I removed in the second patch ever
> >>>> contributed something good to this. It's always the job of the lowest
> >>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> > 
> > Actually, no it's not. You may use a low-priority domain to run idle
> > level jobs outside of the linux infrastructure for that purpose (e.g.
> > RCU). A high priority domain may want to post events for a low priority
> > domain to act upon when a mid priority domain is about to enter the CPU
> > idle state.
> 
> Even if all the related bugs were fixed: When you pass down control to
> the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
> Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
> finally returns from its event handler - likely not what "low-prio"
> suggests.

low prio suggests nothing else than "runs whenever nothing else has to
be done higher in the pipeline". I just don't get why you seem to bind
the Linux rescheduling logic with what a low prio domain would do; by
definition, adeos-wise, both are totally non-related.

> 
> > 
> >>>> Moreover, what would be the practical use for such model in the context
> >>>> of Linux?
> >>> That is _not_ the point. The point is, when submitting a patch, please
> >>> make sure to raise all the concerns it might introduce wrt to changing
> >>> the base features. I'm not opposed to make the feature set evolve, but I
> >>> don't want this to happen "by mistake".
> >> Just pushed
> >>
> >> "x86: Drop redundant ipipe_suspend_domain from cpu_idle
> >>
> >> Allowing domains below root always required more than these calls (Linux
> >> would have to give up idle management). And syncing the root domain now
> >> takes place in __ipipe_halt_root. So remove these suspension calls."
> >>
> >> as commit message for the second patch. Is that what you are looking for?
> >>
> > 
> > Not exactly, because your comment states that what was removed was
> > intrinsically useless, it was not, and has been used, even if only in a
> > couple of occasions, mainly to enable a debugging hack. This is why I
> > choked on removing a feature silently. But at the same time, we are in a
> > simplification trend of the I-pipe toward X3, so I agree on the final
> > goal you are pursuing with that patch.
> > 
> > In short, let's move on, I'll merge that as well, now that everything is
> > on the table, and there is no objection anyway.
> 
> I couldn't agree more. :)
> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 14:19             ` Philippe Gerum
@ 2009-12-16 14:30               ` Philippe Gerum
  2009-12-16 14:39                 ` Jan Kiszka
  2009-12-16 14:32               ` Jan Kiszka
  1 sibling, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2009-12-16 14:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Wed, 2009-12-16 at 15:19 +0100, Philippe Gerum wrote:
> On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
> > Philippe Gerum wrote:
> > > On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
> > >> Philippe Gerum wrote:
> > >>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> > >>>> Philippe Gerum wrote:
> > >>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> > >>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> > >>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
> > >>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
> > >>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> > >>>>>> ipipe_critical_enter to deadlock frequently enough.
> > >>>>>>
> > >>>>>> The first patch of this series fixes this (see below), the second one
> > >>>>>> simply removes the two useless ipipe_suspend_domain calls.
> > >>>>>>
> > >>>>> What your patch does as well, is killing the ability to run low priority
> > >>>>> domains below the root level.
> > >>>> Yes, I'm killing the dream.
> > >>>>
> > >>>> I heavily doubt that the functions I removed in the second patch ever
> > >>>> contributed something good to this. It's always the job of the lowest
> > >>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> > > 
> > > Actually, no it's not. You may use a low-priority domain to run idle
> > > level jobs outside of the linux infrastructure for that purpose (e.g.
> > > RCU). A high priority domain may want to post events for a low priority
> > > domain to act upon when a mid priority domain is about to enter the CPU
> > > idle state.
> > 
> > Even if all the related bugs were fixed: When you pass down control to
> > the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
> > Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
> > finally returns from its event handler - likely not what "low-prio"
> > suggests.
> 
> low prio suggests nothing else than "runs whenever nothing else has to
> be done higher in the pipeline".

Nothing, meaning nothing interrupt-wise. For the rest, it's a deal
between all OSes there.

>  I just don't get why you seem to bind
> the Linux rescheduling logic with what a low prio domain would do; by
> definition, adeos-wise, both are totally non-related.
> 
> > 
> > > 
> > >>>> Moreover, what would be the practical use for such model in the context
> > >>>> of Linux?
> > >>> That is _not_ the point. The point is, when submitting a patch, please
> > >>> make sure to raise all the concerns it might introduce wrt to changing
> > >>> the base features. I'm not opposed to make the feature set evolve, but I
> > >>> don't want this to happen "by mistake".
> > >> Just pushed
> > >>
> > >> "x86: Drop redundant ipipe_suspend_domain from cpu_idle
> > >>
> > >> Allowing domains below root always required more than these calls (Linux
> > >> would have to give up idle management). And syncing the root domain now
> > >> takes place in __ipipe_halt_root. So remove these suspension calls."
> > >>
> > >> as commit message for the second patch. Is that what you are looking for?
> > >>
> > > 
> > > Not exactly, because your comment states that what was removed was
> > > intrinsically useless, it was not, and has been used, even if only in a
> > > couple of occasions, mainly to enable a debugging hack. This is why I
> > > choked on removing a feature silently. But at the same time, we are in a
> > > simplification trend of the I-pipe toward X3, so I agree on the final
> > > goal you are pursuing with that patch.
> > > 
> > > In short, let's move on, I'll merge that as well, now that everything is
> > > on the table, and there is no objection anyway.
> > 
> > I couldn't agree more. :)
> > 
> > Jan
> > 
> 
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 14:19             ` Philippe Gerum
  2009-12-16 14:30               ` Philippe Gerum
@ 2009-12-16 14:32               ` Jan Kiszka
  2009-12-16 14:41                 ` Philippe Gerum
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-16 14:32 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 2483 bytes --]

Philippe Gerum wrote:
> On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
>>>>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
>>>>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
>>>>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
>>>>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
>>>>>>>> ipipe_critical_enter to deadlock frequently enough.
>>>>>>>>
>>>>>>>> The first patch of this series fixes this (see below), the second one
>>>>>>>> simply removes the two useless ipipe_suspend_domain calls.
>>>>>>>>
>>>>>>> What your patch does as well, is killing the ability to run low priority
>>>>>>> domains below the root level.
>>>>>> Yes, I'm killing the dream.
>>>>>>
>>>>>> I heavily doubt that the functions I removed in the second patch ever
>>>>>> contributed something good to this. It's always the job of the lowest
>>>>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
>>> Actually, no it's not. You may use a low-priority domain to run idle
>>> level jobs outside of the linux infrastructure for that purpose (e.g.
>>> RCU). A high priority domain may want to post events for a low priority
>>> domain to act upon when a mid priority domain is about to enter the CPU
>>> idle state.
>> Even if all the related bugs were fixed: When you pass down control to
>> the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
>> Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
>> finally returns from its event handler - likely not what "low-prio"
>> suggests.
> 
> low prio suggests nothing else than "runs whenever nothing else has to
> be done higher in the pipeline". I just don't get why you seem to bind
> the Linux rescheduling logic with what a low prio domain would do; by
> definition, adeos-wise, both are totally non-related.
> 

I meant that a low prio domain is able to (indirectly) defer activities
of a mig prio domain due to the way the latter passed on the control in
its idle loop. Think about my scenario again: where are the preemption
points of on idle, !CONFIG_PREEMPT Linux kernel?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 14:30               ` Philippe Gerum
@ 2009-12-16 14:39                 ` Jan Kiszka
  2009-12-16 14:51                   ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2009-12-16 14:39 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

Philippe Gerum wrote:
> On Wed, 2009-12-16 at 15:19 +0100, Philippe Gerum wrote:
>> On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
>>> Philippe Gerum wrote:
>>>> On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
>>>>> Philippe Gerum wrote:
>>>>>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
>>>>>>> Philippe Gerum wrote:
>>>>>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
>>>>>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
>>>>>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
>>>>>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
>>>>>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
>>>>>>>>> ipipe_critical_enter to deadlock frequently enough.
>>>>>>>>>
>>>>>>>>> The first patch of this series fixes this (see below), the second one
>>>>>>>>> simply removes the two useless ipipe_suspend_domain calls.
>>>>>>>>>
>>>>>>>> What your patch does as well, is killing the ability to run low priority
>>>>>>>> domains below the root level.
>>>>>>> Yes, I'm killing the dream.
>>>>>>>
>>>>>>> I heavily doubt that the functions I removed in the second patch ever
>>>>>>> contributed something good to this. It's always the job of the lowest
>>>>>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
>>>> Actually, no it's not. You may use a low-priority domain to run idle
>>>> level jobs outside of the linux infrastructure for that purpose (e.g.
>>>> RCU). A high priority domain may want to post events for a low priority
>>>> domain to act upon when a mid priority domain is about to enter the CPU
>>>> idle state.
>>> Even if all the related bugs were fixed: When you pass down control to
>>> the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
>>> Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
>>> finally returns from its event handler - likely not what "low-prio"
>>> suggests.
>> low prio suggests nothing else than "runs whenever nothing else has to
>> be done higher in the pipeline".
> 
> Nothing, meaning nothing interrupt-wise. For the rest, it's a deal
> between all OSes there.
> 

Right, and that deal was, well, fairly incomplete in the previous code.
That's my whole point.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 14:32               ` Jan Kiszka
@ 2009-12-16 14:41                 ` Philippe Gerum
  0 siblings, 0 replies; 13+ messages in thread
From: Philippe Gerum @ 2009-12-16 14:41 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Wed, 2009-12-16 at 15:32 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> >>>>>> Philippe Gerum wrote:
> >>>>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> >>>>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> >>>>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
> >>>>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
> >>>>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> >>>>>>>> ipipe_critical_enter to deadlock frequently enough.
> >>>>>>>>
> >>>>>>>> The first patch of this series fixes this (see below), the second one
> >>>>>>>> simply removes the two useless ipipe_suspend_domain calls.
> >>>>>>>>
> >>>>>>> What your patch does as well, is killing the ability to run low priority
> >>>>>>> domains below the root level.
> >>>>>> Yes, I'm killing the dream.
> >>>>>>
> >>>>>> I heavily doubt that the functions I removed in the second patch ever
> >>>>>> contributed something good to this. It's always the job of the lowest
> >>>>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> >>> Actually, no it's not. You may use a low-priority domain to run idle
> >>> level jobs outside of the linux infrastructure for that purpose (e.g.
> >>> RCU). A high priority domain may want to post events for a low priority
> >>> domain to act upon when a mid priority domain is about to enter the CPU
> >>> idle state.
> >> Even if all the related bugs were fixed: When you pass down control to
> >> the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
> >> Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
> >> finally returns from its event handler - likely not what "low-prio"
> >> suggests.
> > 
> > low prio suggests nothing else than "runs whenever nothing else has to
> > be done higher in the pipeline". I just don't get why you seem to bind
> > the Linux rescheduling logic with what a low prio domain would do; by
> > definition, adeos-wise, both are totally non-related.
> > 
> 
> I meant that a low prio domain is able to (indirectly) defer activities
> of a mig prio domain due to the way the latter passed on the control in
> its idle loop.

That is a deal between domains, we could do the same by preventing
Xenomai to reschedule before exiting the interrupt handler, the pipeline
imposes absolutely nothing there. What it says, and only that is:
interrupts must be prioritized. What interrupt handlers do, e.g.
rescheduling or not, is not the pipeline's business. This is another
level of abstraction/logic.

If for some reason, I want to allow such deferral because it helps me
debugging a misbehavior in some upper domain, I may do this. This is
really as simple as that. You have an extended interpretation of what
the pipeline should enforce, really. Albeit it is practically true most
of the time, it may be relaxed in some (rare) occasions, without killing
the entire design, or considering this as a bug. Your perception of
something being a bug could very well be seen as a feature by others.
That's life.

>  Think about my scenario again: where are the preemption
> points of on idle, !CONFIG_PREEMPT Linux kernel?
> 
> Jan
> 
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Adeos-main] [pull request] x86: critical fix for native_safe_halt
  2009-12-16 14:39                 ` Jan Kiszka
@ 2009-12-16 14:51                   ` Philippe Gerum
  0 siblings, 0 replies; 13+ messages in thread
From: Philippe Gerum @ 2009-12-16 14:51 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main

On Wed, 2009-12-16 at 15:39 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Wed, 2009-12-16 at 15:19 +0100, Philippe Gerum wrote:
> >> On Wed, 2009-12-16 at 12:53 +0100, Jan Kiszka wrote:
> >>> Philippe Gerum wrote:
> >>>> On Sun, 2009-12-13 at 19:19 +0100, Jan Kiszka wrote:
> >>>>> Philippe Gerum wrote:
> >>>>>> On Sun, 2009-12-13 at 18:48 +0100, Jan Kiszka wrote:
> >>>>>>> Philippe Gerum wrote:
> >>>>>>>> On Sat, 2009-12-12 at 22:37 +0100, Jan Kiszka wrote:
> >>>>>>>>> Recent moving of ipipe_suspend_domain finally exposed a deeper flaw in
> >>>>>>>>> cpu_idle on x86: We failed to check the pipeline log before issuing the
> >>>>>>>>> real hlt. This caused IRQ latencies or even drops for Linux,
> >>>>>>>>> specifically on SMP. Credits go to plain QEMU whose slow SMP mode caused
> >>>>>>>>> ipipe_critical_enter to deadlock frequently enough.
> >>>>>>>>>
> >>>>>>>>> The first patch of this series fixes this (see below), the second one
> >>>>>>>>> simply removes the two useless ipipe_suspend_domain calls.
> >>>>>>>>>
> >>>>>>>> What your patch does as well, is killing the ability to run low priority
> >>>>>>>> domains below the root level.
> >>>>>>> Yes, I'm killing the dream.
> >>>>>>>
> >>>>>>> I heavily doubt that the functions I removed in the second patch ever
> >>>>>>> contributed something good to this. It's always the job of the lowest
> >>>>>>> domain to issue hardware halt, not of some arbitrary mid-prio domain.
> >>>> Actually, no it's not. You may use a low-priority domain to run idle
> >>>> level jobs outside of the linux infrastructure for that purpose (e.g.
> >>>> RCU). A high priority domain may want to post events for a low priority
> >>>> domain to act upon when a mid priority domain is about to enter the CPU
> >>>> idle state.
> >>> Even if all the related bugs were fixed: When you pass down control to
> >>> the lower domain on cpu_idle via ipipe_suspend_domain, you won't get a
> >>> Linux reschedule (without CONFIG_PREEMPT) until the low-prio domain
> >>> finally returns from its event handler - likely not what "low-prio"
> >>> suggests.
> >> low prio suggests nothing else than "runs whenever nothing else has to
> >> be done higher in the pipeline".
> > 
> > Nothing, meaning nothing interrupt-wise. For the rest, it's a deal
> > between all OSes there.
> > 
> 
> Right, and that deal was, well, fairly incomplete in the previous code.
> That's my whole point.

Nope. The missing code would have been provided by the low priority
domain, without requiring anything to be changed on the Linux side, if
that use was only about running (pre-)idle context code.

Which does not removes anything to the usefulness of actually fixing CPU
halt code this said, but that is unrelated. 

> 
> Jan
> 
> _______________________________________________
> Adeos-main mailing list
> Adeos-main@domain.hid
> https://mail.gna.org/listinfo/adeos-main


-- 
Philippe.




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-12-16 14:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-12 21:37 [Adeos-main] [pull request] x86: critical fix for native_safe_halt Jan Kiszka
2009-12-13 17:02 ` Philippe Gerum
2009-12-13 17:48   ` Jan Kiszka
2009-12-13 18:05     ` Philippe Gerum
2009-12-13 18:19       ` Jan Kiszka
2009-12-16 11:26         ` Philippe Gerum
2009-12-16 11:53           ` Jan Kiszka
2009-12-16 14:19             ` Philippe Gerum
2009-12-16 14:30               ` Philippe Gerum
2009-12-16 14:39                 ` Jan Kiszka
2009-12-16 14:51                   ` Philippe Gerum
2009-12-16 14:32               ` Jan Kiszka
2009-12-16 14:41                 ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.