From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4DA701A9.8030700@domain.hid> Date: Thu, 14 Apr 2011 16:16:09 +0200 From: Jesper Christensen MIME-Version: 1.0 References: <4DA6F0DD.1080403@domain.hid> <1302787886.2083.27.camel@domain.hid> <4DA6FABA.7020407@domain.hid> <1302790179.2083.57.camel@domain.hid> In-Reply-To: <1302790179.2083.57.camel@domain.hid> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] kernel threads crash - possible race condition? List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: "xenomai@xenomai.org" On 2011-04-14 16:09, Philippe Gerum wrote: > On Thu, 2011-04-14 at 15:46 +0200, Jesper Christensen wrote: > >> Actually i have been running with CONFIG_XENO_HW_UNLOCKED_SWITCH the >> whole time >> > You mean enabled? > Disabled, sorry. > >> and i also raised the stack size from 4k to 8k. I do however >> think there could be some fishyness in entry_32.S. In >> "transfer_to_handler" SPRN_SPRG3 is used to check for stack overflow (at >> least in my kernel 2.6.29.6), but i must admit i haven't seen any of >> that in the kernel log. >> >> > Mmm, you are right. In any case, what we want with the unmasked switch > feature is to allow interrupts while we flush the tlb and set the new mm > context, which may be lengthy on some low end platforms. Allowing the > switch code to be preempted during the register swap is of no use wrt > latency. > > Do you have a patch at hand which you could post that flips MSR_EE in > rthal_thread_switch already? > > This protects the whole function, but it should flip the bit inside like you suggest. diff --git a/include/asm-powerpc/bits/pod.h b/include/asm-powerpc/bits/pod.h old mode 100644 new mode 100755 index 6269907..e279647 --- a/include/asm-powerpc/bits/pod.h +++ b/include/asm-powerpc/bits/pod.h @@ -106,6 +106,7 @@ static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, struct mm_struct *prev_mm = out_tcb->active_mm, *next_mm; struct task_struct *prev = out_tcb->active_task; struct task_struct *next = in_tcb->user_task; + unsigned long flags; if (likely(next != NULL)) { in_tcb->active_task = next; @@ -156,12 +157,14 @@ static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, #endif /* PPC32 */ #endif /* !__IPIPE_FEATURE_HARDENED_SWITCHMM */ + rthal_local_irq_save_hw(flags); #ifdef CONFIG_PPC64 rthal_thread_switch(out_tcb->tsp, in_tcb->tsp, next == NULL); #else rthal_thread_switch(out_tcb->tsp, in_tcb->tsp); #endif barrier(); + rthal_local_irq_restore_hw(flags); } >> /Jesper >> >> >> On 2011-04-14 15:31, Philippe Gerum wrote: >> >>> On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote: >>> >>> >>>> I wrote about some problems concerning stack corruption when running >>>> xenomai on ppc. I have found out that if i disable hardware interrupts >>>> while running "rthal_thread_switch" the problem seems to dissapear >>>> somewhat. I saw a crash yesterday after running for 3 hours, and i'm >>>> currently running a test (has been running for 3 hours). Usually it >>>> would fail after 30-40 minutes. My question is: could there be a problem >>>> if we receive an interrupt between updating the stack pointer and the >>>> sprg3 register with the new thread pointer? >>>> >>>> >>>> >>> Normally, there should not be any issue (famous last words), since we >>> would run Xenomai-only code over the preempted context, and we don't >>> depend on SPRG3 to fetch the current phys address. In fact, at this >>> stage we simply don't care about the linux context, only referring to >>> the current Xenomai thread, which is obtained differently. >>> >>> Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine" >>> config area, if this ends up being rock-solid, then this would be a hint >>> that something may be fishy in this area. Raising your k-thread stack >>> sizes in a separate test may be interesting to check too, if not already >>> done. >>> >>> >>> >>> >>>> /Jesper >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xenomai-core mailing list >>>> Xenomai-core@domain.hid >>>> https://mail.gna.org/listinfo/xenomai-core >>>> >>>> >>> >>> >> >