From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4443C07E.7060804@domain.hid> Date: Mon, 17 Apr 2006 18:21:18 +0200 From: Philippe Gerum MIME-Version: 1.0 Subject: Re: [Xenomai-core] [BUG?] stalled xeno domain References: <4437D2EE.4040902@domain.hid> <4437E704.4010607@domain.hid> <443A437B.6010403@domain.hid> <443A5212.7020001@domain.hid> <443A56E2.1010204@domain.hid> <443A5D7B.2040707@domain.hid> <443A9410.50402@domain.hid> <44427A4A.9050208@domain.hid> <4443BB41.1030002@domain.hid> In-Reply-To: <4443BB41.1030002@domain.hid> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: xenomai-core Jan Kiszka wrote: > Philippe Gerum wrote: > >>Jan Kiszka wrote: >> >>>Philippe Gerum wrote: >>> >>> >>>>Jan Kiszka wrote: >>>> >>>> >>>>>>>Philippe, do you see any remaining issues, e.g. that the leak >>>>>>>survived >>>>>>>the task termination? Does this have any meaning for correct >>>>>>>driver and >>>>>>>skin code? >>>>>>> >>>>>> >>>>>>The only way I could see this leakage survive a switch transition >>>>>>would >>>>>>require it to happen over the root context, not over a primary >>>>>>context. >>>>>>Was it the case? >>>>>> >>>>> >>>>> >>>>>The task had to leave from primary mode. If I forced it to secondary >>>>>before terminating, the problem did not show up. >>>>> >>>> >>>>But does the code causing the leakage could have been run by different >>>>contexts in sequence, including the root one? >>>> >>> >>> >>>I don't think so. Bugs in our software aside, there should be no switch >>>to secondary mode until termination. Moreover, we installed a SIGXCPU >>>handler, and that one didn't trigger as well. >>> >>> >>>I just constructed a simple test by placing rthal_local_irq_disable() in >>>rt_timer_spin and setting up this user space app: >>> >>>#include >>>#include >>>#include >>>#include >>> >>>RT_TASK task; >>> >>>void func(void *arg) >>>{ >>> rt_timer_spin(0); >>>} >>> >>> >>>void terminate(int sig) >>>{ >>> printf("joining...\n"); >>> rt_task_join(&task); >>> rt_task_delete(&task); >>> printf("done\n"); >>>} >>> >>> >>>int main() >>>{ >>> signal(SIGINT, terminate); >>> rt_task_spawn(&task, "lockup", 0, 10, T_FPU | T_JOINABLE | T_WARNSW, >>> func, NULL); >>> pause(); >>> return 0; >>>} >>> >>> >>>Should this lock up (as it currently does) or rather continue to run >>>normally after the RT-task terminated? BTW, I'm still not sure if we are >>>hunting shadows (is IRQs off a legal state for user space in some skin?) >>>or a real problem - i.e. is it worth the time. >>> >> >>I've just tested this frag against the current SVN head, patching >>rt_timer_spin() as required, and cannot reproduce the lockup. As > > > Are you sure that you actually used the modified native skin for the test? > Yep, checked twice. > >>expected, the incoming root thread reinstates the correct stall bit >>(i.e. clears it) after the RT thread terminates. Any chance some >>potentially troublesome stuff exists in your setup? >> > > > I just re-verified this behaviour on a slightly different setup (still > 2.6.15-ipipe-1.2-02, xeno trunk), and I'm going to try this on a third > box with 2.6.16+tracing soon. So far I still have a stuck timer IRQ > after the test. > > Jan > -- Philippe.