From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <443A9A2D.8000405@domain.hid> Date: Mon, 10 Apr 2006 19:47:25 +0200 From: Philippe Gerum MIME-Version: 1.0 Subject: Re: [Xenomai-core] [BUG?] stalled xeno domain References: <4437D2EE.4040902@domain.hid> <4437E704.4010607@domain.hid> <443A437B.6010403@domain.hid> <443A5212.7020001@domain.hid> <443A56E2.1010204@domain.hid> <443A5D7B.2040707@domain.hid> <443A9410.50402@domain.hid> In-Reply-To: <443A9410.50402@domain.hid> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: xenomai-core Jan Kiszka wrote: > Philippe Gerum wrote: > >>Jan Kiszka wrote: >> >>>>>Philippe, do you see any remaining issues, e.g. that the leak survived >>>>>the task termination? Does this have any meaning for correct driver and >>>>>skin code? >>>>> >>>> >>>>The only way I could see this leakage survive a switch transition would >>>>require it to happen over the root context, not over a primary context. >>>>Was it the case? >>>> >>> >>> >>>The task had to leave from primary mode. If I forced it to secondary >>>before terminating, the problem did not show up. >>> >> >>But does the code causing the leakage could have been run by different >>contexts in sequence, including the root one? >> > > > I don't think so. Bugs in our software aside, there should be no switch > to secondary mode until termination. Moreover, we installed a SIGXCPU > handler, and that one didn't trigger as well. > > > I just constructed a simple test by placing rthal_local_irq_disable() in > rt_timer_spin and setting up this user space app: > > #include > #include > #include > #include > > RT_TASK task; > > void func(void *arg) > { > rt_timer_spin(0); > } > > > void terminate(int sig) > { > printf("joining...\n"); > rt_task_join(&task); > rt_task_delete(&task); > printf("done\n"); > } > > > int main() > { > signal(SIGINT, terminate); > rt_task_spawn(&task, "lockup", 0, 10, T_FPU | T_JOINABLE | T_WARNSW, > func, NULL); > pause(); > return 0; > } > > > Should this lock up (as it currently does) or rather continue to run > normally after the RT-task terminated? BTW, I'm still not sure if we are > hunting shadows (is IRQs off a legal state for user space in some skin?) > or a real problem - i.e. is it worth the time. > IRQS off in user-space - aside of the particular semantics introduced by the interrupt shielding - is not a correct state, but it is for kernel based RT threads, so I would expect the real-time core to be robust wrt this kind of situation. I'm going to put this issue on my work queue anyway, I don't like unexplained software thingies getting too close to the Twilight Zone... -- Philippe.