From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4C06265C.3030108@domain.hid> Date: Wed, 02 Jun 2010 11:37:32 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <20100601135005.GA5483@domain.hid> <1275402757.27918.151.camel@domain.hid> <20100601155403.GA8240@domain.hid> <4C053C51.4090903@domain.hid> <4C061823.70005@domain.hid> <1275470136.18250.16.camel@domain.hid> <4C062246.40107@domain.hid> <1275470925.18250.18.camel@domain.hid> In-Reply-To: <1275470925.18250.18.camel@domain.hid> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-help] Handling Linux Signals in primary domain context List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: Jan Kiszka , "xenomai@xenomai.org" Philippe Gerum wrote: > On Wed, 2010-06-02 at 11:20 +0200, Jan Kiszka wrote: >> Philippe Gerum wrote: >>> On Wed, 2010-06-02 at 10:36 +0200, Gilles Chanteperdrix wrote: >>>> Jan Kiszka wrote: >>>>> Tschaeche IT-Services wrote: >>>>>> On Tue, Jun 01, 2010 at 04:32:37PM +0200, Philippe Gerum wrote: >>>>>>> Not in the absence of syscall. We thought about this once already, when >>>>>>> considering how a watchdog preempting a runaway task in primary mode >>>>>>> could force a secondary mode switch: there is no sane and easy solution >>>>>>> to this unfortunately. >>>>>> This is exactly Sigmatek's problem: Our customers develop code >>>>>> within our debugging/development environment. We want to catch >>>>>> this situation (the developer implements a while(1)) with a >>>>>> watchdog throwing SIGTRAP so that our debugger gets active >>>>>> and can locate the problem according to the stack frame... >>>>> CONFIG_XENO_OPT_WATCHDOG is probably what you are looking for. It tries >>>>> to catch "well-behaving" broken threads via SIGDEBUG and kills the >>>>> hopelessly broken rest - system alive again. >>>>> >>>>> You can then debug the former and need to do code review on the latter. >>>>> Or you could also try to add some loop-breaking Xenomai syscalls (or >>>>> even more clever checks) to library services the code under suspect >>>>> usually invokes. >>>> I am afraid "well-behaving" means emitting syscalls. We have a radical >>>> way to cause a SIGSEGV to be sent to a thread having run amok: set its >>>> PC to an invalid address (after having printed the real PC). gdb will >>>> not be able to print where the program stopped, but should be able to >>>> print the backtrace. >>>> >>> Actually, we could extend this logic and forge a stack frame to return >>> to the preempted application code via some userland trampoline code, >>> doing the switch: >>> >>> [watchdog trigger] >>> forge_return_frame(on =regs->sp, to =regs->pc); >>> regs->pc = __oops_I_did_it_again; >>> >>> __oops_I_did_it_again: >>> __xn_migrate(LINUX_DOMAIN); >>> ret (via forged frame) >> Yep, that's what came to my mind as well. But the __oops_I_did_it_again >> part has to reside in user space, no? > > Clearly, yes. Either we map this explictly, or we just make sure to > compile it in each app, and pass its address at skin binding time. Our > text is mmlocked anyway. > >>> The thing is, that this brings in some arch-dep code to forge a stack >>> frame (like the kernel uses for signals), that should rather live in the >>> pipeline core. >> Actually, we are then close to enabling signal delivery outside syscalls... >> > > Yes, looks like. When thinking about this real signals things, I was thinking about putting the forging code into Xenomai (the code is the same for all kernel versions, so there is no reason to put it into the I-pipe, and we may have to emit a special syscall to restore the context when handling the signal is done). What we need the I-pipe for, however, is to trigger some event on the way back to user-space. -- Gilles.