From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <49B54B88.8030008@domain.hid> Date: Mon, 09 Mar 2009 18:02:00 +0100 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <49B3A126.6000602@domain.hid> <49B53AC3.10707@domain.hid> <49B54780.6040504@domain.hid> In-Reply-To: <49B54780.6040504@domain.hid> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] Watchdog / immediate Linux signal delivery List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: xenomai-core Jan Kiszka wrote: > Philippe Gerum wrote: >> Jan Kiszka wrote: >>> the watchdog strikes. The second one brought me to another issue: Raise >>> SIGKILL for the current thread and make sure that it can be processed by >>> Linux (e.g. via xnpod_suspend_thread(). Unfortunately, there is >>> no way to force a shadow thread into secondary mode to handle pending >>> Linux signals unless that thread issues a syscall once in a while. And >>> that raises the question if we shouldn't improve this as well while we >>> are on it. >>> >>> Granted, non-broken Xenomai user space threads always issue frequent >>> syscalls, otherwise the system would starve (and the watchdog would come >>> around). On the other hand, delaying signals till syscall prologues is >>> different from plain Linux behaviour... >>> >>> Comments, ideas? >>> >> We probably need a two-stage approach: first record the thread was bumped out >> and suspend it from the watchdog handler to give Linux a chance to run again, >> then finish the work, killing it for good, next time the root thread is >> scheduled in on the same CPU. > > That confuses me again: The watchdog issue is solved now, no? We are > only left with the scenario of breaking out of a user space loop of some > Xenomai thread via a Linux signal (which implies SMP - otherwise there > is no chance to raise the signal...). > > Meanwhile I played with some light-weight approach to relax a thread > that received a signal (according to do_sigwake_event). Worked, but only > once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr, > it does not handle the case that a non-root handler may alter the > current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the > involved domains. Will try to fix this and post my signaling proposal so > that this work is not lost. If we go that way, I would vote for a SIGSEGV instead of the SIGKILL. This would allow to install a handler to dump the backtrace, or even gdb to be stopped at the point of the infinite loop, and a SIGSEGV handler is not expected to recover (well, except in cases of implementation of COW in user-space, but that does not fit well with real-time threads). -- Gilles.