From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4F2176BF.5070900@domain.hid> Date: Thu, 26 Jan 2012 16:52:31 +0100 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4F202C11.70908@domain.hid> <4F202F4E.6000708@domain.hid> <4F203237.2010102@domain.hid> <4F203353.8030302@domain.hid> <4F2035B6.6090105@domain.hid> <4F203771.6070708@domain.hid> <4F203F7F.70509@domain.hid> <4F204466.1040603@domain.hid> <4F212CC4.6060001@domain.hid> <4F216952.7010009@domain.hid> <4F216C09.5040809@domain.hid> In-Reply-To: <4F216C09.5040809@domain.hid> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [PATCH] Add sigdebug unit test List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: xenomai-core On 01/26/2012 04:06 PM, Jan Kiszka wrote: > On 2012-01-26 15:55, Gilles Chanteperdrix wrote: >> On 01/26/2012 11:36 AM, Jan Kiszka wrote: >>> On 2012-01-25 19:05, Jan Kiszka wrote: >>>> On 2012-01-25 18:44, Gilles Chanteperdrix wrote: >>>>> On 01/25/2012 06:10 PM, Jan Kiszka wrote: >>>>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote: >>>>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote: >>>>>>>> On 2012-01-25 17:47, Jan Kiszka wrote: >>>>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote: >>>>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote: >>>>>>>>>>> We had two regressions in this code recently. So test all 6 possible >>>>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available. >>>>>>>>>> >>>>>>>>>> Ok for this test, with a few remarks: >>>>>>>>>> - this is a regression test, so should go to >>>>>>>>>> src/testsuite/regression(/native), and should be added to the >>>>>>>>>> xeno-regression-test >>>>>>>>> >>>>>>>>> What are unit test for (as they are defined here)? Looks a bit inconsistent. >>>>>>> >>>>>>> I put under "regression" all the tests I have which corresponded to >>>>>>> things that failed one time or another in xenomai past. Maybe we could >>>>>>> move unit tests under regression. >>>>>>> >>>>>>>>> >>>>>>>>>> - we already have a regression test for the watchdog called mayday.c, >>>>>>>>>> which tests the second watchdog action, please merge mayday.c with >>>>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of the code in >>>>>>>>>> the mayday page, a nice feature) >>>>>>>>> >>>>>>>>> It seems to have failed in that important last discipline. Need to check >>>>>>>>> why. >>>>>>>> >>>>>>>> Because it didn't check the page content for correctness. But that's now >>>>>>>> done via the new watchdog test. I can keep the debug output, but the >>>>>>>> watchdog test of mayday looks obsolete to me. Am I missing something? >>>>>>> >>>>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the >>>>>>> application is still spinning, it sends a SIGSEGV. As far as I >>>>>>> understood, you test tests the first case, and mayday tests the second >>>>>>> case, so, I agree that mayday should be removed, but whatever it tests >>>>>>> should be integrated in the sigdebug test. >>>>>>> >>>>>> >>>>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) So the >>>>>> test case actually specified a bug as correct behavior. >>>>>> >>>>>> The fallback case is in fact killing the RT task as before. But I'm >>>>>> unsure right now: will this leave the system always in a clean state >>>>>> behind? >>>>> >>>>> The test case being a test case and doing nothing particular, I do not >>>>> see what could go wrong. And if something goes wrong, then it needs fixing. >>>> >>>> Well, if you kill a RT task while it's running in the kernel, you risk >>>> inconsistent system states (held mutexex etc.). In this case the task is >>>> supposed to spin in user space. If that is always safe, let's implement >>>> the test. >>> >>> Had a closer look: These days the two-stage killing is only useful to >>> catch endless loops in the kernel. User space tasks can't get around >>> being migrated on watchdog events, even when SIGDEBUG is ignored. >>> >>> To trigger the enforced task termination without leaving any broken >>> states behind, there is one option: rt_task_spin. Surprisingly for me, >>> it actually spins in the kernel, thus triggers the second level if >>> waiting long enough. I wonder, though, if that behavior shouldn't be >>> improved, ie. the spinning loop be closed in user space - which would >>> take away that option again. >>> >>> Thoughts? >> >> You can also call in an infinite loop, a xenomais syscall which causes a >> switch to primary mode, but fails. > > Nope, we would be migrated to secondary on xnthread_amok_p when > returning to user mode. We need a true kernel loop. But the loop will continue, and the next call to the syscall will cause the thread to re-switch to primary mode. -- Gilles.