From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4F214D7A.5020901@domain.hid> Date: Thu, 26 Jan 2012 13:56:26 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <4F202C11.70908@domain.hid> <4F202F4E.6000708@domain.hid> <4F203237.2010102@domain.hid> <4F203353.8030302@domain.hid> <4F2035B6.6090105@domain.hid> <4F203771.6070708@domain.hid> <4F203F7F.70509@domain.hid> <4F204466.1040603@domain.hid> <4F212CC4.6060001@domain.hid> <4F2136E0.4010200@domain.hid> In-Reply-To: <4F2136E0.4010200@domain.hid> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [PATCH] Add sigdebug unit test List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: xenomai@xenomai.org On 2012-01-26 12:20, Philippe Gerum wrote: > On 01/26/2012 11:36 AM, Jan Kiszka wrote: >> On 2012-01-25 19:05, Jan Kiszka wrote: >>> On 2012-01-25 18:44, Gilles Chanteperdrix wrote: >>>> On 01/25/2012 06:10 PM, Jan Kiszka wrote: >>>>> On 2012-01-25 18:02, Gilles Chanteperdrix wrote: >>>>>> On 01/25/2012 05:52 PM, Jan Kiszka wrote: >>>>>>> On 2012-01-25 17:47, Jan Kiszka wrote: >>>>>>>> On 2012-01-25 17:35, Gilles Chanteperdrix wrote: >>>>>>>>> On 01/25/2012 05:21 PM, Jan Kiszka wrote: >>>>>>>>>> We had two regressions in this code recently. So test all 6 >>>>>>>>>> possible >>>>>>>>>> SIGDEBUG reasons, or 5 if the watchdog is not available. >>>>>>>>> >>>>>>>>> Ok for this test, with a few remarks: >>>>>>>>> - this is a regression test, so should go to >>>>>>>>> src/testsuite/regression(/native), and should be added to the >>>>>>>>> xeno-regression-test >>>>>>>> >>>>>>>> What are unit test for (as they are defined here)? Looks a bit >>>>>>>> inconsistent. >>>>>> >>>>>> I put under "regression" all the tests I have which corresponded to >>>>>> things that failed one time or another in xenomai past. Maybe we >>>>>> could >>>>>> move unit tests under regression. >>>>>> >>>>>>>> >>>>>>>>> - we already have a regression test for the watchdog called >>>>>>>>> mayday.c, >>>>>>>>> which tests the second watchdog action, please merge mayday.c with >>>>>>>>> sigdebug.c (mayday.c also allows checking the disassembly of >>>>>>>>> the code in >>>>>>>>> the mayday page, a nice feature) >>>>>>>> >>>>>>>> It seems to have failed in that important last discipline. Need >>>>>>>> to check >>>>>>>> why. >>>>>>> >>>>>>> Because it didn't check the page content for correctness. But >>>>>>> that's now >>>>>>> done via the new watchdog test. I can keep the debug output, but the >>>>>>> watchdog test of mayday looks obsolete to me. Am I missing >>>>>>> something? >>>>>> >>>>>> The watchdog does two things: it first sends a SIGDEBUG, then if the >>>>>> application is still spinning, it sends a SIGSEGV. As far as I >>>>>> understood, you test tests the first case, and mayday tests the >>>>>> second >>>>>> case, so, I agree that mayday should be removed, but whatever it >>>>>> tests >>>>>> should be integrated in the sigdebug test. >>>>>> >>>>> >>>>> Err... SIGSEGV is not a feature, it was the bug I fixed today. :) >>>>> So the >>>>> test case actually specified a bug as correct behavior. >>>>> >>>>> The fallback case is in fact killing the RT task as before. But I'm >>>>> unsure right now: will this leave the system always in a clean state >>>>> behind? >>>> >>>> The test case being a test case and doing nothing particular, I do not >>>> see what could go wrong. And if something goes wrong, then it needs >>>> fixing. >>> >>> Well, if you kill a RT task while it's running in the kernel, you risk >>> inconsistent system states (held mutexex etc.). In this case the task is >>> supposed to spin in user space. If that is always safe, let's implement >>> the test. >> >> Had a closer look: These days the two-stage killing is only useful to >> catch endless loops in the kernel. User space tasks can't get around >> being migrated on watchdog events, even when SIGDEBUG is ignored. >> >> To trigger the enforced task termination without leaving any broken >> states behind, there is one option: rt_task_spin. Surprisingly for me, >> it actually spins in the kernel, thus triggers the second level if >> waiting long enough. I wonder, though, if that behavior shouldn't be >> improved, ie. the spinning loop be closed in user space - which would >> take away that option again. >> >> Thoughts? >> > > Tick-based timing is going to be the problem for determining the > spinning delay, unless we expose it in the vdso on a per-skin basis, > which won't be pretty. I see. But we should possibly add some signal-pending || amok test to that kernel loop. That would also kill my test design, but it makes otherwise some sense I guess. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux