From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4443BB41.1030002@domain.hid> Date: Mon, 17 Apr 2006 17:58:57 +0200 From: Jan Kiszka MIME-Version: 1.0 Subject: Re: [Xenomai-core] [BUG?] stalled xeno domain References: <4437D2EE.4040902@domain.hid> <4437E704.4010607@domain.hid> <443A437B.6010403@domain.hid> <443A5212.7020001@domain.hid> <443A56E2.1010204@domain.hid> <443A5D7B.2040707@domain.hid> <443A9410.50402@domain.hid> <44427A4A.9050208@domain.hid> In-Reply-To: <44427A4A.9050208@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig7577A2DADF00E93EB4B41FB0" Sender: jan.kiszka@domain.hid List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: xenomai-core This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig7577A2DADF00E93EB4B41FB0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable Philippe Gerum wrote: > Jan Kiszka wrote: >> Philippe Gerum wrote: >> >>> Jan Kiszka wrote: >>> >>>>>> Philippe, do you see any remaining issues, e.g. that the leak >>>>>> survived >>>>>> the task termination? Does this have any meaning for correct >>>>>> driver and >>>>>> skin code? >>>>>> >>>>> >>>>> The only way I could see this leakage survive a switch transition >>>>> would >>>>> require it to happen over the root context, not over a primary >>>>> context. >>>>> Was it the case? >>>>> >>>> >>>> >>>> The task had to leave from primary mode. If I forced it to secondary= >>>> before terminating, the problem did not show up. >>>> >>> >>> But does the code causing the leakage could have been run by differen= t >>> contexts in sequence, including the root one? >>> >> >> >> I don't think so. Bugs in our software aside, there should be no switc= h >> to secondary mode until termination. Moreover, we installed a SIGXCPU >> handler, and that one didn't trigger as well. >> >> >> I just constructed a simple test by placing rthal_local_irq_disable() = in >> rt_timer_spin and setting up this user space app: >> >> #include >> #include >> #include >> #include >> >> RT_TASK task; >> >> void func(void *arg) >> { >> rt_timer_spin(0); >> } >> >> >> void terminate(int sig) >> { >> printf("joining...\n"); >> rt_task_join(&task); >> rt_task_delete(&task); >> printf("done\n"); >> } >> >> >> int main() >> { >> signal(SIGINT, terminate); >> rt_task_spawn(&task, "lockup", 0, 10, T_FPU | T_JOINABLE | T_WARNS= W, >> func, NULL); >> pause(); >> return 0; >> } >> >> >> Should this lock up (as it currently does) or rather continue to run >> normally after the RT-task terminated? BTW, I'm still not sure if we a= re >> hunting shadows (is IRQs off a legal state for user space in some skin= ?) >> or a real problem - i.e. is it worth the time. >> >=20 > I've just tested this frag against the current SVN head, patching > rt_timer_spin() as required, and cannot reproduce the lockup. As Are you sure that you actually used the modified native skin for the test= ? > expected, the incoming root thread reinstates the correct stall bit > (i.e. clears it) after the RT thread terminates. Any chance some > potentially troublesome stuff exists in your setup? >=20 I just re-verified this behaviour on a slightly different setup (still 2.6.15-ipipe-1.2-02, xeno trunk), and I'm going to try this on a third box with 2.6.16+tracing soon. So far I still have a stuck timer IRQ after the test. Jan --------------enig7577A2DADF00E93EB4B41FB0 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEQ7tCniDOoMHTA+kRAqvxAJ9CAW2l28+U12tU3cywA7X8YbdcEQCffWyo rsyhDSX3HQ0CGLFXOcfVMa8= =gW/p -----END PGP SIGNATURE----- --------------enig7577A2DADF00E93EB4B41FB0--