From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <43CEA91F.6080308@domain.hid> Date: Wed, 18 Jan 2006 21:46:23 +0100 From: Jan Kiszka MIME-Version: 1.0 Subject: Re: [Xenomai-core] Scheduling while atomic References: <43CE9FCB.2070305@domain.hid> <17358.42102.613327.327841@domain.hid> <43CEA56D.8000407@domain.hid> In-Reply-To: <43CEA56D.8000407@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig50B721737A5FF1EAA37C0F6F" Sender: jan.kiszka@domain.hid List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai@xenomai.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig50B721737A5FF1EAA37C0F6F Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >> > Jeroen Van den Keybus wrote: >> > > Gilles, >> > >=20 >> > >=20 >> > > I cannot reproduce those messages after turning nucleus debugging= on. >> > > Instead, I now either get relatively more failing mutexes or even= hard >> > > lockups with the test program I sent to you. If the computer didn= 't crash, >> > > dmesg contains 3 Xenomai messages relating to a task being movend= to >> > > secondary domain after exception #14. As when the computer crashe= s: I have >> > > written the last kernel panic message on a paper. Please tell if = you want >> > > also the addresses or (part of) the call stack. >> > >=20 >> > > I'm still wondering if there's a programming error in the mutex t= est >> > > program. After I sent my previous message, and before I turned nu= cleus >> > > debugging on, I managed (by reducing the sleeptimes to max. 5.0e4= ) to >> > > fatally crash the computer, while spewing out countless 'scheduli= ng while >> > > atomic messages'. Is the mutex error reproducible ? >> >=20 >> > I was not able to crash my box or generate that scheduler warnings,= but >> > the attached patch fixes the false positive warnings of unlocked >> > mutexes. We had a "leak" in the unlock path when someone was alread= y >> > waiting. Anyway, *this* issues should not have caused any other pro= blems >> > then the wrong report of rt_mutex_inquire(). >> >> Actually the patch seem insufficient, the whole block : >> { >> xnsynch_set_owner(&mutex->synch_base,&task->thread_base); >> mutex->owner =3D task; >> mutex->lockcnt =3D 1; >> goto unlock_and_exit; >> } >> >> should be done after xnsynch_sleep_on in rt_mutex_lock. >> >=20 > Damn, of course - except for "mutex->owner =3D task". Then this missing= > xnsync_set_owner() may have caused serious issues? Will test... Correction: xnsynch_wakeup_one_sleeper() updates synch->owner, so all fine with my patch in this regard. I guess it is really some migration issue again. Jan --------------enig50B721737A5FF1EAA37C0F6F Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFDzqkfniDOoMHTA+kRAuuQAJwII3YWYfcyDGA0hZ94FfAiOLxCrgCggaSX 5emOrL43eJUdxfPsSRX/r2I= =RHWO -----END PGP SIGNATURE----- --------------enig50B721737A5FF1EAA37C0F6F--