From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <48F8753C.5000901@domain.hid> Date: Fri, 17 Oct 2008 13:21:32 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <20081016154620.320953568@domain.hid> <20081016154621.644116499@domain.hid> <48F7A064.20404@domain.hid> <48F7C5D1.3010500@domain.hid> <48F855DB.3070300@domain.hid> <48F86F6A.1010801@domain.hid> <48F86FED.2010409@domain.hid> <48F8738E.8070709@domain.hid> In-Reply-To: <48F8738E.8070709@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigB78021062AFFC45F9F173F39" Sender: jan.kiszka@domain.hid Subject: Re: [Xenomai-core] [PATCH 08/12] Use fast xnsynch with native skin List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai@xenomai.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB78021062AFFC45F9F173F39 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Gilles Chanteperdrix wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: >>>> Jan Kiszka wrote: >>>>> Gilles Chanteperdrix wrote: >>>>>> Jan Kiszka wrote: >>>>>> >>>>>> Same remark for the #ifdefs. >>>>> Yes, but most cases (maybe except for owner checking) are unavoidab= le >>>>> due to heavy differences. I hope that we may have only FASTXNSYCH a= rchs >>>>> one day. >>>>> >>>>>> I also do not understand your modification >>>>>> of rt_cond_wait_inner, this code is very tense; posix and native h= ad the >>>>>> same bug in this area at different times, so this change should pr= obably >>>>>> be made separately. >>>>> Which modification precisely (damn, I need to find out what makes q= uilt >>>>> cause this attachment confusion)? Note that lockcnt tracking change= d >>>>> with this patch: the lock owner himself is in charge of maintaining= it, >>>>> not some thread handing the lock over. >>>>> >>>>> That said, I would happily analyse the case that broke before. I wi= ll >>>>> also check if I can break out the lockcnt maintenance change, but I= >>>>> think to recall that it was coupled to the fast path changes. >>>> The point is that in case of rt_cond_wait, the mutex must unlock >>>> completely regardless of the current lock count. So, the count is se= t to >>>> 1 before calling the function which unlocks the last level of the >>>> recursion count. As far as I can see, you removed the part which set= s >>>> the counter to 1. >>>> >>> Don't find that spot. Are we talking about this change or something e= lse? >>> >>>> --- a/ksrc/skins/native/cond.c >>>> +++ b/ksrc/skins/native/cond.c >>>> @@ -407,24 +407,26 @@ int rt_cond_wait_inner(RT_COND *cond, RT >>> ... >>>> mutex->lockcnt =3D 0; >>>> =20 >>>> - if (xnsynch_release(&mutex->synch_base)) { >>>> - mutex->lockcnt =3D 1; >>>> - /* Scheduling deferred */ >>>> - } >>>> + xnsynch_release(&mutex->synch_base); >>>> + /* Scheduling deferred */ >>>> =20 >>>> xnsynch_sleep_on(&cond->synch_base, timeout, timeout_mode); >>>> =20 >>> In case it's the above: That setting of lockcnt is actually on behalf= of >>> the new owner (xnsynch_release !=3D NULL =3D> we woke up a new owner)= =2E And >>> that has been sanitized in the patch in so far that only the owner >>> manipulates lockcnt, always. If you see a remaining issue in this >>> approach or find a hole in the implementation, please let me know. >> Who sets the lockcnt to 0 if there is no next owner ? >=20 > Ok. It appears in the code you quoted. However, your modification > changes the behaviour of the mutex code: >=20 > with the old version, the mutex will appear locked if an owner has been= > woken up > with the new version, the mutex will appear unlocked between the time > when a new owner has been woken up and the time this new owner gets out= > of xnsynch_acquire, allowing the lock to be stolen by another thread. But that is what cond_wait is about: releasing the associated lock to whoever gets it - either directly or via lock-stealing. I see no problem here. Moreover, lockcnt is never used (and must not!) as an indication if a lock is free. It is a pure helper for the lock owner to track recursion. And it is also no longer consistent between kernel and user side, another reason to allow only the lock owner to play with it. Jan --------------enigB78021062AFFC45F9F173F39 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkj4dTwACgkQniDOoMHTA+mQOwCeLKjRliJBA0LKPeSm9MN5qTiW t9cAn25kWAhdo3TVtnAv7QqMPLHE0Y3d =me/P -----END PGP SIGNATURE----- --------------enigB78021062AFFC45F9F173F39--