From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4E1B56E0.20109@domain.hid> Date: Mon, 11 Jul 2011 22:02:40 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4E1B469A.8000703@domain.hid> <4E1B4AC0.80506@domain.hid> <4E1B4C19.2070205@domain.hid> <4E1B542B.2010906@domain.hid> <4E1B5638.1050005@domain.hid> In-Reply-To: <4E1B5638.1050005@domain.hid> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Xenomai core On 07/11/2011 09:59 PM, Jan Kiszka wrote: > On 2011-07-11 21:51, Gilles Chanteperdrix wrote: >> On 07/11/2011 09:16 PM, Jan Kiszka wrote: >>> On 2011-07-11 21:10, Jan Kiszka wrote: >>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote: >>>>> On 07/08/2011 06:29 PM, GIT version control wrote: >>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct task_struct *p) >>>>>> magic = xnthread_get_magic(thread); >>>>>> >>>>>> xnlock_get_irqsave(&nklock, s); >>>>>> + >>>>>> + gksched = thread->gksched; >>>>>> + if (gksched) { >>>>>> + xnlock_put_irqrestore(&nklock, s); >>>>> >>>>> Are we sure irqs are on here? Are you sure that what is needed is not an >>>>> xnlock_clear_irqon? >>>> >>>> We are in the context of do_exit. Not only IRQs are on, also preemption. >>>> And surely no nklock is held. >>>> >>>>> Furthermore, I do not understand how we >>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to >>>>> wait for this assignment? >>>> >>>> The gatekeeper holds the gksync token while it's active. We request it, >>>> thus we wait for the gatekeeper to become idle again. While it is idle, >>>> we reset the queued reference - but I just realized that this may tramp >>>> on other tasks' values. I need to add a check that the value to be >>>> null'ified is actually still ours. >>> >>> Thinking again, that's actually not a problem: gktarget is only needed >>> while gksync is zero - but then we won't get hold of it anyway and, >>> thus, can't cause any damage. >> >> Well, you make it look like it does not work. From what I understand, >> what you want is to set gktarget to null if a task being hardened is >> destroyed. But by waiting for the semaphore, you actually wait for the >> harden to be complete, so setting to NULL is useless. Or am I missing >> something else? > > Setting to NULL is probably unneeded but still better than rely on the > gatekeeper never waking up spuriously and then dereferencing a stale > pointer. > > The key element of this fix is waitng on gksync, thus on the completion > of the non-RT part of the hardening. Actually, this part usually fails > as the target task received a termination signal at this point. Yes, but since you wait on the completion of the hardening, the test if (target &&...) in the gatekeeper code will always be true, because at this point the cleanup code will still be waiting for the semaphore. -- Gilles.