All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: Xenomai core <Xenomai-core@domain.hid>
Subject: Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage
Date: Thu, 23 Jun 2011 21:08:14 +0200	[thread overview]
Message-ID: <4E038F1E.6010001@domain.hid> (raw)
In-Reply-To: <4E030956.5030702@domain.hid>

On 06/23/2011 11:37 AM, Jan Kiszka wrote:
> On 2011-06-20 19:07, Jan Kiszka wrote:
>> On 2011-06-19 15:00, Gilles Chanteperdrix wrote:
>>> On 06/19/2011 01:17 PM, Gilles Chanteperdrix wrote:
>>>> On 06/19/2011 12:14 PM, Gilles Chanteperdrix wrote:
>>>>> I am working on this ppd cleanup issue again, I am asking for help to
>>>>> find a fix in -head for all cases where the sys_ppd is needed during
>>>>> some cleanup.
>>>>>
>>>>> The problem is that when the ppd cleanup is invoked:
>>>>> - we have no guarantee that current is a thread from the Xenomai
>>>>> application;
>>>>> - if it is, current->mm is NULL.
>>>>>
>>>>> So, associating the sys_ppd to either current or current->mm does not
>>>>> work. What we could do is pass the sys_ppd to all the other ppds cleanup
>>>>> handlers, this would fix cases such as freeing mutexes fastlock, but
>>>>> that does not help when the sys_ppd is needed during a thread deletion hook.
>>>>>
>>>>> I would like to find a solution where simply calling xnsys_ppd_get()
>>>>> will work, where we do not have an xnsys_ppd_get for each context, such
>>>>> as for instance xnsys_ppd_get_by_mm/xnsys_ppd_get_by_task_struct,
>>>>> because it would be too error-prone.
>>>>>
>>>>> Any idea anyone?
>>>>
>>>> The best I could come up with: use a ptd to store the mm currently 
>>>> being cleaned up, so that xnshadow_ppd_get continues to work, even
>>>> in the middle of a cleanup.
>>>
>>> In order to also get xnshadow_ppd_get to work in task deletion hooks 
>>> (which is needed to avoid the issue at the origin of this thread), we 
>>> also need to set this ptd upon shadow mapping, so it is still there 
>>> when reaching the task deletion hook (where current->mm may be NULL). 
>>> Hence the patch:
>>>
>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>> index b243600..6bc4210 100644
>>> --- a/ksrc/nucleus/shadow.c
>>> +++ b/ksrc/nucleus/shadow.c
>>> @@ -65,6 +65,11 @@ int nkthrptd;
>>>  EXPORT_SYMBOL_GPL(nkthrptd);
>>>  int nkerrptd;
>>>  EXPORT_SYMBOL_GPL(nkerrptd);
>>> +int nkmmptd;
>>> +EXPORT_SYMBOL_GPL(nkmmptd);
>>> +
>>> +#define xnshadow_mmptd(t) ((t)->ptd[nkmmptd])
>>> +#define xnshadow_mm(t) ((struct mm_struct *)xnshadow_mmptd(t))
>>
>> xnshadow_mm() can now return a no longer existing mm. So no user of
>> xnshadow_mm should ever dereference that pointer. Thus we better change
>> all that user to treat the return value as a void pointer e.g.
>>
>>>  
>>>  struct xnskin_slot {
>>>  	struct xnskin_props *props;
>>> @@ -1304,6 +1309,8 @@ int xnshadow_map(xnthread_t *thread, xncompletion_t __user *u_completion,
>>>  	 * friends.
>>>  	 */
>>>  	xnshadow_thrptd(current) = thread;
>>> +	xnshadow_mmptd(current) = current->mm;
>>> +
>>>  	rthal_enable_notifier(current);
>>>  
>>>  	if (xnthread_base_priority(thread) == 0 &&
>>> @@ -2759,7 +2766,15 @@ static void detach_ppd(xnshadow_ppd_t * ppd)
>>>  
>>>  static inline void do_cleanup_event(struct mm_struct *mm)
>>>  {
>>> +	struct task_struct *p = current;
>>> +	struct mm_struct *old;
>>> +
>>> +	old = xnshadow_mm(p);
>>> +	xnshadow_mmptd(p) = mm;
>>> +
>>>  	ppd_remove_mm(mm, &detach_ppd);
>>> +
>>> +	xnshadow_mmptd(p) = old;
>>
>> I don't have the full picture yet, but that feels racy: If the context
>> over which we clean up that foreign mm is also using xnshadow_mmptd,
>> other threads in that process may dislike this temporary change.
>>
>>>  }
>>>  
>>>  RTHAL_DECLARE_CLEANUP_EVENT(cleanup_event);
>>> @@ -2925,7 +2940,7 @@ EXPORT_SYMBOL_GPL(xnshadow_unregister_interface);
>>>  xnshadow_ppd_t *xnshadow_ppd_get(unsigned muxid)
>>>  {
>>>  	if (xnpod_userspace_p())
>>> -		return ppd_lookup(muxid, current->mm);
>>> +		return ppd_lookup(muxid, xnshadow_mm(current) ?: current->mm);
>>>  
>>>  	return NULL;
>>>  }
>>> @@ -2960,8 +2975,9 @@ int xnshadow_mount(void)
>>>  	sema_init(&completion_mutex, 1);
>>>  	nkthrptd = rthal_alloc_ptdkey();
>>>  	nkerrptd = rthal_alloc_ptdkey();
>>> +	nkmmptd = rthal_alloc_ptdkey();
>>>  
>>> -	if (nkthrptd < 0 || nkerrptd < 0) {
>>> +	if (nkthrptd < 0 || nkerrptd < 0 || nkmmptd < 0) {
>>>  		printk(KERN_ERR "Xenomai: cannot allocate PTD slots\n");
>>>  		return -ENOMEM;
>>>  	}
>>> diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c
>>> index 6ce75e5..cc86852 100644
>>> --- a/ksrc/skins/posix/mutex.c
>>> +++ b/ksrc/skins/posix/mutex.c
>>> @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t *mutex,
>>>  	xnlock_put_irqrestore(&nklock, s);
>>>  
>>>  #ifdef CONFIG_XENO_FASTSYNCH
>>> -	/* We call xnheap_free even if the mutex is not pshared; when
>>> -	   this function is called from pse51_mutexq_cleanup, the
>>> -	   sem_heap is destroyed, or not the one to which the fastlock
>>> -	   belongs, xnheap will simply return an error. */
>>
>> I think this comment is not completely obsolete. It still applies /wrt
>> shared/non-shared.
>>
>>>  	xnheap_free(&xnsys_ppd_get(mutex->attr.pshared)->sem_heap,
>>>  		    mutex->synchbase.fastlock);
>>>  #endif /* CONFIG_XENO_FASTSYNCH */
>>>
>>>
>>
>> If we can resolve that potential race, this looks like a nice solution.
> 
> We still have to address that ordering issue I almost forgot:
> do_cleanup_event runs before do_task_exit_event when terminating the
> last task. The former destroys the sem heap, the latter fires the delete
> hook which then tries to free msendq.fastlock to an invalid heap.
> 
> Should be fixable by setting sem_heap NULL in the ppd on destroy and
> skipping the fastlock release in __task_delete_hook if the heap pointer
> is found like that.

It will not work: the ppd is destroyed by the time we reach the taskexit
event. And this order is not guaranteed either.

One way to get the guaranteed order would be, when mapping a shadow, to
increment the mm reference count, and dereference the mm when unmapping,
this way, the cleanup event would be guaranteed to happen after the last
shadow exits.

Or we can handle a reference count on the sys ppd, and when this count
reaches zero, do the detach. We would no longer need the cleanup event.

-- 
                                                                Gilles.


  parent reply	other threads:[~2011-06-23 19:08 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-23 13:53 [Xenomai-core] [PULL] native: Fix msendq fastlock leakage Jan Kiszka
2011-05-24  4:31 ` Gilles Chanteperdrix
2011-05-24  9:13   ` Jan Kiszka
2011-05-24  9:32     ` Gilles Chanteperdrix
2011-05-24  9:36       ` Jan Kiszka
2011-05-24  9:58         ` Gilles Chanteperdrix
2011-05-24 10:36           ` Jan Kiszka
2011-05-24 10:41             ` Gilles Chanteperdrix
2011-05-24 12:23               ` Jan Kiszka
2011-05-24 12:30                 ` Gilles Chanteperdrix
2011-05-24 13:52                   ` Jan Kiszka
2011-05-24 14:03                     ` Gilles Chanteperdrix
2011-05-25 11:20                       ` Jan Kiszka
2011-05-25 11:58                         ` Gilles Chanteperdrix
2011-05-25 12:12                           ` Jan Kiszka
2011-05-25 12:19                             ` Gilles Chanteperdrix
2011-05-25 12:22                               ` Jan Kiszka
2011-05-25 18:48                                 ` Gilles Chanteperdrix
2011-05-26  7:18                                   ` Jan Kiszka
2011-05-26  7:29                                     ` Gilles Chanteperdrix
2011-05-26  7:37                                       ` Jan Kiszka
2011-05-26  7:58                                         ` Gilles Chanteperdrix
2011-06-19 10:14 ` Gilles Chanteperdrix
2011-06-19 11:17   ` Gilles Chanteperdrix
2011-06-19 13:00     ` Gilles Chanteperdrix
2011-06-20 17:07       ` Jan Kiszka
2011-06-20 17:46         ` Gilles Chanteperdrix
2011-06-20 20:52           ` Jan Kiszka
2011-06-23  9:37         ` Jan Kiszka
2011-06-23 11:11           ` Gilles Chanteperdrix
2011-06-23 11:15             ` Jan Kiszka
2011-06-23 17:32               ` Gilles Chanteperdrix
2011-06-23 18:13                 ` Philippe Gerum
2011-06-23 18:24                   ` Philippe Gerum
2011-06-23 18:56                     ` Gilles Chanteperdrix
2011-06-23 19:08           ` Gilles Chanteperdrix [this message]
2011-06-24  7:01           ` Gilles Chanteperdrix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E038F1E.6010001@domain.hid \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=Xenomai-core@domain.hid \
    --cc=jan.kiszka@domain.hid \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.