All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: xenomai-core <xenomai@xenomai.org>
Subject: Re: [Xenomai-core] Pending patches
Date: Tue, 20 Jan 2009 10:22:54 +0100	[thread overview]
Message-ID: <497597EE.1050303@domain.hid> (raw)
In-Reply-To: <4975942D.60906@domain.hid>

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
>>>> Philippe Gerum wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Philippe Gerum wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> commit 728fc8970e2032b3280971788f1223f3ad82d80d
>>>>>>>>>> Author: Jan Kiszka <jan.kiszka@domain.hid>
>>>>>>>>>> Date:   Thu Jan 15 11:10:24 2009 +0100
>>>>>>>>>>
>>>>>>>>>>     xnpipe: Fix racy callback handlers
>>>>>>>>>>     
>>>>>>>>>>     Invocation of input, output and alloc handler must take place under
>>>>>>>>>>     nklock to properly synchronize with xnpipe_disconnect. Change all
>>>>>>>>>>     callers to comply with this policy.
>>>>>>>>>>
>>>>>>>>> That one is under investigation. I agree on the bug report (thanks btw), but I
>>>>>>>>> disagree on the fix. Basically, we can't run all hooks under nklock. For
>>>>>>>>> instance, the alloc_handler may issue kmalloc() calls when issued from the Linux
>>>>>>>>> write endpoint.
>>>>>>>> You mean it /could/? Because no in-tree user (ie. native) calls
>>>>>>>> rt-unsafe services from its alloc_handler.
>>>>>>>>
>>>>>>> When you export a public interface, it is better not to make it incompatible
>>>>>>> unless there is no other way to fix a situation. Doing so is last resort for me.
>>>>>> OTH, there is nothing documented yet about those callback handlers or
>>>>>> xnpipe_connect. So we could only break _assumptions_ about this
>>>>>> interface.
>>>>> Actually, we would break existing implementations, but I understand your point.
>>>>>
>>>>>  But, of course, I would be happy if we could continue to keep
>>>>>> the critical section length short. I just don't see how to achieve this
>>>>>> without significant restrictions on the callback handlers and their use
>>>>>> cases.
>>>>>>
>>>>> That is because the semantics of those callouts is not that smart. If we have to
>>>>> break the API to solve the locking issue, I want to get the semantics fixed in
>>>>> the same move (which may help a lot in solving the locking issue as well), so we
>>>>> don't end up with two subsequent breakage.
>>>>>
>>>> As suspected, the callback management in the message pipe code had very serious
>>>> issues, including the xnpipe_disconnect / xnpipe_release race. At least:
>>>>
>>>> - the latency would simply skyrocket when the input queue (i.e. userland to
>>>> kernel) is flushed from xnpipe_cleanup_user_conn, because the latter holds the
>>>> nklock with interrupts off when doing so. I've tested this case on v2.4.x, the
>>>> results on a latency test when the pipe test code is running in parallel are not
>>>> pretty (i.e. > 220 us latency spot).
>>>>
>>>> - message buffers waiting in the output queue while the latter is flushed would
>>>> NOT be returned to the memory pool when an input handler is present, which is
>>>> the case with the native API that provides one. All the confusion went from the
>>>> output notification and free buffer logic that were combined into the output
>>>> handler. This went unnoticed probably because messages sent via the write()
>>>> endpoint are immediately consumed by the receiving side in a RT task that
>>>> preempts, but still, the bug is there.
>>>>
>>>> The disconnect vs release race can be fixed in a (nearly) lockless way using a
>>>> lingering close sequence, that prevents xnpipe_disconnect() to wipe out the
>>>> extra state (e.g. RT_PIPE struct) until xnpipe_release() does not need it
>>>> anymore. In short, if the userland side is connected, xnpipe_disconnect() will
>>>> defer the actual release of that extra state to xnpipe_release(); otherwise, it
>>>> will discard it immediately. A new callback, namely .release has been added for
>>>> that purpose.
>>> Just had a short glance so far. Looks good - except that there is still
>>> a use (cookie, now xstate, from xnpipe_state) after release
>>> (xnpipe_minor_free). Can't we push the minor_free after calling the
>>> release handler?
>>>
>> We should indeed postpone this just in case the upper layer indexes the extra
>> state on the minor value. We can also simplify a few things doing so.
>>
>> --- ksrc/nucleus/pipe.c	(revision 4565)
>> +++ ksrc/nucleus/pipe.c	(working copy)
>> @@ -77,11 +77,9 @@
>>
>>  static inline void xnpipe_minor_free(int minor)
>>  {
>> -	if (minor < 0 || minor >= XNPIPE_NDEVS)
>> -		return;
>> -
>> -	__clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> -		  1UL << (minor % BITS_PER_LONG));
> 
> xnarch_write_memory_barrier()? Just for the paranoiacs.
>

Yes, we should be that paranoid.

>> +	/* May be called with nklock free. */
>> +	clrbits(xnpipe_bitmap[minor / BITS_PER_LONG],
>> +		1UL << (minor % BITS_PER_LONG));
>>  }
>>
>>  static inline void xnpipe_enqueue_wait(struct xnpipe_state *state, int mask)
>> @@ -413,9 +411,9 @@
>>  		__setbits(state->status, XNPIPE_KERN_LCLOSE);
>>  		xnlock_put_irqrestore(&nklock, s);
>>  	} else {
>> -		xnpipe_minor_free(minor);
>>  		xnlock_put_irqrestore(&nklock, s);
>>  		state->ops.release(state->xstate);
>> +		xnpipe_minor_free(minor);
>>  	}
>>
>>  	if (need_sched)
>> @@ -621,9 +619,9 @@
>>  		__clrbits((__state)->status, XNPIPE_USER_CONN);		\
>>  		if (testbits((__state)->status, XNPIPE_KERN_LCLOSE)) {	\
>>  			clrbits((__state)->status, XNPIPE_KERN_LCLOSE);	\
>> -			xnpipe_minor_free(xnminor_from_state(__state));	\
>>  			xnlock_put_irqrestore(&nklock, (__s));		\
>>  			(__state)->ops.release((__state)->xstate);	\
>> +			xnpipe_minor_free(xnminor_from_state(__state));	\
>>  			xnlock_get_irqsave(&nklock, (__s));		\
>>  		}							\
>>  	} while(0)
>>
> 
> Jan
> 


-- 
Philippe.


  parent reply	other threads:[~2009-01-20  9:22 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-15 10:19 [Xenomai-core] Pending patches Jan Kiszka
2009-01-15 10:30 ` Gilles Chanteperdrix
2009-01-15 11:00   ` Gilles Chanteperdrix
2009-01-15 11:31     ` Jan Kiszka
2009-01-15 13:02       ` Gilles Chanteperdrix
2009-01-15 14:12         ` Jan Kiszka
2009-01-15 11:28   ` Jan Kiszka
2009-01-15 13:09     ` Gilles Chanteperdrix
2009-01-15 14:17       ` Jan Kiszka
2009-01-15 21:46         ` Jan Kiszka
2009-01-15 10:49 ` Philippe Gerum
2009-01-15 11:24   ` Jan Kiszka
2009-01-15 12:09     ` Philippe Gerum
2009-01-15 13:55       ` Jan Kiszka
2009-01-15 14:16         ` Philippe Gerum
2009-01-19 23:18           ` Philippe Gerum
2009-01-19 23:35             ` Jan Kiszka
2009-01-20  8:49               ` Philippe Gerum
2009-01-20  9:06                 ` Jan Kiszka
2009-01-20  9:18                   ` Jan Kiszka
2009-01-20  9:22                   ` Philippe Gerum [this message]
2009-01-27 13:10                 ` Jan Kiszka
2009-01-27 13:26                   ` Jan Kiszka
2009-01-27 13:28                     ` Gilles Chanteperdrix
2009-01-27 13:29                   ` Philippe Gerum
2009-01-27 13:40                     ` Gilles Chanteperdrix
2009-01-27 13:40                   ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=497597EE.1050303@domain.hid \
    --to=rpm@xenomai.org \
    --cc=jan.kiszka@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.