All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] occasional EBADF in select() in notifier.c
@ 2014-04-21 21:24 Matthias Schneider
  2014-04-21 21:32 ` Gilles Chanteperdrix
  2014-04-22  7:32 ` Philippe Gerum
  0 siblings, 2 replies; 8+ messages in thread
From: Matthias Schneider @ 2014-04-21 21:24 UTC (permalink / raw)
  To: xenomai@xenomai.org

Still working on thread suspension in forge/mercury, I occasionally get a EBADF
of the select() call in notifier.c. I suspect that this is due to accessing a 
copy of the file descriptor list notifier_rset while one of the file descriptors 
is being closed. This seems to be due to concurrent access on the notifier_rset 
from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in 
notifier_lock(), but not when copying and invoking select in notifier_sighandler().
The EBADF leads to a "spurious notification" reporting and process termination -
 obviously, the thread suspension was not triggered.

I can think of several ways of addressing this issue but I am not sure about 
side effects:
a) hold the "notifier_lock" mutex between copying the descriptor list and calling select
b) repeating the select() call in the case of EBADF

Any ideas?

Anyway, why is the select call necessary, isnt the file descriptor signaled via 
siginfo->si_fd, too?

Regards,
Matthias


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider
@ 2014-04-21 21:32 ` Gilles Chanteperdrix
  2014-04-21 21:53   ` Gilles Chanteperdrix
  2014-04-22  7:32 ` Philippe Gerum
  1 sibling, 1 reply; 8+ messages in thread
From: Gilles Chanteperdrix @ 2014-04-21 21:32 UTC (permalink / raw)
  To: Matthias Schneider; +Cc: xenomai@xenomai.org

On 04/21/2014 11:24 PM, Matthias Schneider wrote:
> Still working on thread suspension in forge/mercury, I occasionally get a EBADF
> of the select() call in notifier.c. I suspect that this is due to accessing a 
> copy of the file descriptor list notifier_rset while one of the file descriptors 
> is being closed. This seems to be due to concurrent access on the notifier_rset 
> from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in 
> notifier_lock(), but not when copying and invoking select in notifier_sighandler().
> The EBADF leads to a "spurious notification" reporting and process termination -
>  obviously, the thread suspension was not triggered.
> 
> I can think of several ways of addressing this issue but I am not sure about 
> side effects:
> a) hold the "notifier_lock" mutex between copying the descriptor list and calling select
> b) repeating the select() call in the case of EBADF
> 
> Any ideas?
> 
> Anyway, why is the select call necessary, isnt the file descriptor signaled via 
> siginfo->si_fd, too?

I do not know this part of the code, but normally, when select is called
for a closed descriptor, select will return with the descriptor
readable, EOF is returned when reading the descriptor, which can then be
closed. We had a bug report some time ago to implement this behaviour in
xenomai posix skin (so for cobalt).

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-21 21:32 ` Gilles Chanteperdrix
@ 2014-04-21 21:53   ` Gilles Chanteperdrix
  0 siblings, 0 replies; 8+ messages in thread
From: Gilles Chanteperdrix @ 2014-04-21 21:53 UTC (permalink / raw)
  To: Matthias Schneider; +Cc: xenomai@xenomai.org

On 04/21/2014 11:32 PM, Gilles Chanteperdrix wrote:
> On 04/21/2014 11:24 PM, Matthias Schneider wrote:
>> Still working on thread suspension in forge/mercury, I occasionally get a EBADF
>> of the select() call in notifier.c. I suspect that this is due to accessing a 
>> copy of the file descriptor list notifier_rset while one of the file descriptors 
>> is being closed. This seems to be due to concurrent access on the notifier_rset 
>> from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in 
>> notifier_lock(), but not when copying and invoking select in notifier_sighandler().
>> The EBADF leads to a "spurious notification" reporting and process termination -
>>  obviously, the thread suspension was not triggered.
>>
>> I can think of several ways of addressing this issue but I am not sure about 
>> side effects:
>> a) hold the "notifier_lock" mutex between copying the descriptor list and calling select
>> b) repeating the select() call in the case of EBADF
>>
>> Any ideas?
>>
>> Anyway, why is the select call necessary, isnt the file descriptor signaled via 
>> siginfo->si_fd, too?
> 
> I do not know this part of the code, but normally, when select is called
> for a closed descriptor, select will return with the descriptor
> readable, EOF is returned when reading the descriptor, which can then be
> closed. We had a bug report some time ago to implement this behaviour in
> xenomai posix skin (so for cobalt).

Obviously, if it is already closed, read will not return EOF, and there
is no need to close it again, I got all mixed up, sorry for the noise.

> 


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider
  2014-04-21 21:32 ` Gilles Chanteperdrix
@ 2014-04-22  7:32 ` Philippe Gerum
  2014-04-22 10:49   ` Matthias Schneider
  1 sibling, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2014-04-22  7:32 UTC (permalink / raw)
  To: Matthias Schneider, xenomai@xenomai.org

On 04/21/2014 11:24 PM, Matthias Schneider wrote:
> Still working on thread suspension in forge/mercury, I occasionally get a EBADF
> of the select() call in notifier.c. I suspect that this is due to accessing a
> copy of the file descriptor list notifier_rset while one of the file descriptors
> is being closed. This seems to be due to concurrent access on the notifier_rset
> from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in
> notifier_lock(), but not when copying and invoking select in notifier_sighandler().
> The EBADF leads to a "spurious notification" reporting and process termination -
>   obviously, the thread suspension was not triggered.
>
> I can think of several ways of addressing this issue but I am not sure about
> side effects:
> a) hold the "notifier_lock" mutex between copying the descriptor list and calling select

Not an option, we would need a threaded handler for grabbing the 
mutex-based lock, which would defeat the purpose of using a directed 
signal for forcing the recipient thread to stop execution until released.

> b) repeating the select() call in the case of EBADF
>

EBADF should be ignored. This just means that we won't find the notifier 
block in the scanned list anyway, which is a possible and correct outcome.

> Any ideas?
>
> Anyway, why is the select call necessary, isnt the file descriptor signaled via
> siginfo->si_fd, too?
>

Yes it is. This select() loop is a left-over.

> Regards,
> Matthias
>
> _______________________________________________
> Xenomai mailing list
> Xenomai@xenomai.org
> http://www.xenomai.org/mailman/listinfo/xenomai
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-22  7:32 ` Philippe Gerum
@ 2014-04-22 10:49   ` Matthias Schneider
  2014-04-22 13:02     ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Matthias Schneider @ 2014-04-22 10:49 UTC (permalink / raw)
  To: xenomai@xenomai.org

----- Original Message -----
> From: Philippe Gerum <rpm@xenomai.org>
> To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org>
> Cc: 
> Sent: Tuesday, April 22, 2014 9:32 AM
> Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c
> 
> On 04/21/2014 11:24 PM, Matthias Schneider wrote:
>>  Still working on thread suspension in forge/mercury, I occasionally get a 
> EBADF
>>  of the select() call in notifier.c. I suspect that this is due to accessing 
> a
>>  copy of the file descriptor list notifier_rset while one of the file 
> descriptors
>>  is being closed. This seems to be due to concurrent access on the 
> notifier_rset
>>  from notifier_sighandler() and notifier_destroy(). 
> "notifier_lock" is held in
>>  notifier_lock(), but not when copying and invoking select in 
> notifier_sighandler().
>>  The EBADF leads to a "spurious notification" reporting and 
> process termination -
>>    obviously, the thread suspension was not triggered.
>> 
>>  I can think of several ways of addressing this issue but I am not sure 
> about
>>  side effects:
>>  a) hold the "notifier_lock" mutex between copying the descriptor 
> list and calling select
> 
> Not an option, we would need a threaded handler for grabbing the 
> mutex-based lock, which would defeat the purpose of using a directed 
> signal for forcing the recipient thread to stop execution until released.
> 

Ok, I understand.

>>  b) repeating the select() call in the case of EBADF
>> 
> 
> EBADF should be ignored. This just means that we won't find the notifier 
> block in the scanned list anyway, which is a possible and correct outcome.


I do not agree. EBADF only signals that any of the fds is invalid, but not necessarily the one the current thread is interested in. In the scenario being produced in my test, descriptor "A" was being signaled and "B" was closed, being the cause for EBADF. If I had ignored the error, I would have missed a notification for "A". Repeating the select call with a fresh copy of notifier_rset seemed to correctly retrieved the right entry.

> 
>>  Any ideas?
>> 
>>  Anyway, why is the select call necessary, isnt the file descriptor signaled 
> via
>>  siginfo->si_fd, too?
>> 
> 
> Yes it is. This select() loop is a left-over.

So would this be a third variant, getting rid of select() and using si_fd?


> 
>>  Regards,
>>  Matthias
> 
> 
> -- 
> Philippe.
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-22 10:49   ` Matthias Schneider
@ 2014-04-22 13:02     ` Philippe Gerum
  2014-04-22 18:00       ` Matthias Schneider
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2014-04-22 13:02 UTC (permalink / raw)
  To: Matthias Schneider, xenomai@xenomai.org

On 04/22/2014 12:49 PM, Matthias Schneider wrote:
> ----- Original Message -----
>> From: Philippe Gerum <rpm@xenomai.org>
>> To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org>
>> Cc:
>> Sent: Tuesday, April 22, 2014 9:32 AM
>> Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c
>>
>> On 04/21/2014 11:24 PM, Matthias Schneider wrote:
>>>    Still working on thread suspension in forge/mercury, I occasionally get a
>> EBADF
>>>    of the select() call in notifier.c. I suspect that this is due to accessing
>> a
>>>    copy of the file descriptor list notifier_rset while one of the file
>> descriptors
>>>    is being closed. This seems to be due to concurrent access on the
>> notifier_rset
>>>    from notifier_sighandler() and notifier_destroy().
>> "notifier_lock" is held in
>>>    notifier_lock(), but not when copying and invoking select in
>> notifier_sighandler().
>>>    The EBADF leads to a "spurious notification" reporting and
>> process termination -
>>>     obviously, the thread suspension was not triggered.
>>>
>>>    I can think of several ways of addressing this issue but I am not sure
>> about
>>>    side effects:
>>>    a) hold the "notifier_lock" mutex between copying the descriptor
>> list and calling select
>>
>> Not an option, we would need a threaded handler for grabbing the
>> mutex-based lock, which would defeat the purpose of using a directed
>> signal for forcing the recipient thread to stop execution until released.
>>
>
> Ok, I understand.
>
>>>    b) repeating the select() call in the case of EBADF
>>>
>>
>> EBADF should be ignored. This just means that we won't find the notifier
>> block in the scanned list anyway, which is a possible and correct outcome.
>
>
> I do not agree. EBADF only signals that any of the fds is invalid, but not necessarily the one the current thread is interested in. In the scenario being produced in my test, descriptor "A" was being signaled and "B" was closed, being the cause for EBADF. If I had ignored the error, I would have missed a notification for "A". Repeating the select call with a fresh copy of notifier_rset seemed to correctly retrieved the right entry.

This is what I implied: ignore the invalid fd causing EBADF, drop it 
from the poll mask and redo. We obviously don't want to miss other valid 
notifications.

>
>>
>>>    Any ideas?
>>>
>>>    Anyway, why is the select call necessary, isnt the file descriptor signaled
>> via
>>>    siginfo->si_fd, too?
>>>
>>
>> Yes it is. This select() loop is a left-over.
>
> So would this be a third variant, getting rid of select() and using si_fd?
>

Yes, we don't need to maintain the notifier_rset and the select loop. 
Just check for psfd[0] ==  siginfo->si_fd in loop for a match.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-22 13:02     ` Philippe Gerum
@ 2014-04-22 18:00       ` Matthias Schneider
  2014-04-23 13:45         ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Matthias Schneider @ 2014-04-22 18:00 UTC (permalink / raw)
  To: Philippe Gerum, xenomai@xenomai.org



----- Original Message -----
> From: Philippe Gerum <rpm@xenomai.org>
> To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org>
> Cc: 
> Sent: Tuesday, April 22, 2014 3:02 PM
> Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c
> 
> On 04/22/2014 12:49 PM, Matthias Schneider wrote:
>>  ----- Original Message -----
>>>  From: Philippe Gerum <rpm@xenomai.org>
>>>  To: Matthias Schneider <ma30002000@yahoo.de>; 
> "xenomai@xenomai.org" <xenomai@xenomai.org>
>>>  Cc:
>>>  Sent: Tuesday, April 22, 2014 9:32 AM
>>>  Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c
>>> 
>>>  On 04/21/2014 11:24 PM, Matthias Schneider wrote:
>>>>     Still working on thread suspension in forge/mercury, I 
> occasionally get a
>>>  EBADF
>>>>     of the select() call in notifier.c. I suspect that this is due 
> to accessing
>>>  a
>>>>     copy of the file descriptor list notifier_rset while one of the 
> file
>>>  descriptors
>>>>     is being closed. This seems to be due to concurrent access on 
> the
>>>  notifier_rset
>>>>     from notifier_sighandler() and notifier_destroy().
>>>  "notifier_lock" is held in
>>>>     notifier_lock(), but not when copying and invoking select in
>>>  notifier_sighandler().
>>>>     The EBADF leads to a "spurious notification" reporting 
> and
>>>  process termination -
>>>>      obviously, the thread suspension was not triggered.
>>>> 
>>>>     I can think of several ways of addressing this issue but I am 
> not sure
>>>  about
>>>>     side effects:
>>>>     a) hold the "notifier_lock" mutex between copying the 
> descriptor
>>>  list and calling select
>>> 
>>>  Not an option, we would need a threaded handler for grabbing the
>>>  mutex-based lock, which would defeat the purpose of using a directed
>>>  signal for forcing the recipient thread to stop execution until 
> released.
>>> 
>> 
>>  Ok, I understand.
>> 
>>>>     b) repeating the select() call in the case of EBADF
>>>> 
>>> 
>>>  EBADF should be ignored. This just means that we won't find the 
> notifier
>>>  block in the scanned list anyway, which is a possible and correct 
> outcome.
>> 
>> 
>>  I do not agree. EBADF only signals that any of the fds is invalid, but not 
> necessarily the one the current thread is interested in. In the scenario being 
> produced in my test, descriptor "A" was being signaled and 
> "B" was closed, being the cause for EBADF. If I had ignored the error, 
> I would have missed a notification for "A". Repeating the select call 
> with a fresh copy of notifier_rset seemed to correctly retrieved the right 
> entry.
> 
> This is what I implied: ignore the invalid fd causing EBADF, drop it 
> from the poll mask and redo. We obviously don't want to miss other valid 
> notifications.
> 

Ok, sorry I got you wrong.

>> 
>>> 
>>>>     Any ideas?
>>>> 
>>>>     Anyway, why is the select call necessary, isnt the file 
> descriptor signaled
>>>  via
>>>>     siginfo->si_fd, too?
>>>> 
>>> 
>>>  Yes it is. This select() loop is a left-over.
>> 
>>  So would this be a third variant, getting rid of select() and using si_fd?
>> 
> 
> Yes, we don't need to maintain the notifier_rset and the select loop. 
> Just check for psfd[0] ==  siginfo->si_fd in loop for a match.
> 
> 
> -- 
> Philippe.
> 


Considering the three options I think I prefer the last one and remove the
select() altogether. I have attached a small patch that does exactly that.

Matthias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: notifier_without_select.patch
Type: text/x-patch
Size: 1812 bytes
Desc: not available
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20140422/7412edcd/attachment.bin>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] occasional EBADF in select() in notifier.c
  2014-04-22 18:00       ` Matthias Schneider
@ 2014-04-23 13:45         ` Philippe Gerum
  0 siblings, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2014-04-23 13:45 UTC (permalink / raw)
  To: Matthias Schneider, xenomai@xenomai.org

On 04/22/2014 08:00 PM, Matthias Schneider wrote:

>
> Considering the three options I think I prefer the last one and remove the
> select() altogether. I have attached a small patch that does exactly that.
>

Merged, thanks.


-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-23 13:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider
2014-04-21 21:32 ` Gilles Chanteperdrix
2014-04-21 21:53   ` Gilles Chanteperdrix
2014-04-22  7:32 ` Philippe Gerum
2014-04-22 10:49   ` Matthias Schneider
2014-04-22 13:02     ` Philippe Gerum
2014-04-22 18:00       ` Matthias Schneider
2014-04-23 13:45         ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.