* [Xenomai] occasional EBADF in select() in notifier.c @ 2014-04-21 21:24 Matthias Schneider 2014-04-21 21:32 ` Gilles Chanteperdrix 2014-04-22 7:32 ` Philippe Gerum 0 siblings, 2 replies; 8+ messages in thread From: Matthias Schneider @ 2014-04-21 21:24 UTC (permalink / raw) To: xenomai@xenomai.org Still working on thread suspension in forge/mercury, I occasionally get a EBADF of the select() call in notifier.c. I suspect that this is due to accessing a copy of the file descriptor list notifier_rset while one of the file descriptors is being closed. This seems to be due to concurrent access on the notifier_rset from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in notifier_lock(), but not when copying and invoking select in notifier_sighandler(). The EBADF leads to a "spurious notification" reporting and process termination - obviously, the thread suspension was not triggered. I can think of several ways of addressing this issue but I am not sure about side effects: a) hold the "notifier_lock" mutex between copying the descriptor list and calling select b) repeating the select() call in the case of EBADF Any ideas? Anyway, why is the select call necessary, isnt the file descriptor signaled via siginfo->si_fd, too? Regards, Matthias ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider @ 2014-04-21 21:32 ` Gilles Chanteperdrix 2014-04-21 21:53 ` Gilles Chanteperdrix 2014-04-22 7:32 ` Philippe Gerum 1 sibling, 1 reply; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-04-21 21:32 UTC (permalink / raw) To: Matthias Schneider; +Cc: xenomai@xenomai.org On 04/21/2014 11:24 PM, Matthias Schneider wrote: > Still working on thread suspension in forge/mercury, I occasionally get a EBADF > of the select() call in notifier.c. I suspect that this is due to accessing a > copy of the file descriptor list notifier_rset while one of the file descriptors > is being closed. This seems to be due to concurrent access on the notifier_rset > from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in > notifier_lock(), but not when copying and invoking select in notifier_sighandler(). > The EBADF leads to a "spurious notification" reporting and process termination - > obviously, the thread suspension was not triggered. > > I can think of several ways of addressing this issue but I am not sure about > side effects: > a) hold the "notifier_lock" mutex between copying the descriptor list and calling select > b) repeating the select() call in the case of EBADF > > Any ideas? > > Anyway, why is the select call necessary, isnt the file descriptor signaled via > siginfo->si_fd, too? I do not know this part of the code, but normally, when select is called for a closed descriptor, select will return with the descriptor readable, EOF is returned when reading the descriptor, which can then be closed. We had a bug report some time ago to implement this behaviour in xenomai posix skin (so for cobalt). -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-21 21:32 ` Gilles Chanteperdrix @ 2014-04-21 21:53 ` Gilles Chanteperdrix 0 siblings, 0 replies; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-04-21 21:53 UTC (permalink / raw) To: Matthias Schneider; +Cc: xenomai@xenomai.org On 04/21/2014 11:32 PM, Gilles Chanteperdrix wrote: > On 04/21/2014 11:24 PM, Matthias Schneider wrote: >> Still working on thread suspension in forge/mercury, I occasionally get a EBADF >> of the select() call in notifier.c. I suspect that this is due to accessing a >> copy of the file descriptor list notifier_rset while one of the file descriptors >> is being closed. This seems to be due to concurrent access on the notifier_rset >> from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in >> notifier_lock(), but not when copying and invoking select in notifier_sighandler(). >> The EBADF leads to a "spurious notification" reporting and process termination - >> obviously, the thread suspension was not triggered. >> >> I can think of several ways of addressing this issue but I am not sure about >> side effects: >> a) hold the "notifier_lock" mutex between copying the descriptor list and calling select >> b) repeating the select() call in the case of EBADF >> >> Any ideas? >> >> Anyway, why is the select call necessary, isnt the file descriptor signaled via >> siginfo->si_fd, too? > > I do not know this part of the code, but normally, when select is called > for a closed descriptor, select will return with the descriptor > readable, EOF is returned when reading the descriptor, which can then be > closed. We had a bug report some time ago to implement this behaviour in > xenomai posix skin (so for cobalt). Obviously, if it is already closed, read will not return EOF, and there is no need to close it again, I got all mixed up, sorry for the noise. > -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider 2014-04-21 21:32 ` Gilles Chanteperdrix @ 2014-04-22 7:32 ` Philippe Gerum 2014-04-22 10:49 ` Matthias Schneider 1 sibling, 1 reply; 8+ messages in thread From: Philippe Gerum @ 2014-04-22 7:32 UTC (permalink / raw) To: Matthias Schneider, xenomai@xenomai.org On 04/21/2014 11:24 PM, Matthias Schneider wrote: > Still working on thread suspension in forge/mercury, I occasionally get a EBADF > of the select() call in notifier.c. I suspect that this is due to accessing a > copy of the file descriptor list notifier_rset while one of the file descriptors > is being closed. This seems to be due to concurrent access on the notifier_rset > from notifier_sighandler() and notifier_destroy(). "notifier_lock" is held in > notifier_lock(), but not when copying and invoking select in notifier_sighandler(). > The EBADF leads to a "spurious notification" reporting and process termination - > obviously, the thread suspension was not triggered. > > I can think of several ways of addressing this issue but I am not sure about > side effects: > a) hold the "notifier_lock" mutex between copying the descriptor list and calling select Not an option, we would need a threaded handler for grabbing the mutex-based lock, which would defeat the purpose of using a directed signal for forcing the recipient thread to stop execution until released. > b) repeating the select() call in the case of EBADF > EBADF should be ignored. This just means that we won't find the notifier block in the scanned list anyway, which is a possible and correct outcome. > Any ideas? > > Anyway, why is the select call necessary, isnt the file descriptor signaled via > siginfo->si_fd, too? > Yes it is. This select() loop is a left-over. > Regards, > Matthias > > _______________________________________________ > Xenomai mailing list > Xenomai@xenomai.org > http://www.xenomai.org/mailman/listinfo/xenomai > -- Philippe. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-22 7:32 ` Philippe Gerum @ 2014-04-22 10:49 ` Matthias Schneider 2014-04-22 13:02 ` Philippe Gerum 0 siblings, 1 reply; 8+ messages in thread From: Matthias Schneider @ 2014-04-22 10:49 UTC (permalink / raw) To: xenomai@xenomai.org ----- Original Message ----- > From: Philippe Gerum <rpm@xenomai.org> > To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org> > Cc: > Sent: Tuesday, April 22, 2014 9:32 AM > Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c > > On 04/21/2014 11:24 PM, Matthias Schneider wrote: >> Still working on thread suspension in forge/mercury, I occasionally get a > EBADF >> of the select() call in notifier.c. I suspect that this is due to accessing > a >> copy of the file descriptor list notifier_rset while one of the file > descriptors >> is being closed. This seems to be due to concurrent access on the > notifier_rset >> from notifier_sighandler() and notifier_destroy(). > "notifier_lock" is held in >> notifier_lock(), but not when copying and invoking select in > notifier_sighandler(). >> The EBADF leads to a "spurious notification" reporting and > process termination - >> obviously, the thread suspension was not triggered. >> >> I can think of several ways of addressing this issue but I am not sure > about >> side effects: >> a) hold the "notifier_lock" mutex between copying the descriptor > list and calling select > > Not an option, we would need a threaded handler for grabbing the > mutex-based lock, which would defeat the purpose of using a directed > signal for forcing the recipient thread to stop execution until released. > Ok, I understand. >> b) repeating the select() call in the case of EBADF >> > > EBADF should be ignored. This just means that we won't find the notifier > block in the scanned list anyway, which is a possible and correct outcome. I do not agree. EBADF only signals that any of the fds is invalid, but not necessarily the one the current thread is interested in. In the scenario being produced in my test, descriptor "A" was being signaled and "B" was closed, being the cause for EBADF. If I had ignored the error, I would have missed a notification for "A". Repeating the select call with a fresh copy of notifier_rset seemed to correctly retrieved the right entry. > >> Any ideas? >> >> Anyway, why is the select call necessary, isnt the file descriptor signaled > via >> siginfo->si_fd, too? >> > > Yes it is. This select() loop is a left-over. So would this be a third variant, getting rid of select() and using si_fd? > >> Regards, >> Matthias > > > -- > Philippe. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-22 10:49 ` Matthias Schneider @ 2014-04-22 13:02 ` Philippe Gerum 2014-04-22 18:00 ` Matthias Schneider 0 siblings, 1 reply; 8+ messages in thread From: Philippe Gerum @ 2014-04-22 13:02 UTC (permalink / raw) To: Matthias Schneider, xenomai@xenomai.org On 04/22/2014 12:49 PM, Matthias Schneider wrote: > ----- Original Message ----- >> From: Philippe Gerum <rpm@xenomai.org> >> To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org> >> Cc: >> Sent: Tuesday, April 22, 2014 9:32 AM >> Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c >> >> On 04/21/2014 11:24 PM, Matthias Schneider wrote: >>> Still working on thread suspension in forge/mercury, I occasionally get a >> EBADF >>> of the select() call in notifier.c. I suspect that this is due to accessing >> a >>> copy of the file descriptor list notifier_rset while one of the file >> descriptors >>> is being closed. This seems to be due to concurrent access on the >> notifier_rset >>> from notifier_sighandler() and notifier_destroy(). >> "notifier_lock" is held in >>> notifier_lock(), but not when copying and invoking select in >> notifier_sighandler(). >>> The EBADF leads to a "spurious notification" reporting and >> process termination - >>> obviously, the thread suspension was not triggered. >>> >>> I can think of several ways of addressing this issue but I am not sure >> about >>> side effects: >>> a) hold the "notifier_lock" mutex between copying the descriptor >> list and calling select >> >> Not an option, we would need a threaded handler for grabbing the >> mutex-based lock, which would defeat the purpose of using a directed >> signal for forcing the recipient thread to stop execution until released. >> > > Ok, I understand. > >>> b) repeating the select() call in the case of EBADF >>> >> >> EBADF should be ignored. This just means that we won't find the notifier >> block in the scanned list anyway, which is a possible and correct outcome. > > > I do not agree. EBADF only signals that any of the fds is invalid, but not necessarily the one the current thread is interested in. In the scenario being produced in my test, descriptor "A" was being signaled and "B" was closed, being the cause for EBADF. If I had ignored the error, I would have missed a notification for "A". Repeating the select call with a fresh copy of notifier_rset seemed to correctly retrieved the right entry. This is what I implied: ignore the invalid fd causing EBADF, drop it from the poll mask and redo. We obviously don't want to miss other valid notifications. > >> >>> Any ideas? >>> >>> Anyway, why is the select call necessary, isnt the file descriptor signaled >> via >>> siginfo->si_fd, too? >>> >> >> Yes it is. This select() loop is a left-over. > > So would this be a third variant, getting rid of select() and using si_fd? > Yes, we don't need to maintain the notifier_rset and the select loop. Just check for psfd[0] == siginfo->si_fd in loop for a match. -- Philippe. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-22 13:02 ` Philippe Gerum @ 2014-04-22 18:00 ` Matthias Schneider 2014-04-23 13:45 ` Philippe Gerum 0 siblings, 1 reply; 8+ messages in thread From: Matthias Schneider @ 2014-04-22 18:00 UTC (permalink / raw) To: Philippe Gerum, xenomai@xenomai.org ----- Original Message ----- > From: Philippe Gerum <rpm@xenomai.org> > To: Matthias Schneider <ma30002000@yahoo.de>; "xenomai@xenomai.org" <xenomai@xenomai.org> > Cc: > Sent: Tuesday, April 22, 2014 3:02 PM > Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c > > On 04/22/2014 12:49 PM, Matthias Schneider wrote: >> ----- Original Message ----- >>> From: Philippe Gerum <rpm@xenomai.org> >>> To: Matthias Schneider <ma30002000@yahoo.de>; > "xenomai@xenomai.org" <xenomai@xenomai.org> >>> Cc: >>> Sent: Tuesday, April 22, 2014 9:32 AM >>> Subject: Re: [Xenomai] occasional EBADF in select() in notifier.c >>> >>> On 04/21/2014 11:24 PM, Matthias Schneider wrote: >>>> Still working on thread suspension in forge/mercury, I > occasionally get a >>> EBADF >>>> of the select() call in notifier.c. I suspect that this is due > to accessing >>> a >>>> copy of the file descriptor list notifier_rset while one of the > file >>> descriptors >>>> is being closed. This seems to be due to concurrent access on > the >>> notifier_rset >>>> from notifier_sighandler() and notifier_destroy(). >>> "notifier_lock" is held in >>>> notifier_lock(), but not when copying and invoking select in >>> notifier_sighandler(). >>>> The EBADF leads to a "spurious notification" reporting > and >>> process termination - >>>> obviously, the thread suspension was not triggered. >>>> >>>> I can think of several ways of addressing this issue but I am > not sure >>> about >>>> side effects: >>>> a) hold the "notifier_lock" mutex between copying the > descriptor >>> list and calling select >>> >>> Not an option, we would need a threaded handler for grabbing the >>> mutex-based lock, which would defeat the purpose of using a directed >>> signal for forcing the recipient thread to stop execution until > released. >>> >> >> Ok, I understand. >> >>>> b) repeating the select() call in the case of EBADF >>>> >>> >>> EBADF should be ignored. This just means that we won't find the > notifier >>> block in the scanned list anyway, which is a possible and correct > outcome. >> >> >> I do not agree. EBADF only signals that any of the fds is invalid, but not > necessarily the one the current thread is interested in. In the scenario being > produced in my test, descriptor "A" was being signaled and > "B" was closed, being the cause for EBADF. If I had ignored the error, > I would have missed a notification for "A". Repeating the select call > with a fresh copy of notifier_rset seemed to correctly retrieved the right > entry. > > This is what I implied: ignore the invalid fd causing EBADF, drop it > from the poll mask and redo. We obviously don't want to miss other valid > notifications. > Ok, sorry I got you wrong. >> >>> >>>> Any ideas? >>>> >>>> Anyway, why is the select call necessary, isnt the file > descriptor signaled >>> via >>>> siginfo->si_fd, too? >>>> >>> >>> Yes it is. This select() loop is a left-over. >> >> So would this be a third variant, getting rid of select() and using si_fd? >> > > Yes, we don't need to maintain the notifier_rset and the select loop. > Just check for psfd[0] == siginfo->si_fd in loop for a match. > > > -- > Philippe. > Considering the three options I think I prefer the last one and remove the select() altogether. I have attached a small patch that does exactly that. Matthias -------------- next part -------------- A non-text attachment was scrubbed... Name: notifier_without_select.patch Type: text/x-patch Size: 1812 bytes Desc: not available URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20140422/7412edcd/attachment.bin> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] occasional EBADF in select() in notifier.c 2014-04-22 18:00 ` Matthias Schneider @ 2014-04-23 13:45 ` Philippe Gerum 0 siblings, 0 replies; 8+ messages in thread From: Philippe Gerum @ 2014-04-23 13:45 UTC (permalink / raw) To: Matthias Schneider, xenomai@xenomai.org On 04/22/2014 08:00 PM, Matthias Schneider wrote: > > Considering the three options I think I prefer the last one and remove the > select() altogether. I have attached a small patch that does exactly that. > Merged, thanks. -- Philippe. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-04-23 13:45 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-21 21:24 [Xenomai] occasional EBADF in select() in notifier.c Matthias Schneider 2014-04-21 21:32 ` Gilles Chanteperdrix 2014-04-21 21:53 ` Gilles Chanteperdrix 2014-04-22 7:32 ` Philippe Gerum 2014-04-22 10:49 ` Matthias Schneider 2014-04-22 13:02 ` Philippe Gerum 2014-04-22 18:00 ` Matthias Schneider 2014-04-23 13:45 ` Philippe Gerum
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.