From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4B8BDCED.8030501@domain.hid> Date: Mon, 01 Mar 2010 16:27:41 +0100 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4B881E9D.1050101@domain.hid> <4B88203F.2020907@domain.hid> <4B8903E1.3020805@domain.hid> <4B8B7540.609@domain.hid> <4B8B84BC.90300@domain.hid> <4B8B971D.5080807@domain.hid> <4B8B98F5.10602@domain.hid> <4B8BA361.1040807@domain.hid> <4B8BC277.4010108@domain.hid> <4B8BC62F.6090403@domain.hid> <4B8BCC09.8030100@domain.hid> <4B8BCDB5.2010103@domain.hid> <4B8BCF48.70603@domain.hid> <4B8BD4D4.2060109@domain.hid> In-Reply-To: <4B8BD4D4.2060109@domain.hid> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : RTDM+POSIX: Avoid leaking binding objects on errors List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Xenomai core Jan Kiszka wrote: > Gilles Chanteperdrix wrote: >> Jan Kiszka wrote: >>> Gilles Chanteperdrix wrote: >>>> Jan Kiszka wrote: >>>>> Gilles Chanteperdrix wrote: >>>>>> Jan Kiszka wrote: >>>>>>> Gilles Chanteperdrix wrote: >>>>>>>> Jan Kiszka wrote: >>>>>>>>> Gilles Chanteperdrix wrote: >>>>>>>>>> Jan Kiszka wrote: >>>>>>>>>>> Jan Kiszka wrote: >>>>>>>>>>>> Jan Kiszka wrote: >>>>>>>>>>>>> Gilles Chanteperdrix wrote: >>>>>>>>>>>>>> GIT version control wrote: >>>>>>>>>>>>>>> diff --git a/ksrc/skins/posix/mq.c b/ksrc/skins/posix/mq.c >>>>>>>>>>>>>>> index 11f47c0..a896031 100644 >>>>>>>>>>>>>>> --- a/ksrc/skins/posix/mq.c >>>>>>>>>>>>>>> +++ b/ksrc/skins/posix/mq.c >>>>>>>>>>>>>>> @@ -1283,6 +1283,7 @@ int pse51_mq_select_bind(mqd_t fd, struct xnselector *selector, >>>>>>>>>>>>>>> return 0; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> unlock_and_error: >>>>>>>>>>>>>>> + xnfree(binding): >>>>>>>>>>>>>>> xnlock_put_irqrestore(&nklock, s); >>>>>>>>>>>>>>> return err; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>> Ok. Will pull. But I really need to fix that. >>>>>>>>>>>>>> >>>>>>>>>>>>> Ack - now that I see it myself. >>>>>>>>>>>>> >>>>>>>>>>>> I fixed this in my branch and added another patch to transform EIDRM >>>>>>>>>>>> into EBADF when selecting a (half-)deleted RTDM device. Please merge. >>>>>>>>>>>> >>>>>>>>>>> Wait! When the sync object behind some file descriptor is deleted but >>>>>>>>>>> the descriptor itself is still existing, we rather have to return that >>>>>>>>>>> fd signaled from select() instead of letting the call fail. I beed to >>>>>>>>>>> look into this again. >>>>>>>>>> It looks to me like a transitory state, we can wait for the sync object >>>>>>>>>> to be deleted to have the fd destructor signaled. It should not be long. >>>>>>>>> That's not an issue of waiting for this. See e.g. TCP: peer closes >>>>>>>>> connection -> internal sync objects will be destroyed (to make >>>>>>>>> read/write fail). But the fd will remain valid until the local side >>>>>>>>> closes it as well. >>>>>>>> It looks to me like this is going to complicate things a lot, and will >>>>>>>> be a source of regressions. Why can we not have sync objects be >>>>>>>> destroyed when the fd is really destroyed and use a status bit of some >>>>>>>> kind to signal read/write that the fd was closed by peer? >>>>>>> It is way easier and more consistent to unblock reader and writers via >>>>>>> destroying the sync object than to signal it and add tests for specific >>>>>>> states to detect that. Keep in mind that this pattern is in use even >>>>>>> without select support. Diverging from it just to add select awareness >>>>>>> to some driver would be a step back. >>>>>> Ok. As you wish. But in that case, please provide a unit test case which >>>>>> we can run on all architectures to validate your modifications. We will >>>>>> not put this test case in xenomai repository but will create a >>>>>> repository for test cases later on. >>>>> As it requires quite some infrastructure to get there (or do I miss some >>>>> preexisting select unit test?), I can just point to RTnet + TCP for now: >>>>> run rtnet/examples/xenomai/posix/rttcp-server + client over rtlo and >>>>> terminate the client prematurely. This does not happen if you run the >>>>> very same test without RTnet loaded (ie. against Linux' TCP). >>>>> >>>>> Just pushed the corresponding fix. >>>> I really do not understand what you are trying to do. What is the >>>> problem exactly, and how do you fix it? You are reserving 384 more bytes >>>> on the stack. What for? >>>> >>> "We already return an fd as pending if the corresponding xnsynch object >>> is delete while waiting on it. Extend select to do the same if the >>> object is already marked deleted on binding. >>> >>> This fixes the return code select on half-closed RTnet TCP sockets from >>> -EIDRM to > 0 (for pending fds)." >>> >>> HTH. >>> >>> Regarding the additional stack requirements - yes, not nice. Maybe we >>> can avoid this by clearing the in_fds in select_bind_all while >>> processing it unless an fd is marked deleted. Would also avoid passing a >>> separate fds to it. >> To be more precise, it looks to me like each call to bind_one is >> supposed to set the bit of the corresponding fd, so, if bind_one fails >> because the fd is in the transitory state (the one which I do not like), >> bind_all should simply set this bit to 1. And there does not seem to be >> any need for additional parameters, additional space on stack, or >> anything else. > > Right, the trick is likely to properly maintain the output fds of > bind_all in that not only pending fds are set, but others are cleared - > avoids the third bitmap. Still playing with such an approach. xnselect_bind, called by bind_one should take care of setting/clearing individual bits, at least, it is the way it currently works. -- Gilles.