* [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 @ 2015-06-03 12:56 Marc Strämke 2015-06-03 13:02 ` Gilles Chanteperdrix 0 siblings, 1 reply; 19+ messages in thread From: Marc Strämke @ 2015-06-03 12:56 UTC (permalink / raw) To: xenomai Hello everyone, I am experiencing an issue with using packet sockets in Xenomai 3. I am at the moment trying to replicate it on a minimal testcase but struggling. What I've found so far is that when a packet socket is read non-blocking with MSG_DONTWAIT the socket sometimes cannot be closed afterwards. I inserted a rt_printk into rt_packet_socket(fd,proto) and rt_packet_close(fd) in af_packet.c, sometimes rt_packet_close never gets called and so any further communication on this interface is effectively impossible. Any hints on where I should start looking? I assume the socket should be closed by Xenomai in any case, even if the application abnormally terminates without calling close? If that is not the case how do I make sure to clean up after the application in case the application terminates abnormally? Best Regards, Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150603/bc364372/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 2015-06-03 12:56 [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 Marc Strämke @ 2015-06-03 13:02 ` Gilles Chanteperdrix 2015-06-03 17:09 ` Marc Strämke 0 siblings, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-03 13:02 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Wed, Jun 03, 2015 at 02:56:07PM +0200, Marc Strämke wrote: > Hello everyone, > > I am experiencing an issue with using packet sockets in Xenomai 3. I am at the > moment trying to replicate it on a minimal testcase but struggling. > What I've found so far is that when a packet socket is read non-blocking with > MSG_DONTWAIT the socket sometimes cannot be closed afterwards. > I inserted a rt_printk into rt_packet_socket(fd,proto) and rt_packet_close(fd) > in af_packet.c, sometimes rt_packet_close never gets called and so any further > communication on this interface is effectively impossible. > > Any hints on where I should start looking? Look at the reference counts. > I assume the socket should be closed by Xenomai in any case, even if the > application abnormally terminates without calling close? If that is not the > case how do I make sure to clean up after the application in case the > application terminates abnormally? Yes, xenomai cleans up file descriptors automatically upon process termination. -- Gilles. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 2015-06-03 13:02 ` Gilles Chanteperdrix @ 2015-06-03 17:09 ` Marc Strämke 2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke 0 siblings, 1 reply; 19+ messages in thread From: Marc Strämke @ 2015-06-03 17:09 UTC (permalink / raw) To: xenomai Thank you for your swift response Gilles, Am Mittwoch, 3. Juni 2015, 15:02:17 schrieb Gilles Chanteperdrix: > On Wed, Jun 03, 2015 at 02:56:07PM +0200, Marc Strämke wrote: > > Hello everyone, > > > > I am experiencing an issue with using packet sockets in Xenomai 3. I am at > > the moment trying to replicate it on a minimal testcase but struggling. > > What I've found so far is that when a packet socket is read non- blocking > > with MSG_DONTWAIT the socket sometimes cannot be closed afterwards. > > I inserted a rt_printk into rt_packet_socket(fd,proto) and > > rt_packet_close(fd) in af_packet.c, sometimes rt_packet_close never gets > > called and so any further communication on this interface is effectively > > impossible. > > > > Any hints on where I should start looking? > > Look at the reference counts. > Thats what I was assuming also... it is indeed the reference count that stays at 1 after closing the socket . /sys/class/rtdm/raw_packet/refcount -> 1 But I am at a total loss where this reference count is actually incremented/decremented? One thing I now noticed is that I can trigger the problem by accessing the sysfs refcount entry of the raw_packet rtdm driver while I have a raw_packet socket open. So the same test program (open socket, send a packet, wait 2 seconds, receive a packet) works fine without the sysfs access, and triggers the dangling reference problem if the sysfs entry is accesed during the 2 seconds sleep. How is the reference count maintained for the rtdm devices, is that the refcount in the rtdm_fd structure (fd->refs). There is also a driver refcount, I am actually somewhat confused how this is maintained... Regards, Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150603/aa9e6c40/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [Xenomai] rtnet locking/socket SKBs 2015-06-03 17:09 ` Marc Strämke @ 2015-06-04 10:41 ` Marc Strämke 2015-06-04 10:56 ` Gilles Chanteperdrix 2015-09-27 21:32 ` Gilles Chanteperdrix 0 siblings, 2 replies; 19+ messages in thread From: Marc Strämke @ 2015-06-04 10:41 UTC (permalink / raw) To: xenomai Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke: > How is the reference count maintained for the rtdm devices, is that the > refcount in the rtdm_fd structure (fd->refs). There is also a driver > refcount, I am actually somewhat confused how this is maintained... So I got mostly down to the issue but I need some input from someone more knowledged in rtnets design: The reference count of an open AF_PACKET socket is not dropping to zero because there are still skb in the sockets skb pool and rtskb_socket_pool_trylock increments the fds reference count. rt_socket_cleanup would release the skb pool but never gets called. What I do not really understand at this moment is why the fd reference count gets incremented at all when the socket gets a skb in its pool? Is there any reason to not close a socket while it still has associated skbs in the pool? They will never get cleared if the application crashes If I am not mistaken. Best Regards, Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/9f03ea71/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke @ 2015-06-04 10:56 ` Gilles Chanteperdrix 2015-06-04 11:09 ` Marc Strämke 2015-09-27 21:32 ` Gilles Chanteperdrix 1 sibling, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 10:56 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote: > Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke: > > How is the reference count maintained for the rtdm devices, is that the > > refcount in the rtdm_fd structure (fd->refs). There is also a driver > > refcount, I am actually somewhat confused how this is maintained... > > So I got mostly down to the issue but I need some input from someone more > knowledged in rtnets design: > The reference count of an open AF_PACKET socket is not dropping to zero > because there are still skb in the sockets skb pool and > rtskb_socket_pool_trylock increments the fds reference > count. rt_socket_cleanup would release the skb pool but never gets called. > > What I do not really understand at this moment is why the fd > reference count gets incremented at all when the socket gets a skb > in its pool? Is there any reason to not close a socket while it > still has associated skbs in the pool? The previous rtnet design was converted to reference counts globally when integrating into xenomai 3, some details may still need fixing. The old design was different but lead to the close syscall looping for ever and blocking your application if anything went wrong, so, this is an attempt to avoid this issue. > They will never get cleared if the application crashes If I am > not mistaken. Once again: when an application crashes, the file descriptors are automatically closed by Xenomai, you do not need to care about this case. -- Gilles. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 10:56 ` Gilles Chanteperdrix @ 2015-06-04 11:09 ` Marc Strämke 2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Marc Strämke @ 2015-06-04 11:09 UTC (permalink / raw) To: xenomai Gilles, Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix: > The previous rtnet design was converted to reference counts globally > when integrating into xenomai 3, some details may still need fixing. > The old design was different but lead to the close syscall looping > for ever and blocking your application if anything went wrong, so, > this is an attempt to avoid this issue. Yes I do understand this. I am just trying to figure out what the right fix is. If I just disable incrementing the fd reference count (see attached patch) AF_PACKET works as it should. From my current understanding of the code the reference counting for the SKB Pools attached to a socket is redundant, but I've only closely inspected AF_PACKET not the other sockets in the IP stack. > > > They will never get cleared if the application crashes If I am > > not mistaken. > > Once again: when an application crashes, the file descriptors are > automatically closed by Xenomai, you do not need to care about this > case. What I am seeing is that when the reference count is above 1 when the application closes/crashes, the close operation of the rtdm socket never get's called, neither when doing an explicit close system call on the socket which still had a reference count above 1. When I disable the skb pool reference counting for AF_PACKET it basically works as it should. When the application closes the socket or crashes the close op get's called and the skb's release. Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Proof-of-concept-disable-locking-in-af_packet-skb-po.patch Type: text/x-patch Size: 1318 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b098fe8a/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b098fe8a/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool 2015-06-04 11:09 ` Marc Strämke @ 2015-06-04 11:12 ` Marc Strämke 2015-06-04 14:15 ` Gilles Chanteperdrix 2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix 2015-06-04 11:31 ` Gilles Chanteperdrix 2 siblings, 1 reply; 19+ messages in thread From: Marc Strämke @ 2015-06-04 11:12 UTC (permalink / raw) To: xenomai --- kernel/drivers/net/stack/packet/af_packet.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/kernel/drivers/net/stack/packet/af_packet.c b/kernel/drivers/net/stack/packet/af_packet.c index 4c7ff57..9f5e417 100644 --- a/kernel/drivers/net/stack/packet/af_packet.c +++ b/kernel/drivers/net/stack/packet/af_packet.c @@ -190,6 +190,19 @@ static int rt_packet_getsockname(struct rtsocket *sock, struct sockaddr *addr, } +static int rtskb_nop_trylock(void *cookie) +{ + return 1; +} + +static void rtskb_nop_unlock(void *cookie) +{ +} + +static const struct rtskb_pool_lock_ops rtskb_nop_lock_ops = { + .trylock = rtskb_nop_trylock, + .unlock = rtskb_nop_unlock, +}; /*** * rt_packet_socket - initialize a packet socket @@ -202,6 +215,8 @@ static int rt_packet_socket(struct rtdm_fd *fd, int protocol) if ((ret = rt_socket_init(fd, protocol)) != 0) return ret; + sock->skb_pool.lock_ops = &rtskb_module_lock_ops; + sock->prot.packet.packet_type.type = protocol; sock->prot.packet.ifindex = 0; -- 2.2.0 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/daf9252e/attachment.sig> ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool 2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke @ 2015-06-04 14:15 ` Gilles Chanteperdrix 0 siblings, 0 replies; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 14:15 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 01:12:27PM +0200, Marc Strämke wrote: > --- > kernel/drivers/net/stack/packet/af_packet.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) NACK. For reasons explained in other mails. -- Gilles. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 11:09 ` Marc Strämke 2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke @ 2015-06-04 11:20 ` Gilles Chanteperdrix 2015-06-04 11:31 ` Gilles Chanteperdrix 2 siblings, 0 replies; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 11:20 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote: > Gilles, > > Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix: > > The previous rtnet design was converted to reference counts globally > > when integrating into xenomai 3, some details may still need fixing. > > The old design was different but lead to the close syscall looping > > for ever and blocking your application if anything went wrong, so, > > this is an attempt to avoid this issue. > Yes I do understand this. I am just trying to figure out what the right fix is. > If I just disable incrementing the fd reference count (see attached patch) > AF_PACKET works as it should. From my current understanding of the code the > reference counting for the SKB Pools attached to a socket is redundant, but > I've only closely inspected AF_PACKET not the other sockets in the IP stack. I will have a look at that when I resume working on rtnet. This should come in a few weeks now. Your fix is definitely not the right fix though, as the socket pool should be created with the right lock operations pointer right away, the pointer should not be changed after the fact. The risk here is to leak some skbs. Maybe your application work, but maybe you leak some skbs every time you close the file descriptor. The skbs are only allocated and freed during socket or interface creations, the rest of the time, they move from pool to pool. > > > > > > They will never get cleared if the application crashes If I am > > > not mistaken. > > > > Once again: when an application crashes, the file descriptors are > > automatically closed by Xenomai, you do not need to care about this > > case. > What I am seeing is that when the reference count is above 1 when the > application closes/crashes, the close operation of the rtdm socket never get's > called, neither when doing an explicit close system call on the socket which > still had a reference count above 1. > > When I disable the skb pool reference counting for AF_PACKET it basically > works as it should. When the application closes the socket or crashes the > close op get's called and the skb's release. So, the problem fixes itself when fixing the first problem. So, again, this is not the problem you should care about. -- Gilles. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 811 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/e2c3b08e/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 11:09 ` Marc Strämke 2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke 2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix @ 2015-06-04 11:31 ` Gilles Chanteperdrix 2015-06-04 12:35 ` Marc Strämke 2 siblings, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 11:31 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote: > Gilles, > > Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix: > > The previous rtnet design was converted to reference counts globally > > when integrating into xenomai 3, some details may still need fixing. > > The old design was different but lead to the close syscall looping > > for ever and blocking your application if anything went wrong, so, > > this is an attempt to avoid this issue. > Yes I do understand this. I am just trying to figure out what the right fix is. > If I just disable incrementing the fd reference count (see attached patch) > AF_PACKET works as it should. From my current understanding of the code the > reference counting for the SKB Pools attached to a socket is redundant, but > I've only closely inspected AF_PACKET not the other sockets in the IP stack. The locking of the pools is not redundant. The reason for this lock is that when an skb is in between two pools, we do not want the pool it comes from to be destroyed, as the skb would end-up leaking. So, since skbs move from pool to pool, when the skb comes to a new pool, the old pool should be unlocked. This should work as rtnet spends its time exchanging skbs from pool to pool and does not allocate or free them. I tried and implement something along these lines. Apparently, I messed up, but the fix is not to disable the locking. Disabling locking will lead to leaking skbs. -- Gilles. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 811 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/2c14ad2d/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 11:31 ` Gilles Chanteperdrix @ 2015-06-04 12:35 ` Marc Strämke 2015-06-04 12:37 ` Gilles Chanteperdrix 2015-06-04 14:07 ` Gilles Chanteperdrix 0 siblings, 2 replies; 19+ messages in thread From: Marc Strämke @ 2015-06-04 12:35 UTC (permalink / raw) To: xenomai Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix: > On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote: > > > > Yes I do understand this. I am just trying to figure out what the right > > fix is. If I just disable incrementing the fd reference count (see > > attached patch) AF_PACKET works as it should. From my current > > understanding of the code the reference counting for the SKB Pools > > attached to a socket is redundant, but I've only closely inspected > > AF_PACKET not the other sockets in the IP stack. > The locking of the pools is not redundant. The reason for this lock > is that when an skb is in between two pools, we do not want the pool > it comes from to be destroyed, as the skb would end-up leaking. So, Yes, that is obvious. But what these lock ops do is not locking for the transient moment when the packet is moving between pools but for the time the packet is in a pool. This locking ops insure that no whole skb pool is leaked. If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload the module till all pools are empty. It left me wondering though if try_module_get is safe to be called from a realtime context.. > since skbs move from pool to pool, when the skb comes to a new pool, > the old pool should be unlocked. This should work as rtnet spends > its time exchanging skbs from pool to pool and does not allocate or > free them. I tried and implement something along these lines. > Apparently, I messed up, but the fix is not to disable the locking. > Disabling locking will lead to leaking skbs. I do understand this also. It is redundant in the case of AF_PACKET, not in the general case. In the case of rtskb_socket_pool_ops the above mentioned mechanism does not work out. The pool is emptied on close if there are still unread packets in the pool. So the reference counting on rtdm_fd already makes sure that the socket close operation is called. Both mechanisms together lead to a deadlock. If every socket type implements a correct close operation which frees the skbpool used by the socket on close, there is no risk of leaking skbs. I did also test that I am not leaking skbs when stressing the AF_PACKET socket. But you are certainly right, the patch I submitted is kinda stupid. It was just what I tested for AF_PACKET. If my assumptions are correct the lock ops for all sockets should be disabled because the reference counting of the fd takes care of freeing the pool. This would then be done at creation time in rt_bare_socket_init not later on. Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/7a583333/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 12:35 ` Marc Strämke @ 2015-06-04 12:37 ` Gilles Chanteperdrix 2015-06-04 12:40 ` Marc Strämke 2015-06-04 14:07 ` Gilles Chanteperdrix 1 sibling, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 12:37 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 02:35:18PM +0200, Marc Strämke wrote: > Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix: > > On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote: > > > > > > Yes I do understand this. I am just trying to figure out what the right > > > fix is. If I just disable incrementing the fd reference count (see > > > attached patch) AF_PACKET works as it should. From my current > > > understanding of the code the reference counting for the SKB Pools > > > attached to a socket is redundant, but I've only closely inspected > > > AF_PACKET not the other sockets in the IP stack. > > The locking of the pools is not redundant. The reason for this lock > > is that when an skb is in between two pools, we do not want the pool > > it comes from to be destroyed, as the skb would end-up leaking. So, > Yes, that is obvious. But what these lock ops do is not locking for the > transient moment when the packet is moving between pools but for the time the > packet is in a pool. This locking ops insure that no whole skb pool is leaked. > > If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload > the module till all pools are empty. It left me wondering though if > try_module_get is safe to be called from a realtime context.. > > > > since skbs move from pool to pool, when the skb comes to a new pool, > > the old pool should be unlocked. This should work as rtnet spends > > its time exchanging skbs from pool to pool and does not allocate or > > free them. I tried and implement something along these lines. > > Apparently, I messed up, but the fix is not to disable the locking. > > Disabling locking will lead to leaking skbs. > I do understand this also. It is redundant in the case of AF_PACKET, not in > the general case. In the case of rtskb_socket_pool_ops the above mentioned > mechanism does not work out. The pool is emptied on close if there are still > unread packets in the pool. So the reference counting on rtdm_fd already makes > sure that the socket close operation is called. Both mechanisms together lead > to a deadlock. > If every socket type implements a correct close operation which frees the > skbpool used by the socket on close, there is no risk of leaking skbs. I did > also test that I am not leaking skbs when stressing the AF_PACKET socket. > > But you are certainly right, the patch I submitted is kinda stupid. It was > just what I tested for AF_PACKET. If my assumptions are correct the lock ops > for all sockets should be disabled because the reference counting of the fd > takes care of freeing the pool. This would then be done at creation time in > rt_bare_socket_init not later on. Your assumptions are not correct. The lock is needed, it just has to be made to work as intended, which it currently does not. And removing it is not the solution. -- Gilles. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 811 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b2bb0b0e/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 12:37 ` Gilles Chanteperdrix @ 2015-06-04 12:40 ` Marc Strämke 0 siblings, 0 replies; 19+ messages in thread From: Marc Strämke @ 2015-06-04 12:40 UTC (permalink / raw) To: xenomai > Your assumptions are not correct. The lock is needed, it just has to > be made to work as intended, which it currently does not. And > removing it is not the solution. I was just trying to help. Would be nice to know in how far my assumptions are wrong though. Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/7bf7fd8c/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 12:35 ` Marc Strämke 2015-06-04 12:37 ` Gilles Chanteperdrix @ 2015-06-04 14:07 ` Gilles Chanteperdrix 1 sibling, 0 replies; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-06-04 14:07 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 02:35:18PM +0200, Marc Strämke wrote: > Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix: > > On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote: > > > > > > Yes I do understand this. I am just trying to figure out what the right > > > fix is. If I just disable incrementing the fd reference count (see > > > attached patch) AF_PACKET works as it should. From my current > > > understanding of the code the reference counting for the SKB Pools > > > attached to a socket is redundant, but I've only closely inspected > > > AF_PACKET not the other sockets in the IP stack. > > The locking of the pools is not redundant. The reason for this lock > > is that when an skb is in between two pools, we do not want the pool > > it comes from to be destroyed, as the skb would end-up leaking. So, > Yes, that is obvious. But what these lock ops do is not locking for the > transient moment when the packet is moving between pools but for the time the > packet is in a pool. This locking ops insure that no whole skb pool is leaked. > > If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload > the module till all pools are empty. It left me wondering though if > try_module_get is safe to be called from a realtime context.. try_module_get is safe to be called from realtime context, starting with the I-pipe patch for Linux 3.14. -- Gilles. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 811 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/1416b9b0/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke 2015-06-04 10:56 ` Gilles Chanteperdrix @ 2015-09-27 21:32 ` Gilles Chanteperdrix 2015-09-29 21:59 ` Gilles Chanteperdrix 1 sibling, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-09-27 21:32 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote: > Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke: > > How is the reference count maintained for the rtdm devices, is that the > > refcount in the rtdm_fd structure (fd->refs). There is also a driver > > refcount, I am actually somewhat confused how this is maintained... > > So I got mostly down to the issue but I need some input from someone more > knowledged in rtnets design: > The reference count of an open AF_PACKET socket is not dropping to zero > because there are still skb in the sockets skb pool and > rtskb_socket_pool_trylock increments the fds reference > count. rt_socket_cleanup would release the skb pool but never gets called. > > What I do not really understand at this moment is why the fd reference count > gets incremented at all when the socket gets a skb in its pool? > Is there any reason to not close a socket while it still has associated skbs > in the pool? They will never get cleared if the application crashes If I am > not mistaken. Ok, so now I had a look at the issue. The counter is not dropping to zero probably because you have unqueued messages in the socket "incoming" queue, the pool is locked when a packet is out of the pool, not when a packet is in the pool. Anyway, you are right, this is redundant, but not only for af_packet, also for udp and tcp: when the packet is outside any pool or queue, the file descriptor is locked, so, the module can not be removed and leak can not occur, so there is no reason to keep track of the fact that it is outside any queue, and we can probably remove the locking in the socket pools. What may be missing is that creating sockets should lock the corresponding kernel module. -- Gilles. https://click-hack.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-09-27 21:32 ` Gilles Chanteperdrix @ 2015-09-29 21:59 ` Gilles Chanteperdrix 2015-09-30 14:33 ` Marc Strämke 0 siblings, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-09-29 21:59 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Sun, Sep 27, 2015 at 11:32:36PM +0200, Gilles Chanteperdrix wrote: > On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote: > > Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke: > > > How is the reference count maintained for the rtdm devices, is that the > > > refcount in the rtdm_fd structure (fd->refs). There is also a driver > > > refcount, I am actually somewhat confused how this is maintained... > > > > So I got mostly down to the issue but I need some input from someone more > > knowledged in rtnets design: > > The reference count of an open AF_PACKET socket is not dropping to zero > > because there are still skb in the sockets skb pool and > > rtskb_socket_pool_trylock increments the fds reference > > count. rt_socket_cleanup would release the skb pool but never gets called. > > > > What I do not really understand at this moment is why the fd reference count > > gets incremented at all when the socket gets a skb in its pool? > > Is there any reason to not close a socket while it still has associated skbs > > in the pool? They will never get cleared if the application crashes If I am > > not mistaken. > > Ok, so now I had a look at the issue. The counter is not dropping to > zero probably because you have unqueued messages in the socket > "incoming" queue, the pool is locked when a packet is out of the > pool, not when a packet is in the pool. > > Anyway, you are right, this is redundant, but not only for > af_packet, also for udp and tcp: when the packet is outside any pool > or queue, the file descriptor is locked, so, the module can not be > removed and leak can not occur, so there is no reason to keep track > of the fact that it is outside any queue, and we can probably remove > the locking in the socket pools. > > What may be missing is that creating sockets should lock the > corresponding kernel module. The issue should now be fixed in the xenomai-gch git, branch for-forge. -- Gilles. https://click-hack.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-09-29 21:59 ` Gilles Chanteperdrix @ 2015-09-30 14:33 ` Marc Strämke 2015-09-30 14:41 ` Gilles Chanteperdrix 0 siblings, 1 reply; 19+ messages in thread From: Marc Strämke @ 2015-09-30 14:33 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai Hello Gilles, Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix: > > What may be missing is that creating sockets should lock the > > corresponding kernel module. > > The issue should now be fixed in the xenomai-gch git, branch for-forge. Thank you for your work. I will try the branch in the next days. I have been running for a few month now with disabled locking (as in my old patch) in AF_PACKET without any issues so it should be fine. Best Regards, Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH Arnold-Sommerfeld-Ring 3 52499 Baesweiler Germany Tel: +49 2401 - 80970 Fax: +49 2401 - 809715 Geschäftsführer: Dr.-Ing. S. Strämke, Dipl.-Ing. Marc Strämke USt.-IdNr. 291 812 490 / Steuer-Nr. 202/5739/1186 Amtsgericht Aachen HRB 18539 www.eltropuls.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150930/72db7596/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-09-30 14:33 ` Marc Strämke @ 2015-09-30 14:41 ` Gilles Chanteperdrix 2015-09-30 14:44 ` Marc Strämke 0 siblings, 1 reply; 19+ messages in thread From: Gilles Chanteperdrix @ 2015-09-30 14:41 UTC (permalink / raw) To: Marc Strämke; +Cc: xenomai On Wed, Sep 30, 2015 at 04:33:29PM +0200, Marc Strämke wrote: > Hello Gilles, > Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix: > > > What may be missing is that creating sockets should lock the > > > corresponding kernel module. > > > > The issue should now be fixed in the xenomai-gch git, branch for-forge. > > Thank you for your work. > > I will try the branch in the next days. I have been running for a few month > now with disabled locking (as in my old patch) in AF_PACKET without any issues > so it should be fine. Well, the issue is not specific to AF_PACKET. And if you do not get socket creations to lock the rtpacket module, you can trigger leaks and faults by removing the rtpacket module while an application has an AF_PACKET socket open. -- Gilles. https://click-hack.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs 2015-09-30 14:41 ` Gilles Chanteperdrix @ 2015-09-30 14:44 ` Marc Strämke 0 siblings, 0 replies; 19+ messages in thread From: Marc Strämke @ 2015-09-30 14:44 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai Am Mittwoch, 30. September 2015, 16:41:52 schrieb Gilles Chanteperdrix: > On Wed, Sep 30, 2015 at 04:33:29PM +0200, Marc Strämke wrote: > > Hello Gilles, > > > > Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix: > > > > What may be missing is that creating sockets should lock the > > > > corresponding kernel module. > > > > > > The issue should now be fixed in the xenomai-gch git, branch for-forge. > > > > Thank you for your work. > > > > I will try the branch in the next days. I have been running for a few > > month > > now with disabled locking (as in my old patch) in AF_PACKET without any > > issues so it should be fine. > > Well, the issue is not specific to AF_PACKET. And if you do not get > socket creations to lock the rtpacket module, you can trigger leaks > and faults by removing the rtpacket module while an application has > an AF_PACKET socket open. Yes, I was aware of that. This simply does not happen in my application so it did not bother me... Marc -- Dipl.-Ing. Marc Strämke Geschäftsführer / CEO ELTROPULS Anlagenbau GmbH Arnold-Sommerfeld-Ring 3 52499 Baesweiler Germany Tel: +49 2401 - 80970 Fax: +49 2401 - 809715 Geschäftsführer: Dr.-Ing. S. Strämke, Dipl.-Ing. Marc Strämke USt.-IdNr. 291 812 490 / Steuer-Nr. 202/5739/1186 Amtsgericht Aachen HRB 18539 www.eltropuls.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: This is a digitally signed message part. URL: <http://xenomai.org/pipermail/xenomai/attachments/20150930/1f9018fa/attachment.sig> ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-09-30 14:44 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-03 12:56 [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 Marc Strämke 2015-06-03 13:02 ` Gilles Chanteperdrix 2015-06-03 17:09 ` Marc Strämke 2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke 2015-06-04 10:56 ` Gilles Chanteperdrix 2015-06-04 11:09 ` Marc Strämke 2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke 2015-06-04 14:15 ` Gilles Chanteperdrix 2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix 2015-06-04 11:31 ` Gilles Chanteperdrix 2015-06-04 12:35 ` Marc Strämke 2015-06-04 12:37 ` Gilles Chanteperdrix 2015-06-04 12:40 ` Marc Strämke 2015-06-04 14:07 ` Gilles Chanteperdrix 2015-09-27 21:32 ` Gilles Chanteperdrix 2015-09-29 21:59 ` Gilles Chanteperdrix 2015-09-30 14:33 ` Marc Strämke 2015-09-30 14:41 ` Gilles Chanteperdrix 2015-09-30 14:44 ` Marc Strämke
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.