* [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3
@ 2015-06-03 12:56 Marc Strämke
2015-06-03 13:02 ` Gilles Chanteperdrix
0 siblings, 1 reply; 19+ messages in thread
From: Marc Strämke @ 2015-06-03 12:56 UTC (permalink / raw)
To: xenomai
Hello everyone,
I am experiencing an issue with using packet sockets in Xenomai 3. I am at the
moment trying to replicate it on a minimal testcase but struggling.
What I've found so far is that when a packet socket is read non-blocking with
MSG_DONTWAIT the socket sometimes cannot be closed afterwards.
I inserted a rt_printk into rt_packet_socket(fd,proto) and rt_packet_close(fd)
in af_packet.c, sometimes rt_packet_close never gets called and so any further
communication on this interface is effectively impossible.
Any hints on where I should start looking?
I assume the socket should be closed by Xenomai in any case, even if the
application abnormally terminates without calling close? If that is not the
case how do I make sure to clean up after the application in case the
application terminates abnormally?
Best Regards,
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150603/bc364372/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3
2015-06-03 12:56 [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 Marc Strämke
@ 2015-06-03 13:02 ` Gilles Chanteperdrix
2015-06-03 17:09 ` Marc Strämke
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-03 13:02 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Wed, Jun 03, 2015 at 02:56:07PM +0200, Marc Strämke wrote:
> Hello everyone,
>
> I am experiencing an issue with using packet sockets in Xenomai 3. I am at the
> moment trying to replicate it on a minimal testcase but struggling.
> What I've found so far is that when a packet socket is read non-blocking with
> MSG_DONTWAIT the socket sometimes cannot be closed afterwards.
> I inserted a rt_printk into rt_packet_socket(fd,proto) and rt_packet_close(fd)
> in af_packet.c, sometimes rt_packet_close never gets called and so any further
> communication on this interface is effectively impossible.
>
> Any hints on where I should start looking?
Look at the reference counts.
> I assume the socket should be closed by Xenomai in any case, even if the
> application abnormally terminates without calling close? If that is not the
> case how do I make sure to clean up after the application in case the
> application terminates abnormally?
Yes, xenomai cleans up file descriptors automatically upon process
termination.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3
2015-06-03 13:02 ` Gilles Chanteperdrix
@ 2015-06-03 17:09 ` Marc Strämke
2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke
0 siblings, 1 reply; 19+ messages in thread
From: Marc Strämke @ 2015-06-03 17:09 UTC (permalink / raw)
To: xenomai
Thank you for your swift response Gilles,
Am Mittwoch, 3. Juni 2015, 15:02:17 schrieb Gilles Chanteperdrix:
> On Wed, Jun 03, 2015 at 02:56:07PM +0200, Marc Strämke wrote:
> > Hello everyone,
> >
> > I am experiencing an issue with using packet sockets in Xenomai 3. I
am at
> > the moment trying to replicate it on a minimal testcase but struggling.
> > What I've found so far is that when a packet socket is read non-
blocking
> > with MSG_DONTWAIT the socket sometimes cannot be closed
afterwards.
> > I inserted a rt_printk into rt_packet_socket(fd,proto) and
> > rt_packet_close(fd) in af_packet.c, sometimes rt_packet_close never
gets
> > called and so any further communication on this interface is effectively
> > impossible.
> >
> > Any hints on where I should start looking?
>
> Look at the reference counts.
>
Thats what I was assuming also... it is indeed the reference count that
stays at 1 after closing the socket . /sys/class/rtdm/raw_packet/refcount ->
1
But I am at a total loss where this reference count is actually
incremented/decremented? One thing I now noticed is that I can trigger
the problem by accessing the sysfs refcount entry of the raw_packet rtdm
driver while I have a raw_packet socket open. So the same test program
(open socket, send a packet, wait 2 seconds, receive a packet) works fine
without the sysfs access, and triggers the dangling reference problem if
the sysfs entry is accesed during the 2 seconds sleep.
How is the reference count maintained for the rtdm devices, is that the
refcount in the rtdm_fd structure (fd->refs). There is also a driver refcount,
I am actually somewhat confused how this is maintained...
Regards,
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150603/aa9e6c40/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Xenomai] rtnet locking/socket SKBs
2015-06-03 17:09 ` Marc Strämke
@ 2015-06-04 10:41 ` Marc Strämke
2015-06-04 10:56 ` Gilles Chanteperdrix
2015-09-27 21:32 ` Gilles Chanteperdrix
0 siblings, 2 replies; 19+ messages in thread
From: Marc Strämke @ 2015-06-04 10:41 UTC (permalink / raw)
To: xenomai
Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke:
> How is the reference count maintained for the rtdm devices, is that the
> refcount in the rtdm_fd structure (fd->refs). There is also a driver
> refcount, I am actually somewhat confused how this is maintained...
So I got mostly down to the issue but I need some input from someone more
knowledged in rtnets design:
The reference count of an open AF_PACKET socket is not dropping to zero
because there are still skb in the sockets skb pool and
rtskb_socket_pool_trylock increments the fds reference
count. rt_socket_cleanup would release the skb pool but never gets called.
What I do not really understand at this moment is why the fd reference count
gets incremented at all when the socket gets a skb in its pool?
Is there any reason to not close a socket while it still has associated skbs
in the pool? They will never get cleared if the application crashes If I am
not mistaken.
Best Regards,
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/9f03ea71/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke
@ 2015-06-04 10:56 ` Gilles Chanteperdrix
2015-06-04 11:09 ` Marc Strämke
2015-09-27 21:32 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 10:56 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote:
> Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke:
> > How is the reference count maintained for the rtdm devices, is that the
> > refcount in the rtdm_fd structure (fd->refs). There is also a driver
> > refcount, I am actually somewhat confused how this is maintained...
>
> So I got mostly down to the issue but I need some input from someone more
> knowledged in rtnets design:
> The reference count of an open AF_PACKET socket is not dropping to zero
> because there are still skb in the sockets skb pool and
> rtskb_socket_pool_trylock increments the fds reference
> count. rt_socket_cleanup would release the skb pool but never gets called.
>
> What I do not really understand at this moment is why the fd
> reference count gets incremented at all when the socket gets a skb
> in its pool? Is there any reason to not close a socket while it
> still has associated skbs in the pool?
The previous rtnet design was converted to reference counts globally
when integrating into xenomai 3, some details may still need fixing.
The old design was different but lead to the close syscall looping
for ever and blocking your application if anything went wrong, so,
this is an attempt to avoid this issue.
> They will never get cleared if the application crashes If I am
> not mistaken.
Once again: when an application crashes, the file descriptors are
automatically closed by Xenomai, you do not need to care about this
case.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 10:56 ` Gilles Chanteperdrix
@ 2015-06-04 11:09 ` Marc Strämke
2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Marc Strämke @ 2015-06-04 11:09 UTC (permalink / raw)
To: xenomai
Gilles,
Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix:
> The previous rtnet design was converted to reference counts globally
> when integrating into xenomai 3, some details may still need fixing.
> The old design was different but lead to the close syscall looping
> for ever and blocking your application if anything went wrong, so,
> this is an attempt to avoid this issue.
Yes I do understand this. I am just trying to figure out what the right fix is.
If I just disable incrementing the fd reference count (see attached patch)
AF_PACKET works as it should. From my current understanding of the code the
reference counting for the SKB Pools attached to a socket is redundant, but
I've only closely inspected AF_PACKET not the other sockets in the IP stack.
>
> > They will never get cleared if the application crashes If I am
> > not mistaken.
>
> Once again: when an application crashes, the file descriptors are
> automatically closed by Xenomai, you do not need to care about this
> case.
What I am seeing is that when the reference count is above 1 when the
application closes/crashes, the close operation of the rtdm socket never get's
called, neither when doing an explicit close system call on the socket which
still had a reference count above 1.
When I disable the skb pool reference counting for AF_PACKET it basically
works as it should. When the application closes the socket or crashes the
close op get's called and the skb's release.
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Proof-of-concept-disable-locking-in-af_packet-skb-po.patch
Type: text/x-patch
Size: 1318 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b098fe8a/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b098fe8a/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool
2015-06-04 11:09 ` Marc Strämke
@ 2015-06-04 11:12 ` Marc Strämke
2015-06-04 14:15 ` Gilles Chanteperdrix
2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix
2015-06-04 11:31 ` Gilles Chanteperdrix
2 siblings, 1 reply; 19+ messages in thread
From: Marc Strämke @ 2015-06-04 11:12 UTC (permalink / raw)
To: xenomai
---
kernel/drivers/net/stack/packet/af_packet.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/kernel/drivers/net/stack/packet/af_packet.c
b/kernel/drivers/net/stack/packet/af_packet.c
index 4c7ff57..9f5e417 100644
--- a/kernel/drivers/net/stack/packet/af_packet.c
+++ b/kernel/drivers/net/stack/packet/af_packet.c
@@ -190,6 +190,19 @@ static int rt_packet_getsockname(struct rtsocket *sock,
struct sockaddr *addr,
}
+static int rtskb_nop_trylock(void *cookie)
+{
+ return 1;
+}
+
+static void rtskb_nop_unlock(void *cookie)
+{
+}
+
+static const struct rtskb_pool_lock_ops rtskb_nop_lock_ops = {
+ .trylock = rtskb_nop_trylock,
+ .unlock = rtskb_nop_unlock,
+};
/***
* rt_packet_socket - initialize a packet socket
@@ -202,6 +215,8 @@ static int rt_packet_socket(struct rtdm_fd *fd, int
protocol)
if ((ret = rt_socket_init(fd, protocol)) != 0)
return ret;
+ sock->skb_pool.lock_ops = &rtskb_module_lock_ops;
+
sock->prot.packet.packet_type.type = protocol;
sock->prot.packet.ifindex = 0;
--
2.2.0
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/daf9252e/attachment.sig>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 11:09 ` Marc Strämke
2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke
@ 2015-06-04 11:20 ` Gilles Chanteperdrix
2015-06-04 11:31 ` Gilles Chanteperdrix
2 siblings, 0 replies; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 11:20 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote:
> Gilles,
>
> Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix:
> > The previous rtnet design was converted to reference counts globally
> > when integrating into xenomai 3, some details may still need fixing.
> > The old design was different but lead to the close syscall looping
> > for ever and blocking your application if anything went wrong, so,
> > this is an attempt to avoid this issue.
> Yes I do understand this. I am just trying to figure out what the right fix is.
> If I just disable incrementing the fd reference count (see attached patch)
> AF_PACKET works as it should. From my current understanding of the code the
> reference counting for the SKB Pools attached to a socket is redundant, but
> I've only closely inspected AF_PACKET not the other sockets in the IP stack.
I will have a look at that when I resume working on rtnet. This
should come in a few weeks now. Your fix is definitely not the right
fix though, as the socket pool should be created with the right lock
operations pointer right away, the pointer should not be changed
after the fact.
The risk here is to leak some skbs. Maybe your application work, but
maybe you leak some skbs every time you close the file descriptor.
The skbs are only allocated and freed during socket or interface
creations, the rest of the time, they move from pool to pool.
>
> >
> > > They will never get cleared if the application crashes If I am
> > > not mistaken.
> >
> > Once again: when an application crashes, the file descriptors are
> > automatically closed by Xenomai, you do not need to care about this
> > case.
> What I am seeing is that when the reference count is above 1 when the
> application closes/crashes, the close operation of the rtdm socket never get's
> called, neither when doing an explicit close system call on the socket which
> still had a reference count above 1.
>
> When I disable the skb pool reference counting for AF_PACKET it basically
> works as it should. When the application closes the socket or crashes the
> close op get's called and the skb's release.
So, the problem fixes itself when fixing the first problem. So,
again, this is not the problem you should care about.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/e2c3b08e/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 11:09 ` Marc Strämke
2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke
2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix
@ 2015-06-04 11:31 ` Gilles Chanteperdrix
2015-06-04 12:35 ` Marc Strämke
2 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 11:31 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote:
> Gilles,
>
> Am Donnerstag, 4. Juni 2015, 12:56:16 schrieb Gilles Chanteperdrix:
> > The previous rtnet design was converted to reference counts globally
> > when integrating into xenomai 3, some details may still need fixing.
> > The old design was different but lead to the close syscall looping
> > for ever and blocking your application if anything went wrong, so,
> > this is an attempt to avoid this issue.
> Yes I do understand this. I am just trying to figure out what the right fix is.
> If I just disable incrementing the fd reference count (see attached patch)
> AF_PACKET works as it should. From my current understanding of the code the
> reference counting for the SKB Pools attached to a socket is redundant, but
> I've only closely inspected AF_PACKET not the other sockets in the IP stack.
The locking of the pools is not redundant. The reason for this lock
is that when an skb is in between two pools, we do not want the pool
it comes from to be destroyed, as the skb would end-up leaking. So,
since skbs move from pool to pool, when the skb comes to a new pool,
the old pool should be unlocked. This should work as rtnet spends
its time exchanging skbs from pool to pool and does not allocate or
free them. I tried and implement something along these lines.
Apparently, I messed up, but the fix is not to disable the locking.
Disabling locking will lead to leaking skbs.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/2c14ad2d/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 11:31 ` Gilles Chanteperdrix
@ 2015-06-04 12:35 ` Marc Strämke
2015-06-04 12:37 ` Gilles Chanteperdrix
2015-06-04 14:07 ` Gilles Chanteperdrix
0 siblings, 2 replies; 19+ messages in thread
From: Marc Strämke @ 2015-06-04 12:35 UTC (permalink / raw)
To: xenomai
Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix:
> On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote:
> >
> > Yes I do understand this. I am just trying to figure out what the right
> > fix is. If I just disable incrementing the fd reference count (see
> > attached patch) AF_PACKET works as it should. From my current
> > understanding of the code the reference counting for the SKB Pools
> > attached to a socket is redundant, but I've only closely inspected
> > AF_PACKET not the other sockets in the IP stack.
> The locking of the pools is not redundant. The reason for this lock
> is that when an skb is in between two pools, we do not want the pool
> it comes from to be destroyed, as the skb would end-up leaking. So,
Yes, that is obvious. But what these lock ops do is not locking for the
transient moment when the packet is moving between pools but for the time the
packet is in a pool. This locking ops insure that no whole skb pool is leaked.
If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload
the module till all pools are empty. It left me wondering though if
try_module_get is safe to be called from a realtime context..
> since skbs move from pool to pool, when the skb comes to a new pool,
> the old pool should be unlocked. This should work as rtnet spends
> its time exchanging skbs from pool to pool and does not allocate or
> free them. I tried and implement something along these lines.
> Apparently, I messed up, but the fix is not to disable the locking.
> Disabling locking will lead to leaking skbs.
I do understand this also. It is redundant in the case of AF_PACKET, not in
the general case. In the case of rtskb_socket_pool_ops the above mentioned
mechanism does not work out. The pool is emptied on close if there are still
unread packets in the pool. So the reference counting on rtdm_fd already makes
sure that the socket close operation is called. Both mechanisms together lead
to a deadlock.
If every socket type implements a correct close operation which frees the
skbpool used by the socket on close, there is no risk of leaking skbs. I did
also test that I am not leaking skbs when stressing the AF_PACKET socket.
But you are certainly right, the patch I submitted is kinda stupid. It was
just what I tested for AF_PACKET. If my assumptions are correct the lock ops
for all sockets should be disabled because the reference counting of the fd
takes care of freeing the pool. This would then be done at creation time in
rt_bare_socket_init not later on.
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/7a583333/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 12:35 ` Marc Strämke
@ 2015-06-04 12:37 ` Gilles Chanteperdrix
2015-06-04 12:40 ` Marc Strämke
2015-06-04 14:07 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 12:37 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 02:35:18PM +0200, Marc Strämke wrote:
> Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix:
> > On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote:
> > >
> > > Yes I do understand this. I am just trying to figure out what the right
> > > fix is. If I just disable incrementing the fd reference count (see
> > > attached patch) AF_PACKET works as it should. From my current
> > > understanding of the code the reference counting for the SKB Pools
> > > attached to a socket is redundant, but I've only closely inspected
> > > AF_PACKET not the other sockets in the IP stack.
> > The locking of the pools is not redundant. The reason for this lock
> > is that when an skb is in between two pools, we do not want the pool
> > it comes from to be destroyed, as the skb would end-up leaking. So,
> Yes, that is obvious. But what these lock ops do is not locking for the
> transient moment when the packet is moving between pools but for the time the
> packet is in a pool. This locking ops insure that no whole skb pool is leaked.
>
> If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload
> the module till all pools are empty. It left me wondering though if
> try_module_get is safe to be called from a realtime context..
>
>
> > since skbs move from pool to pool, when the skb comes to a new pool,
> > the old pool should be unlocked. This should work as rtnet spends
> > its time exchanging skbs from pool to pool and does not allocate or
> > free them. I tried and implement something along these lines.
> > Apparently, I messed up, but the fix is not to disable the locking.
> > Disabling locking will lead to leaking skbs.
> I do understand this also. It is redundant in the case of AF_PACKET, not in
> the general case. In the case of rtskb_socket_pool_ops the above mentioned
> mechanism does not work out. The pool is emptied on close if there are still
> unread packets in the pool. So the reference counting on rtdm_fd already makes
> sure that the socket close operation is called. Both mechanisms together lead
> to a deadlock.
> If every socket type implements a correct close operation which frees the
> skbpool used by the socket on close, there is no risk of leaking skbs. I did
> also test that I am not leaking skbs when stressing the AF_PACKET socket.
>
> But you are certainly right, the patch I submitted is kinda stupid. It was
> just what I tested for AF_PACKET. If my assumptions are correct the lock ops
> for all sockets should be disabled because the reference counting of the fd
> takes care of freeing the pool. This would then be done at creation time in
> rt_bare_socket_init not later on.
Your assumptions are not correct. The lock is needed, it just has to
be made to work as intended, which it currently does not. And
removing it is not the solution.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/b2bb0b0e/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 12:37 ` Gilles Chanteperdrix
@ 2015-06-04 12:40 ` Marc Strämke
0 siblings, 0 replies; 19+ messages in thread
From: Marc Strämke @ 2015-06-04 12:40 UTC (permalink / raw)
To: xenomai
> Your assumptions are not correct. The lock is needed, it just has to
> be made to work as intended, which it currently does not. And
> removing it is not the solution.
I was just trying to help. Would be nice to know in how far my assumptions are
wrong though.
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/7bf7fd8c/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 12:35 ` Marc Strämke
2015-06-04 12:37 ` Gilles Chanteperdrix
@ 2015-06-04 14:07 ` Gilles Chanteperdrix
1 sibling, 0 replies; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 14:07 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 02:35:18PM +0200, Marc Strämke wrote:
> Am Donnerstag, 4. Juni 2015, 13:31:00 schrieb Gilles Chanteperdrix:
> > On Thu, Jun 04, 2015 at 01:09:08PM +0200, Marc Strämke wrote:
> > >
> > > Yes I do understand this. I am just trying to figure out what the right
> > > fix is. If I just disable incrementing the fd reference count (see
> > > attached patch) AF_PACKET works as it should. From my current
> > > understanding of the code the reference counting for the SKB Pools
> > > attached to a socket is redundant, but I've only closely inspected
> > > AF_PACKET not the other sockets in the IP stack.
> > The locking of the pools is not redundant. The reason for this lock
> > is that when an skb is in between two pools, we do not want the pool
> > it comes from to be destroyed, as the skb would end-up leaking. So,
> Yes, that is obvious. But what these lock ops do is not locking for the
> transient moment when the packet is moving between pools but for the time the
> packet is in a pool. This locking ops insure that no whole skb pool is leaked.
>
> If I look at rtskb_module_lock_ops this makes sense to me, one cannot unload
> the module till all pools are empty. It left me wondering though if
> try_module_get is safe to be called from a realtime context..
try_module_get is safe to be called from realtime context, starting
with the I-pipe patch for Linux 3.14.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150604/1416b9b0/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool
2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke
@ 2015-06-04 14:15 ` Gilles Chanteperdrix
0 siblings, 0 replies; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-06-04 14:15 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 01:12:27PM +0200, Marc Strämke wrote:
> ---
> kernel/drivers/net/stack/packet/af_packet.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
NACK. For reasons explained in other mails.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke
2015-06-04 10:56 ` Gilles Chanteperdrix
@ 2015-09-27 21:32 ` Gilles Chanteperdrix
2015-09-29 21:59 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-09-27 21:32 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote:
> Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke:
> > How is the reference count maintained for the rtdm devices, is that the
> > refcount in the rtdm_fd structure (fd->refs). There is also a driver
> > refcount, I am actually somewhat confused how this is maintained...
>
> So I got mostly down to the issue but I need some input from someone more
> knowledged in rtnets design:
> The reference count of an open AF_PACKET socket is not dropping to zero
> because there are still skb in the sockets skb pool and
> rtskb_socket_pool_trylock increments the fds reference
> count. rt_socket_cleanup would release the skb pool but never gets called.
>
> What I do not really understand at this moment is why the fd reference count
> gets incremented at all when the socket gets a skb in its pool?
> Is there any reason to not close a socket while it still has associated skbs
> in the pool? They will never get cleared if the application crashes If I am
> not mistaken.
Ok, so now I had a look at the issue. The counter is not dropping to
zero probably because you have unqueued messages in the socket
"incoming" queue, the pool is locked when a packet is out of the
pool, not when a packet is in the pool.
Anyway, you are right, this is redundant, but not only for
af_packet, also for udp and tcp: when the packet is outside any pool
or queue, the file descriptor is locked, so, the module can not be
removed and leak can not occur, so there is no reason to keep track
of the fact that it is outside any queue, and we can probably remove
the locking in the socket pools.
What may be missing is that creating sockets should lock the
corresponding kernel module.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-09-27 21:32 ` Gilles Chanteperdrix
@ 2015-09-29 21:59 ` Gilles Chanteperdrix
2015-09-30 14:33 ` Marc Strämke
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-09-29 21:59 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Sun, Sep 27, 2015 at 11:32:36PM +0200, Gilles Chanteperdrix wrote:
> On Thu, Jun 04, 2015 at 12:41:47PM +0200, Marc Strämke wrote:
> > Am Mittwoch, 3. Juni 2015, 19:09:36 schrieb Marc Strämke:
> > > How is the reference count maintained for the rtdm devices, is that the
> > > refcount in the rtdm_fd structure (fd->refs). There is also a driver
> > > refcount, I am actually somewhat confused how this is maintained...
> >
> > So I got mostly down to the issue but I need some input from someone more
> > knowledged in rtnets design:
> > The reference count of an open AF_PACKET socket is not dropping to zero
> > because there are still skb in the sockets skb pool and
> > rtskb_socket_pool_trylock increments the fds reference
> > count. rt_socket_cleanup would release the skb pool but never gets called.
> >
> > What I do not really understand at this moment is why the fd reference count
> > gets incremented at all when the socket gets a skb in its pool?
> > Is there any reason to not close a socket while it still has associated skbs
> > in the pool? They will never get cleared if the application crashes If I am
> > not mistaken.
>
> Ok, so now I had a look at the issue. The counter is not dropping to
> zero probably because you have unqueued messages in the socket
> "incoming" queue, the pool is locked when a packet is out of the
> pool, not when a packet is in the pool.
>
> Anyway, you are right, this is redundant, but not only for
> af_packet, also for udp and tcp: when the packet is outside any pool
> or queue, the file descriptor is locked, so, the module can not be
> removed and leak can not occur, so there is no reason to keep track
> of the fact that it is outside any queue, and we can probably remove
> the locking in the socket pools.
>
> What may be missing is that creating sockets should lock the
> corresponding kernel module.
The issue should now be fixed in the xenomai-gch git, branch for-forge.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-09-29 21:59 ` Gilles Chanteperdrix
@ 2015-09-30 14:33 ` Marc Strämke
2015-09-30 14:41 ` Gilles Chanteperdrix
0 siblings, 1 reply; 19+ messages in thread
From: Marc Strämke @ 2015-09-30 14:33 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
Hello Gilles,
Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix:
> > What may be missing is that creating sockets should lock the
> > corresponding kernel module.
>
> The issue should now be fixed in the xenomai-gch git, branch for-forge.
Thank you for your work.
I will try the branch in the next days. I have been running for a few month
now with disabled locking (as in my old patch) in AF_PACKET without any issues
so it should be fine.
Best Regards,
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
Arnold-Sommerfeld-Ring 3
52499 Baesweiler
Germany
Tel: +49 2401 - 80970
Fax: +49 2401 - 809715
Geschäftsführer: Dr.-Ing. S. Strämke, Dipl.-Ing. Marc Strämke
USt.-IdNr. 291 812 490 / Steuer-Nr. 202/5739/1186
Amtsgericht Aachen HRB 18539
www.eltropuls.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150930/72db7596/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-09-30 14:33 ` Marc Strämke
@ 2015-09-30 14:41 ` Gilles Chanteperdrix
2015-09-30 14:44 ` Marc Strämke
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2015-09-30 14:41 UTC (permalink / raw)
To: Marc Strämke; +Cc: xenomai
On Wed, Sep 30, 2015 at 04:33:29PM +0200, Marc Strämke wrote:
> Hello Gilles,
> Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix:
> > > What may be missing is that creating sockets should lock the
> > > corresponding kernel module.
> >
> > The issue should now be fixed in the xenomai-gch git, branch for-forge.
>
> Thank you for your work.
>
> I will try the branch in the next days. I have been running for a few month
> now with disabled locking (as in my old patch) in AF_PACKET without any issues
> so it should be fine.
Well, the issue is not specific to AF_PACKET. And if you do not get
socket creations to lock the rtpacket module, you can trigger leaks
and faults by removing the rtpacket module while an application has
an AF_PACKET socket open.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai] rtnet locking/socket SKBs
2015-09-30 14:41 ` Gilles Chanteperdrix
@ 2015-09-30 14:44 ` Marc Strämke
0 siblings, 0 replies; 19+ messages in thread
From: Marc Strämke @ 2015-09-30 14:44 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
Am Mittwoch, 30. September 2015, 16:41:52 schrieb Gilles Chanteperdrix:
> On Wed, Sep 30, 2015 at 04:33:29PM +0200, Marc Strämke wrote:
> > Hello Gilles,
> >
> > Am Dienstag, 29. September 2015, 23:59:50 schrieb Gilles Chanteperdrix:
> > > > What may be missing is that creating sockets should lock the
> > > > corresponding kernel module.
> > >
> > > The issue should now be fixed in the xenomai-gch git, branch for-forge.
> >
> > Thank you for your work.
> >
> > I will try the branch in the next days. I have been running for a few
> > month
> > now with disabled locking (as in my old patch) in AF_PACKET without any
> > issues so it should be fine.
>
> Well, the issue is not specific to AF_PACKET. And if you do not get
> socket creations to lock the rtpacket module, you can trigger leaks
> and faults by removing the rtpacket module while an application has
> an AF_PACKET socket open.
Yes, I was aware of that. This simply does not happen in my application so it
did not bother me...
Marc
--
Dipl.-Ing. Marc Strämke
Geschäftsführer / CEO
ELTROPULS Anlagenbau GmbH
Arnold-Sommerfeld-Ring 3
52499 Baesweiler
Germany
Tel: +49 2401 - 80970
Fax: +49 2401 - 809715
Geschäftsführer: Dr.-Ing. S. Strämke, Dipl.-Ing. Marc Strämke
USt.-IdNr. 291 812 490 / Steuer-Nr. 202/5739/1186
Amtsgericht Aachen HRB 18539
www.eltropuls.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://xenomai.org/pipermail/xenomai/attachments/20150930/1f9018fa/attachment.sig>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-09-30 14:44 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-03 12:56 [Xenomai] Problem with RTNET Packet Socket (or RTDM in general) in Xenomai 3 Marc Strämke
2015-06-03 13:02 ` Gilles Chanteperdrix
2015-06-03 17:09 ` Marc Strämke
2015-06-04 10:41 ` [Xenomai] rtnet locking/socket SKBs Marc Strämke
2015-06-04 10:56 ` Gilles Chanteperdrix
2015-06-04 11:09 ` Marc Strämke
2015-06-04 11:12 ` [Xenomai] [PATCH] Proof of concept: disable locking in af_packet skb pool Marc Strämke
2015-06-04 14:15 ` Gilles Chanteperdrix
2015-06-04 11:20 ` [Xenomai] rtnet locking/socket SKBs Gilles Chanteperdrix
2015-06-04 11:31 ` Gilles Chanteperdrix
2015-06-04 12:35 ` Marc Strämke
2015-06-04 12:37 ` Gilles Chanteperdrix
2015-06-04 12:40 ` Marc Strämke
2015-06-04 14:07 ` Gilles Chanteperdrix
2015-09-27 21:32 ` Gilles Chanteperdrix
2015-09-29 21:59 ` Gilles Chanteperdrix
2015-09-30 14:33 ` Marc Strämke
2015-09-30 14:41 ` Gilles Chanteperdrix
2015-09-30 14:44 ` Marc Strämke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.