All of lore.kernel.org
 help / color / mirror / Atom feed
* accept after select for RTNet/TCP
@ 2023-09-27 15:13 Per Oberg
  2023-09-27 20:21 ` Florian Bezdeka
  0 siblings, 1 reply; 8+ messages in thread
From: Per Oberg @ 2023-09-27 15:13 UTC (permalink / raw)
  To: xenomai

Hi 

I'm currently porting parts of an application that uses some TCP during startup to Xenomai. I have a working example of a TCP server but i cannot make the "select" part work.

If I do 

fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))

without the select part it works. If I do a select before it returns  "Invalid argument". 

The "select" part:
------------------------------------
fd_set in_fds;
FD_ZERO(&in_fds);
FD_SET(list_fd, &in_fds);
int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
(FD_ISSET(list_fd, &in_fds))
{
  printf("list_fd (%i)  is in FDSET\n", list_fd);
}
------------------------------------

The program works when compiled for regular linux.
Is select for TCP RTNet a no no ?


Best Regards
Per Öberg 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-27 15:13 accept after select for RTNet/TCP Per Oberg
@ 2023-09-27 20:21 ` Florian Bezdeka
  2023-09-28  5:54   ` Per Oberg
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Bezdeka @ 2023-09-27 20:21 UTC (permalink / raw)
  To: Per Oberg, xenomai

Hi Per,

On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> Hi 
> 
> I'm currently porting parts of an application that uses some TCP during startup to Xenomai. I have a working example of a TCP server but i cannot make the "select" part work.
> 
> If I do 
> 
> fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))

The usage of __COBALT() should not be necessary, the Xenomai function
wrapping will take care of redirecting the call to the cobalt world.
(Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
config)

That allows you to compile the same code - but with different flags -
for both worlds without modifications.

> 
> without the select part it works. If I do a select before it returns  "Invalid argument". 

I can remember that I saw something similar in the past. If I recall
correctly there was an ordering problem. It somehow crashed when the
first connection request came in before the accept() call happened, or
something in this direction. The socket state was somehow "corrupted".

I should have it on one of my TODO lists... Unable to find it right
now.

> The "select" part:
> ------------------------------------
> fd_set in_fds;
> FD_ZERO(&in_fds);
> FD_SET(list_fd, &in_fds);
> int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> (FD_ISSET(list_fd, &in_fds))
> {
>   printf("list_fd (%i)  is in FDSET\n", list_fd);
> }
> ------------------------------------
> 
> The program works when compiled for regular linux.
> Is select for TCP RTNet a no no ?
> 

select() should work as expected.

Can you share a complete reproducer? That would simplify things and
increases the chance that someone is able to look into it.

Best regards,
Florian

> 
> Best Regards
> Per Öberg 
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-27 20:21 ` Florian Bezdeka
@ 2023-09-28  5:54   ` Per Oberg
  2023-09-28 11:17     ` Per Oberg
  0 siblings, 1 reply; 8+ messages in thread
From: Per Oberg @ 2023-09-28  5:54 UTC (permalink / raw)
  To: xenomai; +Cc: Florian Bezdeka

----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com skrev:

> Hi Per,

> On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> > Hi

>> I'm currently porting parts of an application that uses some TCP during startup
>> to Xenomai. I have a working example of a TCP server but i cannot make the
> > "select" part work.

> > If I do

> > fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))

> The usage of __COBALT() should not be necessary, the Xenomai function
> wrapping will take care of redirecting the call to the cobalt world.
> (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
> config)

> That allows you to compile the same code - but with different flags -
> for both worlds without modifications.


>> without the select part it works. If I do a select before it returns "Invalid
> > argument".

> I can remember that I saw something similar in the past. If I recall
> correctly there was an ordering problem. It somehow crashed when the
> first connection request came in before the accept() call happened, or
> something in this direction. The socket state was somehow "corrupted".

> I should have it on one of my TODO lists... Unable to find it right
> now.

> > The "select" part:
> > ------------------------------------
> > fd_set in_fds;
> > FD_ZERO(&in_fds);
> > FD_SET(list_fd, &in_fds);
> > int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> > (FD_ISSET(list_fd, &in_fds))
> > {
> > printf("list_fd (%i) is in FDSET\n", list_fd);
> > }
> > ------------------------------------

> > The program works when compiled for regular linux.
> > Is select for TCP RTNet a no no ?


> select() should work as expected.

> Can you share a complete reproducer? That would simplify things and
> increases the chance that someone is able to look into it.

> Best regards,
> Florian


> > Best Regards
> > Per Öberg

Hi Florian and thanks for answering

The explicit use __COBALT / __STD wrappers is for some of my code that sometimes go on a real time socket and sometimes on a regular socket (depending on the destination). I think I found an issue when leaving it out for some special cases. (But it might have been the missing pselect that was the real issue)

That said, I also use it to ensure that there are no missing syscalls when debugging. E.g. I think I recall that pselect was missing in the implementation and that became apparent when putting __COBALT in front of it. 

My issue is 100% reproducible (afaik) so no timing issues show up for me in the regard. 

I can certainly share the code. My example is an adapted version of a TCP Client/Server demo. I will put it on github if that would work?

A few other notes that came up during the porting:

1. It seems like accept does not like beeing called with null pointers to client address 

2. Accept returns the same socket number as the listen socket, and thus closes the socket when the connection closes. I guessed that this was a simplification deemed ok for real time. This effectively makes the TCP server single client because SOCK_REUSE is not supported, right? 

Best Regards
Per Öberg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-28  5:54   ` Per Oberg
@ 2023-09-28 11:17     ` Per Oberg
  2023-09-29 10:41       ` Florian Bezdeka
  0 siblings, 1 reply; 8+ messages in thread
From: Per Oberg @ 2023-09-28 11:17 UTC (permalink / raw)
  To: xenomai; +Cc: Florian Bezdeka


----- Den 28 sep 2023, på kl 7:54, Per Öberg pero@wolfram.com skrev:

> ----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com
> skrev:

> > Hi Per,

> > On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> > > Hi

> >> I'm currently porting parts of an application that uses some TCP during startup
> >> to Xenomai. I have a working example of a TCP server but i cannot make the
> > > "select" part work.

> > > If I do

> > > fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))

> > The usage of __COBALT() should not be necessary, the Xenomai function
> > wrapping will take care of redirecting the call to the cobalt world.
> > (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
> > config)

> > That allows you to compile the same code - but with different flags -
> > for both worlds without modifications.

> >> without the select part it works. If I do a select before it returns "Invalid
> > > argument".

> > I can remember that I saw something similar in the past. If I recall
> > correctly there was an ordering problem. It somehow crashed when the
> > first connection request came in before the accept() call happened, or
> > something in this direction. The socket state was somehow "corrupted".

> > I should have it on one of my TODO lists... Unable to find it right
> > now.

> > > The "select" part:
> > > ------------------------------------
> > > fd_set in_fds;
> > > FD_ZERO(&in_fds);
> > > FD_SET(list_fd, &in_fds);
> > > int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> > > (FD_ISSET(list_fd, &in_fds))
> > > {
> > > printf("list_fd (%i) is in FDSET\n", list_fd);
> > > }
> > > ------------------------------------

> > > The program works when compiled for regular linux.
> > > Is select for TCP RTNet a no no ?

> > select() should work as expected.

> > Can you share a complete reproducer? That would simplify things and
> > increases the chance that someone is able to look into it.

> > Best regards,
> > Florian

> > > Best Regards
> > > Per Öberg

> Hi Florian and thanks for answering

> The explicit use __COBALT / __STD wrappers is for some of my code that sometimes
> go on a real time socket and sometimes on a regular socket (depending on the
> destination). I think I found an issue when leaving it out for some special
> cases. (But it might have been the missing pselect that was the real issue)

> That said, I also use it to ensure that there are no missing syscalls when
> debugging. E.g. I think I recall that pselect was missing in the implementation
> and that became apparent when putting __COBALT in front of it.

> My issue is 100% reproducible (afaik) so no timing issues show up for me in the
> regard.

> I can certainly share the code. My example is an adapted version of a TCP
> Client/Server demo. I will put it on github if that would work?

> A few other notes that came up during the porting:

> 1. It seems like accept does not like beeing called with null pointers to client
> address

> 2. Accept returns the same socket number as the listen socket, and thus closes
> the socket when the connection closes. I guessed that this was a simplification
> deemed ok for real time. This effectively makes the TCP server single client
> because SOCK_REUSE is not supported, right?

> Best Regards
> Per Öberg

The example code is now available on 

https://github.com/droberg/XenomaiTCP

Best Regards
Per Öberg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-28 11:17     ` Per Oberg
@ 2023-09-29 10:41       ` Florian Bezdeka
  2023-09-29 12:11         ` Jan Kiszka
  2023-09-30 12:42         ` Per Oberg
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Bezdeka @ 2023-09-29 10:41 UTC (permalink / raw)
  To: Per Oberg, xenomai; +Cc: jan.kiszka

On Thu, 2023-09-28 at 06:17 -0500, Per Oberg wrote:
> ----- Den 28 sep 2023, på kl 7:54, Per Öberg pero@wolfram.com skrev:
> 
> > ----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com
> > skrev:
> 
> > > Hi Per,
> 
> > > On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> > > > Hi
> 
> > > > I'm currently porting parts of an application that uses some TCP during startup
> > > > to Xenomai. I have a working example of a TCP server but i cannot make the
> > > > "select" part work.
> 
> > > > If I do
> 
> > > > fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))
> 
> > > The usage of __COBALT() should not be necessary, the Xenomai function
> > > wrapping will take care of redirecting the call to the cobalt world.
> > > (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
> > > config)
> 
> > > That allows you to compile the same code - but with different flags -
> > > for both worlds without modifications.
> 
> > > > without the select part it works. If I do a select before it returns "Invalid
> > > > argument".
> 
> > > I can remember that I saw something similar in the past. If I recall
> > > correctly there was an ordering problem. It somehow crashed when the
> > > first connection request came in before the accept() call happened, or
> > > something in this direction. The socket state was somehow "corrupted".
> 
> > > I should have it on one of my TODO lists... Unable to find it right
> > > now.
> 
> > > > The "select" part:
> > > > ------------------------------------
> > > > fd_set in_fds;
> > > > FD_ZERO(&in_fds);
> > > > FD_SET(list_fd, &in_fds);
> > > > int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> > > > (FD_ISSET(list_fd, &in_fds))
> > > > {
> > > > printf("list_fd (%i) is in FDSET\n", list_fd);
> > > > }
> > > > ------------------------------------
> 
> > > > The program works when compiled for regular linux.
> > > > Is select for TCP RTNet a no no ?
> 
> > > select() should work as expected.
> 
> > > Can you share a complete reproducer? That would simplify things and
> > > increases the chance that someone is able to look into it.
> 
> > > Best regards,
> > > Florian
> 
> > > > Best Regards
> > > > Per Öberg
> 
> > Hi Florian and thanks for answering
> 
> > The explicit use __COBALT / __STD wrappers is for some of my code that sometimes
> > go on a real time socket and sometimes on a regular socket (depending on the
> > destination). I think I found an issue when leaving it out for some special
> > cases. (But it might have been the missing pselect that was the real issue)
> 
> > That said, I also use it to ensure that there are no missing syscalls when
> > debugging. E.g. I think I recall that pselect was missing in the implementation
> > and that became apparent when putting __COBALT in front of it.
> 
> > My issue is 100% reproducible (afaik) so no timing issues show up for me in the
> > regard.
> 
> > I can certainly share the code. My example is an adapted version of a TCP
> > Client/Server demo. I will put it on github if that would work?
> 
> > A few other notes that came up during the porting:
> 
> > 1. It seems like accept does not like beeing called with null pointers to client
> > address

That might be a bug, or a RT simplification. Would have to look into it
in more details. Jan, any idea?

> 
> > 2. Accept returns the same socket number as the listen socket, and thus closes
> > the socket when the connection closes. I guessed that this was a simplification
> > deemed ok for real time. This effectively makes the TCP server single client
> > because SOCK_REUSE is not supported, right?

I'm not familiar with that code yet. But yes, might be a simplification
for RT. Jan, any thoughts?

> 
> > Best Regards
> > Per Öberg
> 
> The example code is now available on 
> 
> https://github.com/droberg/XenomaiTCP

Thanks! I found
https://gitlab.com/Xenomai/xenomai-hacker-space/-/issues/34 meanwhile.
So we would be happy to take patches filling this gap. Are you
interested in working on that?

> 
> Best Regards
> Per Öberg


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-29 10:41       ` Florian Bezdeka
@ 2023-09-29 12:11         ` Jan Kiszka
  2023-09-30 12:42         ` Per Oberg
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2023-09-29 12:11 UTC (permalink / raw)
  To: Florian Bezdeka, Per Oberg, xenomai

On 29.09.23 12:41, Florian Bezdeka wrote:
> On Thu, 2023-09-28 at 06:17 -0500, Per Oberg wrote:
>> ----- Den 28 sep 2023, på kl 7:54, Per Öberg pero@wolfram.com skrev:
>>
>>> ----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com
>>> skrev:
>>
>>>> Hi Per,
>>
>>>> On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
>>>>> Hi
>>
>>>>> I'm currently porting parts of an application that uses some TCP during startup
>>>>> to Xenomai. I have a working example of a TCP server but i cannot make the
>>>>> "select" part work.
>>
>>>>> If I do
>>
>>>>> fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))
>>
>>>> The usage of __COBALT() should not be necessary, the Xenomai function
>>>> wrapping will take care of redirecting the call to the cobalt world.
>>>> (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
>>>> config)
>>
>>>> That allows you to compile the same code - but with different flags -
>>>> for both worlds without modifications.
>>
>>>>> without the select part it works. If I do a select before it returns "Invalid
>>>>> argument".
>>
>>>> I can remember that I saw something similar in the past. If I recall
>>>> correctly there was an ordering problem. It somehow crashed when the
>>>> first connection request came in before the accept() call happened, or
>>>> something in this direction. The socket state was somehow "corrupted".
>>
>>>> I should have it on one of my TODO lists... Unable to find it right
>>>> now.
>>
>>>>> The "select" part:
>>>>> ------------------------------------
>>>>> fd_set in_fds;
>>>>> FD_ZERO(&in_fds);
>>>>> FD_SET(list_fd, &in_fds);
>>>>> int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
>>>>> (FD_ISSET(list_fd, &in_fds))
>>>>> {
>>>>> printf("list_fd (%i) is in FDSET\n", list_fd);
>>>>> }
>>>>> ------------------------------------
>>
>>>>> The program works when compiled for regular linux.
>>>>> Is select for TCP RTNet a no no ?
>>
>>>> select() should work as expected.
>>
>>>> Can you share a complete reproducer? That would simplify things and
>>>> increases the chance that someone is able to look into it.
>>
>>>> Best regards,
>>>> Florian
>>
>>>>> Best Regards
>>>>> Per Öberg
>>
>>> Hi Florian and thanks for answering
>>
>>> The explicit use __COBALT / __STD wrappers is for some of my code that sometimes
>>> go on a real time socket and sometimes on a regular socket (depending on the
>>> destination). I think I found an issue when leaving it out for some special
>>> cases. (But it might have been the missing pselect that was the real issue)
>>
>>> That said, I also use it to ensure that there are no missing syscalls when
>>> debugging. E.g. I think I recall that pselect was missing in the implementation
>>> and that became apparent when putting __COBALT in front of it.
>>
>>> My issue is 100% reproducible (afaik) so no timing issues show up for me in the
>>> regard.
>>
>>> I can certainly share the code. My example is an adapted version of a TCP
>>> Client/Server demo. I will put it on github if that would work?
>>
>>> A few other notes that came up during the porting:
>>
>>> 1. It seems like accept does not like beeing called with null pointers to client
>>> address
> 
> That might be a bug, or a RT simplification. Would have to look into it
> in more details. Jan, any idea?

Yes, it indeed looks like we demand a non-null pointer here. Can be
relaxed, just takes someone to send a patch.

> 
>>
>>> 2. Accept returns the same socket number as the listen socket, and thus closes
>>> the socket when the connection closes. I guessed that this was a simplification
>>> deemed ok for real time. This effectively makes the TCP server single client
>>> because SOCK_REUSE is not supported, right?
> 
> I'm not familiar with that code yet. But yes, might be a simplification
> for RT. Jan, any thoughts?

Yeah, we had there some simplifications built into the RT-TCP
implementation, see also [1]. It was originally designed to handle
point-to-point TCP traffic in RT to a single non-Linux peer. And IIRC,
there was even no need for server support on the Linux side for that

Jan

[1]
https://source.denx.de/Xenomai/xenomai/-/blob/master/kernel/drivers/net/doc/README.tcp?ref_type=heads

-- 
Siemens AG, Technology
Linux Expert Center


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-29 10:41       ` Florian Bezdeka
  2023-09-29 12:11         ` Jan Kiszka
@ 2023-09-30 12:42         ` Per Oberg
  2023-10-05  8:07           ` Florian Bezdeka
  1 sibling, 1 reply; 8+ messages in thread
From: Per Oberg @ 2023-09-30 12:42 UTC (permalink / raw)
  To: Florian Bezdeka, xenomai


----- Den 29 sep 2023, på kl 12:41, Florian Bezdeka florian.bezdeka@siemens.com skrev:

> On Thu, 2023-09-28 at 06:17 -0500, Per Oberg wrote:
> > ----- Den 28 sep 2023, på kl 7:54, Per Öberg pero@wolfram.com skrev:

> > > ----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com
> > > skrev:

> > > > Hi Per,

> > > > On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> > > > > Hi

> > > > > I'm currently porting parts of an application that uses some TCP during startup
> > > > > to Xenomai. I have a working example of a TCP server but i cannot make the
> > > > > "select" part work.

> > > > > If I do

> > > > > fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))

> > > > The usage of __COBALT() should not be necessary, the Xenomai function
> > > > wrapping will take care of redirecting the call to the cobalt world.
> > > > (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
> > > > config)

> > > > That allows you to compile the same code - but with different flags -
> > > > for both worlds without modifications.

> > > > > without the select part it works. If I do a select before it returns "Invalid
> > > > > argument".

> > > > I can remember that I saw something similar in the past. If I recall
> > > > correctly there was an ordering problem. It somehow crashed when the
> > > > first connection request came in before the accept() call happened, or
> > > > something in this direction. The socket state was somehow "corrupted".

> > > > I should have it on one of my TODO lists... Unable to find it right
> > > > now.

> > > > > The "select" part:
> > > > > ------------------------------------
> > > > > fd_set in_fds;
> > > > > FD_ZERO(&in_fds);
> > > > > FD_SET(list_fd, &in_fds);
> > > > > int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> > > > > (FD_ISSET(list_fd, &in_fds))
> > > > > {
> > > > > printf("list_fd (%i) is in FDSET\n", list_fd);
> > > > > }
> > > > > ------------------------------------

> > > > > The program works when compiled for regular linux.
> > > > > Is select for TCP RTNet a no no ?

> > > > select() should work as expected.

> > > > Can you share a complete reproducer? That would simplify things and
> > > > increases the chance that someone is able to look into it.

> > > > Best regards,
> > > > Florian

> > > > > Best Regards
> > > > > Per Öberg

> > > Hi Florian and thanks for answering

> > > The explicit use __COBALT / __STD wrappers is for some of my code that sometimes
> > > go on a real time socket and sometimes on a regular socket (depending on the
> > > destination). I think I found an issue when leaving it out for some special
> > > cases. (But it might have been the missing pselect that was the real issue)

> > > That said, I also use it to ensure that there are no missing syscalls when
> > > debugging. E.g. I think I recall that pselect was missing in the implementation
> > > and that became apparent when putting __COBALT in front of it.

> > > My issue is 100% reproducible (afaik) so no timing issues show up for me in the
> > > regard.

> > > I can certainly share the code. My example is an adapted version of a TCP
> > > Client/Server demo. I will put it on github if that would work?

> > > A few other notes that came up during the porting:

> > > 1. It seems like accept does not like beeing called with null pointers to client
> > > address

> That might be a bug, or a RT simplification. Would have to look into it
> in more details. Jan, any idea?


> > > 2. Accept returns the same socket number as the listen socket, and thus closes
> > > the socket when the connection closes. I guessed that this was a simplification
> > > deemed ok for real time. This effectively makes the TCP server single client
> > > because SOCK_REUSE is not supported, right?

> I'm not familiar with that code yet. But yes, might be a simplification
> for RT. Jan, any thoughts?


> > > Best Regards
> > > Per Öberg

> > The example code is now available on

> > https://github.com/droberg/XenomaiTCP

> Thanks! I found
> https://gitlab.com/Xenomai/xenomai-hacker-space/-/issues/34 meanwhile.
> So we would be happy to take patches filling this gap. Are you
> interested in working on that?

Please elaborate. My kernel hacking experience is not great but I'd like to get more experience. Perhaps this is a good way of getting my hands dirty. 

Are there any ideas right now for what needs to be done, especially in terms of testing and conformance? I would need to get started with smaller pieces to be able to get up and running properly.

> > Best Regards
> > Per Öberg


Best Regards 
Per Öberg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: accept after select for RTNet/TCP
  2023-09-30 12:42         ` Per Oberg
@ 2023-10-05  8:07           ` Florian Bezdeka
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Bezdeka @ 2023-10-05  8:07 UTC (permalink / raw)
  To: Per Oberg, xenomai

On Sat, 2023-09-30 at 07:42 -0500, Per Oberg wrote:
> ----- Den 29 sep 2023, på kl 12:41, Florian Bezdeka florian.bezdeka@siemens.com skrev:
> 
> > On Thu, 2023-09-28 at 06:17 -0500, Per Oberg wrote:
> > > ----- Den 28 sep 2023, på kl 7:54, Per Öberg pero@wolfram.com skrev:
> 
> > > > ----- Den 27 sep 2023, på kl 22:21, Florian Bezdeka florian.bezdeka@siemens.com
> > > > skrev:
> 
> > > > > Hi Per,
> 
> > > > > On Wed, 2023-09-27 at 10:13 -0500, Per Oberg wrote:
> > > > > > Hi
> 
> > > > > > I'm currently porting parts of an application that uses some TCP during startup
> > > > > > to Xenomai. I have a working example of a TCP server but i cannot make the
> > > > > > "select" part work.
> 
> > > > > > If I do
> 
> > > > > > fd = __COBALT(accept(list_fd, (struct sockaddr *) &client_addr, &client_len)))
> 
> > > > > The usage of __COBALT() should not be necessary, the Xenomai function
> > > > > wrapping will take care of redirecting the call to the cobalt world.
> > > > > (Assuming you are using the LDFLAGS and CFLAGS delivered by xeno-
> > > > > config)
> 
> > > > > That allows you to compile the same code - but with different flags -
> > > > > for both worlds without modifications.
> 
> > > > > > without the select part it works. If I do a select before it returns "Invalid
> > > > > > argument".
> 
> > > > > I can remember that I saw something similar in the past. If I recall
> > > > > correctly there was an ordering problem. It somehow crashed when the
> > > > > first connection request came in before the accept() call happened, or
> > > > > something in this direction. The socket state was somehow "corrupted".
> 
> > > > > I should have it on one of my TODO lists... Unable to find it right
> > > > > now.
> 
> > > > > > The "select" part:
> > > > > > ------------------------------------
> > > > > > fd_set in_fds;
> > > > > > FD_ZERO(&in_fds);
> > > > > > FD_SET(list_fd, &in_fds);
> > > > > > int selRet = __COBALT(select(list_fd + 1, &in_fds, 0, 0, 0));
> > > > > > (FD_ISSET(list_fd, &in_fds))
> > > > > > {
> > > > > > printf("list_fd (%i) is in FDSET\n", list_fd);
> > > > > > }
> > > > > > ------------------------------------
> 
> > > > > > The program works when compiled for regular linux.
> > > > > > Is select for TCP RTNet a no no ?
> 
> > > > > select() should work as expected.
> 
> > > > > Can you share a complete reproducer? That would simplify things and
> > > > > increases the chance that someone is able to look into it.
> 
> > > > > Best regards,
> > > > > Florian
> 
> > > > > > Best Regards
> > > > > > Per Öberg
> 
> > > > Hi Florian and thanks for answering
> 
> > > > The explicit use __COBALT / __STD wrappers is for some of my code that sometimes
> > > > go on a real time socket and sometimes on a regular socket (depending on the
> > > > destination). I think I found an issue when leaving it out for some special
> > > > cases. (But it might have been the missing pselect that was the real issue)
> 
> > > > That said, I also use it to ensure that there are no missing syscalls when
> > > > debugging. E.g. I think I recall that pselect was missing in the implementation
> > > > and that became apparent when putting __COBALT in front of it.
> 
> > > > My issue is 100% reproducible (afaik) so no timing issues show up for me in the
> > > > regard.
> 
> > > > I can certainly share the code. My example is an adapted version of a TCP
> > > > Client/Server demo. I will put it on github if that would work?
> 
> > > > A few other notes that came up during the porting:
> 
> > > > 1. It seems like accept does not like beeing called with null pointers to client
> > > > address
> 
> > That might be a bug, or a RT simplification. Would have to look into it
> > in more details. Jan, any idea?
> 
> 
> > > > 2. Accept returns the same socket number as the listen socket, and thus closes
> > > > the socket when the connection closes. I guessed that this was a simplification
> > > > deemed ok for real time. This effectively makes the TCP server single client
> > > > because SOCK_REUSE is not supported, right?
> 
> > I'm not familiar with that code yet. But yes, might be a simplification
> > for RT. Jan, any thoughts?
> 
> 
> > > > Best Regards
> > > > Per Öberg
> 
> > > The example code is now available on
> 
> > > https://github.com/droberg/XenomaiTCP
> 
> > Thanks! I found
> > https://gitlab.com/Xenomai/xenomai-hacker-space/-/issues/34 meanwhile.
> > So we would be happy to take patches filling this gap. Are you
> > interested in working on that?
> 
> Please elaborate. My kernel hacking experience is not great but I'd like to get 
> more experience. Perhaps this is a good way of getting my hands dirty. 
> 
> Are there any ideas right now for what needs to be done, especially in terms 
> of testing and conformance? I would need to get started with smaller pieces 
> to be able to get up and running properly.

A possible first step would be extending the smokey testsuite. Try
adding a simple client/server application sending some data around.

You already found some corner cases, testing them as well and fixing
the issues would be part of the journey. You would basically start in
user-land (from user perspective) and might end up in fixing
cobalt/rtdm infrastructure.

> 
> > > Best Regards
> > > Per Öberg
> 
> 
> Best Regards 
> Per Öberg


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-10-05  8:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-27 15:13 accept after select for RTNet/TCP Per Oberg
2023-09-27 20:21 ` Florian Bezdeka
2023-09-28  5:54   ` Per Oberg
2023-09-28 11:17     ` Per Oberg
2023-09-29 10:41       ` Florian Bezdeka
2023-09-29 12:11         ` Jan Kiszka
2023-09-30 12:42         ` Per Oberg
2023-10-05  8:07           ` Florian Bezdeka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.