* How to find a bug with lost network messages
@ 2016-02-02 9:09 Sandro Stiller
2016-02-02 19:05 ` Arthur Pichlkostner
0 siblings, 1 reply; 2+ messages in thread
From: Sandro Stiller @ 2016-02-02 9:09 UTC (permalink / raw)
To: kernelnewbies
Hello,
I'm struggeling with a network driver (sllin[1]) which is not in the
official kernel.
It has a lot in common with the slcan driver but is used for LIN networks.
The problem is, that sometimes messages sent to the network layer via
netif_rx() don't arrive in all listening programs.
This is how the driver works:
1. The application sends CAN messages to the network interface
2. The driver forwards it to the UART (tty)
3. The UART receives the same message (single-wire connection, RX and TX
connected) and sends it back to the network layer
4. The sending application receives the previously sent message and can
check for transmission errors and appended LIN slave replies.
Sometimes the last point (4.) does not work after 10 - 40 seconds of
transmission.
The application does not receive the message using a blocking read() on
the socket, but other processes receive it (running candump on the
interface). netif_rx() always returns 0.
If more programs are listening (running multiple instances of candump),
the problem appears less often or never.
On my PC there is no problem, it occures on ARM only.
I'm using kernel 4.1.
Can you give me a hint where to search for the cause of this behaviour?
Thank you very much.
Sandro
[1]: https://github.com/sstiller/sllin/tree/master/sllin
^ permalink raw reply [flat|nested] 2+ messages in thread
* How to find a bug with lost network messages
2016-02-02 9:09 How to find a bug with lost network messages Sandro Stiller
@ 2016-02-02 19:05 ` Arthur Pichlkostner
0 siblings, 0 replies; 2+ messages in thread
From: Arthur Pichlkostner @ 2016-02-02 19:05 UTC (permalink / raw)
To: kernelnewbies
I just know that netif_rx() should be updated to netif_rx_ni() for newer kernels.
Without the change I had NOHZ errors in the log, the same change was done in SLCAN.
Maybe this is the origin of your problem.
You can try our fork on https://github.com/tjohann/sllin which includes many improvents and fixes compared to the original driver from 2013.
On Tue, Feb 02, 2016 at 10:09:20AM +0100, Sandro Stiller wrote:
> Hello,
>
> I'm struggeling with a network driver (sllin[1]) which is not in the
> official kernel.
> It has a lot in common with the slcan driver but is used for LIN networks.
> The problem is, that sometimes messages sent to the network layer via
> netif_rx() don't arrive in all listening programs.
>
> This is how the driver works:
> 1. The application sends CAN messages to the network interface
> 2. The driver forwards it to the UART (tty)
> 3. The UART receives the same message (single-wire connection, RX and TX
> connected) and sends it back to the network layer
> 4. The sending application receives the previously sent message and can
> check for transmission errors and appended LIN slave replies.
>
> Sometimes the last point (4.) does not work after 10 - 40 seconds of
> transmission.
> The application does not receive the message using a blocking read() on
> the socket, but other processes receive it (running candump on the
> interface). netif_rx() always returns 0.
>
> If more programs are listening (running multiple instances of candump),
> the problem appears less often or never.
> On my PC there is no problem, it occures on ARM only.
> I'm using kernel 4.1.
>
> Can you give me a hint where to search for the cause of this behaviour?
>
> Thank you very much.
>
> Sandro
>
>
> [1]: https://github.com/sstiller/sllin/tree/master/sllin
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-02-02 19:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-02 9:09 How to find a bug with lost network messages Sandro Stiller
2016-02-02 19:05 ` Arthur Pichlkostner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).