Linux CAN drivers development
 help / color / mirror / Atom feed
From: "Hölzl, Alexander" <alexander.hoelzl@gmx.net>
To: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: robin@protonic.nl, linux-kernel@vger.kernel.org,
	kernel@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: [PATCH] can: j1939: fix wrong rx timeout for CTS hold messages
Date: Thu, 23 Apr 2026 15:34:18 +0200	[thread overview]
Message-ID: <6afff3fa-8e81-44c0-aac6-bd71688ebfa9@gmx.net> (raw)
In-Reply-To: <aeoZn2BIOzZyCWo_@pengutronix.de>



Am 23.04.2026 um 15:07 schrieb Oleksij Rempel:
> On Thu, Apr 23, 2026 at 11:35:27AM +0200, Hölzl, Alexander wrote:
>>
>> Hello Oleksij,
>> thank you for your quick review!
>>
>> Am 23.04.2026 um 05:50 schrieb Oleksij Rempel:
>>> Hi Alexander,
>>>
>>> On Tue, Apr 21, 2026 at 05:31:54PM +0200, Alexander Hölzl wrote:
>>>> In J1939 segmented transport, a CTS message with data byte 2 set to zero is interpreted as a hold message.
>>>> This instructs the transmitter of the segmented message to hold the connection open but to delay sending.
>>>> According to the J1939-21 standard, section 5.10.2.4 the timeout T4 after which an held open session is invalidated is
>>>> 1050 ms, not 550 as implemented currently.
>>>> The 550 ms are problematic if a device uses hold messages and assumes it can wait for more than 550 ms before it has
>>>> to resend the hold message.
>>>>
>>>> This patch changes the T4 timeout used in the implementation from 550 ms to 1050.
>>>>
>>>> Signed-off-by: Alexander Hölzl <alexander.hoelzl@gmx.net>
>>>
>>> LGTM. Thank you!
>>>
>>> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
>>>
>>>
>>> Sashico detected one more potential issue, not related to this patch:
>>> https://sashiko.dev/#/patchset/20260421153152.87772-3-alexander.hoelzl%40gmx.net
>>>
>>> If you have time, can you please verify it?
>> I just tried it and to be honest it seems that holds are fundamentally
>> broken currently. I don't think there is any way to restart normal
>> communication as soon as a hold has been received.
>>
>> When I send a hold with byte 3 set to FF and try to resume from sequence
>> number 1 I get an abort with reason 08 which is "Duplicate sequence number"
>> according to the spec:
>> (000.000000)  can0  18EC31F9   [8]  10 0A 00 02 02 00 AB 00
>> (000.001166)  can0  18ECF931   [8]  11 00 FF FF FF 00 AB 00
>> (000.101138)  can0  18ECF931   [8]  11 02 01 FF FF 00 AB 00
>> (000.000685)  can0  18EC31F9   [8]  FF 08 FF FF FF 00 AB 00
>>
>> The same happens when setting byte 3 to 01:
>> (000.000000)  can0  18EC31F9   [8]  10 0A 00 02 02 00 AB 00
>> (000.001077)  can0  18ECF931   [8]  11 00 01 FF FF 00 AB 00
>> (000.100910)  can0  18ECF931   [8]  11 02 01 FF FF 00 AB 00
>> (000.000657)  can0  18EC31F9   [8]  FF 08 FF FF FF 00 AB 00
>>
>> Setting it to 0 is disallowed as well and the transmission is cancelled
>> immediatley with error 05 which is "Maximum retransmit request limit
>> reached.":
>> (000.000000)  can0  18EC31F9   [8]  10 0A 00 02 02 00 AB 00
>> (000.000941)  can0  18ECF931   [8]  11 00 00 FF FF 00 AB 00
>> (000.000645)  can0  18EC31F9   [8]  FF 05 FF FF FF 00 AB 00
>>
>> There is a check at the beggining of j1939_xtp_rx_cts_one for duplicate
>> sequence numbers which targets byte 0, so the command type byte, and checks
>> that it is not equal to the last command.
>>
>> if (session->last_cmd == dat[0]) {
>> 		err = J1939_XTP_ABORT_DUP_SEQ;
>> 		goto out_session_cancel;
>> 	}
>>
>> This means it is impossible to handle two directly succeeding CTS which
>> would be necessary to escape the hold....
>>
>> The easiest way to fix this would probably be to move the check for a hold
>> message all the way to the top of j1939_xtp_rx_cts_one and if a hold message
>> has been received just set the rx-timeout timer and then bail?
> 
>  From a quick lock, it sounds plausible. Will you send a patch?
> 
> Hm... we needs tests, preferably in kernel source to avoid regressions.
> 
> would it be possible to implement is on top of kunit tests?
> https://lore.kernel.org/all/20260420152228.581421-1-o.rempel@pengutronix.de/
> 
> It looks like there is more user space friendly testing used:
> https://lore.kernel.org/all/20260419144600.GA4091724@chcpu16/
> 
> 
Sure I'll write a patch and tests!


  reply	other threads:[~2026-04-23 13:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 15:31 [PATCH] can: j1939: fix wrong rx timeout for CTS hold messages Alexander Hölzl
2026-04-23  3:50 ` Oleksij Rempel
2026-04-23  9:35   ` Hölzl, Alexander
2026-04-23 13:07     ` Oleksij Rempel
2026-04-23 13:34       ` Hölzl, Alexander [this message]
2026-05-06 12:54 ` Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6afff3fa-8e81-44c0-aac6-bd71688ebfa9@gmx.net \
    --to=alexander.hoelzl@gmx.net \
    --cc=kernel@pengutronix.de \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=o.rempel@pengutronix.de \
    --cc=robin@protonic.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox