From: Oleksij Rempel <o.rempel@pengutronix.de>
To: "Hölzl, Alexander" <alexander.hoelzl@gmx.net>
Cc: robin@protonic.nl, linux-kernel@vger.kernel.org,
kernel@pengutronix.de, linux-can@vger.kernel.org
Subject: Re: [PATCH] can: j1939: fix wrong rx timeout for CTS hold messages
Date: Thu, 23 Apr 2026 15:07:43 +0200 [thread overview]
Message-ID: <aeoZn2BIOzZyCWo_@pengutronix.de> (raw)
In-Reply-To: <3e17efb4-ae71-4b5c-af23-7b5de9c5e03c@gmx.net>
On Thu, Apr 23, 2026 at 11:35:27AM +0200, Hölzl, Alexander wrote:
>
> Hello Oleksij,
> thank you for your quick review!
>
> Am 23.04.2026 um 05:50 schrieb Oleksij Rempel:
> > Hi Alexander,
> >
> > On Tue, Apr 21, 2026 at 05:31:54PM +0200, Alexander Hölzl wrote:
> > > In J1939 segmented transport, a CTS message with data byte 2 set to zero is interpreted as a hold message.
> > > This instructs the transmitter of the segmented message to hold the connection open but to delay sending.
> > > According to the J1939-21 standard, section 5.10.2.4 the timeout T4 after which an held open session is invalidated is
> > > 1050 ms, not 550 as implemented currently.
> > > The 550 ms are problematic if a device uses hold messages and assumes it can wait for more than 550 ms before it has
> > > to resend the hold message.
> > >
> > > This patch changes the T4 timeout used in the implementation from 550 ms to 1050.
> > >
> > > Signed-off-by: Alexander Hölzl <alexander.hoelzl@gmx.net>
> >
> > LGTM. Thank you!
> >
> > Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
> >
> >
> > Sashico detected one more potential issue, not related to this patch:
> > https://sashiko.dev/#/patchset/20260421153152.87772-3-alexander.hoelzl%40gmx.net
> >
> > If you have time, can you please verify it?
> I just tried it and to be honest it seems that holds are fundamentally
> broken currently. I don't think there is any way to restart normal
> communication as soon as a hold has been received.
>
> When I send a hold with byte 3 set to FF and try to resume from sequence
> number 1 I get an abort with reason 08 which is "Duplicate sequence number"
> according to the spec:
> (000.000000) can0 18EC31F9 [8] 10 0A 00 02 02 00 AB 00
> (000.001166) can0 18ECF931 [8] 11 00 FF FF FF 00 AB 00
> (000.101138) can0 18ECF931 [8] 11 02 01 FF FF 00 AB 00
> (000.000685) can0 18EC31F9 [8] FF 08 FF FF FF 00 AB 00
>
> The same happens when setting byte 3 to 01:
> (000.000000) can0 18EC31F9 [8] 10 0A 00 02 02 00 AB 00
> (000.001077) can0 18ECF931 [8] 11 00 01 FF FF 00 AB 00
> (000.100910) can0 18ECF931 [8] 11 02 01 FF FF 00 AB 00
> (000.000657) can0 18EC31F9 [8] FF 08 FF FF FF 00 AB 00
>
> Setting it to 0 is disallowed as well and the transmission is cancelled
> immediatley with error 05 which is "Maximum retransmit request limit
> reached.":
> (000.000000) can0 18EC31F9 [8] 10 0A 00 02 02 00 AB 00
> (000.000941) can0 18ECF931 [8] 11 00 00 FF FF 00 AB 00
> (000.000645) can0 18EC31F9 [8] FF 05 FF FF FF 00 AB 00
>
> There is a check at the beggining of j1939_xtp_rx_cts_one for duplicate
> sequence numbers which targets byte 0, so the command type byte, and checks
> that it is not equal to the last command.
>
> if (session->last_cmd == dat[0]) {
> err = J1939_XTP_ABORT_DUP_SEQ;
> goto out_session_cancel;
> }
>
> This means it is impossible to handle two directly succeeding CTS which
> would be necessary to escape the hold....
>
> The easiest way to fix this would probably be to move the check for a hold
> message all the way to the top of j1939_xtp_rx_cts_one and if a hold message
> has been received just set the rx-timeout timer and then bail?
From a quick lock, it sounds plausible. Will you send a patch?
Hm... we needs tests, preferably in kernel source to avoid regressions.
would it be possible to implement is on top of kunit tests?
https://lore.kernel.org/all/20260420152228.581421-1-o.rempel@pengutronix.de/
It looks like there is more user space friendly testing used:
https://lore.kernel.org/all/20260419144600.GA4091724@chcpu16/
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
next prev parent reply other threads:[~2026-04-23 13:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 15:31 [PATCH] can: j1939: fix wrong rx timeout for CTS hold messages Alexander Hölzl
2026-04-23 3:50 ` Oleksij Rempel
2026-04-23 9:35 ` Hölzl, Alexander
2026-04-23 13:07 ` Oleksij Rempel [this message]
2026-04-23 13:34 ` Hölzl, Alexander
2026-05-06 12:54 ` Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeoZn2BIOzZyCWo_@pengutronix.de \
--to=o.rempel@pengutronix.de \
--cc=alexander.hoelzl@gmx.net \
--cc=kernel@pengutronix.de \
--cc=linux-can@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin@protonic.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox