Re: [PATCH v3 1/2] can: Add Loongson CAN-FD controller driver

Linux CAN drivers development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Binbin Zhou" <zhoubinbin@loongson.cn>
Cc: "Oleksij Rempel" <o.rempel@pengutronix.de>,
	linux-can@vger.kernel.org,
	"Marc Kleine-Budde" <mkl@pengutronix.de>,
	"Vincent Mailhol" <mailhol@kernel.org>,
	"Oliver Hartkopp" <socketcan@hartkopp.net>
Subject: Re: [PATCH v3 1/2] can: Add Loongson CAN-FD controller driver
Date: Tue, 30 Jun 2026 03:16:21 +0000	[thread overview]
Message-ID: <20260630031622.14F981F000E9@smtp.kernel.org> (raw)
In-Reply-To: <b80d271f63f6318a4060220e8107785bd147f1bd.1782787997.git.zhoubinbin@loongson.cn>

Thank you for your contribution! Sashiko AI review found 18 potential issue(s) to consider:
- [High] Missing `skb` check in `CAN_STATE_ERROR_ACTIVE` case leads to a NULL pointer dereference.
- [High] SKB allocation failure or invalid frame in RX path permanently stalls NAPI and corrupts RX FIFO.
- [High] NAPI poll list corruption due to calling `napi_complete_done` while returning `quota`.
- [High] Missing `napi_disable()` on error paths in `loongson_canfd_open()` leads to a kernel panic on subsequent open attempts.
- [High] Race condition in `loongson_canfd_start_xmit` causes unexpected packet drops due to un-reaped TX buffers.
- [High] Missing bounds check on the hardware-provided word count `REG_FRAME_META1_RWCNT` causes heap buffer overflow.
- [High] Unchecked return value of `device_property_read_u32` leaves `clk_rate` uninitialized.
- [High] Premature hardware restart on Bus-Off when auto-restart is configured.
- [High] Bitwise NOT on a 16-bit variable unintentionally unmasks all interrupts due to integer promotion.
- [High] Overwriting of `isr` variable causes lost error interrupts and permanent interrupt masking.
- [High] Using `regmap_update_bits` on command-style SET/CLR registers causes conflicting hardware commands.
- [High] Calling `kfree_skb()` with hardware interrupts disabled.
- [High] Potential infinite loop in TX interrupt handler due to repeated status reads of posted writes.
- [Medium] The driver advertises 'hardware timestamping' in commit message but does not implement it.
- [Medium] `ndev->stats.rx_errors` is modified concurrently from both hardirq and NAPI softirq contexts without locking.
- [Medium] Incorrect check for RTR frames causes invalid RX byte statistics.
- [Medium] Missing `cf->len8_dlc` assignment for Classic CAN frames.
- [Low] Syntax error due to variable declarations immediately following a switch case label.
--

commit e0d474f047900e47a73eaa207fc07dd8c35084bf
Author: Binbin Zhou <zhoubinbin@loongson.cn>
Subject: can: Add Loongson CAN-FD controller driver

This commit adds support for the CAN-FD controller integrated into 
Loongson-2K series SoCs. It implements NAPI for the RX path, TX buffer 
management, bus error reporting, and various timing configurations.

[Severity: Medium]
The commit message states the driver supports hardware timestamping, but is
this functionality actually implemented? Looking at the driver code, it sets
ethtool_op_get_ts_info but does not seem to extract the timestamp field from
the hardware or write to skb_hwtstamps in the receive path.

> diff --git a/drivers/net/can/loongson_canfd.c b/drivers/net/can/loongson_canfd.c
> new file mode 100644
> index 0000000000000..1c7381129c2b6
> --- /dev/null
> +++ b/drivers/net/can/loongson_canfd.c
[ ... ]
> @@ -891,37 +891,24 @@ static int loongson_canfd_chip_start(struct net_device *ndev)
[ ... ]
> +	/* Configure interrupts */
> +	int_ena = REG_INT_STAT_RBNEI | REG_INT_STAT_TXBHCI |
> +		  REG_INT_STAT_EWLI | REG_INT_STAT_FCSI;
> +
> +	/* Bus error reporting */
> +	if (priv->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING)
> +		int_ena |= REG_INT_STAT_ALI | REG_INT_STAT_BEI;
> +
> +	/* It's after reset, so there is no need to clear anything */
> +	regmap_write(priv->regmap, LOONGSON_CANFD_INT_MASK, ~int_ena);

[Severity: High]
Will this bitwise NOT operation unmask all interrupts on the controller?
Because int_ena is a 16-bit unsigned integer, the bitwise NOT causes integer
promotion to a 32-bit signed int, filling the upper 16 bits with 1s. Since
the upper bits of this register clear the mask, does this unintentionally
enable every possible interrupt?

> +	regmap_write(priv->regmap, LOONGSON_CANFD_INT_ENA, int_ena);
[ ... ]
> @@ -1048,46 +1048,22 @@ static netdev_tx_t loongson_canfd_start_xmit(struct sk_buff *skb, struct net_device *ndev)
[ ... ]
> +	for_each_clear_bit(i, &tx_brp, LOONGSON_CANFD_TXBUF_NUM) {
> +		u32 transmission_complete_mask = 0x3 << (i * 2);
> +
> +		if (!(tx_bs & transmission_complete_mask)) {
> +			buf_id = i;
> +			break;
> +		}
> +	}
> +
> +	if (!loongson_canfd_insert_frame(priv, cf, buf_id, can_is_canfd_skb(skb))) {
> +		netdev_err(ndev, "TXNF set but cannot insert frame into TXTB! HW Bug?");
> +		kfree_skb(skb);

[Severity: High]
Could a race condition here cause silent packet drops? The hardware clears
tx_brp immediately on completion, but tx_bs is cleared later in the ISR.
If all inactive buffers are awaiting a reap, the loop finishes without
finding a buffer, leaving buf_id at 0. This would cause the insert check to
fail, dropping the packet while incorrectly returning NETDEV_TX_OK.

[Severity: High]
Is it safe to call kfree_skb here? The tx_lock is held with
spinlock_irqsave which disables local interrupts. Calling kfree_skb with
hard interrupts disabled can lead to kernel warnings and potential deadlocks.
Should this use dev_kfree_skb_any instead?

> +		ndev->stats.tx_dropped++;
> +		return NETDEV_TX_OK;
> +	}
[ ... ]
> @@ -1105,37 +1105,25 @@ static void loongson_canfd_read_rx_frame(struct loongson_canfd_priv *priv, struct canfd_frame *cf,
[ ... ]
> +	/* Data Length Code */
> +	len = FIELD_GET(REG_FRAME_META1_DLC, meta1);
> +	if (len > 8) {
> +		if (meta1 & REG_FRAME_META1_FDF)
> +			len = wc << 2;
> +		else
> +			len = 8;
> +	}
> +
> +	cf->len = len;

[Severity: Medium]
Does this properly support the CAN_CTRLMODE_CC_LEN8_DLC feature? For Classic
CAN frames with a DLC greater than 8, the length is appropriately capped to
8, but the original DLC is not saved to cf->len8_dlc. This might break
userspace protocols that rely on the original DLC data.

> +	if (unlikely(len > wc * LOONGSON_CANFD_DW_BYTE))
> +		len = wc * LOONGSON_CANFD_DW_BYTE;
> +
> +	/* Data */
> +	for (i = 0; i < len; i += LOONGSON_CANFD_DW_BYTE)
> +		regmap_read(priv->regmap, LOONGSON_CANFD_RX_DATA, (u32 *)(cf->data + i));

[Severity: High]
Can a malformed hardware word count cause a heap buffer overflow? The len
variable is checked against the hardware-provided word count, but not against
CANFD_MAX_DLEN. If the hardware yields an invalid read word count, the length
could exceed the 64-byte payload limit, causing out of bounds memory writes
to the skb data.

[ ... ]
> @@ -1164,28 +1164,22 @@ static int loongson_canfd_rx(struct net_device *ndev)
[ ... ]
> +	/* Flexible Data-rate Format */
> +	if (meta1 & REG_FRAME_META1_FDF)
> +		skb = alloc_canfd_skb(ndev, &cf);
> +	else
> +		skb = alloc_can_skb(ndev, (struct can_frame **)&cf);
> +	if (unlikely(!skb))
> +		return 0;

[Severity: High]
If the skb allocation fails, does returning 0 early permanently stall the
NAPI loop? Since the rest of the data payload is not read from the hardware
FIFO, the RX stream might desynchronize. Also, returning 0 here causes the
caller to exit its loop without completing NAPI, potentially leaving it
permanently in a scheduled state.

> +
> +	loongson_canfd_read_rx_frame(priv, cf, meta0, meta1);
> +
> +	if (cf->can_id != CAN_RTR_FLAG)
> +		stats->rx_bytes += cf->len;

[Severity: Medium]
Is this RTR check accurate? The can_id field contains both the CAN ID and
the flags, so a direct inequality comparison to CAN_RTR_FLAG might evaluate
true for almost all RTR frames, erroneously inflating the RX byte statistics
for frames with no actual payload.

[ ... ]
> @@ -1203,32 +1203,26 @@ static enum can_state loongson_canfd_read_fault_state(struct net_device *ndev)
[ ... ]
> +	switch (FIELD_GET(REG_FSTAT_MASK, fstat)) {
> +	case REG_FSTAT_ERA:
> +		u32 ewl, erl, rec_tec, max_tec;

[Severity: Low]
Will this compile on strict C11 compilers? The C11 standard does not allow
variable declarations immediately following a switch case label.

[ ... ]
> @@ -1278,82 +1278,28 @@ static void loongson_canfd_err_interrupt(struct net_device *ndev, u32 isr)
[ ... ]
> +	if ((isr & REG_INT_STAT_FCSI) || (isr & REG_INT_STAT_EWLI)) {
> +		netdev_info(ndev, "state changes from %s to %s\n",
> +			    can_get_state_str(priv->can.state), can_get_state_str(state));
> +
> +		if (priv->can.state == state)
> +			netdev_warn(ndev, "cur and pre state is the same!(miss intr?)\n");
> +
> +		isr = REG_INT_STAT_FCSI | REG_INT_STAT_EWLI;

[Severity: High]
Does overwriting the isr variable here permanently disable other interrupts?
If arbitration lost or bus error flags are pending, this assignment destroys
them. Later in the function, isr is shifted to unmask interrupts, meaning the
erased flags will never be unmasked again.

> +		priv->can.state = state;
> +		switch (state) {
> +		case CAN_STATE_BUS_OFF:
> +			priv->can.can_stats.bus_off++;
> +			if (priv->can.restart_ms)
> +				regmap_write(priv->regmap, LOONGSON_CANFD_CMD,
> +					     REG_CMD_ERCRST);

[Severity: High]
Does this prematurely restart the hardware upon a bus-off event? Instantly
writing the error reset command clears the hardware error state and rejoins
the bus immediately, ignoring the restart delay contract enforced by the core
can_bus_off timer.

[ ... ]
> +		case CAN_STATE_ERROR_ACTIVE:
> +			cf->can_id |= CAN_ERR_CNT;
> +			cf->data[1] = CAN_ERR_CRTL_ACTIVE;
> +			cf->data[6] = bec.txerr;

[Severity: High]
Could this lead to a null pointer dereference? The alloc_can_err_skb call
earlier in the function can fail and leave cf uninitialized. While the other
switch cases check if skb is valid before modifying cf data, this case does
not, which would crash if memory pressure prevents allocation.

[ ... ]
> +	/* Check for Bus Error interrupt */
> +	if (isr & REG_INT_STAT_BEI) {
> +		isr = REG_INT_STAT_BEI;
> +		netdev_err(ndev, "Bus Error interrupt\n");
> +		priv->can.can_stats.bus_error++;
> +		stats->rx_errors++;

[Severity: Medium]
Is it safe to increment stats->rx_errors without locking here? This counter
is modified from hardirq context here and concurrently from softirq context
in the NAPI poll routine, which could cause lost increments due to data races
on 32-bit systems.

[ ... ]
> @@ -1391,37 +1391,25 @@ static int loongson_canfd_rx_napi(struct napi_struct *napi, int quota)
[ ... ]
> +	/* Check for RX FIFO Overflow */
> +	regmap_read(priv->regmap, LOONGSON_CANFD_STAT, &sts);
> +	if (sts & REG_STAT_DOR) {
> +		struct net_device_stats *stats = &ndev->stats;
> +		struct can_frame *cf;
> +		struct sk_buff *skb;
> +
> +		netdev_info(ndev, "Loongson canfd RX overflow\n");
> +		stats->rx_over_errors++;
> +		stats->rx_errors++;

[Severity: Medium]
As noted above, stats->rx_errors is updated concurrently in both softirq and
hardirq contexts without locking.

[ ... ]
> +	if (rxbuf_is_empty && ret != 0) {
> +		if (napi_complete_done(napi, work_done)) {

[Severity: High]
Should this check if work_done is less than quota? If exactly quota frames
are processed and the hardware buffer becomes empty, this logic calls
napi_complete_done and then returns quota to the networking core. Returning
the full quota instructs the core to reschedule the instance, but the state
has already been manipulated, which can corrupt the NAPI poll list.

[ ... ]
> @@ -1425,33 +1425,18 @@ static void loongson_canfd_tx_interrupt(struct net_device *ndev)
[ ... ]
> +	while ((buf_id = loongson_canfd_get_txbuf_id(priv, &sts)) < LOONGSON_CANFD_TXBUF_NUM) {
> +		switch (sts) {
> +		case TX_BS_VALID:
> +			stats->tx_bytes += can_get_echo_skb(ndev, buf_id, NULL);
> +			stats->tx_packets++;
> +			break;

[Severity: High]
Could this result in an inescapable infinite loop? When the driver writes a
clear command to the posted MMIO register at the end of the loop, it
immediately reads the status register back in the while condition. Because
the hardware takes cycles to clear the bit, the old status is read, causing
the driver to repeatedly process the same buffer and spin in the ISR.

> +		case TX_BS_FAIL:
> +			u32 cnt;

[Severity: Low]
Similar to the fault state read function, this violates C11 syntax by
declaring a variable immediately following a switch case label.

[ ... ]
> @@ -1584,20 +1584,14 @@ static void loongson_canfd_chip_stop(struct net_device *ndev)
>  {
>  	struct loongson_canfd_priv *priv = netdev_priv(ndev);
>  
>  	/* Disable interrupts and disable CAN */
>  	regmap_update_bits(priv->regmap, LOONGSON_CANFD_INT_ENA, REG_INT_ENA_CLR, REG_INT_ENA_CLR);
>  	regmap_update_bits(priv->regmap, LOONGSON_CANFD_INT_MASK,
>  			   REG_INT_MASK_SET, REG_INT_MASK_SET);

[Severity: High]
Are these regmap_update_bits calls safe for W1S/W1C command registers? This
function reads the current 1s from the lower half of INT_ENA and writes them
back along with the clear mask in the upper half, which commands the hardware
to simultaneously set and clear the exact same interrupts.

[ ... ]
> @@ -1604,45 +1604,22 @@ static int loongson_canfd_open(struct net_device *ndev)
[ ... ]
> +err_chip_start:
> +	free_irq(ndev->irq, ndev);
> +err_irq:
> +	close_candev(ndev);
> +	return ret;
> +}

[Severity: High]
Are we missing a napi_disable call on these error paths? If the driver fails
to open, the candev is closed but NAPI is left enabled. A subsequent attempt
to bring the interface up will call napi_enable on an already enabled
instance, triggering a BUG_ON in the core.

[ ... ]
> @@ -1690,46 +1690,24 @@ static int loongson_canfd_probe(struct platform_device *pdev)
[ ... ]
> +	regmap = devm_regmap_init_mmio(dev, base, &loongson_cangfd_regmap);
> +	if (IS_ERR(regmap))
> +		return PTR_ERR(regmap);
> +
> +	device_property_read_u32(dev, "clock-frequency", &clk_rate);
> +
> +	irq = platform_get_irq(pdev, 0);

[Severity: High]
Should the return value of device_property_read_u32 be checked? If the
property is missing, clk_rate retains uninitialized stack garbage which is
then assigned to priv->can.clock.freq, leading to unpredictable hardware
timing.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1782787997.git.zhoubinbin@loongson.cn?part=1

next prev parent reply	other threads:[~2026-06-30  3:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-30  3:01 [PATCH v3 0/2] Add Loongson CAN-FD controller driver Binbin Zhou
2026-06-30  3:01 ` [PATCH v3 1/2] can: " Binbin Zhou
2026-06-30  3:16   ` sashiko-bot [this message]
2026-06-30  6:51     ` Vincent Mailhol
2026-06-30  3:01 ` [PATCH v3 2/2] can: loongson_canfd: Add RXDMA support Binbin Zhou
2026-06-30  3:11   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260630031622.14F981F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-can@vger.kernel.org \
    --cc=mailhol@kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=o.rempel@pengutronix.de \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=socketcan@hartkopp.net \
    --cc=zhoubinbin@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox