Re: [PATCH v14 net-next 05/11] net/nebula-matrix: add channel layer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jakub Kicinski <kuba@kernel.org>
To: illusion.wang@nebula-matrix.com
Cc: Jakub Kicinski <kuba@kernel.org>,
	dimon.zhao@nebula-matrix.com, alvin.wang@nebula-matrix.com,
	sam.chen@nebula-matrix.com, netdev@vger.kernel.org,
	andrew+netdev@lunn.ch, corbet@lwn.net, horms@kernel.org,
	linux-doc@vger.kernel.org, pabeni@redhat.com,
	vadim.fedorenko@linux.dev, lukas.bulwahn@redhat.com,
	edumazet@google.com, enelsonmoore@gmail.com,
	skhan@linuxfoundation.org, hkallweit1@gmail.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v14 net-next 05/11] net/nebula-matrix: add channel layer
Date: Fri, 15 May 2026 18:17:13 -0700	[thread overview]
Message-ID: <20260516011714.1863858-1-kuba@kernel.org> (raw)
In-Reply-To: <20260513011649.4404-6-illusion.wang@nebula-matrix.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/nebula-matrix: add channel layer

A channel management layer provides a structured approach to handle
communication between different components and drivers. It handles message
registration, message sending/acknowledgment, and queue management using
mailbox channels.

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> index c1b724a8b92dd..8b178b8f6de3f 100644
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c

[ ... ]

> +static int nbl_chan_teardown_queue(struct nbl_channel_mgt *chan_mgt,
> +				   u8 chan_type)
> +{
> +	struct nbl_chan_info *chan_info = chan_mgt->chan_info[chan_type];
> +
> +	cancel_work_sync(chan_info->clean_task);

Could this pointer be uninitialized? If nbl_chan_setup_queue() fails
early (for example, during buffer allocation), it calls this teardown
function, but the clean_task appears to only be registered externally later.
Would this pass a NULL pointer to cancel_work_sync() and cause a panic?

[ ... ]

> +static int nbl_chan_update_txqueue(struct nbl_channel_mgt *chan_mgt,
> +				   struct nbl_chan_info *chan_info,
> +				   struct nbl_chan_tx_param *param)
> +{
[ ... ]
> +	if (param->arg_len > NBL_CHAN_TX_DESC_EMBEDDED_DATA_LEN) {
> +		memcpy(tx_buf->va, param->arg, param->arg_len);
> +		tx_desc->buf_addr = cpu_to_le64(tx_buf->pa);
> +		tx_desc->buf_len = cpu_to_le16(param->arg_len);
> +		tx_desc->data_len = 0;
> +	} else {
> +		memcpy(tx_desc->data, param->arg, param->arg_len);
> +		tx_desc->buf_len = 0;
> +		tx_desc->data_len = cpu_to_le16(param->arg_len);
> +	}
> +	tx_desc->flags = cpu_to_le16(BIT(NBL_CHAN_TX_DESC_AVAIL));

Should there be a dma_wmb() before setting the AVAIL flag? Without a memory
barrier here, the CPU might reorder the flag write before the payload data
is fully committed, causing the hardware DMA engine to fetch an incomplete
descriptor.

[ ... ]

> +static int nbl_chan_kick_tx_ring(struct nbl_channel_mgt *chan_mgt,
> +				 struct nbl_chan_info *chan_info)
> +{
[ ... ]
> +	while (retry_count < max_retries) {
> +		if (le16_to_cpu(READ_ONCE(tx_desc->flags)) &
> +		    BIT(NBL_CHAN_TX_DESC_USED)) {
> +			dma_rmb();
> +			break;
> +		}
> +
> +		if (!(retry_count % NBL_CHAN_TX_REKICK_WAIT_TIMES))
> +			NBL_UPDATE_QUEUE_TAIL_PTR(chan_info, hw_ops, chan_mgt,
> +						  txq->tail_ptr, NBL_MB_TX_QID);
> +		retry_count++;
> +
> +		usleep_range(NBL_CHAN_TX_WAIT_US, NBL_CHAN_TX_WAIT_US_MAX);
> +	}
> +	if (retry_count >= max_retries) {
> +		dev_err(dev, "chan send message type: %d timeout\n",
> +			le16_to_cpu(READ_ONCE(tx_desc->msg_type)));
> +		return -ETIMEDOUT;
> +	}
> +	txq->next_to_clean = txq->next_to_use;

If a timeout happens, this returns -ETIMEDOUT without advancing
txq->next_to_clean. Won't this permanently wedge the TX queue, because the
next send operation will start polling the exact same stuck descriptor and
timeout again?

[ ... ]

> +static void nbl_chan_recv_ack_msg(void *priv, u16 srcid, u16 msgid, void *data,
> +				  u32 data_len)
> +{
[ ... ]
> +	if (ack_datalen > MAX_ACK_DATA_SIZE) {
> +		dev_err(dev, "Invalid loc=%d\n", ack_msgid.info.loc);
> +		return;
> +	}
> +	ack_msgid.id = *(u16 *)(payload + NBL_CHAN_MSG_ID_POS);

Doesn't the error log above print an uninitialized value? The ack_msgid
union is populated on the line right after it is logged, so it will always
print the zero-initialized default value.

Also, is it safe to read the msgid by casting the payload pointer like this?
Since the sender stores a 16-bit msgid into a 32-bit array natively, casting
to a u16 pointer will read the upper 16 bits of the 32-bit word on big-endian
architectures. That would result in a message ID of 0 and break the ACK
mechanism.

[ ... ]
> +	wait_head->ack_err = *(payload + NBL_CHAN_ACK_RET_POS);
> +
> +	copy_len = min_t(u32, wait_head->ack_data_len, ack_datalen);
> +	if (wait_head->ack_err >= 0 && copy_len > 0) {
> +		if (NBL_CHAN_ACK_HEAD_LEN * sizeof(u32) + copy_len > data_len) {
> +			dev_err(dev, "ACK payload overflow\n");
> +			return;
> +		}
> +		memcpy((char *)wait_head->ack_data,
> +		       payload + NBL_CHAN_ACK_HEAD_LEN, copy_len);
> +		wait_head->ack_data_len = (u16)copy_len;
> +	} else {
> +		wait_head->ack_data_len = 0;
> +	}

Can this memcpy() race with the timeout path in nbl_chan_send_msg()? The
receiver checks wait_head->status earlier without holding locks. If
nbl_chan_send_msg() times out immediately after that check, it sets
wait_head->ack_data to NULL and returns, freeing its local buffer. The
receiver might then resume and execute this memcpy into a NULL pointer or a
freed stack frame.

[ ... ]

> +static void nbl_chan_recv_msg(struct nbl_channel_mgt *chan_mgt, void *data)
> +{
[ ... ]
> +	if (tx_desc->data_len) {
> +		payload_len = le16_to_cpu(tx_desc->data_len);
> +		if (payload_len > NBL_CHAN_TX_DESC_DATA_LEN) {
> +			dev_err(dev, "data_len=%u exceeds embedded buffer size=%u\n",
> +				payload_len,
> +				NBL_CHAN_TX_DESC_DATA_LEN);
> +			return;
> +		}
> +		payload = tx_desc->data;
> +	} else {
> +		payload_len = le16_to_cpu(tx_desc->buf_len);
> +		if (payload_len > NBL_CHAN_BUF_LEN) {
> +			dev_err(dev, "buf_len=%u exceeds external buffer size=%u\n",
> +				payload_len,
> +				NBL_CHAN_BUF_LEN);
> +			return;
> +		}
> +		payload = tx_desc + 1;
> +	}

Could this result in an out-of-bounds access? The external buffer payload
pointer starts at tx_desc + 1 (an offset of sizeof(struct nbl_chan_tx_desc),
which is 64 bytes). If payload_len is between 4033 and 4096,
payload + payload_len will exceed the total NBL_CHAN_BUF_LEN of 4096 bytes,
overflowing the buffer boundary.

[ ... ]

> +static int nbl_chan_send_msg(struct nbl_channel_mgt *chan_mgt,
> +			     struct nbl_chan_send_info *chan_send)
> +{
[ ... ]
> +	/*polling wait mailbox ack*/
> +	mutex_lock(&chan_info->txq_lock);
> +	while (i--) {
> +		nbl_chan_clean_queue(chan_mgt, chan_info);
> +
> +		if (wait_head->acked) {
> +			chan_send->ack_len = wait_head->ack_data_len;
> +			atomic_set(&wait_head->status, NBL_MBX_STATUS_IDLE);
> +			ret = READ_ONCE(wait_head->ack_err);
> +			mutex_unlock(&chan_info->txq_lock);
> +			return ret;
> +		}

Are we missing a read memory barrier here? The receiver uses a wmb() before
setting wait_head->acked = 1, but the polling loop lacks a corresponding rmb()
after observing the flag. The CPU could reorder the reads, causing this code
to read stale values for ack_data_len or ack_err.

Also, is it possible for this polling loop to self-deadlock? The loop holds
txq_lock while calling nbl_chan_clean_queue(), which processes incoming
messages. If an incoming message requires an ACK, its handler will call
nbl_chan_send_ack(), which eventually calls nbl_chan_send_msg() again and
will attempt to reacquire the non-recursive txq_lock.

[ ... ]

> +static int nbl_chan_send_ack(struct nbl_channel_mgt *chan_mgt,
> +			     struct nbl_chan_ack_info *chan_ack)
> +{
> +	u32 len = NBL_CHAN_ACK_HEAD_LEN * sizeof(u32) + chan_ack->data_len;

Does this calculation have an integer overflow issue? A sufficiently large
chan_ack->data_len could cause len to wrap around, leading to a tiny heap
allocation. The subsequent memcpy() would then write past the end of the buffer.

[ ... ]

> +int nbl_chan_init_common(struct nbl_adapter *adap)
> +{
[ ... ]
> +	chan_ops_tbl = nbl_chan_setup_ops(dev, chan_mgt);
> +	if (IS_ERR(chan_ops_tbl)) {
> +		ret = PTR_ERR(chan_ops_tbl);
> +		goto exit;
> +	}
> +	adap->intf.channel_ops_tbl = chan_ops_tbl;
> +
> +	return 0;
> +
> +exit:
> +	if (chan_mgt) {
> +		nbl_chan_remove_msg_handler(chan_mgt);
> +		 adap->core.chan_mgt = NULL;
> +	}

Will this cause a panic if nbl_chan_setup_chan_mgt() returns an ERR_PTR?
The condition if (chan_mgt) will evaluate to true for an error pointer, and
nbl_chan_remove_msg_handler() will dereference it. A check like
!IS_ERR_OR_NULL(chan_mgt) might be safer here.

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c
> new file mode 100644
> index 0000000000000..79c1d5f5aacf9
> --- /dev/null
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c

[ ... ]

> +void nbl_common_destroy_wq(void)
> +{
> +	destroy_workqueue(wq_mgt->ctrl_dev_wq);
> +	kfree(wq_mgt);
> +	wq_mgt = NULL;
> +}

Is it possible that wq_mgt is NULL here? If nbl_common_create_wq() fails to
allocate the workqueue, it frees wq_mgt and sets it to NULL. A subsequent
cleanup call to nbl_common_destroy_wq() would unconditionally dereference it.

[ ... ]

> +int nbl_common_alloc_hash_node(struct nbl_hash_tbl_mgt *tbl_mgt, void *key,
> +			       void *data, void **out_data)
> +{
[ ... ]
> +	hash_val = nbl_common_calc_hash_key(key, key_size,
> +					    tbl_mgt->tbl_key.bucket_size);
> +
> +	hlist_add_head(&hash_node->node, tbl_mgt->hash + hash_val);
> +	tbl_mgt->node_num++;
> +	if (out_data)
> +		*out_data = hash_node->data;
> +
> +	return 0;

[ ... ]

> +void *nbl_common_get_hash_node(struct nbl_hash_tbl_mgt *tbl_mgt, void *key)
> +{
[ ... ]
> +	hlist_for_each_entry(hash_node, head, node)
> +		if (!memcmp(hash_node->key, key, key_size)) {
> +			data = hash_node->data;
> +			break;
> +		}
> +
> +	return data;
> +}

Is there a synchronization issue with this hash table implementation? The
code modifies and traverses the hlist without any locking or RCU protection.
Concurrent message registration and reception could corrupt the linked list.

next prev parent reply	other threads:[~2026-05-16  1:17 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  1:16 [PATCH v14 net-next 00/11] nbl driver for Nebulamatrix NICs illusion.wang
2026-05-13  1:16 ` [PATCH v14 net-next 01/11] net/nebula-matrix: add minimum nbl build framework illusion.wang
2026-05-13  1:16 ` [PATCH v14 net-next 02/11] net/nebula-matrix: add our driver architecture illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 03/11] net/nebula-matrix: add chip related definitions illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 04/11] net/nebula-matrix: channel msg value and msg struct illusion.wang
2026-05-13  1:16 ` [PATCH v14 net-next 05/11] net/nebula-matrix: add channel layer illusion.wang
2026-05-16  1:17   ` Jakub Kicinski [this message]
2026-05-13  1:16 ` [PATCH v14 net-next 06/11] net/nebula-matrix: add common resource implementation illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 07/11] net/nebula-matrix: add intr " illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 08/11] net/nebula-matrix: add vsi " illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 09/11] net/nebula-matrix: add Dispatch layer implementation illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 10/11] net/nebula-matrix: add common/ctrl dev init/reinit operation illusion.wang
2026-05-16  1:17   ` Jakub Kicinski
2026-05-13  1:16 ` [PATCH v14 net-next 11/11] net/nebula-matrix: add common dev start/stop operation illusion.wang
2026-05-16  1:17   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260516011714.1863858-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=alvin.wang@nebula-matrix.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=corbet@lwn.net \
    --cc=dimon.zhao@nebula-matrix.com \
    --cc=edumazet@google.com \
    --cc=enelsonmoore@gmail.com \
    --cc=hkallweit1@gmail.com \
    --cc=horms@kernel.org \
    --cc=illusion.wang@nebula-matrix.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sam.chen@nebula-matrix.com \
    --cc=skhan@linuxfoundation.org \
    --cc=vadim.fedorenko@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.