RE: [External Mail] Re: [PATCH v2 3/7] net: wwan: t9xx: Add control DMA interface

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Wu. JackBB (GSM)" <JackBB_Wu@compal.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Loic Poulain <loic.poulain@oss.qualcomm.com>,
	Sergey Ryazanov <ryazanov.s.a@gmail.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	"Andrew Lunn" <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	Wen-Zhi Huang <wen-zhi.huang@mediatek.com>,
	Shi-Wei Yeh <shi-wei.yeh@mediatek.com>,
	"Minano Tseng" <Minano.tseng@mediatek.com>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	AngeloGioacchino Del Regno
	<angelogioacchino.delregno@collabora.com>,
	"Simon Horman" <horms@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"wojackbb@gmail.com" <wojackbb@gmail.com>
Subject: RE: [External Mail] Re: [PATCH v2 3/7] net: wwan: t9xx: Add control DMA interface
Date: Mon, 15 Jun 2026 08:31:31 +0000	[thread overview]
Message-ID: <35ba38814c9948a99152f9f93472f254@compal.com> (raw)
In-Reply-To: <20260613173018.7a0581f1@kernel.org>

Hi Jakub,

Thank you for your comment.

On Wed, 10 Jun 2026 18:41:06 +0800 Jack Wu via B4 Relay wrote:
> Transient build warnings:
>
> +../drivers/net/wwan/t9xx/pcie/mtk_pci_drv_m9xx.c:52:30: warning: symbol 'mtk_dev_cfg_0900' was not declared. Should it be static?
> +../drivers/net/wwan/t9xx/pcie/mtk_ctrl_cfg_m9xx.c:19:22: warning: symbol 'mtk_ctrl_info_m9xx' was not declared. Should it be static?
> +../drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c:33:22: warning: symbol 'mtk_cldma_regs_m9xx' was not declared. Should it be static?
> +../drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c:166:22: warning: symbol 'cldma_drv_ops_m9xx' was not declared. Should it be static?

Will fix in v3. Moved extern declarations into shared headers so each
defining .c file includes its own declaration.

> please also see all the AI code comments at:
> https://sashiko.dev/#/patchset/20260610-t9xx_driver_v1-v2-3-c65addf23b3f@compal.com

Thank you. We have reviewed the AI comments. Below are our
responses for the items under this patch (patch 3):

Q1: Is the reference count for the skb incremented correctly here? If the port
is non-blocking, the caller might immediately drop the reference. Without an
extra reference taken before queueing, the skb could be freed while it still
resides in the queue, resulting in a use-after-free when the worker thread
later accesses the list.

  No use-after-free here. The skb is allocated by the port layer
  (mtk_port_send_data / mtk_port_common_write) and ownership is
  transferred to submit_skb — the caller does not free the skb after
  a successful submission. On error returns (-EINVAL, -EIO, -EAGAIN),
  the skb is NOT queued and the caller handles cleanup. The skb_queue
  operations (skb_queue_head/skb_queue_tail) do not require an extra
  reference because the queue becomes the sole owner. This is the
  standard kernel pattern — skb ownership transfers to the queue, not
  shared between caller and queue.

Q2: Could there be a race condition between setting the virtual address and the
hardware ownership flag? If the worker checks the condition between these two
writes, it might erroneously detect completion and free the buffer before the
transfer even starts.

  No race. The tx_done_work checks:
    if (!req->data_vm_addr || (gpd_flags & CLDMA_GPD_FLAG_HWO))
        break;
  It only proceeds when data_vm_addr is set AND HWO is cleared.
  In submit_tx, the write order is:
    1. req->data_vm_addr = skb->data;     (set non-NULL)
    2. wmb();
    3. gpd_flags |= CLDMA_GPD_FLAG_HWO;   (set HWO)
  After step 1, data_vm_addr is non-NULL but HWO is set (step 3), so
  tx_done_work sees HWO=1 and breaks. It will only process this req
  after the hardware clears HWO upon DMA completion. The wmb() ensures
  the metadata writes are visible before HWO is set, and the rmb() in
  tx_done_work ensures HWO is read before the metadata.

Q3: Does this permanently halt the receive queue if a memory allocation fails
during refill? If the skb allocation fails earlier, the worker exits here
without advancing the free index, which seems to leave the queue stuck
indefinitely on all subsequent executions.

  The concern is partially valid. If mtk_cldma_reload_rx_skb() fails
  (skb alloc failure), the code jumps to "out" which unmasks the
  interrupt. On the next hardware interrupt, rx_done_work runs again
  and retries from the same free_idx. If req->skb is NULL (because
  reload failed and rx_done set req->skb = NULL after dispatch), the
  code hits the "!req->skb" error check and goes to "out" — but "out"
  now correctly unmasks the interrupt (fixed in v2), so the next
  interrupt will retry. This creates a retry loop that depends on
  memory becoming available. Under sustained memory pressure, the RX
  queue stalls but recovers once memory is freed. This is acceptable
  behavior for a control plane interface with small, infrequent
  messages.

Q4: Does this loop safely handle fatal errors from the submission function? It
appears that on errors other than EAGAIN, the loop breaks without unlinking
the skb. The worker loop would then peek the exact same skb on the next
iteration, causing an infinite busy loop that locks up the processor.

  The code shown (with kref_put) is from patch 4, not patch 3.
  We will fix the issue in patch 4 v3.

Q5: Is a read memory barrier needed after checking the hardware ownership flag?
Without a barrier, the processor might speculatively read the payload length
and data in mtk_cldma_rx_skb_adjust before confirming ownership, leading to
potential data corruption.

  The GPD descriptors are allocated from a coherent DMA pool
  (dma_pool_zalloc), so CPU cache coherency is guaranteed by the DMA
  mapping. A dma_rmb() is not needed here as coherent memory ensures
  the CPU always observes the latest values written by the device
  without reordering.

Q6: Should the return value of the queue stop function be checked before tearing
down DMA memory? If the hardware fails to stop within the timeout, proceeding
to unmap the memory might result in the active hardware writing to freed
kernel memory.

  The concern is valid in theory but not actionable in practice.
  cldma_stop_queue() polls the hardware for up to 10 iterations with
  usleep. If it times out, the hardware is in a broken state (link
  error or device hang). At this point the teardown is part of
  device removal or error recovery — there is no safe alternative to
  proceeding with cleanup. Skipping the teardown would leak DMA
  memory permanently. The synchronize_irq() + flush_work() after
  stop_queue ensures no software handlers are in-flight. If the
  hardware is truly stuck, the PCIe link is likely already down
  (LINK_ERROR_VAL check), making further MMIO access no-ops.

Q7: Does this leak the radix tree nodes and their contents if the array
allocation fails? Performing memory allocations in a teardown path might
permanently leak memory during system memory pressure.

  We will fix in v3 by replacing kcalloc +
  radix_tree_gang_lookup with radix_tree_for_each_slot() iteration,
  eliminating the allocation in the teardown path.

Regarding AI comments for other patches in the series — should we
respond to those under each respective patch thread?

Thanks.

================================================================================================================================================================
This message may contain information which is private, privileged or confidential of Compal Electronics, Inc. If you are not the intended recipient of this message, please notify the sender and destroy/delete the message. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information, by persons or entities other than the intended recipient is prohibited.
================================================================================================================================================================

next prev parent reply	other threads:[~2026-06-15  8:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-10 10:41 [PATCH v2 0/7] net: wwan: t9xx: Add MediaTek T9XX WWAN driver Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 1/7] net: wwan: t9xx: Add PCIe core Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 2/7] net: wwan: t9xx: Add control plane transaction layer Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 3/7] net: wwan: t9xx: Add control DMA interface Jack Wu via B4 Relay
2026-06-14  0:30   ` Jakub Kicinski
2026-06-15  8:31     ` Wu. JackBB (GSM) [this message]
2026-06-10 10:41 ` [PATCH v2 4/7] net: wwan: t9xx: Add control port Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 5/7] net: wwan: t9xx: Add FSM thread Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 6/7] net: wwan: t9xx: Add AT & MBIM WWAN ports Jack Wu via B4 Relay
2026-06-10 10:41 ` [PATCH v2 7/7] net: wwan: t9xx: Add maintainers entry Jack Wu via B4 Relay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35ba38814c9948a99152f9f93472f254@compal.com \
    --to=jackbb_wu@compal.com \
    --cc=Minano.tseng@mediatek.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=johannes@sipsolutions.net \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=loic.poulain@oss.qualcomm.com \
    --cc=matthias.bgg@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=ryazanov.s.a@gmail.com \
    --cc=shi-wei.yeh@mediatek.com \
    --cc=skhan@linuxfoundation.org \
    --cc=wen-zhi.huang@mediatek.com \
    --cc=wojackbb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox