From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org, netdev@vger.kernel.org,
Madhur Agrawal <Madhur.Agrawal@airoha.com>
Subject: Re: [PATCH net] net: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue()
Date: Tue, 31 Mar 2026 13:01:10 +0200 [thread overview]
Message-ID: <acupditAioiehGSj@lore-desk> (raw)
In-Reply-To: <20260330172857.0c94685d@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 4147 bytes --]
> On Fri, 27 Mar 2026 10:48:21 +0100 Lorenzo Bianconi wrote:
> > In order to properly cleanup hw rx QDMA queues and bring the device to
> > the initial state, reset rx DMA queue head/tail index. Moreover, reset
> > queued DMA descriptor fields.
> >
> > Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
> > Tested-by: Madhur Agrawal <Madhur.Agrawal@airoha.com>
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
>
> Take a look at sashiko, please:
> https://sashiko.dev/#/patchset/20260327-airoha_qdma_cleanup_rx_queue-fix-v1-1-369d6ab1511a@kernel.org
>
> Looks somewhat orthogonal to the current patch but probably worth
> fixing.
Hi Jakub,
thx for pointing me to the sashiko's issues.
1- Could this code execute while the interface is still administratively up and
the hardware DMA engines are actively receiving packets?
Looking at the teardown paths, airoha_hw_cleanup() is called before
unregister_netdev(). Unmapping buffers and zeroing descriptor addresses
while the hardware might still be actively writing to them could cause
physical memory corruption and IOMMU faults.
Should unregister_netdev() (which quiesces the DMA via ndo_stop) be called
before airoha_hw_cleanup() to avoid this?
I think the issue described above is already fixed in the following commit
available in net-next:
commit b1c803d5c8167026791abfaed96fd3e6a1fcd750
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Sat Mar 21 15:41:44 2026 +0100
net: airoha: Rework the code flow in airoha_remove() and in airoha_probe() error path
As suggested by Simon in [0], rework the code flow in airoha_remove()
and in the airoha_probe() error path in order to rely on a more common
approach un-registering configured net-devices first and destroying the
hw resources at the end of the code.
Introduce airoha_qdma_cleanup routine to release QDMA resources.
[0] https://lore.kernel.org/netdev/20251214-airoha-fix-dev-registration-v1-1-860e027ad4c6@kernel.org/
Suggested-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260321-airoha-remove-rework-v2-1-16c7bade5fe5@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
How can we go ahead on this?
2- This is a pre-existing issue, but while reviewing this cleanup path, I
noticed a potential NULL pointer dereference if initialization fails earlier.
If devm_kzalloc() or dmam_alloc_coherent() fails in airoha_qdma_init_rx_queue(),
it returns an error before netif_napi_add() is called, leaving the embedded
q->napi struct zero-filled.
However, q->ndesc is set earlier in that function. Since q->ndesc is now
non-zero, the error cleanup path will try to disable and delete this
uninitialized NAPI structure, leading to a crash in napi_disable() when it
calls hrtimer_cancel() on the uninitialized timer.
Could we defer setting q->ndesc until after the allocations succeed?
I think it is fine to set 'q->ndesc' at the end of airoha_qdma_init_rx_queue()
routine but, considering net codebase, it seems the issue can't occur since if
airoha_qdma_init_rx_queue() fails as described above, airoha_probe() will jump
to error_hw_cleanup and netif_napi_del() in airoha_hw_cleanup() will return if
NAPI_STATE_LISTED is not set in __netif_napi_del_locked().
Am I missing something?
3- Is there a missing reset for the CPU producer index (REG_RX_CPU_IDX) here?
The hardware DMA relies on the gap between the CPU and DMA indices to
identify valid descriptors. By rewinding the DMA consumer index (REG_RX_DMA_IDX)
to q->tail while leaving the CPU index at its old, advanced value, could
this create a phantom gap of descriptors?
If the DMA engine fetches these zeroed descriptors (with addr = 0), it might
attempt to write incoming packets to physical address 0x0. Both indices might
need to be synchronized.
I will post a fix for it.
Regards,
Lorenzo
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2026-03-31 11:01 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 9:48 [PATCH net] net: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue() Lorenzo Bianconi
2026-03-31 0:28 ` Jakub Kicinski
2026-03-31 11:01 ` Lorenzo Bianconi [this message]
2026-03-31 0:40 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acupditAioiehGSj@lore-desk \
--to=lorenzo@kernel.org \
--cc=Madhur.Agrawal@airoha.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.