From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
lorenzo@kernel.org, andrew+netdev@lunn.ch, hkallweit1@gmail.com,
dust.li@linux.alibaba.com, edumazet@google.com,
davem@davemloft.net, pabeni@redhat.com, lulie@linux.alibaba.com,
andrew@lunn.ch, netdev@vger.kernel.org, guwen@linux.alibaba.com,
dong100@mucse.com, lukas.bulwahn@redhat.com,
vadim.fedorenko@linux.dev, enelsonmoore@gmail.com
Subject: Re: [net-next,v24,4/6] eea: create/destroy rx,tx queues for netdevice open and stop
Date: Mon, 2 Feb 2026 11:23:32 +0800 [thread overview]
Message-ID: <1770002612.3297296-2-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <20260130233243.923376-1-kuba@kernel.org>
On Fri, 30 Jan 2026 15:32:43 -0800, Jakub Kicinski <kuba@kernel.org> wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> eea: create/destroy rx,tx queues for netdevice open and stop
>
> This commit introduces the implementation for netdevice open and stop in
> the Alibaba Elastic Ethernet Adapter (EEA) driver, including RX/TX queue
> allocation, NAPI setup, and resource management.
[...]
>
>
> > + }
> > +
> > + err = eea_active_ring_and_irq(enet);
> > + if (err) {
> > + netdev_err(enet->netdev,
> > + "eea reset: active new ring and irq failed. err %d\n",
> > + err);
> > + return err;
>
> Does this code leak the newly allocated queues when eea_active_ring_and_irq
> fails?
>
> In this scenario:
> 1. New queues allocated in ctx at line 277
> 2. Old queues freed by eea_netdev_stop at line 285
> 3. enet_bind_new_q_and_cfg succeeds at line 294, transferring ctx->rx and
> ctx->tx to enet->rx and enet->tx
> 4. eea_active_ring_and_irq fails at line 303
> 5. Function returns at line 308 without freeing enet->rx and enet->tx
>
> The newly allocated and transferred resources remain in enet->rx and
> enet->tx but are never freed. The comment at lines 287-292 claims resources
> are deferred for cleanup, but there is no mechanism to track these orphaned
> resources for later cleanup.
The mechanism of the netdev close does this.
>
> [ ... ]
>
> > +static void eea_meta_free_xmit(struct eea_net_tx *tx, struct eea_tx_meta *meta,
> > + int budget)
> > +{
> > + struct eea_net_stats *stats = &tx->stats;
> > +
> > + ++stats->packets;
> > + stats->bytes += meta->skb->len;
> > + napi_consume_skb(meta->skb, budget);
>
> The second parameter to napi_consume_skb() should be a boolean indicating
> NAPI context (0 or non-zero), not the actual budget value.
>
> Jakub Kicinski raised this concern in v21 review, referencing commit
> cd18e8ac030e646 which added kdoc clarifying this. The current code passes
> 'budget' (which could be any positive integer like 64) instead of just 1 to
> indicate NAPI context. While the function still works correctly (any
> non-zero value indicates NAPI context), using the budget value is
> semantically incorrect.
>
> Reference: https://lore.kernel.org/netdev/20260121170658.371e0e63@kernel.org/
I think the parameter budget should be an int value from NAPI context.
>
> > + meta->skb = NULL;
> > +
> > + eea_tx_meta_put_and_unmap(tx, meta);
> > +}
>
> [ ... ]
>
> [ ... ]
>
> > +static struct eea_tx_meta *eea_tx_desc_fill(struct eea_net_tx *tx,
> > + dma_addr_t addr, u32 size,
> > + bool is_last, struct sk_buff *skb,
> > + u32 flags)
> > +{
> > + struct eea_tx_desc *desc;
> > + struct eea_tx_meta *meta;
> > +
> > + meta = eea_tx_meta_get(tx);
> > +
> > + desc = ering_sq_alloc_desc(tx->ering, meta->id, is_last, flags);
>
> Can this code dereference a NULL pointer?
The validation performed after packet posting is sufficient, as large fragmented
packets do not affect this check. The below follows the same principle.
Thanks.
>
> eea_tx_meta_get() can return NULL when the free list is exhausted (tx->free
> is NULL, as seen in lines 40-41 of eea_tx_meta_get). When NULL is returned,
> line 182 unconditionally dereferences meta->id without checking if meta is
> NULL, causing a kernel crash.
>
> The TX queue flow control check (netif_txq_maybe_stop) happens AFTER packet
> posting in eea_tx_xmit (line 280), not before. When the ring is under heavy
> load with fragmented packets, the meta free list can be exhausted before the
> queue is stopped.
>
> The meta array has cfg.tx_ring_depth entries (allocated at line 360 in
> eea_alloc_tx). Each packet can consume 1 + nr_frags meta entries (up to
> MAX_SKB_FRAGS + 1). With small ring depths and many large fragmented packets
> in flight, the free list can be completely exhausted.
>
> Call path:
> Network stack -> eea_tx_xmit -> eea_tx_post_skb (line 274) ->
> eea_tx_desc_fill (line 231) -> eea_tx_meta_get (line 180) returns NULL ->
> line 182 crashes.
>
> > +
> > + meta->data = (void *)skb;
> > + meta->dma = addr;
> > + meta->desc = desc;
> > +
> > + return meta;
> > +}
>
> [ ... ]
>
> > +static int eea_tx_post_skb(struct sk_buff *skb, struct eea_net_tx *tx,
> > + int flags)
> > +{
> > + dma_addr_t addr;
> > + u32 hlen;
> > + struct skb_shared_info *shinfo = skb_shinfo(skb);
> > + struct eea_tx_meta *meta;
> > + int err, last = 0;
> > + struct device *dmadev;
> > +
> > + dmadev = ering_get_dma_device(tx->ering);
> > +
> > + hlen = skb_headlen(skb);
> > +
> > + addr = dma_map_single(dmadev, skb->data, hlen, DMA_TO_DEVICE);
> > + if (dma_mapping_error(dmadev, addr))
> > + return -ENOMEM;
> > +
> > + meta = eea_tx_desc_fill(tx, addr, hlen, !shinfo->nr_frags, skb, flags);
> > +
> > + if (eea_fill_desc_from_skb(skb, tx->ering, meta->desc))
>
> Can this code dereference a NULL pointer?
>
> eea_tx_desc_fill() at line 231 can return NULL when the meta free list is
> exhausted (root cause explained earlier). Line 233 unconditionally
> dereferences meta->desc without checking if meta is NULL.
>
> This is a consequence of the same underlying issue: the TX queue flow
> control happens after packet posting rather than before.
>
> Call path:
> eea_tx_xmit (line 274) -> eea_tx_post_skb -> eea_tx_desc_fill (line 231)
> returns NULL -> line 233 crashes accessing meta->desc.
>
> > + goto err_cancel;
>
> [ ... ]
next prev parent reply other threads:[~2026-02-02 3:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 9:34 [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 1/6] eea: introduce PCI framework Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,1/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 2/6] eea: introduce ring and descriptor structures Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,2/6] " Jakub Kicinski
2026-01-30 9:34 ` [PATCH net-next v24 3/6] eea: probe the netdevice and create adminq Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 4/6] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,4/6] " Jakub Kicinski
2026-02-02 3:23 ` Xuan Zhuo [this message]
2026-01-30 9:34 ` [PATCH net-next v24 5/6] eea: introduce ethtool support Xuan Zhuo
2026-01-30 9:34 ` [PATCH net-next v24 6/6] eea: introduce callback for ndo_get_stats64 Xuan Zhuo
2026-01-30 23:32 ` [net-next,v24,6/6] " Jakub Kicinski
2026-02-02 2:09 ` Xuan Zhuo
2026-01-31 0:20 ` [PATCH net-next v24 0/6] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Ethan Nelson-Moore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1770002612.3297296-2-xuanzhuo@linux.alibaba.com \
--to=xuanzhuo@linux.alibaba.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=dong100@mucse.com \
--cc=dust.li@linux.alibaba.com \
--cc=edumazet@google.com \
--cc=enelsonmoore@gmail.com \
--cc=guwen@linux.alibaba.com \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=lorenzo@kernel.org \
--cc=lukas.bulwahn@redhat.com \
--cc=lulie@linux.alibaba.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=vadim.fedorenko@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox