Re: [PATCH net-next v36 1/8] eea: introduce PCI framework

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Wen Gu <guwen@linux.alibaba.com>,
	Philo Lu <lulie@linux.alibaba.com>,
	Vadim Fedorenko <vadim.fedorenko@linux.dev>,
	Dong Yibo <dong100@mucse.com>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	Dust Li <dust.li@linux.alibaba.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next v36 1/8] eea: introduce PCI framework
Date: Sat, 4 Apr 2026 16:26:45 +0800	[thread overview]
Message-ID: <1775291205.5205972-1-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <e769aaaa-17bb-4356-9848-dc5001885e24@redhat.com>

On Thu, 2 Apr 2026 10:02:53 +0200, Paolo Abeni <pabeni@redhat.com> wrote:
> On 3/29/26 3:22 PM, Xuan Zhuo wrote:
> > +struct eea_pci_cfg {
> > +	__le32 reserve0;
> > +	__le32 reserve1;
> > +	__le32 drv_f_idx;
> > +	__le32 drv_f;
> > +
> > +#define EEA_S_OK           BIT(2)
> > +#define EEA_S_FEATURE_DONE BIT(3)
> > +#define EEA_S_FAILED       BIT(7)
> > +	u8   device_status;
> > +	u8   reserved[7];
> > +
> > +	__le32 rx_num_max;
> > +	__le32 tx_num_max;
> > +	__le32 db_blk_size;
> > +
> > +	/* admin queue cfg */
> > +	__le16 aq_size;
> > +	__le16 aq_msix_vector;
> > +	__le32 aq_db_off;
> > +
> > +	__le32 aq_sq_addr;
> > +	__le32 aq_sq_addr_hi;
> > +	__le32 aq_cq_addr;
> > +	__le32 aq_cq_addr_hi;
> > +
> > +	__le64 hw_ts;
>
> Sashiko has still a lot to say about this series. Here:
>
> ---
> Is there an implicit compiler padding issue here?
> The field aq_cq_addr_hi is a 32-bit integer ending at offset 59. The
> immediately following field, hw_ts, is an 8-byte integer. To align hw_ts
> to an 8-byte boundary, the compiler will implicitly insert 4 bytes of
> padding, placing hw_ts at offset 64 instead of 60.
> Depending on whether the hardware designer intended the register to be
> at offset 60 or 64, the driver might read from the wrong address, or
> rely on implicit padding that could change across architectures.
> ---
>
> I assume the (virtual) H/W does the align thing correctly, so no real
> issue here. Still replying to this patch explaining why the raised
> concern is not valid would help a lot progresses on this series.


Yes, we had indeed overlooked this part before. I think it's still better to
keep it 64-bit aligned. AI did help us, and additionally, it caught another
issue: if an RX packet is dropped, we need to re-dma-sync the buffer
for the device when reusing it. In this AI review, I believe only these two
findings are valid. Even though the second one rarely causes problems in
practice, I'll submit another patch set to address both. AI does catch real
issues, but the Sashiko review results seem to generate too many false
positives. Reviewing them takes up too much of our time, and the return on
effort is too low. I've gone through every single item flagged by Sashiko, and
only these two are actual issues; the rest don't exist. Unfortunately, I can't
reply to them inline like I would in an email thread.

>
> [ ... ]
> > +int eea_device_reset(struct eea_device *edev)
> > +{
> > +	struct eea_pci_device *ep_dev = edev->ep_dev;
> > +	u8 val;
> > +
> > +	eea_pci_io_set_status(edev, 0);
> > +
> > +	return read_poll_timeout(cfg_read8, val, !val, 20, EEA_RESET_TIMEOUT_US,
> > +				 false, ep_dev->reg, device_status);
> > +}
>
> Sashiko says:
> ---
> Can this cause a 60-second thread stall during a surprise removal?
> When a PCIe device is surprise-removed, MMIO reads typically return all
> ones (0xFF). If the device is removed, !0xFF evaluates to false, causing
> the loop to never exit early and hanging the executing thread for the
> entire 60-second timeout.
> ---
>
> A similar concern was raised on the previous iteration. I see you
> decreased the timeout from 1000s to 60s, but 60s is still a considerably
> longsystem hangup. Anything above a few seconds should be considered
> carefully.


I will add check for 0xFF.


Thanks


>
> > +
> > +int eea_pci_set_aq_up(struct eea_device *edev)
> > +{
> > +	struct eea_pci_device *ep_dev = edev->ep_dev;
> > +	u8 status = eea_pci_io_get_status(edev);
> > +	int err;
> > +	u8 val;
> > +
> > +	WARN_ON(status & EEA_S_OK);
>
> Sashiko says:
>
> ---
> Does a surprise removal trigger a spurious kernel warning here?
> A surprise removal forces status to 0xFF, making 0xFF & BIT(2) true and
> triggering an unintended kernel stack dump for an asynchronous hardware
> event.
> ---
>
> This looks valid to me.
>
> /P
>

next prev parent reply	other threads:[~2026-04-04  8:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29 13:22 [PATCH net-next v36 0/8] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 1/8] eea: introduce PCI framework Xuan Zhuo
2026-04-02  8:02   ` Paolo Abeni
2026-04-04  8:26     ` Xuan Zhuo [this message]
2026-03-29 13:22 ` [PATCH net-next v36 2/8] eea: introduce ring and descriptor structures Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 3/8] eea: probe the netdevice and create adminq Xuan Zhuo
2026-04-02  8:19   ` Paolo Abeni
2026-03-29 13:22 ` [PATCH net-next v36 4/8] eea: create/destroy rx,tx queues for netdevice open and stop Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 5/8] eea: implement packet receive logic Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 6/8] eea: implement packet transmit logic Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 7/8] eea: introduce ethtool support Xuan Zhuo
2026-03-29 13:22 ` [PATCH net-next v36 8/8] eea: introduce callback for ndo_get_stats64 Xuan Zhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1775291205.5205972-1-xuanzhuo@linux.alibaba.com \
    --to=xuanzhuo@linux.alibaba.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dong100@mucse.com \
    --cc=dust.li@linux.alibaba.com \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=hkallweit1@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lulie@linux.alibaba.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vadim.fedorenko@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox