From: David Matlack <dmatlack@google.com>
To: Josh Hilke <jrhilke@google.com>
Cc: Alex Williamson <alex@shazbot.org>,
Vipin Sharma <vipinsh@google.com>,
Jason Gunthorpe <jgg@nvidia.com>, Shuah Khan <shuah@kernel.org>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH] vfio: selftests: Add driver for IGB QEMU device
Date: Wed, 13 May 2026 23:29:35 +0000 [thread overview]
Message-ID: <agUJX_SVgng3wG-P@google.com> (raw)
In-Reply-To: <CAAdrzjtiiOtw7-YL2COgZC-QSFp6LLwq+jP9V3jPKLUe0w7Y0g@mail.gmail.com>
On 2026-05-12 07:12 PM, Josh Hilke wrote:
> On Mon, May 11, 2026 at 4:45 PM David Matlack <dmatlack@google.com> wrote:
> > On 2026-05-11 09:18 PM, Josh Hilke wrote:
> > > +#define IGB_MAX_CHUNK_SIZE 1024
> > > +#define MSIX_VECTOR 0
> > > +#define RING_SIZE 4096 /* Number of descriptors in ring */
> >
> > What's the maximum ring size that IGB can support? The max chunk size is
> > quite small so using a larger ring will be helpful toward getting the
> > device to do a lot of memcpys for Live Update testing.
>
> I can dig into QEMU code and investigate the max ring size the QEMU
> IGB implementation can handle. 4096 is just the max ring size for real
> the IGB in drivers/net/ethernet/intel/igb/igb.h. What's our target for
> the maximum memcpy size?
I assume you mean memcpy count, not size. Ideally we want to be able to
get the device to do memcpys for multiple minutes to stress test Live
Update. So "as much as possible". There are 2 limits in practice:
1. Device-specific limits.
2. VFIO selftests memory overhead (e.g. for the tx/rx rings).
It looks like the driver programs the ring size into the device:
igb_write32(igb, IGB_TDLEN0, RING_SIZE * sizeof(struct igb_tx_desc));
So I assume we can pick a larger RING_SIZE if we want? What's the device
limit (if any)? It might be just the width of the IGB_TDLEN0 register.
> > > + rx->read.pkt_addr = curr_dst;
> > > + rx->read.hdr_addr = 0;
> > > + rx->wb.status_error = 0;
> >
> > Don't rx->read.hdr_addr and rx->wb.status_error overlap the same u64?
> > It's not a bug since both write 0 but it is weird.
>
> That's correct, they overlap. I agree it's a bit weird -- it's a
> hardware optimization to save memory. This layout is also how the real
> descriptor (`e1000_adv_rx_desc`) is structured in
> drivers/net/ethernet/intel/igb/e1000_82575.h. The QEMU IGB
> implementation also assumes this layout for the bits.
Oh they layout is totally fine, and comes from the device specification
so not something we can change.
I was more commenting on the code here that does overlapping writes as
being weird.
> > > +
> > > + igb_irq_clear(igb);
> > > +
> > > + igb_irq_enable(igb);
> > > +
> > > + if (rx->wb.status_error & 1)
> > > + return 0;
> > > +
> > > + return -ETIMEDOUT;
> >
> > Is there any other way to the device signals errors other than failing
> > to set bit 0 of status_error? I'm especially curious how it handles
> > memcpy_from_unmapped_iova in vfio_pci_driver_test.c.
>
> Yea, there are some relevant error bits in status_err, but
> unfortunately QEMU IGB doesn't emulate them:
> - E1000_RXDEXT_STATERR_LB: confirms that the packet actually traversed
> the loopback path
> - E1000_RXDEXT_STATERR_CE: detects bit flips during transmission
> - E1000_RXDEXT_STATERR_RXE: catch-all error bit for the receive engine
> (buffer overflows, etc.)
>
> The other error bits (E1000_RXDEXT_STATERR_IPE,
> E1000_RXDEXT_STATERR_TCPE) validate network packet integrity, which
> isn't relevant because the driver doesn't package the memcpy data
> within IP or TCP/UDP packets.
>
> Regarding the memcpy_from_unmapped_iova test, the test only cares that
> the device doesn't trigger an MSI. Normally, the QEMU IGB sends an MSI
> once it sets the descriptor done bit on completion of a memcpy
> operation. To prevent this, the driver disables interrupts before
> starting memcpy, and then re-enables them after the memcpy finishes.
memcpy_from_unmapped_iova triggers the device to do a DMA read from an
unmapped address during rx. I was asking how the device handles that and
communicates the error to the driver.
> > > +#define IGB_TXD_CMD_EOP 0x01 /* End of Packet */
> > > +#define IGB_TXD_CMD_IFCS 0x02 /* Insert FCS */
> > > +#define IGB_TXD_CMD_RS 0x08 /* Report Status */
> > > +#define IGB_TXD_CMD_SHIFT 24 /* Shift for command bits in cmd_type_len */
> > > +#define IGB_TXD_CMD_LEGACY_FORMAT (1 << 20) /* Forces legacy descriptor format in QEMU */
> >
> > IGB_TXD_CMD_LEGACY_FORMAT appears to be unused. I didn't check the rest.
> > Can you prune out the unused macros?
>
> Yep, will do!
>
> >
> > Related: Are these macros and/or the descriptor struct available in any
> > kernel headers that we could symlink into the VFIO selftests like we do
> > for DSA?
>
> They're in private headers under drivers/net/ethernet/intel/igb. Are
> you ok with linking in private headers? The DSA headers are public.
Yeah we already symlink private headers into the VFIO selftests drivers.
But the constraint is those headers have to be bare-bones enough to
compile in selftests environment, which is different from kernel
environment.
prev parent reply other threads:[~2026-05-13 23:29 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 21:18 [PATCH] vfio: selftests: Add driver for IGB QEMU device Josh Hilke
2026-05-11 23:45 ` David Matlack
2026-05-13 2:12 ` Josh Hilke
2026-05-13 18:49 ` Josh Hilke
2026-05-13 23:33 ` David Matlack
2026-05-13 23:29 ` David Matlack [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agUJX_SVgng3wG-P@google.com \
--to=dmatlack@google.com \
--cc=alex@shazbot.org \
--cc=anthony.l.nguyen@intel.com \
--cc=jgg@nvidia.com \
--cc=jrhilke@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=shuah@kernel.org \
--cc=vipinsh@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox