Re: [PATCH] hw/pci: migration: Skip config space check for vendor specific capability during restore/load

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Vinayak Kale <vkale@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-devel@nongnu.org, targupta@nvidia.com, cjia@nvidia.com,
	acurrid@nvidia.com, zhiw@nvidia.com, marcel.apfelbaum@gmail.com,
	avihaih@nvidia.com
Subject: Re: [PATCH] hw/pci: migration: Skip config space check for vendor specific capability during restore/load
Date: Thu, 1 Feb 2024 13:10:48 -0500	[thread overview]
Message-ID: <20240201130440-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <b83a9452-cbcc-4f4d-b0f7-b4957029e755@nvidia.com>

On Thu, Feb 01, 2024 at 11:08:58PM +0530, Vinayak Kale wrote:
> 
> On 31/01/24 11:08 pm, Alex Williamson wrote:
> > 
> > On Wed, 31 Jan 2024 15:22:59 +0530
> > Vinayak Kale <vkale@nvidia.com> wrote:
> > 
> > > On 31/01/24 12:28 am, Alex Williamson wrote:
> > > > 
> > > > On Tue, 30 Jan 2024 23:32:26 +0530
> > > > Vinayak Kale <vkale@nvidia.com> wrote:
> > > > 
> > > > > Missed adding Michael, Marcel, Alex and Avihai earlier, apologies.
> > > > > 
> > > > > Regards,
> > > > > Vinayak
> > > > > 
> > > > > On 30/01/24 3:26 pm, Vinayak Kale wrote:
> > > > > > In case of migration, during restore operation, qemu checks the config space of the pci device with the config space
> > > > > > in the migration stream captured during save operation. In case of config space data mismatch, restore operation is failed.
> > > > > > 
> > > > > > config space check is done in function get_pci_config_device(). By default VSC (vendor-specific-capability) in config space is checked.
> > > > > > 
> > > > > > Ideally qemu should not check VSC during restore/load. This patch skips the check by not setting pdev->cmask[] for VSC offsets in pci_add_capability().
> > > > > > If cmask[] is not set for an offset, then qemu skips config space check for that offset.
> > > > > > 
> > > > > > Signed-off-by: Vinayak Kale <vkale@nvidia.com>
> > > > > > ---
> > > > > >     hw/pci/pci.c | 7 +++++--
> > > > > >     1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > > index 76080af580..32429109df 100644
> > > > > > --- a/hw/pci/pci.c
> > > > > > +++ b/hw/pci/pci.c
> > > > > > @@ -2485,8 +2485,11 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
> > > > > >         memset(pdev->used + offset, 0xFF, QEMU_ALIGN_UP(size, 4));
> > > > > >         /* Make capability read-only by default */
> > > > > >         memset(pdev->wmask + offset, 0, size);
> > > > > > -    /* Check capability by default */
> > > > > > -    memset(pdev->cmask + offset, 0xFF, size);
> > > > > > +
> > > > > > +    if (cap_id != PCI_CAP_ID_VNDR) {
> > > > > > +        /* Check non-vendor specific capability by default */
> > > > > > +        memset(pdev->cmask + offset, 0xFF, size);
> > > > > > +    }
> > > > > >         return offset;
> > > > > >     }
> > > > > > 
> > > > > 
> > > > 
> > > > If there is a possibility that the data within the vendor specific cap
> > > > can be consumed by the driver or diagnostic tools, then it's part of
> > > > the device ABI and should be consistent across migration.  A mismatch
> > > > can certainly cause a migration failure, but why shouldn't it?
> > > 
> > > Sure, the device ABI should be consistent across migration. In case of
> > > VSC, it should represent same format on source and destination. But
> > > shouldn't VSC content check or its interpretation be left to vendor
> > > driver instead of qemu?
> > 
> > By "vendor driver" here, are you suggesting that QEMU device models (ex.
> > hw/net/{e1000*,igb*,rtl8139*}) should perform that validation?  If so,
> > where's the patch that introduces any sort of validation hooks for
> > vendors to provide?  Where is this validation going to happen in the
> > case of a migratable vfio-pci variant devices?  Nothing about this
> > patch suggests that it's deferring responsibility to some other code
> > entity, it only indicates "checking this breaks, let's not do it".
> > 
> > It's possible that the device you care about only reports volatile
> > diagnostic information through the vendor specific capability, but
> > another device might use it to report information relative to the
> > internal hardware configuration.  Without knowing what the vendor
> > specific capability contains, QEMU needs to take the most conservative
> > approach by default.  Thanks,
> 
> PCI/PCIe spec doesn’t define ABI for VSC/VSEC contents. Any other code
> entity except vendor driver should ignore VSC contents. QEMU’s expectation
> of VSC contents to be equal on source and destination seems incorrect given
> that QEMU has no control over ABI for VSC contents.
> 
> > 
> > Alex
> > 

I don't get why this matters though. This is no different from any other
device specific register. If a register is visible to guest it generally
should not change across migration. If you are migrating a VFIO device and
you are making a vsc visible to guest then your migration routine must
make sure to migrate the contents of vsc.

Maybe there's a good reason to have a register which actually does
change. Then, please document the actual reason. When you say:

	 Ideally qemu should not check VSC during restore/load.

then that is clearly wrong in most cases.


-- 
MST

next prev parent reply	other threads:[~2024-02-01 18:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-30  9:56 [PATCH] hw/pci: migration: Skip config space check for vendor specific capability during restore/load Vinayak Kale
2024-01-30 18:02 ` Vinayak Kale
2024-01-30 18:58   ` Alex Williamson
2024-01-31  9:52     ` Vinayak Kale
2024-01-31 17:38       ` Alex Williamson
2024-02-01 17:38         ` Vinayak Kale
2024-02-01 18:10           ` Michael S. Tsirkin [this message]
2024-02-02  0:03             ` Alex Williamson
2024-02-09  9:17               ` Vinayak Kale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240201130440-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=targupta@nvidia.com \
    --cc=vkale@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).