All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Leonardo Bras <leobras@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	 Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	 Peter Xu <peterx@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v2 1/1] pcie: Add hotplug detect state register to cmask
Date: Thu, 06 Jul 2023 09:37:40 +0200	[thread overview]
Message-ID: <87o7kpbid7.fsf@secure.mitica> (raw)
In-Reply-To: <20230706045546.593605-3-leobras@redhat.com> (Leonardo Bras's message of "Thu, 6 Jul 2023 01:55:47 -0300")

Leonardo Bras <leobras@redhat.com> wrote:
> When trying to migrate a machine type pc-q35-6.0 or lower, with this
> cmdline options,
>
> -device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \
> -device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1
>
> the following bug happens after all ram pages were sent:
>
> qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
> qemu-kvm: Failed to load PCIDevice:config
> qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj
> qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port'
> qemu-kvm: load of migration failed: Invalid argument
>
> This happens on pc-q35-6.0 or lower because of:
> { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }
>
> In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(),
> which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI
> hotplug for the guest. After a while the guest will deal with this hotplug
> and qemu will clear the above bit.
>
> Then, during migration, get_pci_config_device() will compare the
> configs of both the freshly created device and the one that is being
> received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit
> and cause the bug to reproduce.
>
> To avoid this fake incompatibility, there are tree fields in PCIDevice that
> can help:
>
> - wmask: Used to implement R/W bytes, and
> - w1cmask: Used to implement RW1C(Write 1 to Clear) bytes
> - cmask: Used to enable config checks on load.
>
> According to PCI Express® Base Specification Revision 5.0 Version 1.0,
> table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is
> listed as RO (read-only), so it only makes sense to make use of the cmask
> field.
>
> So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on
> get_pci_config_device() does not abort the migration.
>
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819
> Signed-off-by: Leonardo Bras <leobras@redhat.com>




> ---
>  hw/pci/pcie.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index b8c24cf45f..cae56bf1c8 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -659,6 +659,10 @@ void pcie_cap_slot_init(PCIDevice *dev, PCIESlot *s)
>      pci_word_test_and_set_mask(dev->w1cmask + pos + PCI_EXP_SLTSTA,
>                                 PCI_EXP_HP_EV_SUPPORTED);
>  
> +    /* Avoid migration abortion when this device hot-removed by guest
> */

I would have included here the text in the commit:

 According to PCI Express® Base Specification Revision 5.0 Version 1.0,
 table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is
 listed as RO (read-only), so it only makes sense to make use of the cmask
 field.

and

This happens on pc-q35-6.0 or lower because of:
{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }

so if we ever remove the machine type pc-q35-6.0, we can drop it.

Yes, I know that we don't drop machine types, but we should at some point.


> +    pci_word_test_and_clear_mask(dev->cmask + pos + PCI_EXP_SLTSTA,
> +                                 PCI_EXP_SLTSTA_PDS);
> +
>      dev->exp.hpev_notified = false;
>  
>      qbus_set_hotplug_handler(BUS(pci_bridge_get_sec_bus(PCI_BRIDGE(dev))),

I agree that this is (at least) a step on the right direction.

I wmould had expected to have to need some check related to the value
of:

{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }

But I will not claim _any_ understanding of the PCI specification.

So:

Reviewed-by: Juan Quintela <quintela@redhat.com>

about that it fixes the migration bug.



  reply	other threads:[~2023-07-06  7:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-06  4:55 [PATCH v2 1/1] pcie: Add hotplug detect state register to cmask Leonardo Bras
2023-07-06  7:37 ` Juan Quintela [this message]
2023-07-06 17:58   ` Leonardo Bras Soares Passos
2023-07-06 14:35 ` Peter Xu
2023-07-06 18:07   ` Leonardo Bras Soares Passos
2023-07-06 18:14     ` Peter Xu
2023-07-06 18:37       ` Leonardo Bras Soares Passos
2023-07-06 18:50       ` Michael S. Tsirkin
2023-07-06 19:02         ` Peter Xu
2023-07-06 20:00           ` Michael S. Tsirkin
2023-07-06 20:17             ` Peter Xu
2023-07-10 17:49             ` Leonardo Bras Soares Passos
2023-07-10 18:16               ` Michael S. Tsirkin
2023-07-10 21:48                 ` Leonardo Bras Soares Passos
2023-07-06 21:47       ` Leonardo Bras Soares Passos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7kpbid7.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=leobras@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.