From: Avi Kivity <avi@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org, jan.kiszka@siemens.com, mst@redhat.com,
jbaron@redhat.com
Subject: Re: [PATCH v2] kvm: Disable MSI/MSI-X in assigned device reset path
Date: Sun, 08 Apr 2012 16:14:29 +0300 [thread overview]
Message-ID: <4F818F35.7010905@redhat.com> (raw)
In-Reply-To: <20120405033450.25661.68810.stgit@bling.home>
On 04/05/2012 06:42 AM, Alex Williamson wrote:
> We've hit a kernel host panic, when issuing a 'system_reset' with an
> 82576 nic assigned and a Windows guest. Host system is a PowerEdge R815.
>
> [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 32993
> [Hardware Error]: APEI generic hardware error status
> [Hardware Error]: severity: 1, fatal
> [Hardware Error]: section: 0, severity: 1, fatal
> [Hardware Error]: flags: 0x01
> [Hardware Error]: primary
> [Hardware Error]: section_type: PCIe error
> [Hardware Error]: port_type: 0, PCIe end point
> [Hardware Error]: version: 1.0
> [Hardware Error]: command: 0x0000, status: 0x0010
> [Hardware Error]: device_id: 0000:08:00.0
> [Hardware Error]: slot: 1
> [Hardware Error]: secondary_bus: 0x00
> [Hardware Error]: vendor_id: 0x8086, device_id: 0x10c9
> [Hardware Error]: class_code: 000002
> [Hardware Error]: aer_status: 0x00100000, aer_mask: 0x00018000
> [Hardware Error]: Unsupported Request
> [Hardware Error]: aer_layer=Transaction Layer, aer_agent=Requester ID
> [Hardware Error]: aer_uncor_severity: 0x00067011
> [Hardware Error]: aer_tlp_header: 40001001 0020000f edbf800c 01000000
> [Hardware Error]: section: 1, severity: 1, fatal
> [Hardware Error]: flags: 0x01
> [Hardware Error]: primary
> [Hardware Error]: section_type: PCIe error
> [Hardware Error]: port_type: 0, PCIe end point
> [Hardware Error]: version: 1.0
> [Hardware Error]: command: 0x0000, status: 0x0010
> [Hardware Error]: device_id: 0000:08:00.0
> [Hardware Error]: slot: 1
> [Hardware Error]: secondary_bus: 0x00
> [Hardware Error]: vendor_id: 0x8086, device_id: 0x10c9
> [Hardware Error]: class_code: 000002
> [Hardware Error]: aer_status: 0x00100000, aer_mask: 0x00018000
> [Hardware Error]: Unsupported Request
> [Hardware Error]: aer_layer=Transaction Layer, aer_agent=Requester ID
> [Hardware Error]: aer_uncor_severity: 0x00067011
> [Hardware Error]: aer_tlp_header: 40001001 0020000f edbf800c 01000000
> Kernel panic - not syncing: Fatal hardware error!
> Pid: 0, comm: swapper Not tainted 2.6.32-242.el6.x86_64 #1
> Call Trace:
> <NMI> [<ffffffff814f2fe5>] ? panic+0xa0/0x168
> [<ffffffff812f919c>] ? ghes_notify_nmi+0x17c/0x180
> [<ffffffff814f91d5>] ? notifier_call_chain+0x55/0x80
> [<ffffffff814f923a>] ? atomic_notifier_call_chain+0x1a/0x20
> [<ffffffff8109667e>] ? notify_die+0x2e/0x30
> [<ffffffff814f6e81>] ? do_nmi+0x1a1/0x2b0
> [<ffffffff814f6760>] ? nmi+0x20/0x30
> [<ffffffff8103762b>] ? native_safe_halt+0xb/0x10
> <<EOE>> [<ffffffff8101495d>] ? default_idle+0x4d/0xb0
> [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
> [<ffffffff814da63a>] ? rest_init+0x7a/0x80
> [<ffffffff81c1ff7b>] ? start_kernel+0x424/0x430
> [<ffffffff81c1f33a>] ? x86_64_start_reservations+0x125/0x129
> [<ffffffff81c1f438>] ? x86_64_start_kernel+0xfa/0x109
>
> The root cause of the problem is that the 'reset_assigned_device()' code
> first writes a 0 to the command register. Then, when qemu subsequently does
> a kvm_deassign_irq() (called by assign_irq(), in the system_reset path),
> the kernel ends up calling '__msix_mask_irq()', which performs a write to
> the memory mapped msi vector space. Since, we've explicitly told the device
> to disallow mmio access (via the 0 write to the command register), we end
> up with the above 'Unsupported Request'.
>
> The fix here is to first disable MSI-X, before doing the reset. We also
> disable MSI, leaving the device in INTx mode. In this way, the device is
> a known state after reset, and we avoid touching msi memory mapped space
> on any subsequent 'kvm_deassign_irq()'.
>
> Thanks to Michael S. Tsirkin for help in understanding what was going on
> here and Jason Baron, the original debugger of this problem.
>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>
> Jason is out of the office for a couple weeks, so I'll try to resolve
> this while he's away. Somehow the emulated config updates were lost
> in Jason's original posting, so I've fixed that and taken Jan's suggestion
> to simply call into the update functions instead of open coding the
> interrupt disable. I think there still may be some disagreements about
> how to handle guest generated errors in the host, but that's a large
> project whereas this is something we should be doing at reset anyway,
> and even if only a workaround, resolves the problem above.
>
> hw/device-assignment.c | 23 +++++++++++++++++++++++
> 1 files changed, 23 insertions(+), 0 deletions(-)
>
> diff --git a/hw/device-assignment.c b/hw/device-assignment.c
> index 89823f1..2e6b93e 100644
> --- a/hw/device-assignment.c
> +++ b/hw/device-assignment.c
> @@ -1613,6 +1613,29 @@ static void reset_assigned_device(DeviceState *dev)
> const char reset[] = "1";
> int fd, ret;
>
> + /*
> + * If a guest is reset without being shutdown, MSI/MSI-X can still
> + * be running. We want to return the device to a known state on
> + * reset, so disable those here. We especially do not want MSI-X
> + * enabled since it lives in MMIO space, which is about to get
> + * disabled.
> + */
> + if (adev->irq_requested_type & KVM_DEV_IRQ_GUEST_MSIX) {
> + uint16_t ctrl = pci_get_word(pci_dev->config +
> + pci_dev->msix_cap + PCI_MSIX_FLAGS);
> +
> + pci_set_word(pci_dev->config + pci_dev->msix_cap + PCI_MSIX_FLAGS,
> + ctrl & ~PCI_MSIX_FLAGS_ENABLE);
> + assigned_dev_update_msix(pci_dev);
> + } else if (adev->irq_requested_type & KVM_DEV_IRQ_GUEST_MSI) {
> + uint8_t ctrl = pci_get_byte(pci_dev->config +
> + pci_dev->msi_cap + PCI_MSI_FLAGS);
> +
> + pci_set_byte(pci_dev->config + pci_dev->msi_cap + PCI_MSI_FLAGS,
> + ctrl & ~PCI_MSI_FLAGS_ENABLE);
> + assigned_dev_update_msi(pci_dev);
> + }
> +
Don't we FLR the device, which ought to disable MSI on the real device?
So it seems to me the correct approach is to synchronize the emulated
config space from the real config space after FLR.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2012-04-08 13:14 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-05 3:42 [PATCH v2] kvm: Disable MSI/MSI-X in assigned device reset path Alex Williamson
2012-04-05 7:28 ` Jan Kiszka
2012-04-05 9:34 ` Michael S. Tsirkin
2012-04-05 14:42 ` Alex Williamson
2012-04-05 15:04 ` Michael S. Tsirkin
2012-04-08 13:14 ` Avi Kivity [this message]
2012-04-08 13:17 ` Michael S. Tsirkin
2012-04-08 13:18 ` Avi Kivity
2012-04-08 13:21 ` Michael S. Tsirkin
2012-04-08 13:24 ` Avi Kivity
2012-04-08 13:30 ` Michael S. Tsirkin
2012-04-08 13:41 ` Avi Kivity
2012-04-08 13:53 ` Michael S. Tsirkin
2012-04-08 14:01 ` Avi Kivity
2012-04-08 14:42 ` Michael S. Tsirkin
2012-04-08 15:26 ` Avi Kivity
2012-04-08 15:46 ` Michael S. Tsirkin
2012-04-08 15:50 ` Avi Kivity
2012-04-08 16:04 ` Michael S. Tsirkin
2012-04-08 16:08 ` Avi Kivity
2012-04-08 17:37 ` Jan Kiszka
2012-04-08 18:18 ` Michael S. Tsirkin
2012-04-08 18:39 ` Jan Kiszka
2012-04-08 20:35 ` Michael S. Tsirkin
2012-04-09 8:35 ` Avi Kivity
2012-04-10 16:55 ` Alex Williamson
2012-04-16 14:03 ` Alex Williamson
2012-04-16 14:31 ` Avi Kivity
2012-04-16 15:06 ` Michael S. Tsirkin
2012-04-16 15:10 ` Jan Kiszka
2012-04-16 16:08 ` Michael S. Tsirkin
2012-04-16 16:13 ` Jan Kiszka
2012-04-16 16:36 ` Michael S. Tsirkin
2012-04-16 16:38 ` Jan Kiszka
2012-04-16 17:12 ` Michael S. Tsirkin
2012-04-16 18:47 ` Jan Kiszka
2012-04-16 16:12 ` Jason Baron
2012-04-16 16:34 ` Michael S. Tsirkin
2012-04-16 19:07 ` Alex Williamson
2012-04-16 19:47 ` Michael S. Tsirkin
2012-04-17 0:55 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F818F35.7010905@redhat.com \
--to=avi@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=jbaron@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox