From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2642C433EF for ; Wed, 15 Jun 2022 15:06:55 +0000 (UTC) Received: from localhost ([::1]:37488 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o1Ube-0001Pc-RJ for qemu-devel@archiver.kernel.org; Wed, 15 Jun 2022 11:06:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47528) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o1UOT-00076U-Jg for qemu-devel@nongnu.org; Wed, 15 Jun 2022 10:53:17 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:26652) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o1UOR-00014o-Jb for qemu-devel@nongnu.org; Wed, 15 Jun 2022 10:53:17 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25FE26rt000892; Wed, 15 Jun 2022 14:53:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2021-07-09; bh=tzSCMG7bNQDetdCrE5pKsy8MNosMSyxzTyUFS4jJchA=; b=e0lSDuIu/5b9ryyOr2oWTqL1xp2+DhhFPY+bxWiRa/1m3DeU0mSaSJKXMfV0uHu5ZxER RKBz2rZH6yGSyyv/hAZVYpvWTsAblr3kqyXW4zhbYr/C56eOkBRDNxtki0emb0TgYZit JxFqXlh+lbQ1lPMoCcr27eqbuD/WlP/I4zc+l2UTCU38YS0PkInfn5kTl9YDFrQTYBbZ lRQeCZRVgVrys1E8DcvVfe16eF8/wToimT1C0Vf5rn8TMCeQc4TTNOpmbgNGtxf1lpPL 6VS1503oUl2BHPWMrWXeEY1f9sBvltyoZZxO6cwYC/WFbG97g8aj3Q65pSrDEUXJaFJx ZQ== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3gmjx9gng6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Jun 2022 14:53:04 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.16.1.2/8.16.1.2) with SMTP id 25FEQ6AL023069; Wed, 15 Jun 2022 14:53:03 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com with ESMTP id 3gpr25vq23-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Jun 2022 14:53:03 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 25FEqSNf018501; Wed, 15 Jun 2022 14:53:03 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com with ESMTP id 3gpr25vpfm-30; Wed, 15 Jun 2022 14:53:03 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Stefan Hajnoczi , =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= , =?UTF-8?q?Alex=20Benn=C3=A9e?= , "Dr. David Alan Gilbert" , "Michael S. Tsirkin" , Marcel Apfelbaum , Alex Williamson , "Daniel P. Berrange" , Juan Quintela , Markus Armbruster , Eric Blake , Jason Zeng , Zheng Chuan , Steve Sistare , Mark Kanda , Guoyi Tu , Peter Maydell , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Igor Mammedov , David Hildenbrand , John Snow Subject: [PATCH V8 29/39] vfio-pci: cpr part 3 (intx) Date: Wed, 15 Jun 2022 07:52:16 -0700 Message-Id: <1655304746-102776-30-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1655304746-102776-1-git-send-email-steven.sistare@oracle.com> References: <1655304746-102776-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-ORIG-GUID: vbCpwc6EDNRzwRbaDAwa9hBd_iOwBkbj X-Proofpoint-GUID: vbCpwc6EDNRzwRbaDAwa9hBd_iOwBkbj Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Preserve vfio INTX state across cpr restart. Preserve VFIOINTx fields as follows: pin : Recover this from the vfio config in kernel space interrupt : Preserve its eventfd descriptor across exec. unmask : Ditto route.irq : This could perhaps be recovered in vfio_pci_post_load by calling pci_device_route_intx_to_irq(pin), whose implementation reads config space for a bridge device such as ich9. However, there is no guarantee that the bridge vmstate is read before vfio vmstate. Rather than fiddling with MigrationPriority for vmstate handlers, explicitly save route.irq in vfio vmstate. pending : save in vfio vmstate. mmap_timeout, mmap_timer : Re-initialize bool kvm_accel : Re-initialize In vfio_realize, defer calling vfio_intx_enable until the vmstate is available, in vfio_pci_post_load. Modify vfio_intx_enable and vfio_intx_kvm_enable to skip vfio initialization, but still perform kvm initialization. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 83 insertions(+), 9 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 2fd7121..b8aee91 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -173,14 +173,45 @@ static void vfio_intx_eoi(VFIODevice *vbasedev) vfio_unmask_single_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX); } +#ifdef CONFIG_KVM +static bool vfio_no_kvm_intx(VFIOPCIDevice *vdev) +{ + return vdev->no_kvm_intx || !kvm_irqfds_enabled() || + vdev->intx.route.mode != PCI_INTX_ENABLED || + !kvm_resamplefds_enabled(); +} +#endif + +static void vfio_intx_reenable_kvm(VFIOPCIDevice *vdev, Error **errp) +{ +#ifdef CONFIG_KVM + if (vfio_no_kvm_intx(vdev)) { + return; + } + + if (vfio_notifier_init(vdev, &vdev->intx.unmask, "intx-unmask", 0)) { + error_setg(errp, "vfio_notifier_init intx-unmask failed"); + return; + } + + if (kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, + &vdev->intx.interrupt, + &vdev->intx.unmask, + vdev->intx.route.irq)) { + error_setg_errno(errp, errno, "failed to setup resample irqfd"); + return; + } + + vdev->intx.kvm_accel = true; +#endif +} + static void vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp) { #ifdef CONFIG_KVM int irq_fd = event_notifier_get_fd(&vdev->intx.interrupt); - if (vdev->no_kvm_intx || !kvm_irqfds_enabled() || - vdev->intx.route.mode != PCI_INTX_ENABLED || - !kvm_resamplefds_enabled()) { + if (vfio_no_kvm_intx(vdev)) { return; } @@ -328,7 +359,13 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) return 0; } - vfio_disable_interrupts(vdev); + /* + * Do not alter interrupt state during vfio_realize and cpr-load. The + * reused flag is cleared thereafter. + */ + if (!vdev->vbasedev.reused) { + vfio_disable_interrupts(vdev); + } vdev->intx.pin = pin - 1; /* Pin A (1) -> irq[0] */ pci_config_set_interrupt_pin(vdev->pdev.config, pin); @@ -353,6 +390,11 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) fd = event_notifier_get_fd(&vdev->intx.interrupt); qemu_set_fd_handler(fd, vfio_intx_interrupt, NULL, vdev); + if (vdev->vbasedev.reused) { + vfio_intx_reenable_kvm(vdev, &err); + goto finish; + } + if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) { qemu_set_fd_handler(fd, NULL, NULL, vdev); @@ -365,6 +407,7 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) warn_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); } +finish: vdev->interrupt = VFIO_INT_INTx; trace_vfio_intx_enable(vdev->vbasedev.name); @@ -3195,9 +3238,13 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vfio_intx_routing_notifier); vdev->irqchip_change_notifier.notify = vfio_irqchip_change; kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); - ret = vfio_intx_enable(vdev, errp); - if (ret) { - goto out_deregister; + + /* Wait until cpr-load reads intx routing data to enable */ + if (!vdev->vbasedev.reused) { + ret = vfio_intx_enable(vdev, errp); + if (ret) { + goto out_deregister; + } } } @@ -3474,6 +3521,7 @@ static int vfio_pci_post_load(void *opaque, int version_id) VFIOPCIDevice *vdev = opaque; PCIDevice *pdev = &vdev->pdev; int nr_vectors; + int ret = 0; if (msix_enabled(pdev)) { msix_set_vector_notifiers(pdev, vfio_msix_vector_use, @@ -3486,10 +3534,35 @@ static int vfio_pci_post_load(void *opaque, int version_id) vfio_claim_vectors(vdev, nr_vectors, false); } else if (vfio_pci_read_config(pdev, PCI_INTERRUPT_PIN, 1)) { - assert(0); /* completed in a subsequent patch */ + Error *err = 0; + ret = vfio_intx_enable(vdev, &err); + if (ret) { + error_report_err(err); + } } - return 0; + return ret; +} + +static const VMStateDescription vfio_intx_vmstate = { + .name = "vfio-intx", + .unmigratable = 1, + .version_id = 0, + .minimum_version_id = 0, + .fields = (VMStateField[]) { + VMSTATE_BOOL(pending, VFIOINTx), + VMSTATE_UINT32(route.mode, VFIOINTx), + VMSTATE_INT32(route.irq, VFIOINTx), + VMSTATE_END_OF_LIST() + } +}; + +#define VMSTATE_VFIO_INTX(_field, _state) { \ + .name = (stringify(_field)), \ + .size = sizeof(VFIOINTx), \ + .vmsd = &vfio_intx_vmstate, \ + .flags = VMS_STRUCT, \ + .offset = vmstate_offset_value(_state, _field, VFIOINTx), \ } static bool vfio_pci_needed(void *opaque) @@ -3509,6 +3582,7 @@ static const VMStateDescription vfio_pci_vmstate = { .fields = (VMStateField[]) { VMSTATE_PCI_DEVICE(pdev, VFIOPCIDevice), VMSTATE_MSIX_TEST(pdev, VFIOPCIDevice, vfio_msix_present), + VMSTATE_VFIO_INTX(intx, VFIOPCIDevice), VMSTATE_END_OF_LIST() } }; -- 1.8.3.1