From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D660C433F5 for ; Tue, 17 May 2022 18:27:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348374AbiEQS1S (ORCPT ); Tue, 17 May 2022 14:27:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231199AbiEQS1S (ORCPT ); Tue, 17 May 2022 14:27:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BC2A63878F for ; Tue, 17 May 2022 11:27:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1652812035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=808kPoCVZ9QBXQQ50NY9PebUEFZ1B+K2Bsp/en8vgGY=; b=hIpYi/sGN2DecSuoV+aN+/5uy97WasBMnQL/n4kbh3XyBo3J2E4ufENMgixlxA7jfbwPyq NSG5GKRT4CCuW2QxV837+nuhLlAjJdFkVPD56gWrlx0tybTB4tsQsuIPvEc3mpfEifh+1m 79SrGJrgyTzkl3S1Q8OSGSJEmuOjDy0= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-445-Dg_ySqA8OQmIcXuk0pISGQ-1; Tue, 17 May 2022 14:27:14 -0400 X-MC-Unique: Dg_ySqA8OQmIcXuk0pISGQ-1 Received: by mail-il1-f197.google.com with SMTP id s6-20020a056e021a0600b002d0fe2b6d2fso5466723ild.12 for ; Tue, 17 May 2022 11:27:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=808kPoCVZ9QBXQQ50NY9PebUEFZ1B+K2Bsp/en8vgGY=; b=2sDZ/XKCWvEQBmfE3wkB9bPTCK6O6eEIoeVRAH7a2FNKdY3MAYgb4oV1E+doiH/IPx AHeIs/utkMEIRBML2AFe0RH2R8mz+TXvqNyWpSaadIDRXdRH7drxK+lUauy6CIVdSqhD JbXH23ZQwqwDI6yvSCbFcXBe9dx42TxmkGkTX54gg2DWWIEVCJzeFd0qqSKXh8Q69ib5 eipHd4Vv5qWgzS3Ua4aJvDwUpNG3Uf4JGc6pf6x7kJ2xP2x8uPjIFUMIvLg3lnwIGmes 8VTzdoqrHUDgUd79QZtmvLMFTNj/6XvvZb199IXsPqpvU4RoaLFJaHkZmTUJC25iayLQ +hog== X-Gm-Message-State: AOAM532Wz644eVAfGp/h7WFAKxx0z04LlKwtQgBbT7wXfxDXpDtzD++/ FELNOaC1QhQr8Ffe0g3M1duSjKQudNF2iBMV3wKceNQDkUxwCBImJ6+4YKnBmIGOgXR6GDlgO7f +MeUoEUPQsArHgwiA16k= X-Received: by 2002:a05:6638:140d:b0:32b:c643:e334 with SMTP id k13-20020a056638140d00b0032bc643e334mr13045255jad.125.1652812033712; Tue, 17 May 2022 11:27:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXPnig6pt33JX7CJmiZk9MZj+x43/FCA6lZpbkCX+oaIPq/UYqAVOL24lvn/cJscVyWc3kyA== X-Received: by 2002:a05:6638:140d:b0:32b:c643:e334 with SMTP id k13-20020a056638140d00b0032bc643e334mr13045242jad.125.1652812033478; Tue, 17 May 2022 11:27:13 -0700 (PDT) Received: from redhat.com ([38.15.36.239]) by smtp.gmail.com with ESMTPSA id r4-20020a02c844000000b0032e2dce10aesm1885083jao.160.2022.05.17.11.27.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 May 2022 11:27:12 -0700 (PDT) Date: Tue, 17 May 2022 12:27:10 -0600 From: Alex Williamson To: Abhishek Sahu Cc: Cornelia Huck , Yishai Hadas , Jason Gunthorpe , Shameer Kolothum , Kevin Tian , "Rafael J . Wysocki" , Max Gurtovoy , Bjorn Helgaas , , , , Subject: Re: [PATCH v4 2/4] vfio/pci: Change the PF power state to D0 before enabling VFs Message-ID: <20220517122710.093c9c19.alex.williamson@redhat.com> In-Reply-To: <20220517100219.15146-3-abhsahu@nvidia.com> References: <20220517100219.15146-1-abhsahu@nvidia.com> <20220517100219.15146-3-abhsahu@nvidia.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org On Tue, 17 May 2022 15:32:17 +0530 Abhishek Sahu wrote: > According to [PCIe v5 9.6.2] for PF Device Power Management States > > "The PF's power management state (D-state) has global impact on its > associated VFs. If a VF does not implement the Power Management > Capability, then it behaves as if it is in an equivalent > power state of its associated PF. > > If a VF implements the Power Management Capability, the Device behavior > is undefined if the PF is placed in a lower power state than the VF. > Software should avoid this situation by placing all VFs in lower power > state before lowering their associated PF's power state." > > From the vfio driver side, user can enable SR-IOV when the PF is in D3hot > state. If VF does not implement the Power Management Capability, then > the VF will be actually in D3hot state and then the VF BAR access will > fail. If VF implements the Power Management Capability, then VF will > assume that its current power state is D0 when the PF is D3hot and > in this case, the behavior is undefined. > > To support PF power management, we need to create power management > dependency between PF and its VF's. The runtime power management support > may help with this where power management dependencies are supported > through device links. But till we have such support in place, we can > disallow the PF to go into low power state, if PF has VF enabled. > There can be a case, where user first enables the VF's and then > disables the VF's. If there is no user of PF, then the PF can put into > D3hot state again. But with this patch, the PF will still be in D0 > state after disabling VF's since detecting this case inside > vfio_pci_core_sriov_configure() requires access to > struct vfio_device::open_count along with its locks. But the subsequent > patches related to runtime PM will handle this case since runtime PM > maintains its own usage count. > > Also, vfio_pci_core_sriov_configure() can be called at any time > (with and without vfio pci device user), so the power state change > needs to be protected with the required locks. > > Signed-off-by: Abhishek Sahu > --- > drivers/vfio/pci/vfio_pci_core.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > index b9f222ca48cf..4fe9a4efc751 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -217,6 +217,10 @@ int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t stat > bool needs_restore = false, needs_save = false; > int ret; > > + /* Prevent changing power state for PFs with VFs enabled */ > + if (pci_num_vf(pdev) && state > PCI_D0) > + return -EBUSY; > + > if (vdev->needs_pm_restore) { > if (pdev->current_state < PCI_D3hot && state >= PCI_D3hot) { > pci_save_state(pdev); > @@ -1960,6 +1964,13 @@ int vfio_pci_core_sriov_configure(struct vfio_pci_core_device *vdev, > } > list_add_tail(&vdev->sriov_pfs_item, &vfio_pci_sriov_pfs); > mutex_unlock(&vfio_pci_sriov_pfs_mutex); > + > + /* > + * The PF power state should always be higher than the VF power > + * state. If PF is in the low power state, then change the > + * power state to D0 first before enabling SR-IOV. > + */ > + vfio_pci_lock_and_set_power_state(vdev, PCI_D0); But we need to hold memory_lock across the next function or else userspace could race a write to the PM register to set D3 before pci_num_vf() can protect us. Thanks, Alex > ret = pci_enable_sriov(pdev, nr_virtfn); > if (ret) > goto out_del;