From: Bjorn Helgaas <helgaas@kernel.org>
To: Ding Hui <dinghui@sangfor.com.cn>
Cc: bhelgaas@google.com, sathyanarayanan.kuppuswamy@linux.intel.com,
vidyas@nvidia.com, david.e.box@linux.intel.com,
kai.heng.feng@canonical.com, michael.a.bottini@linux.intel.com,
rajatja@google.com, qinzongquan@sangfor.com.cn,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PCI/ASPM: fix UAF by removing cached downstream
Date: Mon, 1 May 2023 11:52:54 -0500 [thread overview]
Message-ID: <20230501165254.GA589004@bhelgaas> (raw)
In-Reply-To: <03b45702-d799-f299-1c24-4e5e2e2897d2@sangfor.com.cn>
On Mon, May 01, 2023 at 11:50:50AM +0800, Ding Hui wrote:
> On 2023/5/1 10:10, Bjorn Helgaas wrote:
> > On Sat, Apr 29, 2023 at 09:26:04PM +0800, Ding Hui wrote:
> > > If the function 0 of a multifunction device is removed, an freed
> > > downstream pointer will be left in struct pcie_link_state, and then
> > > when pcie_config_aspm_link() be invoked from any path, we will get a
> > > KASAN use-after-free report.
> >
> > Thanks for finding this problem, debugging it, and the patch!
> >
> > In this case we're doing a "software remove" and the other functions
> > are still present, right? It's kind of annoying that there's only one
> > link, but all the functions of a multifunction device have a Link
> > Control register, and the spec "recommends" that software program the
> > same ASPM control value for all the functions.
>
> Yes, that is the case.
>
> > The hardware of course doesn't know anything about this software
> > remove; all the functions are still physically present and powered up.
> >
> > That makes me think that if software ignores the "removed" function
> > and continues to operate ASPM on the N-1 remaining functions, we're
> > outside the spec recommendations because the ASPM configuration is no
> > longer the same across all the functions.
> >
> > So my inclination would be disable ASPM completely when any function
> > of a multi-function device is removed. What are your thoughts on
> > this?
>
> Agree with you.
>
> Previously, I thought another fix that was if the function 0 is removed,
> we can free the link state to disable ASPM for this link.
>
> Now following you suggestion, it can be expanded to any child function.
>
> How about fixing like this?
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 66d7514ca111..657e0647d19f 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1011,12 +1011,11 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
> down_read(&pci_bus_sem);
> mutex_lock(&aspm_lock);
> /*
> - * All PCIe functions are in one slot, remove one function will remove
> - * the whole slot, so just wait until we are the last function left.
> + * All PCIe functions are in one slot.
> + * The spec "recommends" that software program set the same ASPM control
> + * value for all the functions.
> + * Disable ASPM when any child function is removed.
Since we're updating the comment anyway, let's clean up the "slot"
language here. The PCIe spec doesn't use "slot" in the context of the
bus/device/function PCIe topology; it only uses it when referring to a
physical connector where a card might be installed. What we want here
is "Device," and then we have to consider whether ARI makes any
difference here.
The spec says (referring to ASPM Control):
For Multi-Function Devices (including ARI Devices), it is
recommended that software program the same value for this field in
all Functions. For non-ARI Multi-Function Devices, only capabilities
enabled in all Functions are enabled for the component as a whole.
For ARI Devices, ASPM Control is determined solely by the setting in
Function 0, regardless of Function 0’s D-state. The settings in the
other Functions always return whatever value software programmed for
each, but otherwise are ignored by the component.
A spec reference, e.g., "PCIe r6.0, sec 7.5.3.7", would be good here.
Anyway, I think the idea of "software removing" a single function is
kind of a niche situation that we don't need to worry about
optimizing, and I think turning off ASPM completely will avoid a lot
of weird corner cases.
> */
> - if (!list_empty(&parent->subordinate->devices))
> - goto out;
> -
> link = parent->link_state;
> root = link->root;
> parent_link = link->parent;
>
>
> --
> Thanks,
> -dinghui
>
next prev parent reply other threads:[~2023-05-01 16:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-29 13:26 [PATCH] PCI/ASPM: fix UAF by removing cached downstream Ding Hui
2023-04-29 14:58 ` Ding Hui
2023-05-01 2:10 ` Bjorn Helgaas
2023-05-01 3:50 ` Ding Hui
2023-05-01 16:52 ` Bjorn Helgaas [this message]
2023-05-03 13:31 ` Ding Hui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230501165254.GA589004@bhelgaas \
--to=helgaas@kernel.org \
--cc=bhelgaas@google.com \
--cc=david.e.box@linux.intel.com \
--cc=dinghui@sangfor.com.cn \
--cc=kai.heng.feng@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=michael.a.bottini@linux.intel.com \
--cc=qinzongquan@sangfor.com.cn \
--cc=rajatja@google.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=vidyas@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox