From: "Wassenberg, Dennis" <Dennis.Wassenberg@secunet.com>
To: "ilpo.jarvinen@linux.intel.com" <ilpo.jarvinen@linux.intel.com>
Cc: "kbusch@kernel.org" <kbusch@kernel.org>,
"mika.westerberg@linux.intel.com"
<mika.westerberg@linux.intel.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"mpearson-lenovo@squebb.ca" <mpearson-lenovo@squebb.ca>,
"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
"minipli@grsecurity.net" <minipli@grsecurity.net>,
"lukas@wunner.de" <lukas@wunner.de>
Subject: Re: UAF during boot on MTL based devices with attached dock
Date: Mon, 7 Oct 2024 16:34:27 +0000 [thread overview]
Message-ID: <67e3fc901fcb76a5386cb6378be4381c94039670.camel@secunet.com> (raw)
In-Reply-To: <289bcd4d-099a-810b-0854-b11223f50a9c@linux.intel.com>
Hi,
On Thu, 2024-09-26 at 16:58 +0300, Ilpo Järvinen wrote:
> On Wed, 25 Sep 2024, Wassenberg, Dennis wrote:
> > On Tue, 2024-09-24 at 13:51 +0300, Ilpo Järvinen wrote:
> > > On Mon, 23 Sep 2024, Wassenberg, Dennis wrote:
> > >
> > > > Hi together,
> > > >
> > > > we did some further analysis on this:
> > > >
> > > > Because we are working on kernel 6.8.12, I will use some logs from this kernel version, just for demonstration.
> > > > The
> > > > initial report was based on 6.11.
> > > >
> > > > After we tried a KASAN build (dmesg-ramoops-kasan) it looks like it is exactly the same pciehp flow which leads
> > > > to
> > > > the
> > > > UAF.
> > > > Both going through pciehp_ist -> pciehp_disable_slot -> pciehp_unconfigure_device -> pci_remove_bus_device ->
> > > > ...
> > > > This means there are two consecutive interrupts, running on CPU 12 and both will execute the same flow.
> > > > At the latest the pci_lock_rescan_remove should be taken in pciehp_unconfigure_device to prevent accessing the
> > > > pci/bus
> > > > structures in parallel.
> > > >
> > > > I had a look if there are shared data structures accessed in this code path:
> > > > For me the access to "*parent = ctrl->pcie->port->subordinate;" looks fishy in pciehp_unconfigure_device. The
> > > > parent
> > > > ptr
> > > > will be obtained before getting the lock (pci_lock_rescan_remove). Now, if there are two concurrent/consecutive
> > > > flows
> > > > come into this function, both will get the pointer to the parent bridge/subordinate. One thread will enter the
> > > > lock
> > > > and
> > > > the other one is waiting until the lock is gone. The thread which enters the lock at first will completely
> > > > remove
> > > > the
> > > > bridge and the subordinate: pciehp_unconfigure_device -> pci_stop_and_remove_bus_device -> pci_remove_bus_device
> > > > ->
> > > > pci_destroy_dev: This will destroy the pci_dev and the subordinate is a part the this structure as well. Now
> > > > everything
> > > > is gone below this pci_bus (childs included). In pci_remove_bus_device there is a loop which iterates over all
> > > > child
> > > > devices and call pci_remove_bus_device again. This means even the child bridges of the current bridge will be
> > > > deleted.
> > > > In the end: everything is gone below the bridge which is regarded here at first.
> > >
> > > Doesn't that end up removing portdrv/hotplug too so pciehp_remove() does
> > > release ctrl? I'm not sure if ctrl can be safely accessed even if the
> > > lock is taken first?
> >
> > Yes, it looks like it ends up in removing portdrv/hotplug too. I am not sure if this can be safely accessed. For
> > testing
> > I added "set_service_data(dev, NULL);" at the end of pciehp_remove. This should make sure that it is not possible to
> > access freed ctrl. If there is a flow which accesses this, it should result in a null-ptr instead of UAF. I did some
> > runs with this change but I always ran into the UAF.
>
> Okay, perhaps it doesn't occur for some reason. I suppose the reason is
> that the concurrent pciehp_ist() waits for the lock in
> pciehp_unconfigure_device() and since it has not yet returned,
> free_irq() is what keeps the hotplug & ctrl getting removed.
> So it seems to me your change is fine.
>
> > For me it looks more related to the slot object. If I compare two runs (one with dyndbg enabled for pci and one
> > without)
> > it will access the failing address in the __dynamic_dev_dbg portion at pci_destroy_slot in case of the dyndbg
> > enabled
> > run. In case of the non dyndbg run it will fail while accessing
> > "kobject_put(&slot->kobj);" in pci_destroy_slot.
>
> The first error is
>
> <3>[ 10.244423] BUG: KASAN: slab-use-after-free in pci_slot_release+0x36e/0x3e0
>
> so how you inferred it occurs in pci_destroy_slot()?
Ok, yes. You are right. I was blinded by dev_dbg in pci_destroy_slot ;)
>
> > Unfortunately I have currently no clue about how can this slot object
> > ever been destroyed prematurely.
>
> There are dev_dbg()s on the paths that lead to destruction of the slot
> object. I don't see any of those lines in your logs so I don't believe
> that has occurred here.
>
> > I attach the logs of both runs. I know, one is based on an other kernel version but there it is more easy to
> > reproduce
> > with KASAN enabled.
>
> What in these logs indicate to you it would be slot access which fails? To
> me it looks in both cases access to ->bus is the culprit (it also explains
> why dyndbg on/off matters because pci_destroy_slot() will not access ->bus
> otherwise so it can get all the way into pci_slot_release() before
> blowing up).
>
next prev parent reply other threads:[~2024-10-07 16:34 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-19 8:06 UAF during boot on MTL based devices with attached dock Wassenberg, Dennis
2024-09-21 9:08 ` Ilpo Järvinen
2024-09-23 8:38 ` Wassenberg, Dennis
2024-09-23 4:41 ` mika.westerberg
2024-09-23 8:43 ` Wassenberg, Dennis
2024-09-23 11:17 ` mika.westerberg
2024-09-23 13:42 ` Wassenberg, Dennis
2024-09-23 12:23 ` Wassenberg, Dennis
2024-09-24 10:51 ` Ilpo Järvinen
2024-09-25 15:38 ` Wassenberg, Dennis
2024-09-26 13:58 ` Ilpo Järvinen
2024-10-07 16:34 ` Wassenberg, Dennis [this message]
2024-10-03 13:46 ` Lukas Wunner
2024-10-04 7:45 ` Lukas Wunner
2024-10-07 16:49 ` Wassenberg, Dennis
2024-10-08 13:58 ` Lukas Wunner
2024-10-08 16:37 ` mika.westerberg
2024-10-08 18:23 ` Lukas Wunner
2024-10-09 4:44 ` mika.westerberg
2024-10-09 11:47 ` Lukas Wunner
2024-10-09 12:55 ` mika.westerberg
2024-10-09 6:26 ` Wassenberg, Dennis
2024-10-07 16:20 ` Wassenberg, Dennis
2024-09-24 8:54 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67e3fc901fcb76a5386cb6378be4381c94039670.camel@secunet.com \
--to=dennis.wassenberg@secunet.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mika.westerberg@linux.intel.com \
--cc=minipli@grsecurity.net \
--cc=mpearson-lenovo@squebb.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox