From: Greg KH <gregkh@linuxfoundation.org>
To: WangYuli <wangyuli@uniontech.com>
Cc: leoliu-oc@zhaoxin.com, dlemoal@kernel.org, arnd@kernel.org,
schnelle@linux.ibm.com, WeitaoWang-oc@zhaoxin.com,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
mathias.nyman@linux.intel.com, ulf.hansson@linaro.org,
vkoul@kernel.org, hslester96@gmail.com,
Carsten_Schmid@mentor.com, efremov@linux.com,
tonywwang@zhaoxin.com, weitaowang@zhaoxin.com,
CobeChen@zhaoxin.com, TimGuo@zhaoxin.com, wwt8723@163.com,
stern@rowland.harvard.edu, alex.williamson@redhat.com,
guanwentao@uniontech.com, xuerpeng@uniontech.com
Subject: Re: [PATCH v2] USB: Fix kernel NULL pointer when unbind UHCI form vfio-pci
Date: Mon, 26 Aug 2024 11:30:14 +0200 [thread overview]
Message-ID: <2024082631-resort-stays-b065@gregkh> (raw)
In-Reply-To: <42A38D045199FD79+20240826085455.1525536-1-wangyuli@uniontech.com>
On Mon, Aug 26, 2024 at 04:54:55PM +0800, WangYuli wrote:
> From: leoliu-oc <leoliu-oc@zhaoxin.com>
>
> This bug is found in Zhaoxin platform, but it's a commom code bug.
To be fair, this is not a normal "common" code path at all :)
>
> Fail sequence:
> step1: Unbind UHCI controller from native driver;
First off, you all know this is really an "unsupported" thing to do. I
love it how the vfio people abuse this interface for their main code
path, but remember that is NEVER what it was designed for at all. The
fact that it could possibly work at all is a miracle and everyone gets
lucky if nothing dies when they attempt to manually do the gyrations you
are doing here.
> step2: Bind UHCI controller to vfio-pci, which will put UHCI controller in
> one vfio group's device list and set UHCI's dev->driver_data to
> struct vfio-pci(for UHCI)
Who sets the driver_data here?
> step3: Unbind EHCI controller from native driver, will try to tell UHCI
> native driver that "I'm removed by set
> companion_hcd->self.hs_companion to NULL. However, companion_hcd
> get from UHCI's dev->driver_data that has modified by vfio-pci
> already. So, the vfio-pci structure will be damaged!
Damaged how? Attempting to assign random PCI drivers to the vfio-pci
driver is again, really really not supported (despite what the vfio
authors think), so again, it's amazing this works, as you are finding
out.
> step4: Bind EHCI controller to vfio-pci driver, which will put EHCI
> controller in the same vfio group as UHCI controller;
> ... ...
> step5: Unbind UHCI controller from vfio-pci, which will delete UHCI from
> vfio group device list that has been damaged in step 3. So, delete
> operation can random result into a NULL pointer dereference with
> the below stack dump.
> step6: Bind UHCI controller to native driver;
> step7: Unbind EHCI controller from vfio-pci, which will try to remove EHCI
> controller from the vfio group;
> step8: Bind EHCI controller to native driver;
That's crazy, why would you be doing all of that in the first place? Is
it common to add host controller drivers to virtual machines like this?
Why not just use usbip instead?
> [ 929.114641] uhci_hcd 0000:00:10.0: remove, state 1
> [ 929.114652] usb usb1: USB disconnect, device number 1
> [ 929.114655] usb 1-1: USB disconnect, device number 2
> [ 929.270313] usb 1-2: USB disconnect, device number 3
> [ 929.318404] uhci_hcd 0000:00:10.0: USB bus 1 deregistered
> [ 929.343029] uhci_hcd 0000:00:10.1: remove, state 4
> [ 929.343045] usb usb3: USB disconnect, device number 1
> [ 929.343685] uhci_hcd 0000:00:10.1: USB bus 3 deregistered
> [ 929.369087] ehci-pci 0000:00:10.7: remove, state 4
> [ 929.369102] usb usb4: USB disconnect, device number 1
> [ 929.370325] ehci-pci 0000:00:10.7: USB bus 4 deregistered
> [ 932.398494] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> [ 932.398496] PGD 42a67d067 P4D 42a67d067 PUD 42a65f067 PMD 0
> [ 932.398502] Oops: 0002 [#2] SMP NOPTI
> [ 932.398505] CPU: 2 PID: 7824 Comm: vfio_unbind.sh Tainted: P D 4.19.65-2020051917-rainos #1
Note, this is a very old kernel, and one that has closed source in it
making it such that none of us can debug it at all.
And are you sure this happens on 6.10?
> diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
> index a08f3f228e6d..5a63d7a772ae 100644
> --- a/drivers/usb/core/hcd-pci.c
> +++ b/drivers/usb/core/hcd-pci.c
> @@ -48,6 +48,7 @@ static void for_each_companion(struct pci_dev *pdev, struct usb_hcd *hcd,
> struct pci_dev *companion;
> struct usb_hcd *companion_hcd;
> unsigned int slot = PCI_SLOT(pdev->devfn);
> + struct pci_driver *drv;
>
> /*
> * Iterate through other PCI functions in the same slot.
> @@ -60,6 +61,13 @@ static void for_each_companion(struct pci_dev *pdev, struct usb_hcd *hcd,
> PCI_SLOT(companion->devfn) != slot)
> continue;
>
> + drv = companion->driver;
> + if (drv &&
> + strncmp(drv->name, "uhci_hcd", sizeof("uhci_hcd") - 1) &&
> + strncmp(drv->name, "ohci-pci", sizeof("ohci-pci") - 1) &&
> + strncmp(drv->name, "ehci-pci", sizeof("ehci-pci") - 1))
Attempting to rely on kernel module names within kernel code just is not
going to work, sorry.
What exactly are you trying to do here? Please at least comment it.
> + continue;
Do you just want to fail the binding? If so, why? And what is
precenting a driver to be bound after you do the check?
> +
> /*
> * Companion device should be either UHCI,OHCI or EHCI host
> * controller, otherwise skip.
Why doesn't this check suffice?
thanks,
greg k-h
next prev parent reply other threads:[~2024-08-26 9:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-26 8:54 [PATCH v2] USB: Fix kernel NULL pointer when unbind UHCI form vfio-pci WangYuli
2024-08-26 9:30 ` Greg KH [this message]
2024-08-26 15:42 ` Alan Stern
2024-08-26 9:30 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2024082631-resort-stays-b065@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=Carsten_Schmid@mentor.com \
--cc=CobeChen@zhaoxin.com \
--cc=TimGuo@zhaoxin.com \
--cc=WeitaoWang-oc@zhaoxin.com \
--cc=alex.williamson@redhat.com \
--cc=arnd@kernel.org \
--cc=dlemoal@kernel.org \
--cc=efremov@linux.com \
--cc=guanwentao@uniontech.com \
--cc=hslester96@gmail.com \
--cc=leoliu-oc@zhaoxin.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
--cc=schnelle@linux.ibm.com \
--cc=stern@rowland.harvard.edu \
--cc=tonywwang@zhaoxin.com \
--cc=ulf.hansson@linaro.org \
--cc=vkoul@kernel.org \
--cc=wangyuli@uniontech.com \
--cc=weitaowang@zhaoxin.com \
--cc=wwt8723@163.com \
--cc=xuerpeng@uniontech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox