public inbox for linux-bluetooth@vger.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: "Devegowda, Chandrashekar" <chandrashekar.devegowda@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>,
	"linux-bluetooth@vger.kernel.org"
	<linux-bluetooth@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"Srivatsa, Ravishankar" <ravishankar.srivatsa@intel.com>,
	"Tumkur Narayan, Chethan" <chethan.tumkur.narayan@intel.com>,
	"K, Kiran" <kiran.k@intel.com>,
	"Ben Ami, Golan" <golan.ben.ami@intel.com>,
	"Berg, Johannes" <johannes.berg@intel.com>
Subject: Re: [PATCH v1] Bluetooth: btintel_pcie: Support function level reset
Date: Thu, 23 Oct 2025 15:36:03 -0500	[thread overview]
Message-ID: <20251023203603.GA1312405@bhelgaas> (raw)
In-Reply-To: <IA3PR11MB9016B8132E2E11806B58A0F7FCF0A@IA3PR11MB9016.namprd11.prod.outlook.com>

On Thu, Oct 23, 2025 at 09:42:16AM +0000, Devegowda, Chandrashekar wrote:
> > On Tue, Mar 18, 2025 at 10:55:06AM -0400, Luiz Augusto von Dentz wrote:
> > > On Fri, Mar 14, 2025 at 3:40 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Fri, Mar 14, 2025 at 12:16:13PM +0200, Chandrashekar Devegowda
> > wrote:
> > > > > Support function level reset (flr) on hardware exception to
> > > > > recover controller. Driver also implements the back-off time
> > > > > of 5 seconds and the maximum number of retries are limited
> > > > > to 5 before giving up.
> > > >
> > > > Sort of weird that the commit log mentions FLR, but it's not
> > > > mentioned in the patch itself except for
> > > > BTINTEL_PCIE_FLR_RESET_MAX_RETRY.  Apparently the assumption
> > > > is that DSM_SET_RESET_METHOD_PCIE performs an FLR.
> > > >
> > > > Since this is an ACPI _DSM, presumably this mechanism only
> > > > works for devices built into the platform, not for any
> > > > potential plug-in devices that would not be described via
> > > > ACPI.  I guess this driver probably already only works for
> > > > built-in devices because it also uses DSM_SET_WDISABLE2_DELAY
> > > > and DSM_SET_RESET_METHOD.
> > > >
> > > > There is a generic PCI core way to do FLR (pcie_reset_flr()),
> > > > so I assume the _DSM exists because the device needs some
> > > > additional device-specific work around the FLR.
> > > >
> > > > > +static void btintel_pcie_removal_work(struct work_struct *wk) {
> > > > > +     struct btintel_pcie_removal *removal =
> > > > > +             container_of(wk, struct btintel_pcie_removal, work);
> > > > > +     struct pci_dev *pdev = removal->pdev;
> > > > > +     struct pci_bus *bus;
> > > > > +     struct btintel_pcie_data *data;
> > > > > +
> > > > > +     data = pci_get_drvdata(pdev);
> > > > > +
> > > > > +     pci_lock_rescan_remove();
> > > > > +
> > > > > +     bus = pdev->bus;
> > > > > +     if (!bus)
> > > > > +             goto out;
> > > > > +
> > > > > +     btintel_acpi_reset_method(data->hdev);
> > > > > +     pci_stop_and_remove_bus_device(pdev);
> > > > > +     pci_dev_put(pdev);
> > > > > +
> > > > > +     if (bus->parent)
> > > > > +             bus = bus->parent;
> > > > > +     pci_rescan_bus(bus);
> > > >
> > > > This remove and rescan by a driver that's bound to the device
> > > > subverts the driver model.  pci_stop_and_remove_bus_device()
> > > > detaches the driver from the device.  After the driver is
> > > > detached, we should not be running any driver code.
> > >
> > > Yeah, this self removal was sort of bugging me as well, although
> > > I'm not familiar enough with the pci subsystem, having the
> > > driver remove and continue running code like pci_rescan_bus
> > > seems wrong as we may end up with multiple instances of the same
> > > driver.
> > >
> > > > There are a couple other drivers that remove their own device
> > > > (ath9k, iwlwifi, asus_wmi, eeepc-laptop), but I think those
> > > > are broken and it's a mistake to add this pattern to more
> > > > drivers.
> > > >
> > > > What's the reason for doing the remove and rescan?  The PCI
> > > > core doesn't reset the device when you do this, so it's not a
> > > > "bigger hammer reset".
> > >
> > > I guess it was more of the expectation of Chandru to have a sort
> > > of hard reset, driver remove+probe, instead of a soft reset
> > > where the driver will just need to reinit itself after
> > > performing pcie_reset_flr.
> > 
> > If the object is just to reinitialize the driver, I think this
> > hack of removing and rescanning is a bad way to do it.  If you
> > reset the device, you now know the state of the device and you can
> > make the driver state match it.  If necessary you can always reuse
> > part or all of the .remove() and .probe() methods yourself,
> > without this dance of calling pci_stop_and_remove_bus_device() and
> > pci_rescan_bus().
> 
> I’m sharing insights from our recent work on the PLDR for the BT
> driver. The above method supports FLR effectively, but for PLDR it
> is required to unload Wifi driver before doing PLDR via ACPI method.

IIUC "PLDR" is an ACPI method that does a reset, and you want to reset
a BT device.

> Currently, calling pci_rescan_bus() successfully rebinds both the
> WiFi and BT drivers. This approach follows the method used for the
> WiFi driver, as seen here:
> 
> https://elixir.bootlin.com/linux/v6.18-rc1/source/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/trans.c#L2182

It looks like this is a multi-function device, iwlwifi is bound to
function 0, and btintel is bound to function 1.

Then it looks like iwl_trans_pcie_removal_wk() starts with the wifi
device, finds the corresponding BT device, removes the BT device, runs
PLDR on the wifi device, removes the wifi device, and rescans to find
both devices again:

  iwl_trans_pcie_removal_wk
    wifi = removal->pdev			# WiFi device
    bt = pci_get_slot(...)			# BT device
    pci_stop_and_remove_bus_device(bt)
    iwl_trans_pcie_set_product_reset(wifi)	# do PLDR on WiFi
    pci_stop_and_remove_bus_device(wifi)
    pci_rescan_bus

It seems problematic to me for the WiFi driver to remove the BT
driver.  What if BT was active at the time?  Why is it ok to yank the
rug out from under it?

Why does the BT driver have to be unloaded before resetting the WiFi
device?  Why does the WiFi driver have to be unloaded before resetting
the BT device?

Theoretically, the functions of a PCI multi-function device are
independent and really don't have any influence on each other.

If you had a single driver that claimed both devices, that driver
could coordinate this since it would know about both and could
synchronize their activity when needed.

Maybe you could have a wrapper driver that claims both and delegates
each function to either iwlwifi or btintel as needed?  Since the
wrapper, iwlwifi, and btintel would be linked into a single module,
you could arrange callbacks between them to synchronize and handle
these resets.

If you reset the device, you know the state of the device afterward,
and the driver should be able to initialize its own data structures
accordingly.  This should not require any PCI device removal or
rescan.

Bjorn

  reply	other threads:[~2025-10-23 20:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-14 10:16 [PATCH v1] Bluetooth: btintel_pcie: Support function level reset Chandrashekar Devegowda
2025-03-14  7:32 ` [v1] " bluez.test.bot
2025-03-14 19:40 ` [PATCH v1] " Bjorn Helgaas
2025-03-18 14:55   ` Luiz Augusto von Dentz
2025-03-18 15:47     ` Bjorn Helgaas
2025-05-25 10:30       ` K, Kiran
2025-10-23  9:42       ` Devegowda, Chandrashekar
2025-10-23 20:36         ` Bjorn Helgaas [this message]
2025-10-27 10:08           ` Johannes Berg
2025-10-28 21:06             ` Bjorn Helgaas
2025-10-29 10:38               ` Johannes Berg
2025-10-29 16:26                 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251023203603.GA1312405@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=chandrashekar.devegowda@intel.com \
    --cc=chethan.tumkur.narayan@intel.com \
    --cc=golan.ben.ami@intel.com \
    --cc=johannes.berg@intel.com \
    --cc=kiran.k@intel.com \
    --cc=linux-bluetooth@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=luiz.dentz@gmail.com \
    --cc=ravishankar.srivatsa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox