From: Bjorn Helgaas <helgaas@kernel.org>
To: Manivannan Sadhasivam <mani@kernel.org>
Cc: "Baochen Qiang" <baochen.qiang@oss.qualcomm.com>,
manivannan.sadhasivam@oss.qualcomm.com,
"Jeff Johnson" <jjohnson@kernel.org>,
"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
"Rob Herring" <robh@kernel.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Nirmal Patel" <nirmal.patel@linux.intel.com>,
"Jonathan Derrick" <jonathan.derrick@linux.dev>,
linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org,
ath12k@lists.infradead.org, ath11k@lists.infradead.org,
ath10k@lists.infradead.org, ilpo.jarvinen@linux.intel.com,
linux-arm-msm@vger.kernel.org, linux-pci@vger.kernel.org,
"Krishna Chaitanya Chundru" <krishna.chundru@oss.qualcomm.com>,
"Qiang Yu" <qiang.yu@oss.qualcomm.com>
Subject: Re: [PATCH 4/6] wifi: ath12k: Use pci_{enable/disable}_link_state() APIs to enable/disable ASPM states
Date: Fri, 18 Jul 2025 11:26:00 -0500 [thread overview]
Message-ID: <20250718162600.GA2700538@bhelgaas> (raw)
In-Reply-To: <wkapzhyr6hzp5az4jae3y5c77c3fg4uwrmyyipbq4uosamcivq@z7nv6w7nbyrp>
On Fri, Jul 18, 2025 at 05:19:28PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Jul 18, 2025 at 07:05:03PM GMT, Baochen Qiang wrote:
> > On 7/18/2025 6:20 PM, Manivannan Sadhasivam wrote:
> > > On Fri, Jul 18, 2025 at 01:27:27PM GMT, Manivannan Sadhasivam wrote:
> > >> On Fri, Jul 18, 2025 at 10:05:02AM GMT, Baochen Qiang wrote:
> > >>> On 7/17/2025 7:29 PM, Manivannan Sadhasivam wrote:
> > >>>> On Thu, Jul 17, 2025 at 06:46:12PM GMT, Baochen Qiang wrote:
> > >>>>> On 7/17/2025 6:31 PM, Manivannan Sadhasivam wrote:
> > >>>>>> On Thu, Jul 17, 2025 at 05:24:13PM GMT, Baochen Qiang wrote:
> > >>>>>>
> > >>>>>> [...]
> > >>>>>>
> > >>>>>>>> @@ -16,6 +16,8 @@
> > >>>>>>>> #include "mhi.h"
> > >>>>>>>> #include "debug.h"
> > >>>>>>>>
> > >>>>>>>> +#include "../ath.h"
> > >>>>>>>> +
> > >>>>>>>> #define ATH12K_PCI_BAR_NUM 0
> > >>>>>>>> #define ATH12K_PCI_DMA_MASK 36
> > >>>>>>>>
> > >>>>>>>> @@ -928,8 +930,7 @@ static void ath12k_pci_aspm_disable(struct ath12k_pci *ab_pci)
> > >>>>>>>> u16_get_bits(ab_pci->link_ctl, PCI_EXP_LNKCTL_ASPM_L1));
> > >>>>>>>>
> > >>>>>>>> /* disable L0s and L1 */
> > >>>>>>>> - pcie_capability_clear_word(ab_pci->pdev, PCI_EXP_LNKCTL,
> > >>>>>>>> - PCI_EXP_LNKCTL_ASPMC);
> > >>>>>>>> + pci_disable_link_state(ab_pci->pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
> > >>>>>>>
> > >>>>>>> Not always, but sometimes seems the 'disable' does not work:
> > >>>>>>>
> > >>>>>>> [ 279.920507] ath12k_pci_power_up 1475: link_ctl 0x43 //before disable
> > >>>>>>> [ 279.920539] ath12k_pci_power_up 1482: link_ctl 0x43 //after disable
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> set_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags);
> > >>>>>>>> }
> > >>>>>>>> @@ -958,10 +959,7 @@ static void ath12k_pci_aspm_restore(struct ath12k_pci *ab_pci)
> > >>>>>>>> {
> > >>>>>>>> if (ab_pci->ab->hw_params->supports_aspm &&
> > >>>>>>>> test_and_clear_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags))
> > >>>>>>>> - pcie_capability_clear_and_set_word(ab_pci->pdev, PCI_EXP_LNKCTL,
> > >>>>>>>> - PCI_EXP_LNKCTL_ASPMC,
> > >>>>>>>> - ab_pci->link_ctl &
> > >>>>>>>> - PCI_EXP_LNKCTL_ASPMC);
> > >>>>>>>> + pci_enable_link_state(ab_pci->pdev, ath_pci_aspm_state(ab_pci->link_ctl));
> > >>>>>>>
> > >>>>>>> always, the 'enable' is not working:
> > >>>>>>>
> > >>>>>>> [ 280.561762] ath12k_pci_start 1180: link_ctl 0x43 //before restore
> > >>>>>>> [ 280.561809] ath12k_pci_start 1185: link_ctl 0x42 //after restore
> > >>>>>>>
> > >>>>>>
> > >>>>>> Interesting! I applied your diff and I never see this issue so far (across 10+
> > >>>>>> reboots):
> > >>>>>
> > >>>>> I was not testing reboot. Here is what I am doing:
> > >>>>>
> > >>>>> step1: rmmod ath12k
> > >>>>> step2: force LinkCtrl using setpci (make sure it is 0x43, which seems more likely to see
> > >>>>> the issue)
> > >>>>>
> > >>>>> sudo setpci -s 02:00.0 0x80.B=0x43
> > >>>>>
> > >>>>> step3: insmod ath12k and check linkctrl
> > >>>>>
> > >>>>
> > >>>> So I did the same and got:
> > >>>>
> > >>>> [ 3283.363569] ath12k_pci_power_up 1475: link_ctl 0x43
> > >>>> [ 3283.363769] ath12k_pci_power_up 1480: link_ctl 0x40
> > >>>> [ 3284.007661] ath12k_pci_start 1180: link_ctl 0x40
> > >>>> [ 3284.007826] ath12k_pci_start 1185: link_ctl 0x42
> > >>>>
> > >>>> My host machine is Qcom based Thinkpad T14s and it doesn't
> > >>>> support L0s. So that's why the lnkctl value once enabled
> > >>>> becomes 0x42. This is exactly the reason why the drivers
> > >>>> should not muck around LNKCTL register manually.
> > >>>
> > >>> Thanks, then the 0x43 -> 0x40 -> 0x40 -> 0x42 sequence should
> > >>> not be a concern. But still the random 0x43 -> 0x43 -> 0x43 ->
> > >>> 0x42 sequence seems problematic.
> > >>>
> > >>> How many iterations have you done with above steps? From my
> > >>> side it seems random so better to do some stress test.
> > >>>
> > >>
> > >> So I ran the modprobe for about 50 times on the Intel NUC that
> > >> has QCA6390, but didn't spot the disparity. This is the script
> > >> I used:
> > >>
> > >> for i in {1..50} ;do echo "Loop $i"; sudo setpci -s 01:00.0 0x80.B=0x43;\
> > >> sudo modprobe -r ath11k_pci; sleep 1; sudo modprobe ath11k_pci; sleep 1;done
> > >>
> > >> And I always got:
> > >>
> > >> [ 5862.388083] ath11k_pci_aspm_disable: 609 lnkctrl: 0x43
> > >> [ 5862.388124] ath11k_pci_aspm_disable: 614 lnkctrl: 0x40
> > >> [ 5862.876291] ath11k_pci_start: 880 lnkctrl: 0x40
> > >> [ 5862.876346] ath11k_pci_start: 886 lnkctrl: 0x42
> > >>
> > >> Also no AER messages. TBH, I'm not sure how you were able to
> > >> see the random issues with these APIs. That looks like a race,
> > >> which is scary.
> > >>
> > >> I do not want to ignore your scenario, but would like to
> > >> reproduce and get to the bottom of it.
> > >
> > > I synced with Baochen internally and able to repro the issue.
> > > Ths issue is due to hand modifying the LNKCTL register from
> > > userspace. The PCI core maintains the ASPM state internally and
> > > uses it to change the state when the
> > > pci_{enable/disable}_link_state*() APIs are called.
> > >
> > > So if the userspace or a client driver modifies the LNKCTL
> > > register manually, it makes the PCI cached ASPM states invalid.
> > > So while this series fixes the driver from doing that, nothing
> > > prevents userspace from doing so using 'setpci' and other tools.
> > > Userspace should only use sysfs attributes to change the state
> > > and avoid modifying the PCI registers when the PCI core is
> > > controlling the device. So this is the reason behind the
> > > errantic behavior of the API and it is not due to the issue with
> > > the API or the PCI core.
> >
> > IMO we can not rely on userspace doing what or not doing what, or
> > on how it is doing, right? So can we fix PCI core to avoid this?
>
> I'm not sure it is possible to *fix* the PCI core here. Since the
> PCI core gives userspace access to the entire config space of the
> device, the userspace reads/writes to any of the registers it want.
> So unless the config space access if forbidden if a driver is bound
> to the device, it is inevitable. And then there is also /dev/mem...
>
> Interestingly, there is an API available for this purpose:
> pci_request_config_region_exclusive(), but it is used only by the
> AMD arch driver to prevent userspace from writing to the entire
> config space of the device.
>
> Maybe it makes sense to use something like this to prevent the
> userspace access to the entire config space if the driver is bind to
> the device.
I'm not really a fan of pci_request_config_region_exclusive() because
it's such a singleton thing. I don't like to be one of only a few
users of an interface.
Linux has a long tradition of allowing root users to shoot themselves
in the foot, and setpci is very useful as a debugging tool. Maybe
tainting the kernel for config writes from userspace, and possibly
even a WARN_ONCE() at the time, would be a compromise.
Bjorn
next prev parent reply other threads:[~2025-07-18 16:26 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-16 12:56 [PATCH 0/6] PCI/ASPM: Fix pci_enable_link_state*() APIs behavior Manivannan Sadhasivam via B4 Relay
2025-07-16 12:56 ` [PATCH 1/6] PCI/ASPM: Fix the behavior of pci_enable_link_state*() APIs Manivannan Sadhasivam via B4 Relay
2025-07-16 12:56 ` [PATCH 2/6] PCI/ASPM: Transition the device to D0 (if required) inside pci_enable_link_state_locked() API Manivannan Sadhasivam via B4 Relay
2025-07-16 20:56 ` Bjorn Helgaas
2025-07-17 7:36 ` Manivannan Sadhasivam
2025-07-16 12:56 ` [PATCH 3/6] PCI/ASPM: Improve the kernel-doc for pci_disable_link_state*() APIs Manivannan Sadhasivam via B4 Relay
2025-07-16 12:56 ` [PATCH 4/6] wifi: ath12k: Use pci_{enable/disable}_link_state() APIs to enable/disable ASPM states Manivannan Sadhasivam via B4 Relay
2025-07-17 6:59 ` kernel test robot
2025-07-18 8:22 ` Manivannan Sadhasivam
2025-07-17 9:24 ` Baochen Qiang
2025-07-17 10:31 ` Manivannan Sadhasivam
2025-07-17 10:46 ` Baochen Qiang
2025-07-17 10:55 ` Konrad Dybcio
2025-07-17 10:59 ` Baochen Qiang
2025-07-17 11:29 ` Manivannan Sadhasivam
2025-07-18 2:05 ` Baochen Qiang
2025-07-18 7:57 ` Manivannan Sadhasivam
2025-07-18 8:03 ` Krishna Chaitanya Chundru
2025-07-18 8:12 ` Manivannan Sadhasivam
2025-07-18 8:17 ` Krishna Chaitanya Chundru
2025-07-18 10:20 ` Manivannan Sadhasivam
2025-07-18 11:05 ` Baochen Qiang
2025-07-18 11:49 ` Manivannan Sadhasivam
2025-07-18 16:26 ` Bjorn Helgaas [this message]
2025-07-18 17:19 ` Manivannan Sadhasivam
2025-07-21 8:04 ` Ilpo Järvinen
2025-07-21 8:29 ` Manivannan Sadhasivam
2025-07-21 10:09 ` Ilpo Järvinen
2025-07-21 11:08 ` Manivannan Sadhasivam
2025-07-21 11:28 ` Ilpo Järvinen
2025-08-07 10:03 ` Manivannan Sadhasivam
2025-07-16 12:56 ` [PATCH 5/6] wifi: ath11k: " Manivannan Sadhasivam via B4 Relay
2025-07-16 12:56 ` [PATCH 6/6] wifi: ath10k: " Manivannan Sadhasivam via B4 Relay
2025-07-16 17:11 ` [PATCH 0/6] PCI/ASPM: Fix pci_enable_link_state*() APIs behavior Jeff Johnson
2025-07-18 7:58 ` Manivannan Sadhasivam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250718162600.GA2700538@bhelgaas \
--to=helgaas@kernel.org \
--cc=ath10k@lists.infradead.org \
--cc=ath11k@lists.infradead.org \
--cc=ath12k@lists.infradead.org \
--cc=baochen.qiang@oss.qualcomm.com \
--cc=bhelgaas@google.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=jjohnson@kernel.org \
--cc=jonathan.derrick@linux.dev \
--cc=krishna.chundru@oss.qualcomm.com \
--cc=kwilczynski@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=mani@kernel.org \
--cc=manivannan.sadhasivam@oss.qualcomm.com \
--cc=nirmal.patel@linux.intel.com \
--cc=qiang.yu@oss.qualcomm.com \
--cc=robh@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).