* [PATCH] PCI: Enable Bus Master in pci_power_up()
@ 2026-01-13 20:56 Mario Limonciello (AMD)
2026-01-14 0:01 ` Matthew Ruffell
2026-01-14 9:52 ` Lukas Wunner
0 siblings, 2 replies; 5+ messages in thread
From: Mario Limonciello (AMD) @ 2026-01-13 20:56 UTC (permalink / raw)
To: mario.limonciello, bhelgaas, rafael
Cc: Mario Limonciello (AMD), kengyu, Matthew Ruffell, linux-pci
commit 4d4c10f763d78 ("PCI: Explicitly put devices into D0 when
initializing") addressed the issue of devices not being explicitly
initialized to D0 during system startup, resolving mismatches between
firmware and OS states.
However, this change affected devices lacking runtime PM, as noted in
commit 907a7a2e5bf40 ("PCI/PM: Set up runtime PM even for devices without
PCI PM").
Matthew however reports that there is additional problems specifically on
AWS NVME hardware that can't handle a kexec since these changes were
introduced.
During a kexec reboot ever since commit 4fc9bbf98fd66 ("PCI: Disable Bus
Master only on kexec reboot") bus mastering will be turned off, and this
is a different flow than is observed for shutdown/reboot. The problem
appears to be that because the device is actually in D0 during the
startup routine, clearing bus mastering as part of pci_device_shutdown()
leads to a mismatch during the next kernel boot.
I'd hypothesize that the firmware on this platform normally sets bus
mastering as part of startup and the difference in kexec behavior lead to
an incongruity.
Set bus mastering when the device powers up to fix the mismatch.
Cc: kengyu@lexical.tw
Reported-by: Matthew Ruffell <matthew.ruffell@canonical.com>
Closes: https://lore.kernel.org/linux-pci/CAKAwkKvmdKxRRA4cR=jJEdyadon6uKXe+aFXaGSe=PNSgwDf9g@mail.gmail.com/
Fixes: 4d4c10f763d78 ("PCI: Explicitly put devices into D0 when initializing")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---
NOTE:
This could also be addressed by disabling the clearing of bus mastering across
a kexec reboot if that is preferred.
---
drivers/pci/pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 13dbb405dc31f..c0c0b5c9bf838 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1323,6 +1323,7 @@ int pci_power_up(struct pci_dev *dev)
return -EIO;
}
+ pci_set_master(dev);
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
if (PCI_POSSIBLE_ERROR(pmcsr)) {
pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] PCI: Enable Bus Master in pci_power_up()
2026-01-13 20:56 [PATCH] PCI: Enable Bus Master in pci_power_up() Mario Limonciello (AMD)
@ 2026-01-14 0:01 ` Matthew Ruffell
2026-01-14 9:52 ` Lukas Wunner
1 sibling, 0 replies; 5+ messages in thread
From: Matthew Ruffell @ 2026-01-14 0:01 UTC (permalink / raw)
To: Mario Limonciello (AMD)
Cc: mario.limonciello, bhelgaas, rafael, kengyu, linux-pci
On Wed, 14 Jan 2026 at 09:56, Mario Limonciello (AMD)
<superm1@kernel.org> wrote:
>
> commit 4d4c10f763d78 ("PCI: Explicitly put devices into D0 when
> initializing") addressed the issue of devices not being explicitly
> initialized to D0 during system startup, resolving mismatches between
> firmware and OS states.
>
> However, this change affected devices lacking runtime PM, as noted in
> commit 907a7a2e5bf40 ("PCI/PM: Set up runtime PM even for devices without
> PCI PM").
>
> Matthew however reports that there is additional problems specifically on
> AWS NVME hardware that can't handle a kexec since these changes were
> introduced.
>
> During a kexec reboot ever since commit 4fc9bbf98fd66 ("PCI: Disable Bus
> Master only on kexec reboot") bus mastering will be turned off, and this
> is a different flow than is observed for shutdown/reboot. The problem
> appears to be that because the device is actually in D0 during the
> startup routine, clearing bus mastering as part of pci_device_shutdown()
> leads to a mismatch during the next kernel boot.
>
> I'd hypothesize that the firmware on this platform normally sets bus
> mastering as part of startup and the difference in kexec behavior lead to
> an incongruity.
>
> Set bus mastering when the device powers up to fix the mismatch.
>
> Cc: kengyu@lexical.tw
> Reported-by: Matthew Ruffell <matthew.ruffell@canonical.com>
> Closes: https://lore.kernel.org/linux-pci/CAKAwkKvmdKxRRA4cR=jJEdyadon6uKXe+aFXaGSe=PNSgwDf9g@mail.gmail.com/
> Fixes: 4d4c10f763d78 ("PCI: Explicitly put devices into D0 when initializing")
> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
> ---
> NOTE:
> This could also be addressed by disabling the clearing of bus mastering across
> a kexec reboot if that is preferred.
> ---
> drivers/pci/pci.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 13dbb405dc31f..c0c0b5c9bf838 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1323,6 +1323,7 @@ int pci_power_up(struct pci_dev *dev)
> return -EIO;
> }
>
> + pci_set_master(dev);
> pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> if (PCI_POSSIBLE_ERROR(pmcsr)) {
> pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> --
> 2.43.0
>
Tested on a AWS c5.metal system, with git-tag v6.19-rc5 and this patch applied.
Kexec works great, and the system comes up properly.
Thank you Mario.
Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] PCI: Enable Bus Master in pci_power_up()
2026-01-13 20:56 [PATCH] PCI: Enable Bus Master in pci_power_up() Mario Limonciello (AMD)
2026-01-14 0:01 ` Matthew Ruffell
@ 2026-01-14 9:52 ` Lukas Wunner
2026-01-14 14:59 ` Manivannan Sadhasivam
2026-01-14 15:08 ` Mario Limonciello
1 sibling, 2 replies; 5+ messages in thread
From: Lukas Wunner @ 2026-01-14 9:52 UTC (permalink / raw)
To: Mario Limonciello (AMD)
Cc: mario.limonciello, bhelgaas, rafael, kengyu, Matthew Ruffell,
linux-pci
On Tue, Jan 13, 2026 at 02:56:14PM -0600, Mario Limonciello (AMD) wrote:
> +++ b/drivers/pci/pci.c
> @@ -1323,6 +1323,7 @@ int pci_power_up(struct pci_dev *dev)
> return -EIO;
> }
>
> + pci_set_master(dev);
> pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> if (PCI_POSSIBLE_ERROR(pmcsr)) {
> pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
So any device will be allowed to write to memory from the get-go?
That sounds like a very bad idea. For security reasons alone,
we only want to enable bus mastering when needed. It's up to
the driver to enable it, not up to the PCI core. We've had cases
in the past where devices corrupted memory because BIOS left
bus mastering enabled, see abb2bafd295f. Enabling bus mastering
for everything anytime will exacerbate such problems or uncover
new ones.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] PCI: Enable Bus Master in pci_power_up()
2026-01-14 9:52 ` Lukas Wunner
@ 2026-01-14 14:59 ` Manivannan Sadhasivam
2026-01-14 15:08 ` Mario Limonciello
1 sibling, 0 replies; 5+ messages in thread
From: Manivannan Sadhasivam @ 2026-01-14 14:59 UTC (permalink / raw)
To: Lukas Wunner
Cc: Mario Limonciello (AMD), mario.limonciello, bhelgaas, rafael,
kengyu, Matthew Ruffell, linux-pci
On Wed, Jan 14, 2026 at 10:52:26AM +0100, Lukas Wunner wrote:
> On Tue, Jan 13, 2026 at 02:56:14PM -0600, Mario Limonciello (AMD) wrote:
> > +++ b/drivers/pci/pci.c
> > @@ -1323,6 +1323,7 @@ int pci_power_up(struct pci_dev *dev)
> > return -EIO;
> > }
> >
> > + pci_set_master(dev);
> > pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> > if (PCI_POSSIBLE_ERROR(pmcsr)) {
> > pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>
> So any device will be allowed to write to memory from the get-go?
> That sounds like a very bad idea. For security reasons alone,
> we only want to enable bus mastering when needed. It's up to
> the driver to enable it, not up to the PCI core. We've had cases
> in the past where devices corrupted memory because BIOS left
> bus mastering enabled, see abb2bafd295f. Enabling bus mastering
> for everything anytime will exacerbate such problems or uncover
> new ones.
>
Indeed and it will pave the way for a big security hole as the PCI core will
then allow the device to perform DMA without setting up the address translation
and so on.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] PCI: Enable Bus Master in pci_power_up()
2026-01-14 9:52 ` Lukas Wunner
2026-01-14 14:59 ` Manivannan Sadhasivam
@ 2026-01-14 15:08 ` Mario Limonciello
1 sibling, 0 replies; 5+ messages in thread
From: Mario Limonciello @ 2026-01-14 15:08 UTC (permalink / raw)
To: Lukas Wunner
Cc: mario.limonciello, bhelgaas, rafael, kengyu, Matthew Ruffell,
linux-pci
On 1/14/26 3:52 AM, Lukas Wunner wrote:
> On Tue, Jan 13, 2026 at 02:56:14PM -0600, Mario Limonciello (AMD) wrote:
>> +++ b/drivers/pci/pci.c
>> @@ -1323,6 +1323,7 @@ int pci_power_up(struct pci_dev *dev)
>> return -EIO;
>> }
>>
>> + pci_set_master(dev);
>> pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>> if (PCI_POSSIBLE_ERROR(pmcsr)) {
>> pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>
> So any device will be allowed to write to memory from the get-go?
Well so I did a quick check on a modern production Strix laptop with
6.19-rc5 on my desk with this:
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 13dbb405dc31..74d7745c185c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1305,6 +1305,7 @@ int pci_power_up(struct pci_dev *dev)
bool need_restore;
pci_power_t state;
u16 pmcsr;
+ u16 old_cmd;
platform_pci_set_power_state(dev, PCI_D0);
@@ -1352,6 +1353,10 @@ int pci_power_up(struct pci_dev *dev)
udelay(PCI_PM_D2_DELAY);
end:
+ pci_read_config_word(dev, PCI_COMMAND, &old_cmd);
+ pci_info(dev, "Bus mastering bit is %sabled in D0\n",
+ (old_cmd & PCI_COMMAND_MASTER) ? "en" : "dis");
+
dev->current_state = PCI_D0;
if (need_restore)
return 1;
Here's what I observe.
$ sudo dmesg | grep mastering
[ 2.560916] pci 0000:00:01.1: Bus mastering bit is enabled in D0
[ 2.580832] pci 0000:00:01.2: Bus mastering bit is enabled in D0
[ 2.594468] pci 0000:00:02.1: Bus mastering bit is enabled in D0
[ 2.595439] pci 0000:00:02.2: Bus mastering bit is enabled in D0
[ 2.596292] pci 0000:00:02.3: Bus mastering bit is enabled in D0
[ 2.597235] pci 0000:00:02.4: Bus mastering bit is enabled in D0
[ 2.600120] pci 0000:00:08.1: Bus mastering bit is enabled in D0
[ 2.600899] pci 0000:00:08.2: Bus mastering bit is enabled in D0
[ 2.601535] pci 0000:00:08.3: Bus mastering bit is enabled in D0
[ 2.610333] pci 0000:c1:00.0: Bus mastering bit is enabled in D0
[ 2.611717] pci 0000:c2:00.0: Bus mastering bit is disabled in D0
[ 2.612795] pci 0000:c3:00.0: Bus mastering bit is disabled in D0
[ 2.613959] pci 0000:c4:00.0: Bus mastering bit is enabled in D0
[ 2.621622] pci 0000:c5:00.0: Bus mastering bit is enabled in D0
[ 2.624022] pci 0000:c5:00.1: Bus mastering bit is disabled in D0
[ 2.624697] pci 0000:c5:00.2: Bus mastering bit is disabled in D0
[ 2.629270] pci 0000:c5:00.4: Bus mastering bit is enabled in D0
[ 2.634864] pci 0000:c5:00.5: Bus mastering bit is disabled in D0
[ 2.635657] pci 0000:c5:00.7: Bus mastering bit is disabled in D0
[ 2.636505] pci 0000:c6:00.0: Bus mastering bit is disabled in D0
[ 2.636859] pci 0000:c6:00.1: Bus mastering bit is disabled in D0
[ 2.641870] pci 0000:c7:00.0: Bus mastering bit is enabled in D0
[ 2.649681] pci 0000:c7:00.3: Bus mastering bit is enabled in D0
[ 2.657457] pci 0000:c7:00.4: Bus mastering bit is enabled in D0
[ 2.665211] pci 0000:c7:00.5: Bus mastering bit is enabled in D0
[ 2.669811] pci 0000:c7:00.6: Bus mastering bit is enabled in D0
[ 5.114596] pcieport 0000:00:01.1: Bus mastering bit is enabled in D0
[ 5.170824] pcieport 0000:00:01.2: Bus mastering bit is enabled in D0
[ 8.025850] snd_hda_intel 0000:c5:00.1: Bus mastering bit is disabled
in D0
[ 22.314165] pcieport 0000:00:02.4: Bus mastering bit is enabled in D0
[ 22.330193] r8169 0000:c4:00.0: Bus mastering bit is enabled in D0
[ 22.336309] xhci_hcd 0000:c7:00.0: Bus mastering bit is disabled in D0
[ 34.587505] snd_hda_intel 0000:c5:00.1: Bus mastering bit is enabled
in D0
[ 35.771050] xhci_hcd 0000:c5:00.4: Bus mastering bit is disabled in D0
So doesn't BIOS appear to have set bus mastering on a majority of
devices already?
> That sounds like a very bad idea. For security reasons alone,
> we only want to enable bus mastering when needed.
Should actually be doing the reverse of my proposed patch and explicitly
disabling bus mastering in the PCI core at startup then require drivers
to set policy?
> It's up to
> the driver to enable it, not up to the PCI core. We've had cases
> in the past where devices corrupted memory because BIOS left
> bus mastering enabled, see abb2bafd295f. Enabling bus mastering
> for everything anytime will exacerbate such problems or uncover
> new ones.
>
How do you feel about the other proposal I mentioned, not clearing it on
kexec? Matthew confirmed this will help this issue too.
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-01-14 15:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-13 20:56 [PATCH] PCI: Enable Bus Master in pci_power_up() Mario Limonciello (AMD)
2026-01-14 0:01 ` Matthew Ruffell
2026-01-14 9:52 ` Lukas Wunner
2026-01-14 14:59 ` Manivannan Sadhasivam
2026-01-14 15:08 ` Mario Limonciello
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox