* [PATCH 0/6] PCI: align small (<4k) BARs
@ 2024-07-09 13:35 Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 1/6] PCI: don't clear already cleared bit Stewart Hildebrand
` (6 more replies)
0 siblings, 7 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:35 UTC (permalink / raw)
To: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin
Cc: Stewart Hildebrand, x86, linux-pci, linux-kernel
This series sets the default minimum resource alignment to 4k for memory
BARs. In preparation, it makes an optimization and addresses some corner
cases observed when reallocating BARs. I consider the prepapatory
patches to be prerequisites to changing the default BAR size.
Issues observed when small (<4k) BARs are not 4k aligned are:
1. Devices to be passed through (to e.g. a Xen HVM guest) with small
(<4k) BARs require each memory BAR to be page aligned. Currently, the
only way to guarantee this alignment from a user perspective is to fake
the size of the BARs using the pci=resource_alignment= option. This is a
bad user experience, and faking the BAR size is not always desirable.
See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
for further discussion.
Anecdotally, we're using pcitest for PCI passthrough validation with
Xen, and pcitest fails with a fake BAR size.
2. Devices with multiple small (<4k) BARs could have the MSI-X tables
located in one of its small (<4k) BARs. This may lead to the MSI-X
tables being mapped in the same 4k region as other data. The PCIe 6.1
specification (section 7.7.2 MSI-X Capability and Table Structure) says
we probably shouldn't do that.
To improve the user experience, and increase conformance to PCIe spec,
set the default minimum resource alignment of memory BARs to 4k. Choose
4k (rather than PAGE_SIZE) for the alignment value in the common code,
since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
The new default alignment may be overridden by arches by implementing
pcibios_default_alignment(), or by the user with the
pci=resource_alignment= option.
I considered introducing checks for the specific scenarios described,
but chose not to pursue this. A check such as "if (xen_domain())" may be
pretty simple, but that doesn't account for other hypervisors. If other
hypervisors are to be considered, or if we try to dynamically reallocate
BARs for devices being marked for passthrough, such a check may quickly
grow unwieldy. Further, checking for the MSI-X tables residing in a
small (<4k) BAR is unlikely to be a one-liner. Making 4k alignment the
default seems more robust.
I considered alternatively adding new functionality to the
pci=resource_alignment= option, but that approach was already attempted
and decided against [1].
[1] https://lore.kernel.org/linux-pci/1473757234-5284-4-git-send-email-xyjxie@linux.vnet.ibm.com/
Comment from pci_request_resource_alignment() pasted here for reference:
/*
* Increase the alignment of the resource. There are two ways we
* can do this:
*
* 1) Increase the size of the resource. BARs are aligned on their
* size, so when we reallocate space for this resource, we'll
* allocate it with the larger alignment. This also prevents
* assignment of any other BARs inside the alignment region, so
* if we're requesting page alignment, this means no other BARs
* will share the page.
*
* The disadvantage is that this makes the resource larger than
* the hardware BAR, which may break drivers that compute things
* based on the resource size, e.g., to find registers at a
* fixed offset before the end of the BAR.
*
* 2) Retain the resource size, but use IORESOURCE_STARTALIGN and
* set r->start to the desired alignment. By itself this
* doesn't prevent other BARs being put inside the alignment
* region, but if we realign *every* resource of every device in
* the system, none of them will share an alignment region.
*
* When the user has requested alignment for only some devices via
* the "pci=resource_alignment" argument, "resize" is true and we
* use the first method. Otherwise we assume we're aligning all
* devices and we use the second.
*/
Stewart Hildebrand (6):
PCI: don't clear already cleared bit
PCI: restore resource alignment
PCI: restore memory decoding after reallocation
x86: PCI: preserve IORESOURCE_STARTALIGN alignment
PCI: don't reassign resources that are already aligned
PCI: align small (<4k) BARs
arch/x86/pci/i386.c | 7 +++++--
drivers/pci/pci.c | 17 +++++++++++++---
drivers/pci/setup-bus.c | 44 +++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 2 ++
4 files changed, 65 insertions(+), 5 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/6] PCI: don't clear already cleared bit
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
@ 2024-07-09 13:35 ` Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 2/6] PCI: restore resource alignment Stewart Hildebrand
` (5 subsequent siblings)
6 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:35 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Stewart Hildebrand, linux-pci, linux-kernel
pci_reassigndev_resource_alignment() performs a PCI config read, then
then unconditionally clears a bit and issues a PCI config write.
However, the bit in question (PCI_COMMAND_MEMORY) is a RW bit, so it is
unnecessary to issue the PCI config write if the bit is already cleared.
Make the write conditional.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
Should this be folded into ("pci: restore memory decoding after
reallocation") in this series?
---
drivers/pci/pci.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index df550953fa26..f017e7a8f028 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6632,8 +6632,10 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
}
pci_read_config_word(dev, PCI_COMMAND, &command);
- command &= ~PCI_COMMAND_MEMORY;
- pci_write_config_word(dev, PCI_COMMAND, command);
+ if (command & PCI_COMMAND_MEMORY) {
+ command &= ~PCI_COMMAND_MEMORY;
+ pci_write_config_word(dev, PCI_COMMAND, command);
+ }
for (i = 0; i <= PCI_ROM_RESOURCE; i++)
pci_request_resource_alignment(dev, i, align, resize);
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/6] PCI: restore resource alignment
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 1/6] PCI: don't clear already cleared bit Stewart Hildebrand
@ 2024-07-09 13:35 ` Stewart Hildebrand
2024-07-09 13:36 ` [PATCH 3/6] PCI: restore memory decoding after reallocation Stewart Hildebrand
` (4 subsequent siblings)
6 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:35 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Stewart Hildebrand, linux-pci, linux-kernel
Devices with alignment specified will lose their alignment in cases when
the bridge resources have been released, e.g. due to insufficient bridge
window size. Restore the alignment.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
drivers/pci/setup-bus.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 23082bc0ca37..ab7510ce6917 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1594,6 +1594,23 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
}
}
+static void restore_child_resource_alignment(struct pci_bus *bus)
+{
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ struct pci_bus *b;
+
+ pci_reassigndev_resource_alignment(dev);
+
+ b = dev->subordinate;
+ if (!b)
+ continue;
+
+ restore_child_resource_alignment(b);
+ }
+}
+
#define PCI_RES_TYPE_MASK \
(IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
IORESOURCE_MEM_64)
@@ -1648,6 +1665,8 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
r->start = 0;
r->flags = 0;
+ restore_child_resource_alignment(bus);
+
/* Avoiding touch the one without PREF */
if (type & IORESOURCE_PREFETCH)
type = IORESOURCE_PREFETCH;
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/6] PCI: restore memory decoding after reallocation
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 1/6] PCI: don't clear already cleared bit Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 2/6] PCI: restore resource alignment Stewart Hildebrand
@ 2024-07-09 13:36 ` Stewart Hildebrand
2024-07-09 16:16 ` Bjorn Helgaas
2024-07-09 13:36 ` [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment Stewart Hildebrand
` (3 subsequent siblings)
6 siblings, 1 reply; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:36 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Stewart Hildebrand, linux-pci, linux-kernel
Currently, the PCI subsystem unconditionally clears the memory decoding
bit of devices with resource alignment specified. Unfortunately, some
drivers assume memory decoding was enabled by firmware. Restore the
memory decoding bit after the resource has been reallocated.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
Relevant prior discussion at [1]
[1] https://lore.kernel.org/linux-pci/20160906165652.GE1214@localhost/
---
drivers/pci/pci.c | 1 +
drivers/pci/setup-bus.c | 25 +++++++++++++++++++++++++
include/linux/pci.h | 2 ++
3 files changed, 28 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f017e7a8f028..7953e75b9129 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6633,6 +6633,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
pci_read_config_word(dev, PCI_COMMAND, &command);
if (command & PCI_COMMAND_MEMORY) {
+ dev->dev_flags |= PCI_DEV_FLAGS_MEMORY_ENABLE;
command &= ~PCI_COMMAND_MEMORY;
pci_write_config_word(dev, PCI_COMMAND, command);
}
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index ab7510ce6917..6847b251e19a 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2131,6 +2131,29 @@ pci_root_bus_distribute_available_resources(struct pci_bus *bus,
}
}
+static void restore_memory_decoding(struct pci_bus *bus)
+{
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ struct pci_bus *b;
+
+ if (dev->dev_flags & PCI_DEV_FLAGS_MEMORY_ENABLE) {
+ u16 command;
+
+ pci_read_config_word(dev, PCI_COMMAND, &command);
+ command |= PCI_COMMAND_MEMORY;
+ pci_write_config_word(dev, PCI_COMMAND, command);
+ }
+
+ b = dev->subordinate;
+ if (!b)
+ continue;
+
+ restore_memory_decoding(b);
+ }
+}
+
/*
* First try will not touch PCI bridge res.
* Second and later try will clear small leaf bridge res.
@@ -2229,6 +2252,8 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
goto again;
dump:
+ restore_memory_decoding(bus);
+
/* Dump the resource on buses */
pci_bus_dump_resources(bus);
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e83ac93a4dcb..cb5df4c6a999 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -245,6 +245,8 @@ enum pci_dev_flags {
PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
/* Device does honor MSI masking despite saying otherwise */
PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12),
+ /* Firmware enabled memory decoding, to be restored if BAR is updated */
+ PCI_DEV_FLAGS_MEMORY_ENABLE = (__force pci_dev_flags_t) (1 << 13),
};
enum pci_irq_reroute_variant {
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
` (2 preceding siblings ...)
2024-07-09 13:36 ` [PATCH 3/6] PCI: restore memory decoding after reallocation Stewart Hildebrand
@ 2024-07-09 13:36 ` Stewart Hildebrand
2024-07-09 16:19 ` Bjorn Helgaas
2024-07-10 14:05 ` Ilpo Järvinen
2024-07-09 13:36 ` [PATCH 5/6] PCI: don't reassign resources that are already aligned Stewart Hildebrand
` (2 subsequent siblings)
6 siblings, 2 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:36 UTC (permalink / raw)
To: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin
Cc: Stewart Hildebrand, x86, linux-pci, linux-kernel
Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
x86 due to the alignment being overwritten in
pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
make it work on x86.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
RFC: We don't have enough info in this function to re-calculate the
alignment value in case of IORESOURCE_STARTALIGN. Luckily our
alignment value seems to be intact, so just don't touch it...
Alternatively, we could call pci_reassigndev_resource_alignment()
after the loop. Would that be preferable?
---
arch/x86/pci/i386.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index f2f4a5d50b27..ff6e61389ec7 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
/* We'll assign a new address later */
pcibios_save_fw_addr(dev,
idx, r->start);
- r->end -= r->start;
- r->start = 0;
+ if (!(r->flags &
+ IORESOURCE_STARTALIGN)) {
+ r->end -= r->start;
+ r->start = 0;
+ }
}
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 5/6] PCI: don't reassign resources that are already aligned
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
` (3 preceding siblings ...)
2024-07-09 13:36 ` [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment Stewart Hildebrand
@ 2024-07-09 13:36 ` Stewart Hildebrand
2024-07-09 13:36 ` [PATCH 6/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-10 23:26 ` [PATCH 0/6] " Stewart Hildebrand
6 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:36 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Stewart Hildebrand, linux-pci, linux-kernel
As a follow-up to commit 0dde1c08d1b9 ("PCI: Don't reassign resources
that are already aligned"), don't reassign already aligned resources
that have requested alignment without resizing.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
drivers/pci/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7953e75b9129..9f7894538334 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6553,6 +6553,9 @@ static void pci_request_resource_alignment(struct pci_dev *dev, int bar,
if (size >= align)
return;
+ if (!resize && (r->start > align) && !(r->start & (align - 1)))
+ return;
+
/*
* Increase the alignment of the resource. There are two ways we
* can do this:
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 6/6] PCI: align small (<4k) BARs
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
` (4 preceding siblings ...)
2024-07-09 13:36 ` [PATCH 5/6] PCI: don't reassign resources that are already aligned Stewart Hildebrand
@ 2024-07-09 13:36 ` Stewart Hildebrand
2024-07-09 16:21 ` Bjorn Helgaas
2024-07-10 13:56 ` Ilpo Järvinen
2024-07-10 23:26 ` [PATCH 0/6] " Stewart Hildebrand
6 siblings, 2 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-09 13:36 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Stewart Hildebrand, linux-pci, linux-kernel
Issues observed when small (<4k) BARs are not 4k aligned are:
1. Devices to be passed through (to e.g. a Xen HVM guest) with small
(<4k) BARs require each memory BAR to be page aligned. Currently, the
only way to guarantee this alignment from a user perspective is to fake
the size of the BARs using the pci=resource_alignment= option. This is a
bad user experience, and faking the BAR size is not always desirable.
See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
for further discussion.
2. Devices with multiple small (<4k) BARs could have the MSI-X tables
located in one of its small (<4k) BARs. This may lead to the MSI-X
tables being mapped in the same 4k region as other data. The PCIe 6.1
specification (section 7.7.2 MSI-X Capability and Table Structure) says
we probably shouldn't do that.
To improve the user experience, and increase conformance to PCIe spec,
set the default minimum resource alignment of memory BARs to 4k. Choose
4k (rather than PAGE_SIZE) for the alignment value in the common code,
since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
The new default alignment may be overridden by arches by implementing
pcibios_default_alignment(), or by the user with the
pci=resource_alignment= option.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
Preparatory patches in this series are prerequisites to this patch.
---
drivers/pci/pci.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9f7894538334..e7b648304383 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6453,7 +6453,12 @@ struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
resource_size_t __weak pcibios_default_alignment(void)
{
- return 0;
+ /*
+ * Avoid MSI-X tables being mapped in the same 4k region as other data
+ * according to PCIe 6.1 specification section 7.7.2 MSI-X Capability
+ * and Table Structure.
+ */
+ return 4 * 1024;
}
/*
--
2.45.2
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 3/6] PCI: restore memory decoding after reallocation
2024-07-09 13:36 ` [PATCH 3/6] PCI: restore memory decoding after reallocation Stewart Hildebrand
@ 2024-07-09 16:16 ` Bjorn Helgaas
2024-07-10 20:31 ` Stewart Hildebrand
0 siblings, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-09 16:16 UTC (permalink / raw)
To: Stewart Hildebrand; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On Tue, Jul 09, 2024 at 09:36:00AM -0400, Stewart Hildebrand wrote:
> Currently, the PCI subsystem unconditionally clears the memory decoding
> bit of devices with resource alignment specified. Unfortunately, some
> drivers assume memory decoding was enabled by firmware. Restore the
> memory decoding bit after the resource has been reallocated.
Which drivers have you tripped over? Those drivers apparently don't
call pci_enable_device() and assume the firmware has left memory
decoding enabled. I don't think there's any guarantee that firmware
must do that, so the drivers are probably broken on some platforms,
and we could improve things overall by adding the pci_enable_device()
to them.
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> Relevant prior discussion at [1]
>
> [1] https://lore.kernel.org/linux-pci/20160906165652.GE1214@localhost/
> ---
> drivers/pci/pci.c | 1 +
> drivers/pci/setup-bus.c | 25 +++++++++++++++++++++++++
> include/linux/pci.h | 2 ++
> 3 files changed, 28 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index f017e7a8f028..7953e75b9129 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6633,6 +6633,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>
> pci_read_config_word(dev, PCI_COMMAND, &command);
> if (command & PCI_COMMAND_MEMORY) {
> + dev->dev_flags |= PCI_DEV_FLAGS_MEMORY_ENABLE;
> command &= ~PCI_COMMAND_MEMORY;
> pci_write_config_word(dev, PCI_COMMAND, command);
> }
It would be nice if this could be contained to
pci_reassigndev_resource_alignment() so the clear and restore could be
in the same function. But I suppose the concern is that re-enabling
decoding too early could be an issue in a hierarchy where bridge
windows are also reassigned?
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index ab7510ce6917..6847b251e19a 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -2131,6 +2131,29 @@ pci_root_bus_distribute_available_resources(struct pci_bus *bus,
> }
> }
>
> +static void restore_memory_decoding(struct pci_bus *bus)
> +{
> + struct pci_dev *dev;
> +
> + list_for_each_entry(dev, &bus->devices, bus_list) {
> + struct pci_bus *b;
> +
> + if (dev->dev_flags & PCI_DEV_FLAGS_MEMORY_ENABLE) {
> + u16 command;
> +
> + pci_read_config_word(dev, PCI_COMMAND, &command);
> + command |= PCI_COMMAND_MEMORY;
> + pci_write_config_word(dev, PCI_COMMAND, command);
> + }
> +
> + b = dev->subordinate;
> + if (!b)
> + continue;
> +
> + restore_memory_decoding(b);
> + }
> +}
> +
> /*
> * First try will not touch PCI bridge res.
> * Second and later try will clear small leaf bridge res.
> @@ -2229,6 +2252,8 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
> goto again;
>
> dump:
> + restore_memory_decoding(bus);
> +
> /* Dump the resource on buses */
> pci_bus_dump_resources(bus);
> }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index e83ac93a4dcb..cb5df4c6a999 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -245,6 +245,8 @@ enum pci_dev_flags {
> PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> /* Device does honor MSI masking despite saying otherwise */
> PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12),
> + /* Firmware enabled memory decoding, to be restored if BAR is updated */
> + PCI_DEV_FLAGS_MEMORY_ENABLE = (__force pci_dev_flags_t) (1 << 13),
> };
>
> enum pci_irq_reroute_variant {
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-09 13:36 ` [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment Stewart Hildebrand
@ 2024-07-09 16:19 ` Bjorn Helgaas
2024-07-10 16:16 ` Stewart Hildebrand
2024-07-10 14:05 ` Ilpo Järvinen
1 sibling, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-09 16:19 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> x86 due to the alignment being overwritten in
> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> make it work on x86.
Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
added, but likely it was for some situation on x86, so presumably it
worked at one time. If something broke it in the meantime, it would
be nice to identify the commit that broke it.
Nit: follow the subject line conventions for this and the other
patches. Learn them with "git log --oneline". For this patch,
"x86/PCI: <Capitalized text>" is appropriate.
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> RFC: We don't have enough info in this function to re-calculate the
> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
> alignment value seems to be intact, so just don't touch it...
> Alternatively, we could call pci_reassigndev_resource_alignment()
> after the loop. Would that be preferable?
> ---
> arch/x86/pci/i386.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index f2f4a5d50b27..ff6e61389ec7 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
> /* We'll assign a new address later */
> pcibios_save_fw_addr(dev,
> idx, r->start);
> - r->end -= r->start;
> - r->start = 0;
> + if (!(r->flags &
> + IORESOURCE_STARTALIGN)) {
> + r->end -= r->start;
> + r->start = 0;
> + }
> }
> }
> }
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] PCI: align small (<4k) BARs
2024-07-09 13:36 ` [PATCH 6/6] PCI: align small (<4k) BARs Stewart Hildebrand
@ 2024-07-09 16:21 ` Bjorn Helgaas
2024-07-10 16:35 ` Stewart Hildebrand
2024-07-10 13:56 ` Ilpo Järvinen
1 sibling, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-09 16:21 UTC (permalink / raw)
To: Stewart Hildebrand; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On Tue, Jul 09, 2024 at 09:36:03AM -0400, Stewart Hildebrand wrote:
> Issues observed when small (<4k) BARs are not 4k aligned are:
>
> 1. Devices to be passed through (to e.g. a Xen HVM guest) with small
> (<4k) BARs require each memory BAR to be page aligned. Currently, the
> only way to guarantee this alignment from a user perspective is to fake
> the size of the BARs using the pci=resource_alignment= option. This is a
> bad user experience, and faking the BAR size is not always desirable.
> See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
> for further discussion.
Include the relevant part of this discussion directly here so this log
is self-contained. Someday that function will change, which will make
this commit log less useful.
> 2. Devices with multiple small (<4k) BARs could have the MSI-X tables
> located in one of its small (<4k) BARs. This may lead to the MSI-X
> tables being mapped in the same 4k region as other data. The PCIe 6.1
> specification (section 7.7.2 MSI-X Capability and Table Structure) says
> we probably shouldn't do that.
>
> To improve the user experience, and increase conformance to PCIe spec,
> set the default minimum resource alignment of memory BARs to 4k. Choose
> 4k (rather than PAGE_SIZE) for the alignment value in the common code,
> since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
> The new default alignment may be overridden by arches by implementing
> pcibios_default_alignment(), or by the user with the
> pci=resource_alignment= option.
>
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> Preparatory patches in this series are prerequisites to this patch.
> ---
> drivers/pci/pci.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9f7894538334..e7b648304383 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6453,7 +6453,12 @@ struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
>
> resource_size_t __weak pcibios_default_alignment(void)
> {
> - return 0;
> + /*
> + * Avoid MSI-X tables being mapped in the same 4k region as other data
> + * according to PCIe 6.1 specification section 7.7.2 MSI-X Capability
> + * and Table Structure.
> + */
> + return 4 * 1024;
> }
>
> /*
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] PCI: align small (<4k) BARs
2024-07-09 13:36 ` [PATCH 6/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-09 16:21 ` Bjorn Helgaas
@ 2024-07-10 13:56 ` Ilpo Järvinen
2024-07-10 16:28 ` Stewart Hildebrand
1 sibling, 1 reply; 24+ messages in thread
From: Ilpo Järvinen @ 2024-07-10 13:56 UTC (permalink / raw)
To: Stewart Hildebrand; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On Tue, 9 Jul 2024, Stewart Hildebrand wrote:
> Issues observed when small (<4k) BARs are not 4k aligned are:
>
> 1. Devices to be passed through (to e.g. a Xen HVM guest) with small
> (<4k) BARs require each memory BAR to be page aligned. Currently, the
> only way to guarantee this alignment from a user perspective is to fake
> the size of the BARs using the pci=resource_alignment= option. This is a
> bad user experience, and faking the BAR size is not always desirable.
> See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
> for further discussion.
>
> 2. Devices with multiple small (<4k) BARs could have the MSI-X tables
> located in one of its small (<4k) BARs. This may lead to the MSI-X
> tables being mapped in the same 4k region as other data. The PCIe 6.1
> specification (section 7.7.2 MSI-X Capability and Table Structure) says
> we probably shouldn't do that.
>
> To improve the user experience, and increase conformance to PCIe spec,
> set the default minimum resource alignment of memory BARs to 4k. Choose
> 4k (rather than PAGE_SIZE) for the alignment value in the common code,
> since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
> The new default alignment may be overridden by arches by implementing
> pcibios_default_alignment(), or by the user with the
> pci=resource_alignment= option.
>
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> Preparatory patches in this series are prerequisites to this patch.
> ---
> drivers/pci/pci.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9f7894538334..e7b648304383 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6453,7 +6453,12 @@ struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
>
> resource_size_t __weak pcibios_default_alignment(void)
> {
> - return 0;
> + /*
> + * Avoid MSI-X tables being mapped in the same 4k region as other data
> + * according to PCIe 6.1 specification section 7.7.2 MSI-X Capability
> + * and Table Structure.
> + */
> + return 4 * 1024;
SZ_4K
+ add #include for it if its not yet included by the .c file.
--
i.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-09 13:36 ` [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment Stewart Hildebrand
2024-07-09 16:19 ` Bjorn Helgaas
@ 2024-07-10 14:05 ` Ilpo Järvinen
2024-07-15 17:30 ` Stewart Hildebrand
1 sibling, 1 reply; 24+ messages in thread
From: Ilpo Järvinen @ 2024-07-10 14:05 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, LKML
On Tue, 9 Jul 2024, Stewart Hildebrand wrote:
> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> x86 due to the alignment being overwritten in
> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> make it work on x86.
>
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> RFC: We don't have enough info in this function to re-calculate the
> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
> alignment value seems to be intact, so just don't touch it...
> Alternatively, we could call pci_reassigndev_resource_alignment()
> after the loop. Would that be preferable?
> ---
> arch/x86/pci/i386.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> index f2f4a5d50b27..ff6e61389ec7 100644
> --- a/arch/x86/pci/i386.c
> +++ b/arch/x86/pci/i386.c
> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
> /* We'll assign a new address later */
> pcibios_save_fw_addr(dev,
> idx, r->start);
> - r->end -= r->start;
> - r->start = 0;
> + if (!(r->flags &
> + IORESOURCE_STARTALIGN)) {
> + r->end -= r->start;
> + r->start = 0;
> + }
> }
> }
> }
>
As a general comment to that loop in pcibios_allocate_dev_resources()
function, it would be nice to reverse some of the logic in the if
conditions and use continue to limit the runaway indentation level.
--
i.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-09 16:19 ` Bjorn Helgaas
@ 2024-07-10 16:16 ` Stewart Hildebrand
2024-07-10 21:24 ` Bjorn Helgaas
0 siblings, 1 reply; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 16:16 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On 7/9/24 12:19, Bjorn Helgaas wrote:
> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>> x86 due to the alignment being overwritten in
>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>> make it work on x86.
>
> Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
> added, but likely it was for some situation on x86, so presumably it
> worked at one time. If something broke it in the meantime, it would
> be nice to identify the commit that broke it.
No, I don't have reason to believe it's a regression.
IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
up resource alignment management").
There are some interesting commits mentioning 884525655d07:
5f17cfce5776 ("PCI: fix pbus_size_mem() resource alignment for CardBus
controllers")
934b7024f0ed ("Fix cardbus resource allocation")
It would seem that 884525655d07 missed updating the bits in
arch/x86/pci/i386.c. I don't think a Fixes: tag is strictly necessary
because I think the issue would only start to trigger once the default
alignment is updated in the last patch of this series.
As far as I can tell, it only breaks in a certain corner case that's not
really possible to trigger yet. The corner case seems to be the
following:
* Only on x86
* The BAR has been set in firmware
* Alignment has been requested via IORESOURCE_STARTALIGN
* The IORESOURCE_UNSET flag is set
* Only PCI_STD_RESOURCES and PCI_IOV_RESOURCES resources (not bridge
windows)
I think the reason this hasn't been seen until now is that it's not
possible to request IORESOURCE_STARTALIGN alignment via the
pci=resource_alignment= option. That option instead sets
IORESOURCE_SIZEALIGN, and in that case it works fine.
After the last patch in this series that changes the default alignment,
we will be starting to use IORESOURCE_STARTALIGN on all
not-sufficiently-aligned resources, and the corner case would be more
likely (rather, possible at all) to trigger.
My commit message is overstating the severity of the issue. So, how
about I make the commit message less scary:
There is a corner case in pcibios_allocate_dev_resources() that doesn't
account for IORESOURCE_STARTALIGN, in which case the alignment would be
overwritten. As far as I can tell, the corner case is not yet possible
to trigger, but it will be possible once the resource alignment changes.
Account for IORESOURCE_STARTALIGN in preparation for changing the
default resource alignment.
> Nit: follow the subject line conventions for this and the other
> patches. Learn them with "git log --oneline". For this patch,
> "x86/PCI: <Capitalized text>" is appropriate.
Thanks for pointing this out, I'll fix
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> RFC: We don't have enough info in this function to re-calculate the
>> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>> alignment value seems to be intact, so just don't touch it...
>> Alternatively, we could call pci_reassigndev_resource_alignment()
>> after the loop. Would that be preferable?
Any comments on this? After some more thought, I think calling
pci_reassigndev_resource_alignment() after the loop is probably more
correct, so if there aren't any comments, I'll plan on changing it.
>> ---
>> arch/x86/pci/i386.c | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>> index f2f4a5d50b27..ff6e61389ec7 100644
>> --- a/arch/x86/pci/i386.c
>> +++ b/arch/x86/pci/i386.c
>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>> /* We'll assign a new address later */
>> pcibios_save_fw_addr(dev,
>> idx, r->start);
>> - r->end -= r->start;
>> - r->start = 0;
>> + if (!(r->flags &
>> + IORESOURCE_STARTALIGN)) {
>> + r->end -= r->start;
>> + r->start = 0;
>> + }
>> }
>> }
>> }
>> --
>> 2.45.2
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] PCI: align small (<4k) BARs
2024-07-10 13:56 ` Ilpo Järvinen
@ 2024-07-10 16:28 ` Stewart Hildebrand
0 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 16:28 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On 7/10/24 09:56, Ilpo Järvinen wrote:
> On Tue, 9 Jul 2024, Stewart Hildebrand wrote:
>
>> Issues observed when small (<4k) BARs are not 4k aligned are:
>>
>> 1. Devices to be passed through (to e.g. a Xen HVM guest) with small
>> (<4k) BARs require each memory BAR to be page aligned. Currently, the
>> only way to guarantee this alignment from a user perspective is to fake
>> the size of the BARs using the pci=resource_alignment= option. This is a
>> bad user experience, and faking the BAR size is not always desirable.
>> See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
>> for further discussion.
>>
>> 2. Devices with multiple small (<4k) BARs could have the MSI-X tables
>> located in one of its small (<4k) BARs. This may lead to the MSI-X
>> tables being mapped in the same 4k region as other data. The PCIe 6.1
>> specification (section 7.7.2 MSI-X Capability and Table Structure) says
>> we probably shouldn't do that.
>>
>> To improve the user experience, and increase conformance to PCIe spec,
>> set the default minimum resource alignment of memory BARs to 4k. Choose
>> 4k (rather than PAGE_SIZE) for the alignment value in the common code,
>> since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
>> The new default alignment may be overridden by arches by implementing
>> pcibios_default_alignment(), or by the user with the
>> pci=resource_alignment= option.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> Preparatory patches in this series are prerequisites to this patch.
>> ---
>> drivers/pci/pci.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 9f7894538334..e7b648304383 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6453,7 +6453,12 @@ struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
>>
>> resource_size_t __weak pcibios_default_alignment(void)
>> {
>> - return 0;
>> + /*
>> + * Avoid MSI-X tables being mapped in the same 4k region as other data
>> + * according to PCIe 6.1 specification section 7.7.2 MSI-X Capability
>> + * and Table Structure.
>> + */
>> + return 4 * 1024;
>
> SZ_4K
Ah, thank you! I'll fix.
>
> + add #include for it if its not yet included by the .c file.
>
I'll add #include <linux/sizes.h>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] PCI: align small (<4k) BARs
2024-07-09 16:21 ` Bjorn Helgaas
@ 2024-07-10 16:35 ` Stewart Hildebrand
0 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 16:35 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On 7/9/24 12:21, Bjorn Helgaas wrote:
> On Tue, Jul 09, 2024 at 09:36:03AM -0400, Stewart Hildebrand wrote:
>> Issues observed when small (<4k) BARs are not 4k aligned are:
>>
>> 1. Devices to be passed through (to e.g. a Xen HVM guest) with small
>> (<4k) BARs require each memory BAR to be page aligned. Currently, the
>> only way to guarantee this alignment from a user perspective is to fake
>> the size of the BARs using the pci=resource_alignment= option. This is a
>> bad user experience, and faking the BAR size is not always desirable.
>> See the comment in drivers/pci/pci.c:pci_request_resource_alignment()
>> for further discussion.
>
> Include the relevant part of this discussion directly here so this log
> is self-contained. Someday that function will change, which will make
> this commit log less useful.
Will do
>> 2. Devices with multiple small (<4k) BARs could have the MSI-X tables
>> located in one of its small (<4k) BARs. This may lead to the MSI-X
>> tables being mapped in the same 4k region as other data. The PCIe 6.1
>> specification (section 7.7.2 MSI-X Capability and Table Structure) says
>> we probably shouldn't do that.
>>
>> To improve the user experience, and increase conformance to PCIe spec,
>> set the default minimum resource alignment of memory BARs to 4k. Choose
>> 4k (rather than PAGE_SIZE) for the alignment value in the common code,
>> since that is the value called out in the PCIe 6.1 spec, section 7.7.2.
>> The new default alignment may be overridden by arches by implementing
>> pcibios_default_alignment(), or by the user with the
>> pci=resource_alignment= option.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> Preparatory patches in this series are prerequisites to this patch.
>> ---
>> drivers/pci/pci.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 9f7894538334..e7b648304383 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6453,7 +6453,12 @@ struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
>>
>> resource_size_t __weak pcibios_default_alignment(void)
>> {
>> - return 0;
>> + /*
>> + * Avoid MSI-X tables being mapped in the same 4k region as other data
>> + * according to PCIe 6.1 specification section 7.7.2 MSI-X Capability
>> + * and Table Structure.
>> + */
>> + return 4 * 1024;
>> }
>>
>> /*
>> --
>> 2.45.2
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/6] PCI: restore memory decoding after reallocation
2024-07-09 16:16 ` Bjorn Helgaas
@ 2024-07-10 20:31 ` Stewart Hildebrand
0 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 20:31 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Bjorn Helgaas, linux-pci, linux-kernel
On 7/9/24 12:16, Bjorn Helgaas wrote:
> On Tue, Jul 09, 2024 at 09:36:00AM -0400, Stewart Hildebrand wrote:
>> Currently, the PCI subsystem unconditionally clears the memory decoding
>> bit of devices with resource alignment specified. Unfortunately, some
>> drivers assume memory decoding was enabled by firmware. Restore the
>> memory decoding bit after the resource has been reallocated.
>
> Which drivers have you tripped over? Those drivers apparently don't
> call pci_enable_device() and assume the firmware has left memory
> decoding enabled. I don't think there's any guarantee that firmware
> must do that, so the drivers are probably broken on some platforms,
> and we could improve things overall by adding the pci_enable_device()
> to them.
Agreed. Well, it would be vgacon, but lspci -v doesn't actually show any
driver attached to the device in my test case. Presumably because vgacon
just uses the 0xb8000 buffer directly without any sort of PCI
involvement. Memory decoding must be enabled on this particular VGA
device for vgacon to properly display a console on the display. If
memory decoding becomes disabled on the VGA device (or fails to be
reenabled), the console ceases to display properly.
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> Relevant prior discussion at [1]
>>
>> [1] https://lore.kernel.org/linux-pci/20160906165652.GE1214@localhost/
>> ---
>> drivers/pci/pci.c | 1 +
>> drivers/pci/setup-bus.c | 25 +++++++++++++++++++++++++
>> include/linux/pci.h | 2 ++
>> 3 files changed, 28 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index f017e7a8f028..7953e75b9129 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6633,6 +6633,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>>
>> pci_read_config_word(dev, PCI_COMMAND, &command);
>> if (command & PCI_COMMAND_MEMORY) {
>> + dev->dev_flags |= PCI_DEV_FLAGS_MEMORY_ENABLE;
>> command &= ~PCI_COMMAND_MEMORY;
>> pci_write_config_word(dev, PCI_COMMAND, command);
>> }
>
> It would be nice if this could be contained to
> pci_reassigndev_resource_alignment() so the clear and restore could be
> in the same function. But I suppose the concern is that re-enabling
> decoding too early could be an issue in a hierarchy where bridge
> windows are also reassigned?
Well, yes, and, even if the bridge windows are sufficient and we're just
allocating a new a BAR, that happens later on. If we were to return from
pci_reassigndev_resource_alignment() with memory decoding enabled, we'd
have the situation described in [1] with our knowledge of what the BAR
contains thrown away while memory decoding is enabled. We'd also
potentially be writing the new BAR while memory decoding is enabled.
>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>> index ab7510ce6917..6847b251e19a 100644
>> --- a/drivers/pci/setup-bus.c
>> +++ b/drivers/pci/setup-bus.c
>> @@ -2131,6 +2131,29 @@ pci_root_bus_distribute_available_resources(struct pci_bus *bus,
>> }
>> }
>>
>> +static void restore_memory_decoding(struct pci_bus *bus)
>> +{
>> + struct pci_dev *dev;
>> +
>> + list_for_each_entry(dev, &bus->devices, bus_list) {
>> + struct pci_bus *b;
>> +
>> + if (dev->dev_flags & PCI_DEV_FLAGS_MEMORY_ENABLE) {
>> + u16 command;
>> +
>> + pci_read_config_word(dev, PCI_COMMAND, &command);
>> + command |= PCI_COMMAND_MEMORY;
>> + pci_write_config_word(dev, PCI_COMMAND, command);
>> + }
>> +
>> + b = dev->subordinate;
>> + if (!b)
>> + continue;
>> +
>> + restore_memory_decoding(b);
>> + }
>> +}
>> +
>> /*
>> * First try will not touch PCI bridge res.
>> * Second and later try will clear small leaf bridge res.
>> @@ -2229,6 +2252,8 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
>> goto again;
>>
>> dump:
>> + restore_memory_decoding(bus);
>> +
>> /* Dump the resource on buses */
>> pci_bus_dump_resources(bus);
>> }
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index e83ac93a4dcb..cb5df4c6a999 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -245,6 +245,8 @@ enum pci_dev_flags {
>> PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
>> /* Device does honor MSI masking despite saying otherwise */
>> PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12),
>> + /* Firmware enabled memory decoding, to be restored if BAR is updated */
>> + PCI_DEV_FLAGS_MEMORY_ENABLE = (__force pci_dev_flags_t) (1 << 13),
>> };
>>
>> enum pci_irq_reroute_variant {
>> --
>> 2.45.2
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-10 16:16 ` Stewart Hildebrand
@ 2024-07-10 21:24 ` Bjorn Helgaas
2024-07-10 22:49 ` Stewart Hildebrand
2024-07-15 17:26 ` Stewart Hildebrand
0 siblings, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-10 21:24 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> On 7/9/24 12:19, Bjorn Helgaas wrote:
> > On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >> x86 due to the alignment being overwritten in
> >> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >> make it work on x86.
> >
> > Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
> > added, but likely it was for some situation on x86, so presumably it
> > worked at one time. If something broke it in the meantime, it would
> > be nice to identify the commit that broke it.
>
> No, I don't have reason to believe it's a regression.
>
> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
> up resource alignment management").
Ah, OK. IORESOURCE_STARTALIGN is used for bridge windows, which don't
need to be aligned on their size as BARs do. It would be terrible if
that usage was broken, which is why I was alarmed by the idea of it
not working on x86.
But this patch is only relevant for BARs. I was a little confused
about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
force alignment on *more* than the BAR's size, e.g., to prevent
multiple BARs from being put in the same page.
Bottom line, this would need to be a little more specific so it
doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
IIUC, the main purpose of the series is to align all BARs to at least
4K. I don't think the series relies on IORESOURCE_STARTALIGN to do
that. But there's an issue with "pci=resource_alignment=..." that you
noticed sort of incidentally, and this patch fixes that? If so, it's
important to mention that parameter.
> >> RFC: We don't have enough info in this function to re-calculate the
> >> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
> >> alignment value seems to be intact, so just don't touch it...
> >> Alternatively, we could call pci_reassigndev_resource_alignment()
> >> after the loop. Would that be preferable?
>
> Any comments on this? After some more thought, I think calling
> pci_reassigndev_resource_alignment() after the loop is probably more
> correct, so if there aren't any comments, I'll plan on changing it.
Sounds like this might be a separate patch unless it logically has to
be part of this one to avoid an issue.
> >> ---
> >> arch/x86/pci/i386.c | 7 +++++--
> >> 1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
> >> index f2f4a5d50b27..ff6e61389ec7 100644
> >> --- a/arch/x86/pci/i386.c
> >> +++ b/arch/x86/pci/i386.c
> >> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
> >> /* We'll assign a new address later */
> >> pcibios_save_fw_addr(dev,
> >> idx, r->start);
> >> - r->end -= r->start;
> >> - r->start = 0;
> >> + if (!(r->flags &
> >> + IORESOURCE_STARTALIGN)) {
> >> + r->end -= r->start;
> >> + r->start = 0;
> >> + }
I wondered why this only touched x86 and whether other arches need a
similar change. This is used in two paths:
1) pcibios_resource_survey_bus(), which is only implemented by x86
2) pcibios_resource_survey(), which is implemented by x86 and
powerpc. The powerpc pcibios_allocate_resources() is similar to the
x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
r->end and r->start updates you're making conditional.
So it looks like x86 is indeed the only place that needs this change.
None of this stuff is arch-specific, so it's a shame that we don't
have generic code for it all. Sigh, oh well.
> >> }
> >> }
> >> }
> >> --
> >> 2.45.2
> >>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-10 21:24 ` Bjorn Helgaas
@ 2024-07-10 22:49 ` Stewart Hildebrand
2024-07-11 18:40 ` Bjorn Helgaas
2024-07-15 17:26 ` Stewart Hildebrand
1 sibling, 1 reply; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 22:49 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On 7/10/24 17:24, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>>>> x86 due to the alignment being overwritten in
>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>>>> make it work on x86.
>>>
>>> Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
>>> added, but likely it was for some situation on x86, so presumably it
>>> worked at one time. If something broke it in the meantime, it would
>>> be nice to identify the commit that broke it.
>>
>> No, I don't have reason to believe it's a regression.
>>
>> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
>> up resource alignment management").
>
> Ah, OK. IORESOURCE_STARTALIGN is used for bridge windows, which don't
> need to be aligned on their size as BARs do. It would be terrible if
> that usage was broken, which is why I was alarmed by the idea of it
> not working on x86>
> But this patch is only relevant for BARs. I was a little confused
> about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
> force alignment on *more* than the BAR's size, e.g., to prevent
> multiple BARs from being put in the same page.
>
> Bottom line, this would need to be a little more specific so it
> doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
I'll make the commit message clearer.
> IIUC, the main purpose of the series is to align all BARs to at least
> 4K. I don't think the series relies on IORESOURCE_STARTALIGN to do
> that.
Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
> But there's an issue with "pci=resource_alignment=..." that you
> noticed sort of incidentally, and this patch fixes that?
No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
breaks pcitest. And we'd like pcitest to work properly for PCI
passthrough validation with Xen, hence the need for
IORESOURCE_STARTALIGN.
> If so, it's
> important to mention that parameter.
>
>>>> RFC: We don't have enough info in this function to re-calculate the
>>>> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>>>> alignment value seems to be intact, so just don't touch it...
>>>> Alternatively, we could call pci_reassigndev_resource_alignment()
>>>> after the loop. Would that be preferable?
>>
>> Any comments on this? After some more thought, I think calling
>> pci_reassigndev_resource_alignment() after the loop is probably more
>> correct, so if there aren't any comments, I'll plan on changing it.
>
> Sounds like this might be a separate patch unless it logically has to
> be part of this one to avoid an issue
>
>>>> ---
>>>> arch/x86/pci/i386.c | 7 +++++--
>>>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>>>> index f2f4a5d50b27..ff6e61389ec7 100644
>>>> --- a/arch/x86/pci/i386.c
>>>> +++ b/arch/x86/pci/i386.c
>>>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>>> /* We'll assign a new address later */
>>>> pcibios_save_fw_addr(dev,
>>>> idx, r->start);
>>>> - r->end -= r->start;
>>>> - r->start = 0;
>>>> + if (!(r->flags &
>>>> + IORESOURCE_STARTALIGN)) {
>>>> + r->end -= r->start;
>>>> + r->start = 0;
>>>> + }
>
> I wondered why this only touched x86 and whether other arches need a
> similar change. This is used in two paths:
>
> 1) pcibios_resource_survey_bus(), which is only implemented by x86
>
> 2) pcibios_resource_survey(), which is implemented by x86 and
> powerpc. The powerpc pcibios_allocate_resources() is similar to the
> x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
> r->end and r->start updates you're making conditional.
>
> So it looks like x86 is indeed the only place that needs this change.
> None of this stuff is arch-specific, so it's a shame that we don't
> have generic code for it all. Sigh, oh well.
>
>>>> }
>>>> }
>>>> }
>>>> --
>>>> 2.45.2
>>>>
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/6] PCI: align small (<4k) BARs
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
` (5 preceding siblings ...)
2024-07-09 13:36 ` [PATCH 6/6] PCI: align small (<4k) BARs Stewart Hildebrand
@ 2024-07-10 23:26 ` Stewart Hildebrand
6 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-10 23:26 UTC (permalink / raw)
To: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin
Cc: x86, linux-pci, linux-kernel
On 7/9/24 09:35, Stewart Hildebrand wrote:
> This series sets the default minimum resource alignment to 4k for memory
> BARs. In preparation, it makes an optimization and addresses some corner
> cases observed when reallocating BARs. I consider the prepapatory
> patches to be prerequisites to changing the default BAR size.
Sorry, I meant for this to say default BAR "alignment", not "size"
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-10 22:49 ` Stewart Hildebrand
@ 2024-07-11 18:40 ` Bjorn Helgaas
2024-07-11 18:58 ` Stewart Hildebrand
0 siblings, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-11 18:40 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
> On 7/10/24 17:24, Bjorn Helgaas wrote:
> > On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> >> On 7/9/24 12:19, Bjorn Helgaas wrote:
> >>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >>>> x86 due to the alignment being overwritten in
> >>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >>>> make it work on x86.
> >>>
> >>> Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
> >>> added, but likely it was for some situation on x86, so presumably it
> >>> worked at one time. If something broke it in the meantime, it would
> >>> be nice to identify the commit that broke it.
> >>
> >> No, I don't have reason to believe it's a regression.
> >>
> >> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
> >> up resource alignment management").
> >
> > Ah, OK. IORESOURCE_STARTALIGN is used for bridge windows, which don't
> > need to be aligned on their size as BARs do. It would be terrible if
> > that usage was broken, which is why I was alarmed by the idea of it
> > not working on x86>
> > But this patch is only relevant for BARs. I was a little confused
> > about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
> > force alignment on *more* than the BAR's size, e.g., to prevent
> > multiple BARs from being put in the same page.
> >
> > Bottom line, this would need to be a little more specific so it
> > doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
>
> I'll make the commit message clearer.
>
> > IIUC, the main purpose of the series is to align all BARs to at least
> > 4K. I don't think the series relies on IORESOURCE_STARTALIGN to do
> > that.
>
> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
Oh, I missed that, sorry. The only places I see that set
IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
where I got the "pci=resource_alignment=..." connection, and
pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
for bridge windows, AFAICS.
Doesn't the >= 4K alignment in this series hinge on the
pcibios_default_alignment() change? It looks like that would force at
least 4K alignment independent of IORESOURCE_STARTALIGN.
> > But there's an issue with "pci=resource_alignment=..." that you
> > noticed sort of incidentally, and this patch fixes that?
>
> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
> breaks pcitest. And we'd like pcitest to work properly for PCI
> passthrough validation with Xen, hence the need for
> IORESOURCE_STARTALIGN.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-11 18:40 ` Bjorn Helgaas
@ 2024-07-11 18:58 ` Stewart Hildebrand
2024-07-11 20:35 ` Bjorn Helgaas
0 siblings, 1 reply; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-11 18:58 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On 7/11/24 14:40, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
>> On 7/10/24 17:24, Bjorn Helgaas wrote:
>>> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>>>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>>>>>> x86 due to the alignment being overwritten in
>>>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>>>>>> make it work on x86.
>>>>>
>>>>> Is this a regression? I didn't look up when IORESOURCE_STARTALIGN was
>>>>> added, but likely it was for some situation on x86, so presumably it
>>>>> worked at one time. If something broke it in the meantime, it would
>>>>> be nice to identify the commit that broke it.
>>>>
>>>> No, I don't have reason to believe it's a regression.
>>>>
>>>> IORESOURCE_STARTALIGN was introduced in commit 884525655d07 ("PCI: clean
>>>> up resource alignment management").
>>>
>>> Ah, OK. IORESOURCE_STARTALIGN is used for bridge windows, which don't
>>> need to be aligned on their size as BARs do. It would be terrible if
>>> that usage was broken, which is why I was alarmed by the idea of it
>>> not working on x86>
>>> But this patch is only relevant for BARs. I was a little confused
>>> about IORESOURCE_STARTALIGN for a BAR, but I guess the point is to
>>> force alignment on *more* than the BAR's size, e.g., to prevent
>>> multiple BARs from being put in the same page.
>>>
>>> Bottom line, this would need to be a little more specific so it
>>> doesn't suggest that IORESOURCE_STARTALIGN for windows is broken.
>>
>> I'll make the commit message clearer.
>>
>>> IIUC, the main purpose of the series is to align all BARs to at least
>>> 4K. I don't think the series relies on IORESOURCE_STARTALIGN to do
>>> that.
>>
>> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
>
> Oh, I missed that, sorry. The only places I see that set
> IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
> where I got the "pci=resource_alignment=..." connection, and
> pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
> for bridge windows, AFAICS.
>
> Doesn't the >= 4K alignment in this series hinge on the
> pcibios_default_alignment() change?
Yep
> It looks like that would force at
> least 4K alignment independent of IORESOURCE_STARTALIGN.
Changing pcibios_default_alignment() (without pci=resource_alignment=
specified) results in IORESOURCE_STARTALIGN.
>>> But there's an issue with "pci=resource_alignment=..." that you
>>> noticed sort of incidentally, and this patch fixes that?
>>
>> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
>> breaks pcitest. And we'd like pcitest to work properly for PCI
>> passthrough validation with Xen, hence the need for
>> IORESOURCE_STARTALIGN.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-11 18:58 ` Stewart Hildebrand
@ 2024-07-11 20:35 ` Bjorn Helgaas
0 siblings, 0 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2024-07-11 20:35 UTC (permalink / raw)
To: Stewart Hildebrand
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel,
Yongji Xie
[+cc Yongji Xie]
On Thu, Jul 11, 2024 at 02:58:24PM -0400, Stewart Hildebrand wrote:
> On 7/11/24 14:40, Bjorn Helgaas wrote:
> > On Wed, Jul 10, 2024 at 06:49:42PM -0400, Stewart Hildebrand wrote:
> >> On 7/10/24 17:24, Bjorn Helgaas wrote:
> >>> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
> >>>> On 7/9/24 12:19, Bjorn Helgaas wrote:
> >>>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
> >>>>>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
> >>>>>> x86 due to the alignment being overwritten in
> >>>>>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
> >>>>>> make it work on x86.
> ...
> >>> IIUC, the main purpose of the series is to align all BARs to at least
> >>> 4K. I don't think the series relies on IORESOURCE_STARTALIGN to do
> >>> that.
> >>
> >> Yes, it does rely on IORESOURCE_STARTALIGN for BARs.
> >
> > Oh, I missed that, sorry. The only places I see that set
> > IORESOURCE_STARTALIGN are pci_request_resource_alignment(), which is
> > where I got the "pci=resource_alignment=..." connection, and
> > pbus_size_io(), pbus_size_mem(), and pci_bus_size_cardbus(), which are
> > for bridge windows, AFAICS.
> >
> > Doesn't the >= 4K alignment in this series hinge on the
> > pcibios_default_alignment() change?
>
> Yep
>
> > It looks like that would force at
> > least 4K alignment independent of IORESOURCE_STARTALIGN.
>
> Changing pcibios_default_alignment() (without pci=resource_alignment=
> specified) results in IORESOURCE_STARTALIGN.
Mmmm. I guess it's this path:
pci_device_add
pci_reassigndev_resource_alignment
align = pci_specified_resource_alignment(&resize)
pcibios_default_alignment
for (i = 0; i <= PCI_ROM_RESOURCE; i++)
pci_request_resource_alignment(i, align, resize)
if (!resize)
r->flags |= IORESOURCE_STARTALIGN
where "resize" is false because the device wasn't mentioned in a
"pci=resource_alignment=..." parameter, so
pci_request_resource_alignment() sets IORESOURCE_STARTALIGN.
When 0a701aa63784 ("PCI: Add pcibios_default_alignment() for
arch-specific alignment control") added pcibios_default_alignment(),
we got a way to do arch-specific alignment, but if the alignment is
non-zero, the implementation *also* applies that alignment to ALL
devices in the system.
Prior to 0a701aa63784, I think pci_specified_resource_alignment()
only caused increased alignment for devices mentioned with a
"pci=resource_alignment=..." parameter.
I suppose the change to do it for all devices was intentional because
382746376993 ("powerpc/powernv: Override pcibios_default_alignment()
to force PCI devices to be page aligned") says it's for all PCI
devices on PowerNV.
Since 0a701aa63784 and 382746376993 were for VFIO, which is generic, I
kind of wish that we'd done it in a more generic way instead of making
a pcibios interface that is only implemented for PowerNV.
This series does make it generic by doing it in the weak
pcibios_default_alignment() that's used by default, so that's good.
It's ancient history now, but I'm also a little unsure about the way
pci_reassigndev_resource_alignment() is kind of tacked on at the end
in pci_device_add() and not integrated with the usual BAR sizing and
allocation machinery.
> >>> But there's an issue with "pci=resource_alignment=..." that you
> >>> noticed sort of incidentally, and this patch fixes that?
> >>
> >> No, pci=resource_alignment= results in IORESOURCE_SIZEALIGN, which
> >> breaks pcitest. And we'd like pcitest to work properly for PCI
> >> passthrough validation with Xen, hence the need for
> >> IORESOURCE_STARTALIGN.
Thanks for working on this.
Bjorn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-10 21:24 ` Bjorn Helgaas
2024-07-10 22:49 ` Stewart Hildebrand
@ 2024-07-15 17:26 ` Stewart Hildebrand
1 sibling, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-15 17:26 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, linux-kernel
On 7/10/24 17:24, Bjorn Helgaas wrote:
> On Wed, Jul 10, 2024 at 12:16:24PM -0400, Stewart Hildebrand wrote:
>> On 7/9/24 12:19, Bjorn Helgaas wrote:
>>> On Tue, Jul 09, 2024 at 09:36:01AM -0400, Stewart Hildebrand wrote:
>>>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>>>> index f2f4a5d50b27..ff6e61389ec7 100644
>>>> --- a/arch/x86/pci/i386.c
>>>> +++ b/arch/x86/pci/i386.c
>>>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>>>> /* We'll assign a new address later */
>>>> pcibios_save_fw_addr(dev,
>>>> idx, r->start);
>>>> - r->end -= r->start;
>>>> - r->start = 0;
>>>> + if (!(r->flags &
>>>> + IORESOURCE_STARTALIGN)) {
>>>> + r->end -= r->start;
>>>> + r->start = 0;
>>>> + }
>
> I wondered why this only touched x86 and whether other arches need a
> similar change. This is used in two paths:
>
> 1) pcibios_resource_survey_bus(), which is only implemented by x86
>
> 2) pcibios_resource_survey(), which is implemented by x86 and
> powerpc. The powerpc pcibios_allocate_resources() is similar to the
> x86 pcibios_allocate_dev_resources(), but powerpc doesn't have the
> r->end and r->start updates you're making conditional.
Actually it does. It's in a separate function, alloc_resource(). I'll
make the change over there too.
> So it looks like x86 is indeed the only place that needs this change.
> None of this stuff is arch-specific, so it's a shame that we don't
> have generic code for it all. Sigh, oh well.
>
>>>> }
>>>> }
>>>> }
>>>> --
>>>> 2.45.2
>>>>
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment
2024-07-10 14:05 ` Ilpo Järvinen
@ 2024-07-15 17:30 ` Stewart Hildebrand
0 siblings, 0 replies; 24+ messages in thread
From: Stewart Hildebrand @ 2024-07-15 17:30 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, x86, linux-pci, LKML
On 7/10/24 10:05, Ilpo Järvinen wrote:
> On Tue, 9 Jul 2024, Stewart Hildebrand wrote:
>
>> Currently, it's not possible to use the IORESOURCE_STARTALIGN flag on
>> x86 due to the alignment being overwritten in
>> pcibios_allocate_dev_resources(). Make one small change in arch/x86 to
>> make it work on x86.
>>
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> ---
>> RFC: We don't have enough info in this function to re-calculate the
>> alignment value in case of IORESOURCE_STARTALIGN. Luckily our
>> alignment value seems to be intact, so just don't touch it...
>> Alternatively, we could call pci_reassigndev_resource_alignment()
>> after the loop. Would that be preferable?
>> ---
>> arch/x86/pci/i386.c | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
>> index f2f4a5d50b27..ff6e61389ec7 100644
>> --- a/arch/x86/pci/i386.c
>> +++ b/arch/x86/pci/i386.c
>> @@ -283,8 +283,11 @@ static void pcibios_allocate_dev_resources(struct pci_dev *dev, int pass)
>> /* We'll assign a new address later */
>> pcibios_save_fw_addr(dev,
>> idx, r->start);
>> - r->end -= r->start;
>> - r->start = 0;
>> + if (!(r->flags &
>> + IORESOURCE_STARTALIGN)) {
>> + r->end -= r->start;
>> + r->start = 0;
>> + }
>> }
>> }
>> }
>>
>
> As a general comment to that loop in pcibios_allocate_dev_resources()
> function, it would be nice to reverse some of the logic in the if
> conditions and use continue to limit the runaway indentation level.
The similar function pcibios_allocate_resources() in
arch/powerpc/kernel/pci-common.c has moved some of the logic out into a
separate function. I'll do the same here.
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2024-07-15 17:30 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-09 13:35 [PATCH 0/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 1/6] PCI: don't clear already cleared bit Stewart Hildebrand
2024-07-09 13:35 ` [PATCH 2/6] PCI: restore resource alignment Stewart Hildebrand
2024-07-09 13:36 ` [PATCH 3/6] PCI: restore memory decoding after reallocation Stewart Hildebrand
2024-07-09 16:16 ` Bjorn Helgaas
2024-07-10 20:31 ` Stewart Hildebrand
2024-07-09 13:36 ` [RFC PATCH 4/6] x86: PCI: preserve IORESOURCE_STARTALIGN alignment Stewart Hildebrand
2024-07-09 16:19 ` Bjorn Helgaas
2024-07-10 16:16 ` Stewart Hildebrand
2024-07-10 21:24 ` Bjorn Helgaas
2024-07-10 22:49 ` Stewart Hildebrand
2024-07-11 18:40 ` Bjorn Helgaas
2024-07-11 18:58 ` Stewart Hildebrand
2024-07-11 20:35 ` Bjorn Helgaas
2024-07-15 17:26 ` Stewart Hildebrand
2024-07-10 14:05 ` Ilpo Järvinen
2024-07-15 17:30 ` Stewart Hildebrand
2024-07-09 13:36 ` [PATCH 5/6] PCI: don't reassign resources that are already aligned Stewart Hildebrand
2024-07-09 13:36 ` [PATCH 6/6] PCI: align small (<4k) BARs Stewart Hildebrand
2024-07-09 16:21 ` Bjorn Helgaas
2024-07-10 16:35 ` Stewart Hildebrand
2024-07-10 13:56 ` Ilpo Järvinen
2024-07-10 16:28 ` Stewart Hildebrand
2024-07-10 23:26 ` [PATCH 0/6] " Stewart Hildebrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).