* [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas
@ 2010-12-08 21:36 Bjorn Helgaas
2010-12-08 21:36 ` [PATCH 2/5] x86: avoid BIOS area when allocating address space Bjorn Helgaas
` (5 more replies)
0 siblings, 6 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2010-12-08 21:36 UTC (permalink / raw)
To: Jesse Barnes, Len Brown
Cc: linux-pci, linux-kernel, linux-acpi, H. Peter Anvin,
Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay
This adds arch_remove_reservations(), which an arch can implement if it
needs to protect part of the address space from allocation.
Sometimes that can be done by just requesting a resource. This hook is to
cover cases where protected area doesn't fit well in the hierarchical
resource tree. For example, x86 BIOS E820 reservations are not related
to devices, so they may overlap part of, all of, or more than a device
resource.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---
include/linux/ioport.h | 1 +
kernel/resource.c | 8 ++++++++
2 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index d377ea8..b92b8b4 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -124,6 +124,7 @@ extern void reserve_region_with_split(struct resource *root,
extern struct resource *insert_resource_conflict(struct resource *parent, struct resource *new);
extern int insert_resource(struct resource *parent, struct resource *new);
extern void insert_resource_expand_to_fit(struct resource *root, struct resource *new);
+extern void arch_remove_reservations(struct resource *avail);
extern int allocate_resource(struct resource *root, struct resource *new,
resource_size_t size, resource_size_t min,
resource_size_t max, resource_size_t align,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9fad33e..246957e 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -374,6 +374,10 @@ int __weak page_is_ram(unsigned long pfn)
return walk_system_ram_range(pfn, 1, NULL, __is_ram) == 1;
}
+void __weak arch_remove_reservations(struct resource *avail)
+{
+}
+
static resource_size_t simple_align_resource(void *data,
const struct resource *avail,
resource_size_t size,
@@ -426,6 +430,7 @@ static int find_resource_from_top(struct resource *root, struct resource *new,
struct resource *this;
struct resource tmp, avail, alloc;
+ tmp.flags = new->flags;
tmp.start = root->end;
tmp.end = root->end;
@@ -438,6 +443,7 @@ static int find_resource_from_top(struct resource *root, struct resource *new,
tmp.start = root->start;
resource_clip(&tmp, min, max);
+ arch_remove_reservations(&tmp);
/* Check for overflow after ALIGN() */
avail = *new;
@@ -478,6 +484,7 @@ static int find_resource(struct resource *root, struct resource *new,
struct resource *this = root->child;
struct resource tmp = *new, avail, alloc;
+ tmp.flags = new->flags;
tmp.start = root->start;
/*
* Skip past an allocated resource that starts at 0, since the
@@ -495,6 +502,7 @@ static int find_resource(struct resource *root, struct resource *new,
tmp.end = root->end;
resource_clip(&tmp, min, max);
+ arch_remove_reservations(&tmp);
/* Check for overflow after ALIGN() */
avail = *new;
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH 2/5] x86: avoid BIOS area when allocating address space 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas @ 2010-12-08 21:36 ` Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 3/5] x86: avoid PNP resources " Bjorn Helgaas ` (4 subsequent siblings) 5 siblings, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-08 21:36 UTC (permalink / raw) To: Jesse Barnes, Len Brown Cc: linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay This implements arch_remove_reservations() so allocate_resource() can avoid any arch-specific reserved areas. This currently just avoids the BIOS area (the first 1MB), but could be used for E820 reserved areas if that turns out to be necessary. We previously avoided this area in pcibios_align_resource(). This patch moves the test from that PCI-specific path to a generic path, so *all* resource allocations will avoid this area. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> --- arch/x86/kernel/Makefile | 1 + arch/x86/kernel/resource.c | 14 ++++++++++++++ arch/x86/pci/i386.c | 3 --- 3 files changed, 15 insertions(+), 3 deletions(-) create mode 100644 arch/x86/kernel/resource.c diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 9e13763..1e99475 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -45,6 +45,7 @@ obj-y += pci-dma.o quirks.o i8237.o topology.o kdebugfs.o obj-y += alternative.o i8253.o pci-nommu.o hw_breakpoint.o obj-y += tsc.o io_delay.o rtc.o obj-y += pci-iommu_table.o +obj-y += resource.o obj-$(CONFIG_X86_TRAMPOLINE) += trampoline.o obj-y += process.o diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c new file mode 100644 index 0000000..5dd6473 --- /dev/null +++ b/arch/x86/kernel/resource.c @@ -0,0 +1,14 @@ +#include <linux/ioport.h> +#include <asm/e820.h> + +void arch_remove_reservations(struct resource *avail) +{ + /* + * Trim out the area reserved for BIOS (low 1MB). We could also remove + * E820 "reserved" areas here. + */ + if (avail->flags & IORESOURCE_MEM) { + if (avail->start < BIOS_END) + avail->start = BIOS_END; + } +} diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c index c4bb261..b261011 100644 --- a/arch/x86/pci/i386.c +++ b/arch/x86/pci/i386.c @@ -77,9 +77,6 @@ pcibios_align_resource(void *data, const struct resource *res, */ if (!skip_isa_ioresource_align(dev)) start &= ~0x300; - } else if (res->flags & IORESOURCE_MEM) { - if (start < BIOS_END) - start = res->end; /* fail; no space */ } return start; } ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 3/5] x86: avoid PNP resources when allocating address space 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 2/5] x86: avoid BIOS area when allocating address space Bjorn Helgaas @ 2010-12-08 21:36 ` Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 4/5] PNP: add framework for platform PNP quirks Bjorn Helgaas ` (3 subsequent siblings) 5 siblings, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-08 21:36 UTC (permalink / raw) To: Jesse Barnes, Len Brown Cc: Jiri Slaby, Dan Williams, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay, Matthew Garrett Remove PNP resources from "available space" used by allocate_resource(). This would be better done by putting the PNP resources directly in the resource maps (iomem_resource, etc), but there are some issues that keep us from doing that yet. This patch keeps us from handing out PNP device address space to other callers of allocate_resource(). Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23332 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23542 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23802 Reported-by: Matthew Garrett <mjg@redhat.com> Reported-by: Dan Williams <dcbw@redhat.com> Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> --- arch/x86/kernel/resource.c | 78 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 78 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c index 5dd6473..1daee92 100644 --- a/arch/x86/kernel/resource.c +++ b/arch/x86/kernel/resource.c @@ -1,6 +1,80 @@ #include <linux/ioport.h> +#include <linux/pnp.h> #include <asm/e820.h> +#ifdef CONFIG_PNP +static bool resource_conflict(struct resource *res, resource_size_t start, + resource_size_t end) +{ + /* + * Return true if and only if "res" conflicts with the [start-end] + * range. + */ + return res->start <= end && res->end >= start; +} + +static void resource_split(struct resource *res, resource_size_t start, + resource_size_t end, struct resource *low, + struct resource *high) +{ + /* + * If "res" conflicts with [start-end], split "res" into the + * part below "start" (low) and the part above "end" (high), + * either (or both) of which may be empty. + * + * If there's no conflict, return the entire "res" as "low". + */ + *low = *res; + low->start = res->start; + low->end = res->start - 1; /* default to empty (size 0) */ + + *high = *res; + high->end = res->end; + high->start = res->end + 1; /* default to empty (size 0) */ + + if (!resource_conflict(res, start, end)) { + low->end = res->end; + return; + } + + if (res->start < start) + low->end = start - 1; + + if (res->end > end) + high->start = end + 1; +} + +static void pnp_remove_reservations(struct resource *avail) +{ + unsigned long type = resource_type(avail); + struct pnp_dev *dev; + int i; + struct resource *res, low, high; + + /* + * Clip the available region to avoid PNP devices. The PNP + * resources really should be in the resource map to begin with, + * but there are still some issues preventing that. + */ + pnp_for_each_dev(dev) { + i = 0; + res = pnp_get_resource(dev, type, i++); + while (res) { + if (!(res->flags & IORESOURCE_WINDOW)) { + resource_split(avail, res->start, res->end, + &low, &high); + if (resource_size(&low) > resource_size(&high)) + *avail = low; + else + *avail = high; + } + + res = pnp_get_resource(dev, type, i++); + } + } +} +#endif + void arch_remove_reservations(struct resource *avail) { /* @@ -11,4 +85,8 @@ void arch_remove_reservations(struct resource *avail) if (avail->start < BIOS_END) avail->start = BIOS_END; } + +#ifdef CONFIG_PNP + pnp_remove_reservations(avail); +#endif } ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 4/5] PNP: add framework for platform PNP quirks 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 2/5] x86: avoid BIOS area when allocating address space Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 3/5] x86: avoid PNP resources " Bjorn Helgaas @ 2010-12-08 21:36 ` Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources Bjorn Helgaas ` (2 subsequent siblings) 5 siblings, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-08 21:36 UTC (permalink / raw) To: Jesse Barnes, Len Brown Cc: linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay This allows platform quirks to fabricate PNP devices to describe things that should have been described via ACPI. For example, if the BIOS writer forgets to describe a device, we may assign its address space to another device, causing a conflict. We can avoid the conflict by making a fake PNP device to stand in for the one the BIOS missed. In that case, there's no ACPI or PNPBIOS device, so we need a new pnp_protocol that doesn't go back to firmware for get/set/wakeup/ suspend/etc. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> --- drivers/pnp/base.h | 2 ++ drivers/pnp/core.c | 9 ++++++++- drivers/pnp/quirks.c | 15 +++++++++++++++ 3 files changed, 25 insertions(+), 1 deletions(-) diff --git a/drivers/pnp/base.h b/drivers/pnp/base.h index 19bc736..dca301e 100644 --- a/drivers/pnp/base.h +++ b/drivers/pnp/base.h @@ -7,6 +7,8 @@ extern spinlock_t pnp_lock; extern struct device_attribute pnp_interface_attrs[]; void *pnp_alloc(long size); +void platform_pnp_fixups(void); + int pnp_register_protocol(struct pnp_protocol *protocol); void pnp_unregister_protocol(struct pnp_protocol *protocol); diff --git a/drivers/pnp/core.c b/drivers/pnp/core.c index 0f34d96..5076493 100644 --- a/drivers/pnp/core.c +++ b/drivers/pnp/core.c @@ -212,7 +212,14 @@ void __pnp_remove_device(struct pnp_dev *dev) static int __init pnp_init(void) { - return bus_register(&pnp_bus_type); + int ret; + + ret = bus_register(&pnp_bus_type); + if (ret) + return ret; + + platform_pnp_fixups(); + return 0; } subsys_initcall(pnp_init); diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c index dfbd5a6..f18bb69 100644 --- a/drivers/pnp/quirks.c +++ b/drivers/pnp/quirks.c @@ -15,6 +15,7 @@ #include <linux/types.h> #include <linux/kernel.h> +#include <linux/dmi.h> #include <linux/string.h> #include <linux/slab.h> #include <linux/pnp.h> @@ -337,3 +338,17 @@ void pnp_fixup_device(struct pnp_dev *dev) f->quirk_function(dev); } } + +static struct pnp_protocol pnp_fixup_protocol = { + .name = "Plug and Play fixup", +}; + +static const struct dmi_system_id pnp_fixup_table[] __initconst = { + {} +}; + +void __init platform_pnp_fixups(void) +{ + pnp_register_protocol(&pnp_fixup_protocol); + dmi_check_system(pnp_fixup_table); +} ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas ` (2 preceding siblings ...) 2010-12-08 21:36 ` [PATCH 4/5] PNP: add framework for platform PNP quirks Bjorn Helgaas @ 2010-12-08 21:36 ` Bjorn Helgaas 2010-12-12 3:30 ` Linus Torvalds 2010-12-08 21:37 ` [PATCH 0/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas 2010-12-10 20:30 ` [PATCH 1/5] " Jesse Barnes 5 siblings, 1 reply; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-08 21:36 UTC (permalink / raw) To: Jesse Barnes, Len Brown Cc: linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay The HP nx6325 BIOS doesn't report any devices in the [0xf8000000-0xfbffffff] region via ACPI devices or the E820 memory map, but when we assign it to the 00:14.4 bridge as a prefetchable memory window, the machine hangs. I determined experimentally that there are only three 1MB regions in that area that cause trouble, so this fixup builds a fake PNP device that consumes those regions. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23332 Reported-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> --- drivers/pnp/quirks.c | 30 ++++++++++++++++++++++++++++++ 1 files changed, 30 insertions(+), 0 deletions(-) diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c index f18bb69..e7de402 100644 --- a/drivers/pnp/quirks.c +++ b/drivers/pnp/quirks.c @@ -343,7 +343,37 @@ static struct pnp_protocol pnp_fixup_protocol = { .name = "Plug and Play fixup", }; +static int __init hp_nx6325_fixup(const struct dmi_system_id *d) +{ + struct pnp_dev *dev; + + /* + * The BIOS apparently forgot to describe some regions in the + * address map. See https://bugzilla.kernel.org/show_bug.cgi?id=23332 + */ + + dev = pnp_alloc_dev(&pnp_fixup_protocol, 0, "LNXHAZRD"); + if (!dev) + return 0; + + dev->active = 1; + pnp_add_mem_resource(dev, 0xf8300000, 0xf83fffff, 0); + pnp_add_mem_resource(dev, 0xf8500000, 0xf85fffff, 0); + pnp_add_mem_resource(dev, 0xf9100000, 0xf91fffff, 0); + pnp_add_device(dev); + dev_info(&dev->dev, "added to work around BIOS defect\n"); + return 0; +} + static const struct dmi_system_id pnp_fixup_table[] __initconst = { + { + .callback = hp_nx6325_fixup, + .ident = "HP nx6325 laptop", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), + DMI_MATCH(DMI_PRODUCT_NAME, "HP Compaq nx6325"), + }, + }, {} }; ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-08 21:36 ` [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources Bjorn Helgaas @ 2010-12-12 3:30 ` Linus Torvalds 2010-12-12 5:23 ` Dave Airlie 2010-12-12 6:17 ` Bjorn Helgaas 0 siblings, 2 replies; 28+ messages in thread From: Linus Torvalds @ 2010-12-12 3:30 UTC (permalink / raw) To: Bjorn Helgaas, Jesse Barnes Cc: Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Wed, Dec 8, 2010 at 1:36 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > The HP nx6325 BIOS doesn't report any devices in the [0xf8000000-0xfbffffff] > region via ACPI devices or the E820 memory map, but when we assign it to the > 00:14.4 bridge as a prefetchable memory window, the machine hangs. Quite frankly, I think this patch sucks. It sucks because these kinds of hw-specific patches are fundamentally a sign of something else being wrong. Why didn't windows hit this? Why do we need this total hack? And is there any reason at all to believe that that one particular laptop is really special? I doubt it. And what happens for the next random machine that comes along an hits this? Maybe we should just say that if we know the bridge is negative decode, and it hasn't been set up by the BIOS, we just don't allocate it at all. And try to look like Windows. Or figure out what else Windows is doing differently. The whole "allocate bottom up" old PCI allocation has _years_ of testing and quirk that have been gathered over a long time. We can't just say "we'll do the same thing for the top-down allocator". The WHOLE AND ONLY POINT of the top-down allocator was to act lik Windows and not need crap like this. If that doesn't work, then I seriously don't think we should change bottom-up to top-down at all, and for 2.6.37 we should just revert the "set to top-down by default". Seriously. That "whole and only point" thing is important. If we need hacks like this, then we shouldn't do it. We're much better off with the model that has year of testing an not the upheaval. Top-down allocation is in _no_ way inherently better, the only excuse for it was supposed to be "we don't need these kinds of hooks". Linus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-12 3:30 ` Linus Torvalds @ 2010-12-12 5:23 ` Dave Airlie 2010-12-12 6:17 ` Bjorn Helgaas 1 sibling, 0 replies; 28+ messages in thread From: Dave Airlie @ 2010-12-12 5:23 UTC (permalink / raw) To: Linus Torvalds Cc: Bjorn Helgaas, Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Sun, Dec 12, 2010 at 1:30 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Dec 8, 2010 at 1:36 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: >> >> The HP nx6325 BIOS doesn't report any devices in the [0xf8000000-0xfbffffff] >> region via ACPI devices or the E820 memory map, but when we assign it to the >> 00:14.4 bridge as a prefetchable memory window, the machine hangs. > > Quite frankly, I think this patch sucks. > > It sucks because these kinds of hw-specific patches are fundamentally > a sign of something else being wrong. Why didn't windows hit this? Why > do we need this total hack? > > And is there any reason at all to believe that that one particular > laptop is really special? I doubt it. And what happens for the next > random machine that comes along an hits this? > > Maybe we should just say that if we know the bridge is negative > decode, and it hasn't been set up by the BIOS, we just don't allocate > it at all. And try to look like Windows. > > Or figure out what else Windows is doing differently. > > The whole "allocate bottom up" old PCI allocation has _years_ of > testing and quirk that have been gathered over a long time. We can't > just say "we'll do the same thing for the top-down allocator". > > The WHOLE AND ONLY POINT of the top-down allocator was to act lik > Windows and not need crap like this. If that doesn't work, then I > seriously don't think we should change bottom-up to top-down at all, > and for 2.6.37 we should just revert the "set to top-down by default". > > Seriously. That "whole and only point" thing is important. If we need > hacks like this, then we shouldn't do it. We're much better off with > the model that has year of testing an not the upheaval. Top-down > allocation is in _no_ way inherently better, the only excuse for it > was supposed to be "we don't need these kinds of hooks". I agree, I've got an NX6125 predecessor of this and I'll take any money that is has an equally insane BIOS, considering its the worst laptop in my pile in nearly every way imaginable. and I suspect the HP laptop braindamage won't end with those two. Dave. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-12 3:30 ` Linus Torvalds 2010-12-12 5:23 ` Dave Airlie @ 2010-12-12 6:17 ` Bjorn Helgaas 2010-12-14 20:34 ` Linus Torvalds 1 sibling, 1 reply; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-12 6:17 UTC (permalink / raw) To: Linus Torvalds Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Sat, Dec 11, 2010 at 07:30:54PM -0800, Linus Torvalds wrote: > On Wed, Dec 8, 2010 at 1:36 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > The HP nx6325 BIOS doesn't report any devices in the [0xf8000000-0xfbffffff] > > region via ACPI devices or the E820 memory map, but when we assign it to the > > 00:14.4 bridge as a prefetchable memory window, the machine hangs. > > Quite frankly, I think this patch sucks. > > It sucks because these kinds of hw-specific patches are fundamentally > a sign of something else being wrong. Why didn't windows hit this? Why > do we need this total hack? I agree, it *does* suck, and I *am* quite worried about how many issues like this we might trip over. Windows didn't hit this because of other differences: - Windows relies on subtractive decode; Linux programs a window - Windows gives the downstream CardBus bridge a 4K and a 64M mem window; Linux gives it two 64M windows - Windows doesn't align the CardBus windows on their size; Linux does Under Windows, the CardBus windows don't conflict with the unreported devices. The larger windows allocated by Linux do. So relying on subtractive decode would work until we plug in a CardBus device that expects to *use* that area, and then the device won't work. > And is there any reason at all to believe that that one particular > laptop is really special? I doubt it. And what happens for the next > random machine that comes along an hits this? > > Maybe we should just say that if we know the bridge is negative > decode, and it hasn't been set up by the BIOS, we just don't allocate > it at all. And try to look like Windows. I do like this idea more than I did at first, even though it's not a complete fix, because it's much better to have a non-working CardBus device than a hanging machine. But I guess we'd still want to fix that device, and then we're back at a hw-specific quirk like this one. Maybe we should do both (leave the bridge alone and keep the quirk). > Or figure out what else Windows is doing differently. > > The whole "allocate bottom up" old PCI allocation has _years_ of > testing and quirk that have been gathered over a long time. We can't > just say "we'll do the same thing for the top-down allocator". True (although most of the quirks I can think of are in the form of hard-coded legacy device reservations, and we're keeping those). If we didn't care about host bridge _CRS, there'd be no reason to change the old PCI allocation scheme. But I think we *do* care and will care more in the future. We frequently assign resources to devices the BIOS didn't configure, and that only works if we're lucky or there's only one host bridge. > The WHOLE AND ONLY POINT of the top-down allocator was to act lik > Windows and not need crap like this. If that doesn't work, then I > seriously don't think we should change bottom-up to top-down at all, > and for 2.6.37 we should just revert the "set to top-down by default". > > Seriously. That "whole and only point" thing is important. If we need > hacks like this, then we shouldn't do it. We're much better off with > the model that has year of testing an not the upheaval. Top-down > allocation is in _no_ way inherently better, the only excuse for it > was supposed to be "we don't need these kinds of hooks". Not really -- the main point here is to make multi-host bridge machines work reliably, and I really don't see a way to do that without using _CRS. If we're going to use _CRS, I think in the long run we'll be better off if we do it similarly to Windows, despite these early problems. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-12 6:17 ` Bjorn Helgaas @ 2010-12-14 20:34 ` Linus Torvalds 2010-12-14 20:44 ` Linus Torvalds 2010-12-15 6:26 ` Bjorn Helgaas 0 siblings, 2 replies; 28+ messages in thread From: Linus Torvalds @ 2010-12-14 20:34 UTC (permalink / raw) To: Bjorn Helgaas Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Sat, Dec 11, 2010 at 10:17 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > Not really -- the main point here is to make multi-host bridge > machines work reliably, and I really don't see a way to do that > without using _CRS. > > If we're going to use _CRS, I think in the long run we'll be better > off if we do it similarly to Windows, despite these early problems. It's not about any "despite these early problems". It's about "clearly we're not doing things at all like Windows, and it's just broken". The thing is, we will never be able to match Windows exactly. It may well have random hardcoded quirks we simply don't know about. I'm perfectly happy with you aiming to use _CRS. I am _not_ happy with you then using that as an excuse to then do things that don't work. We will NOT start doing random BIOS-specific quirks just because top-down allocations hit other bugs than bottom-up ones do. Just no. We'll continue doing that we have tried to do, which is to perhaps have quirks that are specific to *hardware* (like the ones in drivers/pci/quirks.c) and just filling in stuff that some BIOSes are known to get wrong. That's a maintainable approach. But it's maintainable ONLY if we then don't do other random changes that invalidates all the years of testing we've had. Linus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-14 20:34 ` Linus Torvalds @ 2010-12-14 20:44 ` Linus Torvalds 2010-12-14 23:57 ` Bjorn Helgaas 2010-12-15 6:02 ` Bjorn Helgaas 2010-12-15 6:26 ` Bjorn Helgaas 1 sibling, 2 replies; 28+ messages in thread From: Linus Torvalds @ 2010-12-14 20:44 UTC (permalink / raw) To: Bjorn Helgaas Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Tue, Dec 14, 2010 at 12:34 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > That's a maintainable approach. But it's maintainable ONLY if we then > don't do other random changes that invalidates all the years of > testing we've had. Btw, looking at all the x86-specific commits that have gone in, I'm *extremely* unhappy that they apparently stopped honoring that "resource_alloc_from_bottom" flag that I explicitly asked for. So it looks like it's not enough to just set that flag. We have to actually revert all the commits in this area as broken. Which is sad, but since they clearly *are* broken and don't honor the flag that was there explicitly to avoid this problem and make it easy to test reverting it, I'm really pissed off. The WHOLE POINT of that flag was to give people an option to say "use the old resource allocation order because the new one doesn't work for me". So at this point the only question is whether I should just revert the whole effing lot, or whether there are patches to fix the code to honor the "allocate from bottom" bit and then just set it by default again. Bjorn? Preferences? Linus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-14 20:44 ` Linus Torvalds @ 2010-12-14 23:57 ` Bjorn Helgaas 2010-12-15 6:02 ` Bjorn Helgaas 1 sibling, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-14 23:57 UTC (permalink / raw) To: Linus Torvalds Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Tuesday, December 14, 2010 01:44:51 pm Linus Torvalds wrote: > On Tue, Dec 14, 2010 at 12:34 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > That's a maintainable approach. But it's maintainable ONLY if we then > > don't do other random changes that invalidates all the years of > > testing we've had. > > Btw, looking at all the x86-specific commits that have gone in, I'm > *extremely* unhappy that they apparently stopped honoring that > "resource_alloc_from_bottom" flag that I explicitly asked for. In 20-20 hindsight, I should have made that switch affect more things. I tried to do what you asked; I obviously just didn't do enough, and I am sorry. > So it looks like it's not enough to just set that flag. We have to > actually revert all the commits in this area as broken. > > Which is sad, but since they clearly *are* broken and don't honor the > flag that was there explicitly to avoid this problem and make it easy > to test reverting it, I'm really pissed off. The WHOLE POINT of that > flag was to give people an option to say "use the old resource > allocation order because the new one doesn't work for me". > > So at this point the only question is whether I should just revert the > whole effing lot, or whether there are patches to fix the code to > honor the "allocate from bottom" bit and then just set it by default > again. > > Bjorn? Preferences? Let me identify the set of reversion candidates and the consequences, and then we can figure out whether it's better to retreat or push forward. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-14 20:44 ` Linus Torvalds 2010-12-14 23:57 ` Bjorn Helgaas @ 2010-12-15 6:02 ` Bjorn Helgaas 1 sibling, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-15 6:02 UTC (permalink / raw) To: Linus Torvalds Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Tue, Dec 14, 2010 at 12:44:51PM -0800, Linus Torvalds wrote: > On Tue, Dec 14, 2010 at 12:34 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > That's a maintainable approach. But it's maintainable ONLY if we then > > don't do other random changes that invalidates all the years of > > testing we've had. > > Btw, looking at all the x86-specific commits that have gone in, I'm > *extremely* unhappy that they apparently stopped honoring that > "resource_alloc_from_bottom" flag that I explicitly asked for. > > So it looks like it's not enough to just set that flag. We have to > actually revert all the commits in this area as broken. > > Which is sad, but since they clearly *are* broken and don't honor the > flag that was there explicitly to avoid this problem and make it easy > to test reverting it, I'm really pissed off. The WHOLE POINT of that > flag was to give people an option to say "use the old resource > allocation order because the new one doesn't work for me". > > So at this point the only question is whether I should just revert the > whole effing lot, or whether there are patches to fix the code to > honor the "allocate from bottom" bit and then just set it by default > again. I'm not sure there's value in having both "pci=nocrs" and "resource_alloc_from_bottom" flags. What if I made it so we allocate top-down if and only if we're using _CRS? Then old boxes where we don't look at _CRS would use the same bottom-up behavior they always have (this is another major screwup in the current tree -- we currently do top-down on these boxes for no good reason). And on new boxes, we default to using _CRS and allocating top-down, but if that doesn't work, we can use "pci=nocrs" and go back to the old "ignore _CRS and allocate bottom-up" behavior. Here's a sample patch which I will test and document if you think it's a reasonable approach: commit e39250083dbdd0b42e886aa858d0ffbc86e618c4 Author: Bjorn Helgaas <bjorn.helgaas@hp.com> Date: Tue Dec 14 22:10:16 2010 -0700 diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 21c6746..85268f8 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -769,7 +769,6 @@ void __init setup_arch(char **cmdline_p) x86_init.oem.arch_setup(); - resource_alloc_from_bottom = 0; iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; setup_memory_map(); parse_setup_data(); diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c index 0972315..ca770fc 100644 --- a/arch/x86/pci/acpi.c +++ b/arch/x86/pci/acpi.c @@ -68,6 +68,9 @@ void __init pci_acpi_crs_quirks(void) "if necessary, use \"pci=%s\" and report a bug\n", pci_use_crs ? "Using" : "Ignoring", pci_use_crs ? "nocrs" : "use_crs"); + + if (pci_use_crs) + resource_alloc_from_top = 1; } static acpi_status diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c index c4bb261..57a1374 100644 --- a/arch/x86/pci/i386.c +++ b/arch/x86/pci/i386.c @@ -65,9 +65,16 @@ pcibios_align_resource(void *data, const struct resource *res, resource_size_t size, resource_size_t align) { struct pci_dev *dev = data; - resource_size_t start = round_down(res->end - size + 1, align); + resource_size_t start; + + if (resource_alloc_from_top) + start = round_down(res->end - size + 1, align); + else + start = res->start; if (res->flags & IORESOURCE_IO) { + if (skip_isa_ioresource_align(dev)) + return start; /* * If we're avoiding ISA aliases, the largest contiguous I/O @@ -75,11 +82,18 @@ pcibios_align_resource(void *data, const struct resource *res, * all 256-byte and smaller alignments, so the result will * still be correctly aligned. */ - if (!skip_isa_ioresource_align(dev)) + if (resource_alloc_from_top) start &= ~0x300; + else if (start & 0x300) + start = (start + 0x3ff) & ~0x3ff; + } else if (res->flags & IORESOURCE_MEM) { - if (start < BIOS_END) - start = res->end; /* fail; no space */ + if (start < BIOS_END) { + if (resource_alloc_from_top) + start = res->end; /* fail; no space */ + else + start = BIOS_END; + } } return start; } diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 003170e..bd59bc5 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -160,7 +160,7 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, resource_size_t), void *alignf_data) { - int ret = -ENOMEM; + int i, ret = -ENOMEM; struct resource *r; resource_size_t max = -1; unsigned int type = res->flags & IORESOURCE_TYPE_BITS; @@ -171,26 +171,51 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, if (!(res->flags & IORESOURCE_MEM_64)) max = PCIBIOS_MAX_MEM_32; - /* Look for space at highest addresses first */ - r = pci_bus_find_resource_prev(bus, type, NULL); - for ( ; r; r = pci_bus_find_resource_prev(bus, type, r)) { - /* type_mask must match */ - if ((res->flags ^ r->flags) & type_mask) - continue; - - /* We cannot allocate a non-prefetching resource - from a pre-fetching area */ - if ((r->flags & IORESOURCE_PREFETCH) && - !(res->flags & IORESOURCE_PREFETCH)) - continue; - - /* Ok, try it out.. */ - ret = allocate_resource(r, res, size, - r->start ? : min, - max, align, - alignf, alignf_data); - if (ret == 0) - break; + if (resource_alloc_from_top) { + /* Look for space at highest addresses first */ + r = pci_bus_find_resource_prev(bus, type, NULL); + for ( ; r; r = pci_bus_find_resource_prev(bus, type, r)) { + /* type_mask must match */ + if ((res->flags ^ r->flags) & type_mask) + continue; + + /* We cannot allocate a non-prefetching resource + from a pre-fetching area */ + if ((r->flags & IORESOURCE_PREFETCH) && + !(res->flags & IORESOURCE_PREFETCH)) + continue; + + /* Ok, try it out.. */ + ret = allocate_resource(r, res, size, + r->start ? : min, + max, align, + alignf, alignf_data); + if (ret == 0) + break; + } + } else { + pci_bus_for_each_resource(bus, r, i) { + if (!r) + continue; + + /* type_mask must match */ + if ((res->flags ^ r->flags) & type_mask) + continue; + + /* We cannot allocate a non-prefetching resource + from a pre-fetching area */ + if ((r->flags & IORESOURCE_PREFETCH) && + !(res->flags & IORESOURCE_PREFETCH)) + continue; + + /* Ok, try it out.. */ + ret = allocate_resource(r, res, size, + r->start ? : min, + max, align, + alignf, alignf_data); + if (ret == 0) + break; + } } return ret; } diff --git a/include/linux/ioport.h b/include/linux/ioport.h index d377ea8..d427cd5 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -112,7 +112,7 @@ struct resource_list { /* PC/ISA/whatever - the normal PC address spaces: IO and memory */ extern struct resource ioport_resource; extern struct resource iomem_resource; -extern int resource_alloc_from_bottom; +extern int resource_alloc_from_top; extern struct resource *request_resource_conflict(struct resource *root, struct resource *new); extern int request_resource(struct resource *root, struct resource *new); diff --git a/kernel/resource.c b/kernel/resource.c index 9fad33e..275b414 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -41,21 +41,13 @@ EXPORT_SYMBOL(iomem_resource); static DEFINE_RWLOCK(resource_lock); /* - * By default, we allocate free space bottom-up. The architecture can request - * top-down by clearing this flag. The user can override the architecture's - * choice with the "resource_alloc_from_bottom" kernel boot option, but that - * should only be a debugging tool. + * By default, we allocate free space bottom-up, as we have done in the past. + * Architectures can request top-down by setting "resource_alloc_from_top". + * On x86, we do this when we use PCI host bridge ACPI _CRS information. + * If the user turns off _CRS with "pci=nocrs", we use the default bottom-up + * allocation strategy. */ -int resource_alloc_from_bottom = 1; - -static __init int setup_alloc_from_bottom(char *s) -{ - printk(KERN_INFO - "resource: allocating from bottom-up; please report a bug\n"); - resource_alloc_from_bottom = 1; - return 0; -} -early_param("resource_alloc_from_bottom", setup_alloc_from_bottom); +int resource_alloc_from_top; static void *r_next(struct seq_file *m, void *v, loff_t *pos) { @@ -545,10 +537,10 @@ int allocate_resource(struct resource *root, struct resource *new, alignf = simple_align_resource; write_lock(&resource_lock); - if (resource_alloc_from_bottom) - err = find_resource(root, new, size, min, max, align, alignf, alignf_data); - else + if (resource_alloc_from_top) err = find_resource_from_top(root, new, size, min, max, align, alignf, alignf_data); + else + err = find_resource(root, new, size, min, max, align, alignf, alignf_data); if (err >= 0 && __request_resource(root, new)) err = -EBUSY; write_unlock(&resource_lock); ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-14 20:34 ` Linus Torvalds 2010-12-14 20:44 ` Linus Torvalds @ 2010-12-15 6:26 ` Bjorn Helgaas 2010-12-15 7:03 ` Linus Torvalds 1 sibling, 1 reply; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-15 6:26 UTC (permalink / raw) To: Linus Torvalds Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Tue, Dec 14, 2010 at 12:34:20PM -0800, Linus Torvalds wrote: > On Sat, Dec 11, 2010 at 10:17 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > Not really -- the main point here is to make multi-host bridge > > machines work reliably, and I really don't see a way to do that > > without using _CRS. > > > > If we're going to use _CRS, I think in the long run we'll be better > > off if we do it similarly to Windows, despite these early problems. > > It's not about any "despite these early problems". > > It's about "clearly we're not doing things at all like Windows, and > it's just broken". > > The thing is, we will never be able to match Windows exactly. It may > well have random hardcoded quirks we simply don't know about. Granted. > I'm perfectly happy with you aiming to use _CRS. I am _not_ happy with > you then using that as an excuse to then do things that don't work. I don't want to do things that make you unhappy :) > We will NOT start doing random BIOS-specific quirks just because > top-down allocations hit other bugs than bottom-up ones do. Just no. > We'll continue doing that we have tried to do, which is to perhaps > have quirks that are specific to *hardware* (like the ones in > drivers/pci/quirks.c) and just filling in stuff that some BIOSes are > known to get wrong. I've only proposed one BIOS-specific quirk, which is the one for the nx6325 unreported regions, and I identified things we do differently than Windows that explain why we see the problem and Windows doesn't. If we stop opening windows on subtractive-decode bridges, we don't need that quirk to avoid the hang. We will still need it if we want to use more than 40-odd MB of space on a PC Card. I'm pretty confident that if we could find PC Cards that require enough space, they wouldn't work under Windows either. I don't know whether the other patches in this series make you unhappy. I'm really not happy with how I implemented the avoidance of ACPI devices when doing PCI allocation, but I do think we need to avoid them *somehow*, and I was looking for a minimal quick fix at this point in the cycle. Avoiding ACPI devices fixes the Matthew's 2530p problem. We can also avoid that particular problem with the simple PCIBIOS_MAX_MEM_32 change you proposed. However, avoiding ACPI devices fixes other problems at the same time, such as this one: https://bugzilla.kernel.org/show_bug.cgi?id=23802 where we put the intel-gtt "flush page" on top of an ACPI TPM device. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-15 6:26 ` Bjorn Helgaas @ 2010-12-15 7:03 ` Linus Torvalds 2010-12-15 18:18 ` Bjorn Helgaas 0 siblings, 1 reply; 28+ messages in thread From: Linus Torvalds @ 2010-12-15 7:03 UTC (permalink / raw) To: Bjorn Helgaas Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Tue, Dec 14, 2010 at 10:26 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > I don't know whether the other patches in this series make you > unhappy. I'm really not happy with how I implemented the avoidance > of ACPI devices when doing PCI allocation, but I do think we need > to avoid them *somehow*, and I was looking for a minimal quick > fix at this point in the cycle. So the "avoid ACPI devices" part makes sense, and doesn't involved quirks, so I don't hate it at all the same way I hated the HP quirk. However, I hate how it makes the allocation logic opaque. You can no longer tell from the regular non-debug dmesg and the /proc/iomem _why_ something got allocated the way it did, because there are hidden rules. That makes things awkward, methinks. Also, quite frankly, I wonder what happens after release when somebody shows another machine that simply stopped working because the allocation strategy didn't work for it. The hw coverage that -rc6 gets is tiny compared to a real release. IOW, what's the long-term strategy for this? The only sane long-term strategy I can see is the one we have _always_ done, which is to try to populate the memory resource tree with what simply matches reality. The whole "ok, we know the hardware better than the BIOS does" is a _stable_ strategy. In contrast, the things you propose are NOT stable strategies, they all depend on basically "we match windows exactly and/or trust ACPI". Both of which are *known* to be failing models. That's why I'm somewhat upset. Your whole strategy seems to depend on a known broken model. We _know_ ACPI tables are crap much of the time. So we know that "avoiding ACPI resources" is inevitably insufficient. And that's why I hate the "switch everything around" model. Yes, we have a known way to fix things up - namely to actually detect the hardware itself properly when firmware inevitably screws up - but the very act of switching things around will pretty much guarantee that all our years of effort is of dubious value, and we'll end up finding other laptops that used to work and no longer does. Only switching around when _CRS is used is possible, and shouldn't cause any regressions if we continue to default to not using _CRS. But you want to switch that default around at some point, don't you? At which point we'll be up sh*t creek again. See what I'm saying? Which all makes me suspect that we'd be much better off just doing the bottom-up allocation even for _CRS. And maybe CRS works fine then when we combine our hardware knowledge with the ACPI region avoidance. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-15 7:03 ` Linus Torvalds @ 2010-12-15 18:18 ` Bjorn Helgaas 2010-12-15 18:27 ` H. Peter Anvin 2010-12-15 19:21 ` Linus Torvalds 0 siblings, 2 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-15 18:18 UTC (permalink / raw) To: Linus Torvalds Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Wednesday, December 15, 2010 12:03:15 am Linus Torvalds wrote: > On Tue, Dec 14, 2010 at 10:26 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > I don't know whether the other patches in this series make you > > unhappy. I'm really not happy with how I implemented the avoidance > > of ACPI devices when doing PCI allocation, but I do think we need > > to avoid them *somehow*, and I was looking for a minimal quick > > fix at this point in the cycle. > > So the "avoid ACPI devices" part makes sense, and doesn't involved > quirks, so I don't hate it at all the same way I hated the HP quirk. > > However, I hate how it makes the allocation logic opaque. You can no > longer tell from the regular non-debug dmesg and the /proc/iomem _why_ > something got allocated the way it did, because there are hidden > rules. That makes things awkward, methinks. > > Also, quite frankly, I wonder what happens after release when somebody > shows another machine that simply stopped working because the > allocation strategy didn't work for it. The hw coverage that -rc6 gets > is tiny compared to a real release. > > IOW, what's the long-term strategy for this? The only sane long-term > strategy I can see is the one we have _always_ done, which is to try > to populate the memory resource tree with what simply matches reality. > The whole "ok, we know the hardware better than the BIOS does" is a > _stable_ strategy. In contrast, the things you propose are NOT stable > strategies, they all depend on basically "we match windows exactly > and/or trust ACPI". Both of which are *known* to be failing models. > > That's why I'm somewhat upset. Your whole strategy seems to depend on > a known broken model. We _know_ ACPI tables are crap much of the time. > So we know that "avoiding ACPI resources" is inevitably insufficient. > > And that's why I hate the "switch everything around" model. Yes, we > have a known way to fix things up - namely to actually detect the > hardware itself properly when firmware inevitably screws up - but the > very act of switching things around will pretty much guarantee that > all our years of effort is of dubious value, and we'll end up finding > other laptops that used to work and no longer does. > > Only switching around when _CRS is used is possible, and shouldn't > cause any regressions if we continue to default to not using _CRS. But > you want to switch that default around at some point, don't you? At > which point we'll be up sh*t creek again. See what I'm saying? > > Which all makes me suspect that we'd be much better off just doing the > bottom-up allocation even for _CRS. And maybe CRS works fine then when > we combine our hardware knowledge with the ACPI region avoidance. ACPI devices tend to be at high addresses, so allocating top-down is definitely more dangerous unless we explicitly avoid them. I should have realized that and done something like patches 1-3 of this series before the top-down patches. Doing it bottom-up would very likely work better than the "top-down without avoiding ACPI regions" model we currently have, at least in the short term. We *would* have to do something to avoid E820 reservations to fix this: https://bugzilla.kernel.org/show_bug.cgi?id=16228, but that's doable. So here's my proposal for .37: - Keep the current state of _CRS enabled by default (for 2008 and newer machines). - Allocate bottom-up always - Avoid E820 reservations That should fix all the regressions I'm aware of. I'll work on the patches this afternoon. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-15 18:18 ` Bjorn Helgaas @ 2010-12-15 18:27 ` H. Peter Anvin 2010-12-15 19:21 ` Linus Torvalds 1 sibling, 0 replies; 28+ messages in thread From: H. Peter Anvin @ 2010-12-15 18:27 UTC (permalink / raw) To: Bjorn Helgaas Cc: Linus Torvalds, Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, Thomas Gleixner, Ingo Molnar, Adam Belay On 12/15/2010 10:18 AM, Bjorn Helgaas wrote: > > ACPI devices tend to be at high addresses, so allocating top-down > is definitely more dangerous unless we explicitly avoid them. I > should have realized that and done something like patches 1-3 of > this series before the top-down patches. > > Doing it bottom-up would very likely work better than the "top-down > without avoiding ACPI regions" model we currently have, at least in > the short term. We *would* have to do something to avoid E820 > reservations to fix this: > https://bugzilla.kernel.org/show_bug.cgi?id=16228, > but that's doable. > > So here's my proposal for .37: > - Keep the current state of _CRS enabled by default (for 2008 > and newer machines). > - Allocate bottom-up always > - Avoid E820 reservations > > That should fix all the regressions I'm aware of. I'll work on > the patches this afternoon. > At the same time, I would like to see a few things done as a matter of course: a) reserve the top 2 MiB of the 32-bit address space. There *will* be ROM at the top of the 32-bit address space; it's a fact of the architecture, and on at least older systems it was common to have a shadow 1 MiB below. b) we may want to consider doing special things in the 0xFExxxxxx memory range, which is used by the CPU-APIC-MSI system in recent processors. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources 2010-12-15 18:18 ` Bjorn Helgaas 2010-12-15 18:27 ` H. Peter Anvin @ 2010-12-15 19:21 ` Linus Torvalds 1 sibling, 0 replies; 28+ messages in thread From: Linus Torvalds @ 2010-12-15 19:21 UTC (permalink / raw) To: Bjorn Helgaas Cc: Jesse Barnes, Len Brown, linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay On Wed, Dec 15, 2010 at 10:18 AM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > So here's my proposal for .37: > - Keep the current state of _CRS enabled by default (for 2008 > and newer machines). > - Allocate bottom-up always > - Avoid E820 reservations Sounds sane to me. And yes, the fact that the BIOS tends to allocate things top-down probably does mean that the whole bottom-up approach tends to be the safer model. Linus ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 0/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas ` (3 preceding siblings ...) 2010-12-08 21:36 ` [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources Bjorn Helgaas @ 2010-12-08 21:37 ` Bjorn Helgaas 2010-12-10 20:30 ` [PATCH 1/5] " Jesse Barnes 5 siblings, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-08 21:37 UTC (permalink / raw) To: Jesse Barnes Cc: Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay Crap, here's the cover letter I meant to include. [RFC] PCI, PNP, resources: effectively reserve ACPI device resources 2.6.37-rc includes some patches of mine to change the way we allocate address space for devices, and these caused some regressions. Most device resource allocation is done by the BIOS, but Linux does allocate space if the BIOS doesn't do it (e.g., for CardBus or other hotplug devices) or if the BIOS did it wrong (e.g., it put a PCI device outside the host bridge windows). My original 2.6.37-rc patches make Linux allocate space from the top down instead of from the bottom up. We did this because that's what Windows does, and doing it the same way helps us avoid some BIOS defects, such as https://bugzilla.kernel.org/show_bug.cgi?id=16228 . The regressions caused by these changes include: https://bugzilla.kernel.org/show_bug.cgi?id=23332 Here, we allocated a 64MB bridge window for a subtractive-decode bridge leading to a CardBus controller, and the window conflicts with devices not reported by the BIOS. Windows doesn't see this problem because it relies on the fact that the bridge is in subtractive-decode mode and doesn't program the window at all. https://bugzilla.kernel.org/show_bug.cgi?id=23542 https://bugzilla.kernel.org/show_bug.cgi?id=23802 In both cases, Linux allocated space that conflicts with an ACPI device because it ignores resource usage reported by ACPI. Windows avoids this problem because it pays attention to the ACPI _CRS methods that report device resource usage. I think the best way to address these problems would be to put ACPI device resources in the resource maps, just like we do for PCI devices. We tried this a couple years ago (http://lkml.org/lkml/2007/11/1/214) and tripped over some subtle issues. We should try again, but not at this stage of the -rc. Instead, these patches take the approach of adding a filter in the allocate_resource() path so each arch can prevent allocation of any address ranges it cares about. For x86, that means looking through all the PNP devices and protecting the address space they use. In addition, for 23332, I added a quirk that builds a fake PNP device to protect the unreported devices. I'm a little nervous about this because I've been told that the WHQL tests look for unreported devices, and the HP nx6325 should have passwd. But it's the best I can think of for now. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas ` (4 preceding siblings ...) 2010-12-08 21:37 ` [PATCH 0/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas @ 2010-12-10 20:30 ` Jesse Barnes 2010-12-10 20:36 ` Jesse Barnes 5 siblings, 1 reply; 28+ messages in thread From: Jesse Barnes @ 2010-12-10 20:30 UTC (permalink / raw) To: Bjorn Helgaas Cc: Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay On Wed, 08 Dec 2010 14:36:06 -0700 Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > This adds arch_remove_reservations(), which an arch can implement if it > needs to protect part of the address space from allocation. > > Sometimes that can be done by just requesting a resource. This hook is to > cover cases where protected area doesn't fit well in the hierarchical > resource tree. For example, x86 BIOS E820 reservations are not related > to devices, so they may overlap part of, all of, or more than a device > resource. > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > --- Hm, this is bigger than the simple change of just avoiding the high 2M; Linus have you checked it out yet? It's nicer than simply adjusting PCIBIOS_MAX_MEM since it will affect all resource callers rather than just PCI, but it's definitely bigger. If you want just the simple change for 2.6.37 I can push that, but we'll need to get a tested-by from Matthew: diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h index ca0437c..aef9f77 100644 --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -141,7 +141,7 @@ void dma32_reserve_bootmem(void); /* generic pci stuff */ #include <asm-generic/pci.h> -#define PCIBIOS_MAX_MEM_32 0xffffffff +#define PCIBIOS_MAX_MEM_32 0xfff00000 #ifdef CONFIG_NUMA /* Returns the node based on pci bus */ and I'll queue up this set for 2.6.38. Thanks, -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-10 20:30 ` [PATCH 1/5] " Jesse Barnes @ 2010-12-10 20:36 ` Jesse Barnes 2010-12-10 21:07 ` Bjorn Helgaas 0 siblings, 1 reply; 28+ messages in thread From: Jesse Barnes @ 2010-12-10 20:36 UTC (permalink / raw) Cc: Bjorn Helgaas, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay, Matthew Garrett [Actually cc'ing Matthew this time] On Fri, 10 Dec 2010 12:30:08 -0800 Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > On Wed, 08 Dec 2010 14:36:06 -0700 > Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > > This adds arch_remove_reservations(), which an arch can implement if it > > needs to protect part of the address space from allocation. > > > > Sometimes that can be done by just requesting a resource. This hook is to > > cover cases where protected area doesn't fit well in the hierarchical > > resource tree. For example, x86 BIOS E820 reservations are not related > > to devices, so they may overlap part of, all of, or more than a device > > resource. > > > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > > --- > > Hm, this is bigger than the simple change of just avoiding the high 2M; > Linus have you checked it out yet? It's nicer than simply adjusting > PCIBIOS_MAX_MEM since it will affect all resource callers rather than > just PCI, but it's definitely bigger. > > If you want just the simple change for 2.6.37 I can push that, but > we'll need to get a tested-by from Matthew: > > diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h > index ca0437c..aef9f77 100644 > --- a/arch/x86/include/asm/pci.h > +++ b/arch/x86/include/asm/pci.h > @@ -141,7 +141,7 @@ void dma32_reserve_bootmem(void); > > /* generic pci stuff */ > #include <asm-generic/pci.h> > -#define PCIBIOS_MAX_MEM_32 0xffffffff > +#define PCIBIOS_MAX_MEM_32 0xfff00000 > > #ifdef CONFIG_NUMA > /* Returns the node based on pci bus */ > > and I'll queue up this set for 2.6.38. > > Thanks, -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-10 20:36 ` Jesse Barnes @ 2010-12-10 21:07 ` Bjorn Helgaas 2010-12-11 1:37 ` Jesse Barnes 0 siblings, 1 reply; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-10 21:07 UTC (permalink / raw) To: Jesse Barnes Cc: Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams, rjw On Friday, December 10, 2010 01:36:09 pm Jesse Barnes wrote: > [Actually cc'ing Matthew this time] > > On Fri, 10 Dec 2010 12:30:08 -0800 > Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > On Wed, 08 Dec 2010 14:36:06 -0700 > > Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > > > > > This adds arch_remove_reservations(), which an arch can implement if it > > > needs to protect part of the address space from allocation. > > > > > > Sometimes that can be done by just requesting a resource. This hook is to > > > cover cases where protected area doesn't fit well in the hierarchical > > > resource tree. For example, x86 BIOS E820 reservations are not related > > > to devices, so they may overlap part of, all of, or more than a device > > > resource. > > > > > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > > > --- > > > > Hm, this is bigger than the simple change of just avoiding the high 2M; > > Linus have you checked it out yet? It's nicer than simply adjusting > > PCIBIOS_MAX_MEM since it will affect all resource callers rather than > > just PCI, but it's definitely bigger. Dan Williams also reproduced the problem on a 2530p and tested this series (though I think he actually tested a backport in Fedora): https://bugzilla.kernel.org/show_bug.cgi?id=23542#c7 https://bugzilla.kernel.org/show_bug.cgi?id=23542#c23 The simple PCIBIOS_MAX_MEM_32 change below will fix the 2530p problem, but it won't help fix the nx6325 problem Rafael reported; details at: https://bugzilla.kernel.org/show_bug.cgi?id=23332#c20 Rafael did confirm on #acpi Wednesday that the full series fixed his nx6325. Bjorn > > If you want just the simple change for 2.6.37 I can push that, but > > we'll need to get a tested-by from Matthew: > > > > diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h > > index ca0437c..aef9f77 100644 > > --- a/arch/x86/include/asm/pci.h > > +++ b/arch/x86/include/asm/pci.h > > @@ -141,7 +141,7 @@ void dma32_reserve_bootmem(void); > > > > /* generic pci stuff */ > > #include <asm-generic/pci.h> > > -#define PCIBIOS_MAX_MEM_32 0xffffffff > > +#define PCIBIOS_MAX_MEM_32 0xfff00000 > > > > #ifdef CONFIG_NUMA > > /* Returns the node based on pci bus */ > > > > and I'll queue up this set for 2.6.38. > > > > Thanks, > > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-10 21:07 ` Bjorn Helgaas @ 2010-12-11 1:37 ` Jesse Barnes 2010-12-12 3:34 ` Linus Torvalds 0 siblings, 1 reply; 28+ messages in thread From: Jesse Barnes @ 2010-12-11 1:37 UTC (permalink / raw) To: Bjorn Helgaas Cc: Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Linus Torvalds, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams, rjw On Fri, 10 Dec 2010 14:07:24 -0700 Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > On Friday, December 10, 2010 01:36:09 pm Jesse Barnes wrote: > > [Actually cc'ing Matthew this time] > > > > On Fri, 10 Dec 2010 12:30:08 -0800 > > Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > > > On Wed, 08 Dec 2010 14:36:06 -0700 > > > Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > > > > > > > > This adds arch_remove_reservations(), which an arch can implement if it > > > > needs to protect part of the address space from allocation. > > > > > > > > Sometimes that can be done by just requesting a resource. This hook is to > > > > cover cases where protected area doesn't fit well in the hierarchical > > > > resource tree. For example, x86 BIOS E820 reservations are not related > > > > to devices, so they may overlap part of, all of, or more than a device > > > > resource. > > > > > > > > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > > > > --- > > > > > > Hm, this is bigger than the simple change of just avoiding the high 2M; > > > Linus have you checked it out yet? It's nicer than simply adjusting > > > PCIBIOS_MAX_MEM since it will affect all resource callers rather than > > > just PCI, but it's definitely bigger. > > Dan Williams also reproduced the problem on a 2530p and tested this > series (though I think he actually tested a backport in Fedora): > https://bugzilla.kernel.org/show_bug.cgi?id=23542#c7 > https://bugzilla.kernel.org/show_bug.cgi?id=23542#c23 > > The simple PCIBIOS_MAX_MEM_32 change below will fix the 2530p problem, > but it won't help fix the nx6325 problem Rafael reported; details at: > https://bugzilla.kernel.org/show_bug.cgi?id=23332#c20 > > Rafael did confirm on #acpi Wednesday that the full series fixed > his nx6325. Thanks, I'll add Dan and Rafael's tested-bys to the patches (they're already in my for-linus tree). Unless Linus has a problem with them I'll send them over to him this weekend or Monday. -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-11 1:37 ` Jesse Barnes @ 2010-12-12 3:34 ` Linus Torvalds 2010-12-12 4:16 ` Jesse Barnes 0 siblings, 1 reply; 28+ messages in thread From: Linus Torvalds @ 2010-12-12 3:34 UTC (permalink / raw) To: Jesse Barnes Cc: Bjorn Helgaas, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams, rjw On Fri, Dec 10, 2010 at 5:37 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > Thanks, I'll add Dan and Rafael's tested-bys to the patches (they're > already in my for-linus tree). Unless Linus has a problem with them > I'll send them over to him this weekend or Monday. See my other email I just sent out. I really am not going to take some totally new experimental and hacky major PCI resource management thing this late in the -rc game. No way, no how. If the top-down allocator is causing regressions that cannot be fixed by _simple_ patches, we're simply going to have to undo it. What's the advantage of top-down? None. Not if we then need all this crap, which we could as easily do on top of the bottom-up one WITHOUT any regressions. Why isn't anybody else questioning the whole basic premise here? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-12 3:34 ` Linus Torvalds @ 2010-12-12 4:16 ` Jesse Barnes 2010-12-12 13:20 ` Rafael J. Wysocki 0 siblings, 1 reply; 28+ messages in thread From: Jesse Barnes @ 2010-12-12 4:16 UTC (permalink / raw) To: Linus Torvalds Cc: Bjorn Helgaas, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams, rjw On Sat, 11 Dec 2010 19:34:05 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Dec 10, 2010 at 5:37 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > > Thanks, I'll add Dan and Rafael's tested-bys to the patches (they're > > already in my for-linus tree). Unless Linus has a problem with them > > I'll send them over to him this weekend or Monday. > > See my other email I just sent out. > > I really am not going to take some totally new experimental and hacky > major PCI resource management thing this late in the -rc game. No way, > no how. > > If the top-down allocator is causing regressions that cannot be fixed > by _simple_ patches, we're simply going to have to undo it. What's the > advantage of top-down? None. Not if we then need all this crap, which > we could as easily do on top of the bottom-up one WITHOUT any > regressions. > > Why isn't anybody else questioning the whole basic premise here? Questioning the whole premise is fine, but so far we've gone in (or at least think we're going in) a consistent direction: behave like Windows on platforms designed for Windows to avoid bugs that Windows doesn't hit and enable all the same devices Windows allows. But yes, I really don't like the nx6325 patch either; there's obviously something we're still missing that's preventing us from doing the right thing on that platform. Quirking it isn't a good long term answer. -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-12 4:16 ` Jesse Barnes @ 2010-12-12 13:20 ` Rafael J. Wysocki 2010-12-13 5:43 ` Bjorn Helgaas 0 siblings, 1 reply; 28+ messages in thread From: Rafael J. Wysocki @ 2010-12-12 13:20 UTC (permalink / raw) To: Jesse Barnes Cc: Linus Torvalds, Bjorn Helgaas, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams On Sunday, December 12, 2010, Jesse Barnes wrote: > On Sat, 11 Dec 2010 19:34:05 -0800 > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Fri, Dec 10, 2010 at 5:37 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > > > > Thanks, I'll add Dan and Rafael's tested-bys to the patches (they're > > > already in my for-linus tree). Unless Linus has a problem with them > > > I'll send them over to him this weekend or Monday. > > > > See my other email I just sent out. > > > > I really am not going to take some totally new experimental and hacky > > major PCI resource management thing this late in the -rc game. No way, > > no how. > > > > If the top-down allocator is causing regressions that cannot be fixed > > by _simple_ patches, we're simply going to have to undo it. What's the > > advantage of top-down? None. Not if we then need all this crap, which > > we could as easily do on top of the bottom-up one WITHOUT any > > regressions. > > > > Why isn't anybody else questioning the whole basic premise here? > > Questioning the whole premise is fine, but so far we've gone in (or at > least think we're going in) a consistent direction: behave like Windows > on platforms designed for Windows to avoid bugs that Windows doesn't > hit and enable all the same devices Windows allows. > > But yes, I really don't like the nx6325 patch either; there's obviously > something we're still missing that's preventing us from doing the right > thing on that platform. Quirking it isn't a good long term answer. OK, so I guess the best thing we can do for 2.6.37 is to revert 1af3c2e (x86: allocate space within a region top-down), right? Rafael ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-12 13:20 ` Rafael J. Wysocki @ 2010-12-13 5:43 ` Bjorn Helgaas 2010-12-13 13:47 ` Ingo Molnar 0 siblings, 1 reply; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-13 5:43 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jesse Barnes, Linus Torvalds, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Adam Belay, Matthew Garrett, Dan Williams On Sun, Dec 12, 2010 at 02:20:56PM +0100, Rafael J. Wysocki wrote: > On Sunday, December 12, 2010, Jesse Barnes wrote: > > On Sat, 11 Dec 2010 19:34:05 -0800 > > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > > On Fri, Dec 10, 2010 at 5:37 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > > > > > > Thanks, I'll add Dan and Rafael's tested-bys to the patches (they're > > > > already in my for-linus tree). Unless Linus has a problem with them > > > > I'll send them over to him this weekend or Monday. > > > > > > See my other email I just sent out. > > > > > > I really am not going to take some totally new experimental and hacky > > > major PCI resource management thing this late in the -rc game. No way, > > > no how. > > > > > > If the top-down allocator is causing regressions that cannot be fixed > > > by _simple_ patches, we're simply going to have to undo it. What's the > > > advantage of top-down? None. Not if we then need all this crap, which > > > we could as easily do on top of the bottom-up one WITHOUT any > > > regressions. > > > > > > Why isn't anybody else questioning the whole basic premise here? > > > > Questioning the whole premise is fine, but so far we've gone in (or at > > least think we're going in) a consistent direction: behave like Windows > > on platforms designed for Windows to avoid bugs that Windows doesn't > > hit and enable all the same devices Windows allows. > > > > But yes, I really don't like the nx6325 patch either; there's obviously > > something we're still missing that's preventing us from doing the right > > thing on that platform. Quirking it isn't a good long term answer. > > OK, so I guess the best thing we can do for 2.6.37 is to revert > 1af3c2e (x86: allocate space within a region top-down), right? That's a possibility, but I'm not sure it's the best. It would avoid the nx6325 landmine, but we could easily step on a different one on other machines. If Windows uses the end of available space first, that's the best-tested territory, and I think it's better to try to stay in it rather revert 1af3c2e and start mapping different territory that really isn't tested by Windows at all. On the nx6325, we stray out of that tested territory because Linux opens windows on subtractive-decode bridges and allocates more space to CardBus bridges. I think it'd be fairly simple to leave subtractive-decode bridges alone, and that might make the nx6325 work well enough for .37, even without the quirk. This feels like a change that might be the right thing to do in general. Using subtractive rather than positive decode would give up a little performance, but I doubt it's important There's another regression, https://bugzilla.kernel.org/show_bug.cgi?id=23132, that seems similar, though we haven't debugged it yet. It also has to do with assigning resources that aren't strictly needed to bridges, so there might be more changes we should make to that strategy. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-13 5:43 ` Bjorn Helgaas @ 2010-12-13 13:47 ` Ingo Molnar 2010-12-15 0:09 ` Bjorn Helgaas 0 siblings, 1 reply; 28+ messages in thread From: Ingo Molnar @ 2010-12-13 13:47 UTC (permalink / raw) To: Bjorn Helgaas Cc: Rafael J. Wysocki, Jesse Barnes, Linus Torvalds, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Adam Belay, Matthew Garrett, Dan Williams * Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > OK, so I guess the best thing we can do for 2.6.37 is to revert 1af3c2e (x86: > > allocate space within a region top-down), right? > > That's a possibility, but I'm not sure it's the best. It would avoid the nx6325 > landmine, but we could easily step on a different one on other machines. > > If Windows uses the end of available space first, that's the best-tested > territory, and I think it's better to try to stay in it rather revert 1af3c2e and > start mapping different territory that really isn't tested by Windows at all. > > On the nx6325, we stray out of that tested territory because Linux opens windows > on subtractive-decode bridges and allocates more space to CardBus bridges. > > I think it'd be fairly simple to leave subtractive-decode bridges alone, and that > might make the nx6325 work well enough for .37, even without the quirk. This > feels like a change that might be the right thing to do in general. Using > subtractive rather than positive decode would give up a little performance, but I > doubt it's important Is there a patch for that that people could try? The regression needs to be addressed one way or another: either by improving the "compatible with Windows" approach to actually work (without quirks), or by reverting to the tested-for-years Linux solution. ( What we always try to avoid is trading in a set of regressions for another set of regressions. That way lies madness, it can not result in reliable, provable progress. ) Thanks, Ingo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas 2010-12-13 13:47 ` Ingo Molnar @ 2010-12-15 0:09 ` Bjorn Helgaas 0 siblings, 0 replies; 28+ messages in thread From: Bjorn Helgaas @ 2010-12-15 0:09 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Jesse Barnes, Linus Torvalds, Len Brown, linux-pci, linux-kernel, linux-acpi, H. Peter Anvin, Thomas Gleixner, Adam Belay, Matthew Garrett, Dan Williams On Monday, December 13, 2010 06:47:39 am Ingo Molnar wrote: > > * Bjorn Helgaas <bjorn.helgaas@hp.com> wrote: > > > > OK, so I guess the best thing we can do for 2.6.37 is to revert 1af3c2e (x86: > > > allocate space within a region top-down), right? > > > > That's a possibility, but I'm not sure it's the best. It would avoid the nx6325 > > landmine, but we could easily step on a different one on other machines. > > > > If Windows uses the end of available space first, that's the best-tested > > territory, and I think it's better to try to stay in it rather revert 1af3c2e and > > start mapping different territory that really isn't tested by Windows at all. > > > > On the nx6325, we stray out of that tested territory because Linux opens windows > > on subtractive-decode bridges and allocates more space to CardBus bridges. > > > > I think it'd be fairly simple to leave subtractive-decode bridges alone, and that > > might make the nx6325 work well enough for .37, even without the quirk. This > > feels like a change that might be the right thing to do in general. Using > > subtractive rather than positive decode would give up a little performance, but I > > doubt it's important > > Is there a patch for that that people could try? Yes. https://bugzilla.kernel.org/show_bug.cgi?id=23332#c22 It works for me, but it's such a new idea to me that I'm not quite used to it and don't know how confident to be yet. Bjorn ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2010-12-15 19:21 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-12-08 21:36 [PATCH 1/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 2/5] x86: avoid BIOS area when allocating address space Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 3/5] x86: avoid PNP resources " Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 4/5] PNP: add framework for platform PNP quirks Bjorn Helgaas 2010-12-08 21:36 ` [PATCH 5/5] PNP: HP nx6325 fixup: reserve unreported resources Bjorn Helgaas 2010-12-12 3:30 ` Linus Torvalds 2010-12-12 5:23 ` Dave Airlie 2010-12-12 6:17 ` Bjorn Helgaas 2010-12-14 20:34 ` Linus Torvalds 2010-12-14 20:44 ` Linus Torvalds 2010-12-14 23:57 ` Bjorn Helgaas 2010-12-15 6:02 ` Bjorn Helgaas 2010-12-15 6:26 ` Bjorn Helgaas 2010-12-15 7:03 ` Linus Torvalds 2010-12-15 18:18 ` Bjorn Helgaas 2010-12-15 18:27 ` H. Peter Anvin 2010-12-15 19:21 ` Linus Torvalds 2010-12-08 21:37 ` [PATCH 0/5] resources: add arch hook for preventing allocation in reserved areas Bjorn Helgaas 2010-12-10 20:30 ` [PATCH 1/5] " Jesse Barnes 2010-12-10 20:36 ` Jesse Barnes 2010-12-10 21:07 ` Bjorn Helgaas 2010-12-11 1:37 ` Jesse Barnes 2010-12-12 3:34 ` Linus Torvalds 2010-12-12 4:16 ` Jesse Barnes 2010-12-12 13:20 ` Rafael J. Wysocki 2010-12-13 5:43 ` Bjorn Helgaas 2010-12-13 13:47 ` Ingo Molnar 2010-12-15 0:09 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox