* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 18:56 ` Bjorn Helgaas
@ 2010-05-13 19:08 ` H. Peter Anvin
2010-05-13 19:12 ` Mike Travis
2010-05-13 19:38 ` Mike Habeck
2 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-13 19:08 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, x86, Jesse Barnes,
Jacob Pan, Tejun Heo, Mike Habeck, LKML, Yinghai,
linux-pci@vger.kernel.org
On 05/13/2010 11:56 AM, Bjorn Helgaas wrote:
>
> I'm a little bit nervous about Linux's current strategy of assigning
> resources to things before we even know whether we're going to use
> them. We don't support dynamic PCI resource reassignment, so maybe
> we don't have any choice in this case, but generally I prefer the
> lazy approach.
>
Lazy has its own pitfalls. In particular, the issue here is about
allocation of bridge devices, which may cause all kinds of issues with
bridges further up. Consider:
A
/ \
B C
| |
D E
... where A, B, and C are bridges, and D and E are devices. Now the
driver initializes D, and requests address space. As a result, the OS
configures bridge B to have a 4K window, and accordingly the same for
bridge A.
Now you want to initialize device E. This means the window for bridge A
has to be widened, because bridge C is going to need its own 4K window.
If the linearly consecutive address space is not available, it now
means reconfiguring bridge B and device D.
4K granularity really hurts. It made sense when bridges were relatively
rare, but in PCI-express world that is no longer the case...
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 18:56 ` Bjorn Helgaas
2010-05-13 19:08 ` H. Peter Anvin
@ 2010-05-13 19:12 ` Mike Travis
2010-05-13 19:13 ` Mike Travis
2010-05-13 19:54 ` Bjorn Helgaas
2010-05-13 19:38 ` Mike Habeck
2 siblings, 2 replies; 49+ messages in thread
From: Mike Travis @ 2010-05-13 19:12 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86, Jesse Barnes,
Jacob Pan, Tejun Heo, Mike Habeck, LKML, Yinghai
Bjorn Helgaas wrote:
> On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
>> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
>> From: Mike Habeck <habeck@sgi.com>
>>
>> The Linux kernel assigns BARs that a BIOS did not assign, most likely
>> to handle broken BIOSes that didn't enumerate the devices correctly.
>> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
>> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
>> We purposely don't assign these I/O BARs because I/O Space is a very
>> limited resource. There is only 64k of I/O Space, and in a PCIe
>> topology that space gets divided up into 4k chucks (this is due to
>> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
>> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
>>
>> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
>> I/O BARs on devices we know don't use them, we can do that (iff the
>> kernel doesn't go and assign these BARs that the BIOS purposely didn't
>> assign).
>
> I don't quite understand this part. If you boot with "pci=nobar",
> the BIOS doesn't assign BARs, Linux doesn't either, the drivers
> don't need them -- everything works, and that makes sense so far.
>
> Now, if you boot normally (without "pci=nobar"), what changes?
> The BIOS situation is the same, but Linux tries to assign the
> unassigned BARs. It may assign a few before running out of space,
> but the drivers still don't need those BARs. What breaks?
The problem arises because we run out of address spaces to assign.
Say you have 24 cards, and the 1st 16 do not use I/O BARs. If
you assign the available 16 address spaces to cards that may not
need them, then the final 8 cards will not be available.
This avoids this problem by not wasting I/O address spaces when
they are not going to be used.
>
>> This patch will not assign a resource to a device BAR if that BAR was
>> not assigned by the BIOS, and the kernel cmdline option 'pci=nobar'
>> was specified. This patch is closely modeled after the 'pci=norom'
>> option that currently exists in the tree.
>
> Can't we figure out whether we need this ourselves? Using a command-
> line option just guarantees that we'll forever be writing customer
> advisories about this issue.
I think since this is so specific (like the potential of having
more than 16 cards would be something the customer would know),
I think it's better to error on the safe side. If a BIOS does
not recognize an add in card (for whatever reason), and does
not assign the I/O BAR, then it would be up to the kernel to
do that. Wouldn't you get more customer complaints about non-working
I/O, than someone with > 16 PCI cards not being able to use them
all?
>
> This issue is not specific to x86, so I don't really like having
> the implementation be x86-specific.
We were going for as light a touch as possible, as there is not
time to verify other arches. I'd be glad to submit a follow on
patch dealing with the generic case and depend on others for
testing, if that's of interest.
Note we also modeled the option to be identical in operation to
the pci=norom option, which is a similar x86 specific function.
>
> Do we know anything about how other OSes handle this case of I/O
> space exhaustion?
16+ PCI devices is a fairly large amount. Are there any other PC's
that handle this much I/O?
>
> I'm a little bit nervous about Linux's current strategy of assigning
> resources to things before we even know whether we're going to use
> them. We don't support dynamic PCI resource reassignment, so maybe
> we don't have any choice in this case, but generally I prefer the
> lazy approach.
That's a great idea if it can work. Unfortunately, we are all tied
to the way BIOS sets up the system, and for UV systems I don't think
dynamic provisioning would work. There's too much infrastructure
that all has to cooperate by the time the system is fully functional.
>
> Bjorn
Thanks for the feedback.
Mike
>
>> Signed-off-by: Mike Habeck <habeck@sgi.com>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> Documentation/kernel-parameters.txt | 2 ++
>> arch/x86/include/asm/pci_x86.h | 1 +
>> arch/x86/pci/common.c | 20 ++++++++++++++++++++
>> 3 files changed, 23 insertions(+)
>>
>> --- linux.orig/Documentation/kernel-parameters.txt
>> +++ linux/Documentation/kernel-parameters.txt
>> @@ -1935,6 +1935,8 @@ and is between 256 and 4096 characters.
>> norom [X86] Do not assign address space to
>> expansion ROMs that do not already have
>> BIOS assigned address ranges.
>> + nobar [X86] Do not assign address space to the
>> + BARs that weren't assigned by the BIOS.
>> irqmask=0xMMMM [X86] Set a bit mask of IRQs allowed to be
>> assigned automatically to PCI devices. You can
>> make the kernel exclude IRQs of your ISA cards
>> --- linux.orig/arch/x86/include/asm/pci_x86.h
>> +++ linux/arch/x86/include/asm/pci_x86.h
>> @@ -30,6 +30,7 @@
>> #define PCI_HAS_IO_ECS 0x40000
>> #define PCI_NOASSIGN_ROMS 0x80000
>> #define PCI_ROOT_NO_CRS 0x100000
>> +#define PCI_NOASSIGN_BARS 0x200000
>>
>> extern unsigned int pci_probe;
>> extern unsigned long pirq_table_addr;
>> --- linux.orig/arch/x86/pci/common.c
>> +++ linux/arch/x86/pci/common.c
>> @@ -125,6 +125,23 @@ void __init dmi_check_skip_isa_align(voi
>> static void __devinit pcibios_fixup_device_resources(struct pci_dev *dev)
>> {
>> struct resource *rom_r = &dev->resource[PCI_ROM_RESOURCE];
>> + struct resource *bar_r;
>> + int bar;
>> +
>> + if (pci_probe & PCI_NOASSIGN_BARS) {
>> + /*
>> + * If the BIOS did not assign the BAR, zero out the
>> + * resource so the kernel doesn't attmept to assign
>> + * it later on in pci_assign_unassigned_resources
>> + */
>> + for (bar = 0; bar <= PCI_STD_RESOURCE_END; bar++) {
>> + bar_r = &dev->resource[bar];
>> + if (bar_r->start == 0 && bar_r->end != 0) {
>> + bar_r->flags = 0;
>> + bar_r->end = 0;
>> + }
>> + }
>> + }
>>
>> if (pci_probe & PCI_NOASSIGN_ROMS) {
>> if (rom_r->parent)
>> @@ -509,6 +526,9 @@ char * __devinit pcibios_setup(char *st
>> } else if (!strcmp(str, "norom")) {
>> pci_probe |= PCI_NOASSIGN_ROMS;
>> return NULL;
>> + } else if (!strcmp(str, "nobar")) {
>> + pci_probe |= PCI_NOASSIGN_BARS;
>> + return NULL;
>> } else if (!strcmp(str, "assign-busses")) {
>> pci_probe |= PCI_ASSIGN_ALL_BUSSES;
>> return NULL;
>>
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 19:12 ` Mike Travis
@ 2010-05-13 19:13 ` Mike Travis
2010-05-13 19:54 ` Bjorn Helgaas
1 sibling, 0 replies; 49+ messages in thread
From: Mike Travis @ 2010-05-13 19:13 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86, Jesse Barnes,
Jacob Pan, Tejun Heo, Mike Habeck, LKML, Yinghai, linux-pci
> Oops, I added Yinghai to the CC: list, but I forgot to add
> linux-pci@vger.kernel.org. Please add that on any future replies.
[added.]
Mike Travis wrote:
>
>
> Bjorn Helgaas wrote:
>> On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
>>> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not
>>> already assigned
>>> From: Mike Habeck <habeck@sgi.com>
>>>
>>> The Linux kernel assigns BARs that a BIOS did not assign, most likely
>>> to handle broken BIOSes that didn't enumerate the devices correctly.
>>> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
>>> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
>>> We purposely don't assign these I/O BARs because I/O Space is a very
>>> limited resource. There is only 64k of I/O Space, and in a PCIe
>>> topology that space gets divided up into 4k chucks (this is due to
>>> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
>>> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
>>>
>>> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
>>> I/O BARs on devices we know don't use them, we can do that (iff the
>>> kernel doesn't go and assign these BARs that the BIOS purposely didn't
>>> assign).
>>
>> I don't quite understand this part. If you boot with "pci=nobar",
>> the BIOS doesn't assign BARs, Linux doesn't either, the drivers
>> don't need them -- everything works, and that makes sense so far.
>>
>> Now, if you boot normally (without "pci=nobar"), what changes?
>> The BIOS situation is the same, but Linux tries to assign the
>> unassigned BARs. It may assign a few before running out of space,
>> but the drivers still don't need those BARs. What breaks?
>
> The problem arises because we run out of address spaces to assign.
>
> Say you have 24 cards, and the 1st 16 do not use I/O BARs. If
> you assign the available 16 address spaces to cards that may not
> need them, then the final 8 cards will not be available.
>
> This avoids this problem by not wasting I/O address spaces when
> they are not going to be used.
>
>>
>>> This patch will not assign a resource to a device BAR if that BAR was
>>> not assigned by the BIOS, and the kernel cmdline option 'pci=nobar'
>>> was specified. This patch is closely modeled after the 'pci=norom'
>>> option that currently exists in the tree.
>>
>> Can't we figure out whether we need this ourselves? Using a command-
>> line option just guarantees that we'll forever be writing customer
>> advisories about this issue.
>
> I think since this is so specific (like the potential of having
> more than 16 cards would be something the customer would know),
> I think it's better to error on the safe side. If a BIOS does
> not recognize an add in card (for whatever reason), and does
> not assign the I/O BAR, then it would be up to the kernel to
> do that. Wouldn't you get more customer complaints about non-working
> I/O, than someone with > 16 PCI cards not being able to use them
> all?
>
>>
>> This issue is not specific to x86, so I don't really like having
>> the implementation be x86-specific.
>
> We were going for as light a touch as possible, as there is not
> time to verify other arches. I'd be glad to submit a follow on
> patch dealing with the generic case and depend on others for
> testing, if that's of interest.
>
> Note we also modeled the option to be identical in operation to
> the pci=norom option, which is a similar x86 specific function.
>
>>
>> Do we know anything about how other OSes handle this case of I/O
>> space exhaustion?
>
> 16+ PCI devices is a fairly large amount. Are there any other PC's
> that handle this much I/O?
>>
>> I'm a little bit nervous about Linux's current strategy of assigning
>> resources to things before we even know whether we're going to use
>> them. We don't support dynamic PCI resource reassignment, so maybe
>> we don't have any choice in this case, but generally I prefer the
>> lazy approach.
>
> That's a great idea if it can work. Unfortunately, we are all tied
> to the way BIOS sets up the system, and for UV systems I don't think
> dynamic provisioning would work. There's too much infrastructure
> that all has to cooperate by the time the system is fully functional.
>
>>
>> Bjorn
>
> Thanks for the feedback.
>
> Mike
>
>>
>>> Signed-off-by: Mike Habeck <habeck@sgi.com>
>>> Signed-off-by: Mike Travis <travis@sgi.com>
>>> ---
>>> Documentation/kernel-parameters.txt | 2 ++
>>> arch/x86/include/asm/pci_x86.h | 1 +
>>> arch/x86/pci/common.c | 20 ++++++++++++++++++++
>>> 3 files changed, 23 insertions(+)
>>>
>>> --- linux.orig/Documentation/kernel-parameters.txt
>>> +++ linux/Documentation/kernel-parameters.txt
>>> @@ -1935,6 +1935,8 @@ and is between 256 and 4096 characters.
>>> norom [X86] Do not assign address space to
>>> expansion ROMs that do not already have
>>> BIOS assigned address ranges.
>>> + nobar [X86] Do not assign address space to the
>>> + BARs that weren't assigned by the BIOS.
>>> irqmask=0xMMMM [X86] Set a bit mask of IRQs allowed to be
>>> assigned automatically to PCI devices. You can
>>> make the kernel exclude IRQs of your ISA cards
>>> --- linux.orig/arch/x86/include/asm/pci_x86.h
>>> +++ linux/arch/x86/include/asm/pci_x86.h
>>> @@ -30,6 +30,7 @@
>>> #define PCI_HAS_IO_ECS 0x40000
>>> #define PCI_NOASSIGN_ROMS 0x80000
>>> #define PCI_ROOT_NO_CRS 0x100000
>>> +#define PCI_NOASSIGN_BARS 0x200000
>>>
>>> extern unsigned int pci_probe;
>>> extern unsigned long pirq_table_addr;
>>> --- linux.orig/arch/x86/pci/common.c
>>> +++ linux/arch/x86/pci/common.c
>>> @@ -125,6 +125,23 @@ void __init dmi_check_skip_isa_align(voi
>>> static void __devinit pcibios_fixup_device_resources(struct pci_dev
>>> *dev)
>>> {
>>> struct resource *rom_r = &dev->resource[PCI_ROM_RESOURCE];
>>> + struct resource *bar_r;
>>> + int bar;
>>> +
>>> + if (pci_probe & PCI_NOASSIGN_BARS) {
>>> + /*
>>> + * If the BIOS did not assign the BAR, zero out the
>>> + * resource so the kernel doesn't attmept to assign
>>> + * it later on in pci_assign_unassigned_resources
>>> + */
>>> + for (bar = 0; bar <= PCI_STD_RESOURCE_END; bar++) {
>>> + bar_r = &dev->resource[bar];
>>> + if (bar_r->start == 0 && bar_r->end != 0) {
>>> + bar_r->flags = 0;
>>> + bar_r->end = 0;
>>> + }
>>> + }
>>> + }
>>>
>>> if (pci_probe & PCI_NOASSIGN_ROMS) {
>>> if (rom_r->parent)
>>> @@ -509,6 +526,9 @@ char * __devinit pcibios_setup(char *st
>>> } else if (!strcmp(str, "norom")) {
>>> pci_probe |= PCI_NOASSIGN_ROMS;
>>> return NULL;
>>> + } else if (!strcmp(str, "nobar")) {
>>> + pci_probe |= PCI_NOASSIGN_BARS;
>>> + return NULL;
>>> } else if (!strcmp(str, "assign-busses")) {
>>> pci_probe |= PCI_ASSIGN_ALL_BUSSES;
>>> return NULL;
>>>
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 19:12 ` Mike Travis
2010-05-13 19:13 ` Mike Travis
@ 2010-05-13 19:54 ` Bjorn Helgaas
2010-05-13 20:27 ` Mike Habeck
1 sibling, 1 reply; 49+ messages in thread
From: Bjorn Helgaas @ 2010-05-13 19:54 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86, Jesse Barnes,
Jacob Pan, Tejun Heo, Mike Habeck, LKML, Yinghai, linux-pci,
Myron Stowe
On Thursday, May 13, 2010 01:12:21 pm Mike Travis wrote:
>
> Bjorn Helgaas wrote:
> > On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
> >> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
> >> From: Mike Habeck <habeck@sgi.com>
> >>
> >> The Linux kernel assigns BARs that a BIOS did not assign, most likely
> >> to handle broken BIOSes that didn't enumerate the devices correctly.
> >> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
> >> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
> >> We purposely don't assign these I/O BARs because I/O Space is a very
> >> limited resource. There is only 64k of I/O Space, and in a PCIe
> >> topology that space gets divided up into 4k chucks (this is due to
> >> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
> >> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
> >>
> >> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
> >> I/O BARs on devices we know don't use them, we can do that (iff the
> >> kernel doesn't go and assign these BARs that the BIOS purposely didn't
> >> assign).
> >
> > I don't quite understand this part. If you boot with "pci=nobar",
> > the BIOS doesn't assign BARs, Linux doesn't either, the drivers
> > don't need them -- everything works, and that makes sense so far.
> >
> > Now, if you boot normally (without "pci=nobar"), what changes?
> > The BIOS situation is the same, but Linux tries to assign the
> > unassigned BARs. It may assign a few before running out of space,
> > but the drivers still don't need those BARs. What breaks?
>
> The problem arises because we run out of address spaces to assign.
>
> Say you have 24 cards, and the 1st 16 do not use I/O BARs. If
> you assign the available 16 address spaces to cards that may not
> need them, then the final 8 cards will not be available.
It sounds like your BIOS treats some devices specially, so I assumed
it would leave the first sixteen devices unassigned, but would assign
the last eight, including the bridge windows leading to them. In that
case, I would expect Linux to preserve the resources of the last
eight devices, since they're already assigned, and assign anything
left over to the first sixteen.
Are you saying that Linux clobbers the resources of the last eight
devices in the process of assigning the first sixteen? If so, I'd
say that's a Linux bug.
Or are the last eight hot-added cards that the BIOS never had a
chance to assign? That's definitely a problem.
> > Can't we figure out whether we need this ourselves? Using a command-
> > line option just guarantees that we'll forever be writing customer
> > advisories about this issue.
>
> I think since this is so specific (like the potential of having
> more than 16 cards would be something the customer would know),
> I think it's better to error on the safe side. If a BIOS does
> not recognize an add in card (for whatever reason), and does
> not assign the I/O BAR, then it would be up to the kernel to
> do that. Wouldn't you get more customer complaints about non-working
> I/O, than someone with > 16 PCI cards not being able to use them
> all?
It feels specific now, but in five years, I bet it won't be so
unusual. I really don't want to force customers to figure out
when they need this.
> > This issue is not specific to x86, so I don't really like having
> > the implementation be x86-specific.
>
> We were going for as light a touch as possible, as there is not
> time to verify other arches. I'd be glad to submit a follow on
> patch dealing with the generic case and depend on others for
> testing, if that's of interest.
It's of interest to me. I spend a lot of time pulling generic
out of architecture-specific places. If there's stuff that we
know is generic from the beginning, we shouldn't make work for
ourselves by making it x86-specific.
> > I'm a little bit nervous about Linux's current strategy of assigning
> > resources to things before we even know whether we're going to use
> > them. We don't support dynamic PCI resource reassignment, so maybe
> > we don't have any choice in this case, but generally I prefer the
> > lazy approach.
>
> That's a great idea if it can work. Unfortunately, we are all tied
> to the way BIOS sets up the system, and for UV systems I don't think
> dynamic provisioning would work. There's too much infrastructure
> that all has to cooperate by the time the system is fully functional.
Like I said, we maybe don't have a choice in this case, but I'd like
to have a clearer understanding of the problem and how other OSes
deal with it before we start applying band-aids that will hurt when
we pull them off later.
Bjorn
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 19:54 ` Bjorn Helgaas
@ 2010-05-13 20:27 ` Mike Habeck
0 siblings, 0 replies; 49+ messages in thread
From: Mike Habeck @ 2010-05-13 20:27 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
Jesse Barnes, Jacob Pan, Tejun Heo, LKML, Yinghai, linux-pci,
Myron Stowe, habeck
Bjorn Helgaas wrote:
> On Thursday, May 13, 2010 01:12:21 pm Mike Travis wrote:
>> Bjorn Helgaas wrote:
>>> On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
>>>> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
>>>> From: Mike Habeck <habeck@sgi.com>
>>>>
>>>> The Linux kernel assigns BARs that a BIOS did not assign, most likely
>>>> to handle broken BIOSes that didn't enumerate the devices correctly.
>>>> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
>>>> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
>>>> We purposely don't assign these I/O BARs because I/O Space is a very
>>>> limited resource. There is only 64k of I/O Space, and in a PCIe
>>>> topology that space gets divided up into 4k chucks (this is due to
>>>> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
>>>> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
>>>>
>>>> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
>>>> I/O BARs on devices we know don't use them, we can do that (iff the
>>>> kernel doesn't go and assign these BARs that the BIOS purposely didn't
>>>> assign).
>>> I don't quite understand this part. If you boot with "pci=nobar",
>>> the BIOS doesn't assign BARs, Linux doesn't either, the drivers
>>> don't need them -- everything works, and that makes sense so far.
>>>
>>> Now, if you boot normally (without "pci=nobar"), what changes?
>>> The BIOS situation is the same, but Linux tries to assign the
>>> unassigned BARs. It may assign a few before running out of space,
>>> but the drivers still don't need those BARs. What breaks?
>> The problem arises because we run out of address spaces to assign.
>>
>> Say you have 24 cards, and the 1st 16 do not use I/O BARs. If
>> you assign the available 16 address spaces to cards that may not
>> need them, then the final 8 cards will not be available.
>
> It sounds like your BIOS treats some devices specially, so I assumed
> it would leave the first sixteen devices unassigned, but would assign
> the last eight, including the bridge windows leading to them. In that
> case, I would expect Linux to preserve the resources of the last
> eight devices, since they're already assigned, and assign anything
> left over to the first sixteen.
Correct, the BIOS special cases devices that have an I/O BAR but
aren't used because the same functionality that the I/O BAR provides
is available via the MEM BAR. So if the device driver only uses the
MEM BAR then the BIOS will only assign resources to it, not to the
unused I/O BAR
And yes, linux would preserve the resources of the last eight devices
in your example above
>
> Are you saying that Linux clobbers the resources of the last eight
> devices in the process of assigning the first sixteen? If so, I'd
> say that's a Linux bug.
No, Linux does not clobbers the resources
>
> Or are the last eight hot-added cards that the BIOS never had a
> chance to assign? That's definitely a problem.
Yes, that is one of the problems we envision seeing.
>
>>> Can't we figure out whether we need this ourselves? Using a command-
>>> line option just guarantees that we'll forever be writing customer
>>> advisories about this issue.
>> I think since this is so specific (like the potential of having
>> more than 16 cards would be something the customer would know),
>> I think it's better to error on the safe side. If a BIOS does
>> not recognize an add in card (for whatever reason), and does
>> not assign the I/O BAR, then it would be up to the kernel to
>> do that. Wouldn't you get more customer complaints about non-working
>> I/O, than someone with > 16 PCI cards not being able to use them
>> all?
>
> It feels specific now, but in five years, I bet it won't be so
> unusual. I really don't want to force customers to figure out
> when they need this.
>
>>> This issue is not specific to x86, so I don't really like having
>>> the implementation be x86-specific.
>> We were going for as light a touch as possible, as there is not
>> time to verify other arches. I'd be glad to submit a follow on
>> patch dealing with the generic case and depend on others for
>> testing, if that's of interest.
>
> It's of interest to me. I spend a lot of time pulling generic
> out of architecture-specific places. If there's stuff that we
> know is generic from the beginning, we shouldn't make work for
> ourselves by making it x86-specific.
>
>>> I'm a little bit nervous about Linux's current strategy of assigning
>>> resources to things before we even know whether we're going to use
>>> them. We don't support dynamic PCI resource reassignment, so maybe
>>> we don't have any choice in this case, but generally I prefer the
>>> lazy approach.
>> That's a great idea if it can work. Unfortunately, we are all tied
>> to the way BIOS sets up the system, and for UV systems I don't think
>> dynamic provisioning would work. There's too much infrastructure
>> that all has to cooperate by the time the system is fully functional.
>
> Like I said, we maybe don't have a choice in this case, but I'd like
> to have a clearer understanding of the problem and how other OSes
> deal with it before we start applying band-aids that will hurt when
> we pull them off later.
>
> Bjorn
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 18:56 ` Bjorn Helgaas
2010-05-13 19:08 ` H. Peter Anvin
2010-05-13 19:12 ` Mike Travis
@ 2010-05-13 19:38 ` Mike Habeck
2010-05-13 20:02 ` Bjorn Helgaas
2010-05-13 20:36 ` Yinghai Lu
2 siblings, 2 replies; 49+ messages in thread
From: Mike Habeck @ 2010-05-13 19:38 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
Jesse Barnes, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org
Bjorn Helgaas wrote:
> On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
>> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
>> From: Mike Habeck <habeck@sgi.com>
>>
>> The Linux kernel assigns BARs that a BIOS did not assign, most likely
>> to handle broken BIOSes that didn't enumerate the devices correctly.
>> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
>> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
>> We purposely don't assign these I/O BARs because I/O Space is a very
>> limited resource. There is only 64k of I/O Space, and in a PCIe
>> topology that space gets divided up into 4k chucks (this is due to
>> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
>> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
>>
>> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
>> I/O BARs on devices we know don't use them, we can do that (iff the
>> kernel doesn't go and assign these BARs that the BIOS purposely didn't
>> assign).
>
> I don't quite understand this part. If you boot with "pci=nobar",
> the BIOS doesn't assign BARs, Linux doesn't either, the drivers
> don't need them -- everything works, and that makes sense so far.
>
> Now, if you boot normally (without "pci=nobar"), what changes?
> The BIOS situation is the same, but Linux tries to assign the
> unassigned BARs. It may assign a few before running out of space,
> but the drivers still don't need those BARs. What breaks?
Nothing really breaks, it's more of a problem that the kernel uses
up the rest of the I/O Space, and starts spitting out warning
messages as it tries to assign the rest of the I/O BARs that the
BIOS didn't assign, something like:
pci 0010:03:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
pci 0012:05:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
...
And in using up all the I/O space, I think that could prevent a
hotplug attach of a pci device requiring I/O space (although I
believe most BIOSes pad the bridge decoders to support that).
I'm not to familiar with how pci hotplug works on x86 so I may
be wrong in what I just stated.
>
>> This patch will not assign a resource to a device BAR if that BAR was
>> not assigned by the BIOS, and the kernel cmdline option 'pci=nobar'
>> was specified. This patch is closely modeled after the 'pci=norom'
>> option that currently exists in the tree.
>
> Can't we figure out whether we need this ourselves? Using a command-
> line option just guarantees that we'll forever be writing customer
> advisories about this issue.
>
> This issue is not specific to x86, so I don't really like having
> the implementation be x86-specific.
I agree this isn't a x86 specific issue but given the 'norom'
cmdline option is basically doing the same thing (but for pci
Expansion ROM BARs) this code was modeled after it.
>
> Do we know anything about how other OSes handle this case of I/O
> space exhaustion?
>
> I'm a little bit nervous about Linux's current strategy of assigning
> resources to things before we even know whether we're going to use
> them. We don't support dynamic PCI resource reassignment, so maybe
> we don't have any choice in this case, but generally I prefer the
> lazy approach.
>
> Bjorn
>
>> Signed-off-by: Mike Habeck <habeck@sgi.com>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> Documentation/kernel-parameters.txt | 2 ++
>> arch/x86/include/asm/pci_x86.h | 1 +
>> arch/x86/pci/common.c | 20 ++++++++++++++++++++
>> 3 files changed, 23 insertions(+)
>>
>> --- linux.orig/Documentation/kernel-parameters.txt
>> +++ linux/Documentation/kernel-parameters.txt
>> @@ -1935,6 +1935,8 @@ and is between 256 and 4096 characters.
>> norom [X86] Do not assign address space to
>> expansion ROMs that do not already have
>> BIOS assigned address ranges.
>> + nobar [X86] Do not assign address space to the
>> + BARs that weren't assigned by the BIOS.
>> irqmask=0xMMMM [X86] Set a bit mask of IRQs allowed to be
>> assigned automatically to PCI devices. You can
>> make the kernel exclude IRQs of your ISA cards
>> --- linux.orig/arch/x86/include/asm/pci_x86.h
>> +++ linux/arch/x86/include/asm/pci_x86.h
>> @@ -30,6 +30,7 @@
>> #define PCI_HAS_IO_ECS 0x40000
>> #define PCI_NOASSIGN_ROMS 0x80000
>> #define PCI_ROOT_NO_CRS 0x100000
>> +#define PCI_NOASSIGN_BARS 0x200000
>>
>> extern unsigned int pci_probe;
>> extern unsigned long pirq_table_addr;
>> --- linux.orig/arch/x86/pci/common.c
>> +++ linux/arch/x86/pci/common.c
>> @@ -125,6 +125,23 @@ void __init dmi_check_skip_isa_align(voi
>> static void __devinit pcibios_fixup_device_resources(struct pci_dev *dev)
>> {
>> struct resource *rom_r = &dev->resource[PCI_ROM_RESOURCE];
>> + struct resource *bar_r;
>> + int bar;
>> +
>> + if (pci_probe & PCI_NOASSIGN_BARS) {
>> + /*
>> + * If the BIOS did not assign the BAR, zero out the
>> + * resource so the kernel doesn't attmept to assign
>> + * it later on in pci_assign_unassigned_resources
>> + */
>> + for (bar = 0; bar <= PCI_STD_RESOURCE_END; bar++) {
>> + bar_r = &dev->resource[bar];
>> + if (bar_r->start == 0 && bar_r->end != 0) {
>> + bar_r->flags = 0;
>> + bar_r->end = 0;
>> + }
>> + }
>> + }
>>
>> if (pci_probe & PCI_NOASSIGN_ROMS) {
>> if (rom_r->parent)
>> @@ -509,6 +526,9 @@ char * __devinit pcibios_setup(char *st
>> } else if (!strcmp(str, "norom")) {
>> pci_probe |= PCI_NOASSIGN_ROMS;
>> return NULL;
>> + } else if (!strcmp(str, "nobar")) {
>> + pci_probe |= PCI_NOASSIGN_BARS;
>> + return NULL;
>> } else if (!strcmp(str, "assign-busses")) {
>> pci_probe |= PCI_ASSIGN_ALL_BUSSES;
>> return NULL;
>>
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 19:38 ` Mike Habeck
@ 2010-05-13 20:02 ` Bjorn Helgaas
2010-05-13 20:09 ` H. Peter Anvin
2010-05-14 22:25 ` Jesse Barnes
2010-05-13 20:36 ` Yinghai Lu
1 sibling, 2 replies; 49+ messages in thread
From: Bjorn Helgaas @ 2010-05-13 20:02 UTC (permalink / raw)
To: Mike Habeck
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, H. Peter Anvin, x86,
Jesse Barnes, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Thursday, May 13, 2010 01:38:24 pm Mike Habeck wrote:
> Bjorn Helgaas wrote:
> > On Wednesday, May 12, 2010 12:14:32 pm Mike Travis wrote:
> >> Subject: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
> >> From: Mike Habeck <habeck@sgi.com>
> >>
> >> The Linux kernel assigns BARs that a BIOS did not assign, most likely
> >> to handle broken BIOSes that didn't enumerate the devices correctly.
> >> On UV the BIOS purposely doesn't assign I/O BARs for certain devices/
> >> drivers we know don't use them (examples, LSI SAS, Qlogic FC, ...).
> >> We purposely don't assign these I/O BARs because I/O Space is a very
> >> limited resource. There is only 64k of I/O Space, and in a PCIe
> >> topology that space gets divided up into 4k chucks (this is due to
> >> the fact that a pci-to-pci bridge's I/O decoder is aligned at 4k)...
> >> Thus a system can have at most 16 cards with I/O BARs: (64k / 4k = 16)
> >>
> >> SGI needs to scale to >16 devices with I/O BARs. So by not assigning
> >> I/O BARs on devices we know don't use them, we can do that (iff the
> >> kernel doesn't go and assign these BARs that the BIOS purposely didn't
> >> assign).
> >
> > I don't quite understand this part. If you boot with "pci=nobar",
> > the BIOS doesn't assign BARs, Linux doesn't either, the drivers
> > don't need them -- everything works, and that makes sense so far.
> >
> > Now, if you boot normally (without "pci=nobar"), what changes?
> > The BIOS situation is the same, but Linux tries to assign the
> > unassigned BARs. It may assign a few before running out of space,
> > but the drivers still don't need those BARs. What breaks?
>
> Nothing really breaks, it's more of a problem that the kernel uses
> up the rest of the I/O Space, and starts spitting out warning
> messages as it tries to assign the rest of the I/O BARs that the
> BIOS didn't assign, something like:
>
> pci 0010:03:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
> pci 0012:05:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
> ...
OK, that's what I would expect. Personally, I think I'd *like*
to have those messages. If 0010:03:00.0 is a device whose driver
depends on I/O space, the message will be a good clue as to why
the driver isn't working.
> And in using up all the I/O space, I think that could prevent a
> hotplug attach of a pci device requiring I/O space (although I
> believe most BIOSes pad the bridge decoders to support that).
> I'm not to familiar with how pci hotplug works on x86 so I may
> be wrong in what I just stated.
Yep, that's definitely a problem, and I don't have a good solution.
HP (and probably SGI) had a nice hardware solution for ia64 --
address translation across the host bridge, so each bridge could
have its own 64K I/O space. But I don't see that coming in the
x86 PC arena.
> > This issue is not specific to x86, so I don't really like having
> > the implementation be x86-specific.
>
> I agree this isn't a x86 specific issue but given the 'norom'
> cmdline option is basically doing the same thing (but for pci
> Expansion ROM BARs) this code was modeled after it.
IMHO, we should fix both.
Bjorn
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 20:02 ` Bjorn Helgaas
@ 2010-05-13 20:09 ` H. Peter Anvin
2010-05-14 22:25 ` Jesse Barnes
1 sibling, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-13 20:09 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Habeck, Mike Travis, Ingo Molnar, Thomas Gleixner, x86,
Jesse Barnes, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/13/2010 01:02 PM, Bjorn Helgaas wrote:
>
> Yep, that's definitely a problem, and I don't have a good solution.
>
> HP (and probably SGI) had a nice hardware solution for ia64 --
> address translation across the host bridge, so each bridge could
> have its own 64K I/O space. But I don't see that coming in the
> x86 PC arena.
>
I would agree, that is unlikely (not that I have any information on this
subject, mind you.) Much more likely would be a bridge extension
capability for finer-grained I/O space routing.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 20:02 ` Bjorn Helgaas
2010-05-13 20:09 ` H. Peter Anvin
@ 2010-05-14 22:25 ` Jesse Barnes
2010-05-14 22:34 ` Mike Travis
1 sibling, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 22:25 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Habeck, Mike Travis, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Thu, 13 May 2010 14:02:30 -0600
Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
> > > This issue is not specific to x86, so I don't really like having
> > > the implementation be x86-specific.
> >
> > I agree this isn't a x86 specific issue but given the 'norom'
> > cmdline option is basically doing the same thing (but for pci
> > Expansion ROM BARs) this code was modeled after it.
>
> IMHO, we should fix both.
Yeah, that would be good. Mike, have you looked at this at all?
Also, to clarify, this isn't affecting users today, right? Or do you
need all this I/O space for multiple IOHs and the drivers that bind to
them in current UV systems?
Fundamentally, until we have real dynamic PCI resource management (i.e.
driver hooks for handling relocation, lazy allocation of resources at
driver bind time, etc.) we're going to continue to need hacks like
this. However, we could make them slightly more automated by making
"nobar" and "norom" the default on systems that typically need them,
maybe with a DMI table.
Thanks,
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:25 ` Jesse Barnes
@ 2010-05-14 22:34 ` Mike Travis
2010-05-14 22:35 ` H. Peter Anvin
2010-05-14 22:47 ` Jesse Barnes
0 siblings, 2 replies; 49+ messages in thread
From: Mike Travis @ 2010-05-14 22:34 UTC (permalink / raw)
To: Jesse Barnes
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
Jesse Barnes wrote:
> On Thu, 13 May 2010 14:02:30 -0600
> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
>>>> This issue is not specific to x86, so I don't really like having
>>>> the implementation be x86-specific.
>>> I agree this isn't a x86 specific issue but given the 'norom'
>>> cmdline option is basically doing the same thing (but for pci
>>> Expansion ROM BARs) this code was modeled after it.
>> IMHO, we should fix both.
>
> Yeah, that would be good. Mike, have you looked at this at all?
>
> Also, to clarify, this isn't affecting users today, right? Or do you
> need all this I/O space for multiple IOHs and the drivers that bind to
> them in current UV systems?
We have customers that want to install more than 16 PCI-e cards right
now. Our window of opportunity closes very soon (days), so either this
patch makes it in as is (or something close), or we wait for another
release cycle. UV shipments start this month.
[I wouldn't mind working on an improvement for later.]
>
> Fundamentally, until we have real dynamic PCI resource management (i.e.
> driver hooks for handling relocation, lazy allocation of resources at
> driver bind time, etc.) we're going to continue to need hacks like
> this. However, we could make them slightly more automated by making
> "nobar" and "norom" the default on systems that typically need them,
> maybe with a DMI table.
It seems that BIOS changes are much more difficult. The real solution
to this problem is for Card Vendors to not request I/O Bars if they
won't be using them. But that's the hardest option of all to accomplish.
Thanks,
Mike
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:34 ` Mike Travis
@ 2010-05-14 22:35 ` H. Peter Anvin
2010-05-14 22:40 ` Mike Travis
2010-05-14 22:47 ` Jesse Barnes
1 sibling, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-14 22:35 UTC (permalink / raw)
To: Mike Travis
Cc: Jesse Barnes, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 03:34 PM, Mike Travis wrote:
>
> It seems that BIOS changes are much more difficult. The real solution
> to this problem is for Card Vendors to not request I/O Bars if they
> won't be using them. But that's the hardest option of all to accomplish.
>
That is a non-option. Any device which may have to be a boot or console
device in a legacy system pretty much needs them.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:35 ` H. Peter Anvin
@ 2010-05-14 22:40 ` Mike Travis
2010-05-15 2:25 ` Mike Travis
0 siblings, 1 reply; 49+ messages in thread
From: Mike Travis @ 2010-05-14 22:40 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Jesse Barnes, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
H. Peter Anvin wrote:
> On 05/14/2010 03:34 PM, Mike Travis wrote:
>> It seems that BIOS changes are much more difficult. The real solution
>> to this problem is for Card Vendors to not request I/O Bars if they
>> won't be using them. But that's the hardest option of all to accomplish.
>>
>
> That is a non-option. Any device which may have to be a boot or console
> device in a legacy system pretty much needs them.
>
> -hpa
Yes, that's true. We're somewhat fortunate in that our Legacy I/O
devices are confined to those on the first blade, and all of these
other devices are on other blades (PCI segments 1+).
But you're right, unless the card vendor supplied some kind of strapping
option, or something similar, it won't know it's in a UV system or not.
Thanks,
Mike
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:40 ` Mike Travis
@ 2010-05-15 2:25 ` Mike Travis
0 siblings, 0 replies; 49+ messages in thread
From: Mike Travis @ 2010-05-15 2:25 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Jesse Barnes, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
... We're somewhat fortunate in that our Legacy I/O
> devices are confined to those on the first blade, and all of these
> other devices are on other blades (PCI segments 1+).
By "somewhat" I meant we'd be way more fortunate without the
legacy I/O devices at all! But hpa is right, 4 lines of
assembler is way better than (900?) lines of C.
My feeling is that if a CPU needs legacy i/o it should be
built into the chip. HAH! ;-)
The BIOS should just list the machine resources, in a
"non-BIOS" way, (oh wait, that's ACPI! :*)
Cheers!
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:34 ` Mike Travis
2010-05-14 22:35 ` H. Peter Anvin
@ 2010-05-14 22:47 ` Jesse Barnes
2010-05-14 22:59 ` Mike Travis
2010-05-14 23:20 ` H. Peter Anvin
1 sibling, 2 replies; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 22:47 UTC (permalink / raw)
To: Mike Travis
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 15:34:01 -0700
Mike Travis <travis@sgi.com> wrote:
>
>
> Jesse Barnes wrote:
> > On Thu, 13 May 2010 14:02:30 -0600
> > Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
> >>>> This issue is not specific to x86, so I don't really like having
> >>>> the implementation be x86-specific.
> >>> I agree this isn't a x86 specific issue but given the 'norom'
> >>> cmdline option is basically doing the same thing (but for pci
> >>> Expansion ROM BARs) this code was modeled after it.
>
> >> IMHO, we should fix both.
> >
> > Yeah, that would be good. Mike, have you looked at this at all?
> >
> > Also, to clarify, this isn't affecting users today, right? Or do you
> > need all this I/O space for multiple IOHs and the drivers that bind to
> > them in current UV systems?
>
> We have customers that want to install more than 16 PCI-e cards right
> now. Our window of opportunity closes very soon (days), so either this
> patch makes it in as is (or something close), or we wait for another
> release cycle. UV shipments start this month.
>
> [I wouldn't mind working on an improvement for later.]
Wow and they're using cards that want to use I/O space? Funky. It's
too late to get this into 2.6.34, but that can't be what you were
expecting... I don't see a problem with getting something like this in
for 2.6.35.
> > Fundamentally, until we have real dynamic PCI resource management (i.e.
> > driver hooks for handling relocation, lazy allocation of resources at
> > driver bind time, etc.) we're going to continue to need hacks like
> > this. However, we could make them slightly more automated by making
> > "nobar" and "norom" the default on systems that typically need them,
> > maybe with a DMI table.
>
> It seems that BIOS changes are much more difficult. The real solution
> to this problem is for Card Vendors to not request I/O Bars if they
> won't be using them. But that's the hardest option of all to accomplish.
Right.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:47 ` Jesse Barnes
@ 2010-05-14 22:59 ` Mike Travis
2010-05-14 23:06 ` Jesse Barnes
2010-05-14 23:20 ` H. Peter Anvin
1 sibling, 1 reply; 49+ messages in thread
From: Mike Travis @ 2010-05-14 22:59 UTC (permalink / raw)
To: Jesse Barnes
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
Jesse Barnes wrote:
> On Fri, 14 May 2010 15:34:01 -0700
> Mike Travis <travis@sgi.com> wrote:
>
>>
>> Jesse Barnes wrote:
>>> On Thu, 13 May 2010 14:02:30 -0600
>>> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
>>>>>> This issue is not specific to x86, so I don't really like having
>>>>>> the implementation be x86-specific.
>>>>> I agree this isn't a x86 specific issue but given the 'norom'
>>>>> cmdline option is basically doing the same thing (but for pci
>>>>> Expansion ROM BARs) this code was modeled after it.
>>>> IMHO, we should fix both.
>>> Yeah, that would be good. Mike, have you looked at this at all?
>>>
>>> Also, to clarify, this isn't affecting users today, right? Or do you
>>> need all this I/O space for multiple IOHs and the drivers that bind to
>>> them in current UV systems?
>> We have customers that want to install more than 16 PCI-e cards right
>> now. Our window of opportunity closes very soon (days), so either this
>> patch makes it in as is (or something close), or we wait for another
>> release cycle. UV shipments start this month.
>>
>> [I wouldn't mind working on an improvement for later.]
>
> Wow and they're using cards that want to use I/O space? Funky. It's
> too late to get this into 2.6.34, but that can't be what you were
> expecting... I don't see a problem with getting something like this in
> for 2.6.35.
2.6.35 would be fine. It's the acceptance that's the key.
And yes, we're using standard cards like everyone else... ;-)
[The message is "UV" is just a really, really big PC. ;-)]
I would appreciate however, some more detail on what's the goal of the
updates to "fix both". Thanks!
>
>>> Fundamentally, until we have real dynamic PCI resource management (i.e.
>>> driver hooks for handling relocation, lazy allocation of resources at
>>> driver bind time, etc.) we're going to continue to need hacks like
>>> this. However, we could make them slightly more automated by making
>>> "nobar" and "norom" the default on systems that typically need them,
>>> maybe with a DMI table.
>> It seems that BIOS changes are much more difficult. The real solution
>> to this problem is for Card Vendors to not request I/O Bars if they
>> won't be using them. But that's the hardest option of all to accomplish.
>
> Right.
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:59 ` Mike Travis
@ 2010-05-14 23:06 ` Jesse Barnes
2010-05-14 23:23 ` Mike Travis
0 siblings, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 23:06 UTC (permalink / raw)
To: Mike Travis
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 15:59:11 -0700
Mike Travis <travis@sgi.com> wrote:
>
>
> Jesse Barnes wrote:
> > On Fri, 14 May 2010 15:34:01 -0700
> > Mike Travis <travis@sgi.com> wrote:
> >
> >>
> >> Jesse Barnes wrote:
> >>> On Thu, 13 May 2010 14:02:30 -0600
> >>> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
> >>>>>> This issue is not specific to x86, so I don't really like having
> >>>>>> the implementation be x86-specific.
> >>>>> I agree this isn't a x86 specific issue but given the 'norom'
> >>>>> cmdline option is basically doing the same thing (but for pci
> >>>>> Expansion ROM BARs) this code was modeled after it.
> >>>> IMHO, we should fix both.
> >>> Yeah, that would be good. Mike, have you looked at this at all?
> >>>
> >>> Also, to clarify, this isn't affecting users today, right? Or do you
> >>> need all this I/O space for multiple IOHs and the drivers that bind to
> >>> them in current UV systems?
> >> We have customers that want to install more than 16 PCI-e cards right
> >> now. Our window of opportunity closes very soon (days), so either this
> >> patch makes it in as is (or something close), or we wait for another
> >> release cycle. UV shipments start this month.
> >>
> >> [I wouldn't mind working on an improvement for later.]
> >
> > Wow and they're using cards that want to use I/O space? Funky. It's
> > too late to get this into 2.6.34, but that can't be what you were
> > expecting... I don't see a problem with getting something like this in
> > for 2.6.35.
>
> 2.6.35 would be fine. It's the acceptance that's the key.
>
> And yes, we're using standard cards like everyone else... ;-)
>
> [The message is "UV" is just a really, really big PC. ;-)]
>
> I would appreciate however, some more detail on what's the goal of the
> updates to "fix both". Thanks!
As Bjorn noted, both the norom and nobar options are listed as x86
specific in the documentation and use x86 specific flags and code to
prevent their respective allocations.
It would be good if we could move the flags and code to the common PCI
layer in drivers/pci so that other arches could take advantage of it,
and we could have less resource management fragmentation.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:06 ` Jesse Barnes
@ 2010-05-14 23:23 ` Mike Travis
2010-05-14 23:33 ` Jesse Barnes
0 siblings, 1 reply; 49+ messages in thread
From: Mike Travis @ 2010-05-14 23:23 UTC (permalink / raw)
To: Jesse Barnes
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
Jesse Barnes wrote:
> On Fri, 14 May 2010 15:59:11 -0700
> Mike Travis <travis@sgi.com> wrote:
>
>>
>> Jesse Barnes wrote:
>>> On Fri, 14 May 2010 15:34:01 -0700
>>> Mike Travis <travis@sgi.com> wrote:
>>>
>>>> Jesse Barnes wrote:
>>>>> On Thu, 13 May 2010 14:02:30 -0600
>>>>> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
>>>>>>>> This issue is not specific to x86, so I don't really like having
>>>>>>>> the implementation be x86-specific.
>>>>>>> I agree this isn't a x86 specific issue but given the 'norom'
>>>>>>> cmdline option is basically doing the same thing (but for pci
>>>>>>> Expansion ROM BARs) this code was modeled after it.
>>>>>> IMHO, we should fix both.
>>>>> Yeah, that would be good. Mike, have you looked at this at all?
>>>>>
>>>>> Also, to clarify, this isn't affecting users today, right? Or do you
>>>>> need all this I/O space for multiple IOHs and the drivers that bind to
>>>>> them in current UV systems?
>>>> We have customers that want to install more than 16 PCI-e cards right
>>>> now. Our window of opportunity closes very soon (days), so either this
>>>> patch makes it in as is (or something close), or we wait for another
>>>> release cycle. UV shipments start this month.
>>>>
>>>> [I wouldn't mind working on an improvement for later.]
>>> Wow and they're using cards that want to use I/O space? Funky. It's
>>> too late to get this into 2.6.34, but that can't be what you were
>>> expecting... I don't see a problem with getting something like this in
>>> for 2.6.35.
>> 2.6.35 would be fine. It's the acceptance that's the key.
>>
>> And yes, we're using standard cards like everyone else... ;-)
>>
>> [The message is "UV" is just a really, really big PC. ;-)]
>>
>> I would appreciate however, some more detail on what's the goal of the
>> updates to "fix both". Thanks!
>
> As Bjorn noted, both the norom and nobar options are listed as x86
> specific in the documentation and use x86 specific flags and code to
> prevent their respective allocations.
>
> It would be good if we could move the flags and code to the common PCI
> layer in drivers/pci so that other arches could take advantage of it,
> and we could have less resource management fragmentation.
>
Thanks! I was hoping you weren't going to say we need to remove the
need for these options.
I think the dynamic provisioning thing would be a great feature for the
future, but it seems to me that it would entail quite a bit of coordinated
effort between BIOS and the kernel? Not to mention the final death of
all POST related initialization (at least in the case where the device
is not being used in Legacy mode)? [Heck, can we kill POST too?!?] I
haven't looked too closely at non-x86 arch's but unless they run an
x86 emulator, then the POST initialization is pretty much superfluous,
isn't it? The drivers will initialize when the device comes online?
I mean we do have more the 64k of memory now... ;-)]
Thanks,
Mike
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:23 ` Mike Travis
@ 2010-05-14 23:33 ` Jesse Barnes
2010-05-14 23:40 ` H. Peter Anvin
0 siblings, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 23:33 UTC (permalink / raw)
To: Mike Travis
Cc: Bjorn Helgaas, Mike Habeck, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 16:23:52 -0700
Mike Travis <travis@sgi.com> wrote:
> Thanks! I was hoping you weren't going to say we need to remove the
> need for these options.
No, I won't force that on you just yet. :)
> I think the dynamic provisioning thing would be a great feature for the
> future, but it seems to me that it would entail quite a bit of coordinated
> effort between BIOS and the kernel? Not to mention the final death of
> all POST related initialization (at least in the case where the device
> is not being used in Legacy mode)? [Heck, can we kill POST too?!?] I
> haven't looked too closely at non-x86 arch's but unless they run an
> x86 emulator, then the POST initialization is pretty much superfluous,
> isn't it? The drivers will initialize when the device comes online?
> I mean we do have more the 64k of memory now... ;-)]
Yeah, we can mostly ignore POST, though some BIOSes need it for their
boot time services. Even complex graphics devices tend not to need it
these days as long as you have a fully capable driver.
As for BIOS coordination for dynamic reallocation, yeah there'd be some
of that. I think the basic principles would be:
1) use BIOS allocations wherever possible
2) get an accurate list of available resources from the BIOS for
potential remapping later
3) allocate resources for BARs and devices as late as possible (e.g.
at driver bind time) to avoid allocating more than we need
But that's a good chunk of work, and as we've seen, PCs in particular
are really sensitive to having resources moved around too much, so step
(2) is probably the hardest part.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:33 ` Jesse Barnes
@ 2010-05-14 23:40 ` H. Peter Anvin
2010-05-15 0:02 ` Jesse Barnes
0 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-14 23:40 UTC (permalink / raw)
To: Jesse Barnes
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 04:33 PM, Jesse Barnes wrote:
>
> As for BIOS coordination for dynamic reallocation, yeah there'd be some
> of that. I think the basic principles would be:
> 1) use BIOS allocations wherever possible
> 2) get an accurate list of available resources from the BIOS for
> potential remapping later
> 3) allocate resources for BARs and devices as late as possible (e.g.
> at driver bind time) to avoid allocating more than we need
>
> But that's a good chunk of work, and as we've seen, PCs in particular
> are really sensitive to having resources moved around too much, so step
> (2) is probably the hardest part.
>
The real problem that I see, as outlined before, has nothing to do with
the BIOS, but rather the interdependencies between resources.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:40 ` H. Peter Anvin
@ 2010-05-15 0:02 ` Jesse Barnes
0 siblings, 0 replies; 49+ messages in thread
From: Jesse Barnes @ 2010-05-15 0:02 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 16:40:55 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/14/2010 04:33 PM, Jesse Barnes wrote:
> >
> > As for BIOS coordination for dynamic reallocation, yeah there'd be some
> > of that. I think the basic principles would be:
> > 1) use BIOS allocations wherever possible
> > 2) get an accurate list of available resources from the BIOS for
> > potential remapping later
> > 3) allocate resources for BARs and devices as late as possible (e.g.
> > at driver bind time) to avoid allocating more than we need
> >
> > But that's a good chunk of work, and as we've seen, PCs in particular
> > are really sensitive to having resources moved around too much, so step
> > (2) is probably the hardest part.
> >
>
> The real problem that I see, as outlined before, has nothing to do with
> the BIOS, but rather the interdependencies between resources.
I was using the term "BIOS" loosely to refer to safe ranges to allocate
resources. Your topology example is a good one, there are definite
dependencies between devices, especially on large systems. But in both
simple and complex cases, we still need to have resource ranges
available, or we'll have no where to put things at all.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 22:47 ` Jesse Barnes
2010-05-14 22:59 ` Mike Travis
@ 2010-05-14 23:20 ` H. Peter Anvin
2010-05-14 23:28 ` Jesse Barnes
1 sibling, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-14 23:20 UTC (permalink / raw)
To: Jesse Barnes
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 03:47 PM, Jesse Barnes wrote:
>
> Wow and they're using cards that want to use I/O space? Funky. It's
> too late to get this into 2.6.34, but that can't be what you were
> expecting... I don't see a problem with getting something like this in
> for 2.6.35.
>
Most cards on the market provide I/O BARs as a convenience to legacy
BIOSes; they don't need the I/O BAR functionality from inside a
full-featured OS. There are a few, key, exceptions, mainly in the form
of legacy-interface devices like UARTs and VGA (note that VGA has its
own routing bits and is therefore unaffected by this problem.)
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:20 ` H. Peter Anvin
@ 2010-05-14 23:28 ` Jesse Barnes
2010-05-14 23:32 ` H. Peter Anvin
0 siblings, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 23:28 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 16:20:45 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/14/2010 03:47 PM, Jesse Barnes wrote:
> >
> > Wow and they're using cards that want to use I/O space? Funky. It's
> > too late to get this into 2.6.34, but that can't be what you were
> > expecting... I don't see a problem with getting something like this in
> > for 2.6.35.
> >
>
> Most cards on the market provide I/O BARs as a convenience to legacy
> BIOSes; they don't need the I/O BAR functionality from inside a
> full-featured OS. There are a few, key, exceptions, mainly in the form
> of legacy-interface devices like UARTs and VGA (note that VGA has its
> own routing bits and is therefore unaffected by this problem.)
Yeah, it's the "legacy" part that I'm questioning. I'm just lamenting
that it's dying off so slowly...
And yes, VGA is an unfortunate standard.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:28 ` Jesse Barnes
@ 2010-05-14 23:32 ` H. Peter Anvin
2010-05-14 23:34 ` Jesse Barnes
0 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-14 23:32 UTC (permalink / raw)
To: Jesse Barnes
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 04:28 PM, Jesse Barnes wrote:
> On Fri, 14 May 2010 16:20:45 -0700
> "H. Peter Anvin" <hpa@zytor.com> wrote:
>
>> On 05/14/2010 03:47 PM, Jesse Barnes wrote:
>>>
>>> Wow and they're using cards that want to use I/O space? Funky. It's
>>> too late to get this into 2.6.34, but that can't be what you were
>>> expecting... I don't see a problem with getting something like this in
>>> for 2.6.35.
>>>
>>
>> Most cards on the market provide I/O BARs as a convenience to legacy
>> BIOSes; they don't need the I/O BAR functionality from inside a
>> full-featured OS. There are a few, key, exceptions, mainly in the form
>> of legacy-interface devices like UARTs and VGA (note that VGA has its
>> own routing bits and is therefore unaffected by this problem.)
>
> Yeah, it's the "legacy" part that I'm questioning. I'm just lamenting
> that it's dying off so slowly...
>
> And yes, VGA is an unfortunate standard.
>
I'm not lamenting that fact, because my experience is that what ends up
replacing it is often far worse. Consider UARTs -- no MMU dependencies
at all, can be accessed with four lines of assembly, and compare it to
EHCI debug port, the driver for which is over 900 lines in the Linux
kernel -- and that assumes that you're already in flat mode.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:32 ` H. Peter Anvin
@ 2010-05-14 23:34 ` Jesse Barnes
2010-05-14 23:39 ` H. Peter Anvin
0 siblings, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-14 23:34 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 16:32:07 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/14/2010 04:28 PM, Jesse Barnes wrote:
> > On Fri, 14 May 2010 16:20:45 -0700
> > "H. Peter Anvin" <hpa@zytor.com> wrote:
> >
> >> On 05/14/2010 03:47 PM, Jesse Barnes wrote:
> >>>
> >>> Wow and they're using cards that want to use I/O space? Funky. It's
> >>> too late to get this into 2.6.34, but that can't be what you were
> >>> expecting... I don't see a problem with getting something like this in
> >>> for 2.6.35.
> >>>
> >>
> >> Most cards on the market provide I/O BARs as a convenience to legacy
> >> BIOSes; they don't need the I/O BAR functionality from inside a
> >> full-featured OS. There are a few, key, exceptions, mainly in the form
> >> of legacy-interface devices like UARTs and VGA (note that VGA has its
> >> own routing bits and is therefore unaffected by this problem.)
> >
> > Yeah, it's the "legacy" part that I'm questioning. I'm just lamenting
> > that it's dying off so slowly...
> >
> > And yes, VGA is an unfortunate standard.
> >
>
> I'm not lamenting that fact, because my experience is that what ends up
> replacing it is often far worse. Consider UARTs -- no MMU dependencies
> at all, can be accessed with four lines of assembly, and compare it to
> EHCI debug port, the driver for which is over 900 lines in the Linux
> kernel -- and that assumes that you're already in flat mode.
Heh, you're so old and crufty!
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:34 ` Jesse Barnes
@ 2010-05-14 23:39 ` H. Peter Anvin
2010-05-15 0:00 ` Jesse Barnes
0 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-14 23:39 UTC (permalink / raw)
To: Jesse Barnes
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 04:34 PM, Jesse Barnes wrote:
>> I'm not lamenting that fact, because my experience is that what ends up
>> replacing it is often far worse. Consider UARTs -- no MMU dependencies
>> at all, can be accessed with four lines of assembly, and compare it to
>> EHCI debug port, the driver for which is over 900 lines in the Linux
>> kernel -- and that assumes that you're already in flat mode.
>
> Heh, you're so old and crufty!
I know. Debugging is so 20th century.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-14 23:39 ` H. Peter Anvin
@ 2010-05-15 0:00 ` Jesse Barnes
2010-05-15 0:14 ` H. Peter Anvin
0 siblings, 1 reply; 49+ messages in thread
From: Jesse Barnes @ 2010-05-15 0:00 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On Fri, 14 May 2010 16:39:59 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/14/2010 04:34 PM, Jesse Barnes wrote:
> >> I'm not lamenting that fact, because my experience is that what ends up
> >> replacing it is often far worse. Consider UARTs -- no MMU dependencies
> >> at all, can be accessed with four lines of assembly, and compare it to
> >> EHCI debug port, the driver for which is over 900 lines in the Linux
> >> kernel -- and that assumes that you're already in flat mode.
> >
> > Heh, you're so old and crufty!
>
> I know. Debugging is so 20th century.
Yeah, get with the times. Debugging on today's machines means you have
to look up register block offsets in ACPI, run some AML to setup your
debug port, and then load a microkernel onto the debug device to get
your vnc enabled debug console!
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-15 0:00 ` Jesse Barnes
@ 2010-05-15 0:14 ` H. Peter Anvin
0 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-15 0:14 UTC (permalink / raw)
To: Jesse Barnes
Cc: Mike Travis, Bjorn Helgaas, Mike Habeck, Ingo Molnar,
Thomas Gleixner, x86, Jacob Pan, Tejun Heo, LKML, Yinghai,
linux-pci@vger.kernel.org, Myron Stowe
On 05/14/2010 05:00 PM, Jesse Barnes wrote:
> On Fri, 14 May 2010 16:39:59 -0700
> "H. Peter Anvin" <hpa@zytor.com> wrote:
>
>> On 05/14/2010 04:34 PM, Jesse Barnes wrote:
>>>> I'm not lamenting that fact, because my experience is that what ends up
>>>> replacing it is often far worse. Consider UARTs -- no MMU dependencies
>>>> at all, can be accessed with four lines of assembly, and compare it to
>>>> EHCI debug port, the driver for which is over 900 lines in the Linux
>>>> kernel -- and that assumes that you're already in flat mode.
>>>
>>> Heh, you're so old and crufty!
>>
>> I know. Debugging is so 20th century.
>
> Yeah, get with the times. Debugging on today's machines means you have
> to look up register block offsets in ACPI, run some AML to setup your
> debug port, and then load a microkernel onto the debug device to get
> your vnc enabled debug console!
>
An excellent plan! Cannot fail!
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 19:38 ` Mike Habeck
2010-05-13 20:02 ` Bjorn Helgaas
@ 2010-05-13 20:36 ` Yinghai Lu
2010-05-13 20:34 ` H. Peter Anvin
1 sibling, 1 reply; 49+ messages in thread
From: Yinghai Lu @ 2010-05-13 20:36 UTC (permalink / raw)
To: Mike Habeck
Cc: Bjorn Helgaas, Mike Travis, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, x86, Jesse Barnes, Jacob Pan, Tejun Heo, LKML,
linux-pci@vger.kernel.org
On 05/13/2010 12:38 PM, Mike Habeck wrote:
>
>
> Nothing really breaks, it's more of a problem that the kernel uses
> up the rest of the I/O Space, and starts spitting out warning
> messages as it tries to assign the rest of the I/O BARs that the
> BIOS didn't assign, something like:
>
> pci 0010:03:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
> pci 0012:05:00.0: BAR 5: can't allocate I/O resource [0x0-0x7f]
> ...
>
> And in using up all the I/O space, I think that could prevent a
> hotplug attach of a pci device requiring I/O space (although I
> believe most BIOSes pad the bridge decoders to support that).
> I'm not to familiar with how pci hotplug works on x86 so I may
> be wrong in what I just stated.
assume you have several IOHs on your system, and for hotplug support,
BIOS is supposed to balance the IO ioport range allocation
between IOHs.
mmio range should be much better, it could use mmio64
if your BIOS can not provide right _CRS for peer root buses, some
devices may get wrong allocation from kernel.
or BIOS set small BAR in the bridge, that could cause problem too.
YH
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [Patch 1/1] x86 pci: Add option to not assign BAR's if not already assigned
2010-05-13 20:36 ` Yinghai Lu
@ 2010-05-13 20:34 ` H. Peter Anvin
0 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2010-05-13 20:34 UTC (permalink / raw)
To: Yinghai Lu
Cc: Mike Habeck, Bjorn Helgaas, Mike Travis, Ingo Molnar,
Thomas Gleixner, x86, Jesse Barnes, Jacob Pan, Tejun Heo, LKML,
linux-pci@vger.kernel.org
On 05/13/2010 01:36 PM, Yinghai Lu wrote:
>
> mmio range should be much better, it could use mmio64
>
We're not talking about MMIO.
-hpa
^ permalink raw reply [flat|nested] 49+ messages in thread