All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI<->PCI bridges, transparent resource fix
@ 2002-08-06 18:44 Benjamin Herrenschmidt
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-06 18:44 UTC (permalink / raw)
  To: Linux kernel mailing list

Sorry for those who already got this, it seems there is enough
interest/debate to discuss this here. So here we go...

----

You remember that old debate about how to handle PCI<->PCI bridges
resources that are considered "invalid", should those be transparent or
just closed, etc...

After much thinking (and experiments), I figured out that:

 - A "closed" resource could be considered "transparent" without problem.

 - The current code for setting up a transparent resource is broken in
a couple of ways. It makes assumptions about the layout of the parent
resources (0 beeing IO, 1 memory, 2 prefetchable memory), while this
is just not true, especially if your parent is the host bridge. It can
also end up setting up both a resource as beeing transparent and one
that is not, which can lead to intresting messup of the resource tree
under some circumstances.

I have reworked the routine along those lines:

 - If the IO resource is invalid, consider it transparent by pointing
the bus resources to _all_ parent bus resources of type IO

 - If _both_ mem resources are invalid, consider it transparent by
pointing the bus resources to _all_ parent bus resources of type MEM

 - if any of the mem resource is invalid and the other valid, just
use the valid one an ignore the invalid one.

Here is a replacement for pci_read_bridges_bases() implementing that
in 2.4 (but the code should move to 2.5 without hitch, I'm not asking
for a merge now, I'm asking for comments/suggestions/brown paper bags ;)

I've quickly tested in on a machine here for which one memory range
was incorrectly considered as transparent and it didn't do anything
bad. I would need you to verify that it still works on real transparent
bridges, or else, help me figure out what I overlooked, the goal here
is to avoid having bazillion of per-bridge special cases.

Regards,
Ben.


static inline int __devinit
add_bus_resource(struct pci_bus *child, struct resource* res)
{
        int i;

        /* Find free slot */
        for(i=0; i<4; i++)
                if (child->resource[i] == NULL) {
                        child->resource[i] = res;
                        return 1;
                }
        return 0;
}

static int __devinit
setup_transparent(struct pci_bus *child, unsigned long req_flags)
{
        int i;
        int found = 0;
        
        /* Iterate parent resources for matching flags */
        for(i=0; i<4; i++) {
                struct resource* pres = child->parent->resource[i];
                
                if (pres && ((pres->flags & req_flags) != 0)) {
                        if (!add_bus_resource(child, pres)) {
                                printk(KERN_ERR "Out of resource slots
for transparent bridge resources\n");
                                return 0;
                        }
                        found = 1;
                }
        }
        return found;
}

/*
 * The logic here is as follow:
 * 
 * For each bridge base (IO, mem, mem+prefetch), if the resource appear
 * valid, it is added to the resource tree. If not, things are dealt
 * differently for IO and mem.
 * 
 * If the IO resource is considered invalid, it's marked transparent,
 * that is all of the parent IO ranges are copied down. If the parent
 * has no IO ranges, it's considered closed, we don't provide an IO
 * resource for this bridge childs.
 * 
 * If at least one the memory resources is considered invalid, we have
 * do deal with one of these 3 cases:
 * 
 *   - mem invalid, mem+prefetch invalid : This is the simplest case.
 * the bridge is either completely transparent for memory cycles or
 * completely closed. We copy down all mem resources including
 * mem+prefetch from the parent
 * 
 *   - mem valid, mem+prefetch invalid : Here, we assume the bridge will
 * decode one memory region and is not transparent (the mem+prefetch one
 * is considered as closed). We don't copy any resource from the parent
 * 
 *   - mem invalid, mem+prefetch valid : Do this case exist ? For now, I
 * set it up as non-transparent bridge like the above.
 * 
 * An important goal here is to avoid mixing transparent and non
 * transparent resources of the same type. This messes up the resource
 * hierarchy and cause allocation failures
 */
 
void __devinit pci_read_bridge_bases(struct pci_bus *child)
{
        struct pci_dev *dev = child->self;
        u8 io_base_lo, io_limit_lo;
        u16 mem_base_lo, mem_limit_lo;
        unsigned long base, limit;
        struct resource *res;
        int i;
        int mem_transp = 0;
        
        if (!dev)               /* It's a host bus, nothing to read */
                return;

        for(i=0; i<4; i++)
                child->resource[i] = NULL;

        res = &dev->resource[PCI_BRIDGE_RESOURCES];
        pci_read_config_byte(dev, PCI_IO_BASE, &io_base_lo);
        pci_read_config_byte(dev, PCI_IO_LIMIT, &io_limit_lo);
        base = (io_base_lo & PCI_IO_RANGE_MASK) << 8;
        limit = (io_limit_lo & PCI_IO_RANGE_MASK) << 8;

        if ((io_base_lo & PCI_IO_RANGE_TYPE_MASK) == PCI_IO_RANGE_TYPE_32) {
                u16 io_base_hi, io_limit_hi;
                pci_read_config_word(dev, PCI_IO_BASE_UPPER16, &io_base_hi);
                pci_read_config_word(dev, PCI_IO_LIMIT_UPPER16, &io_limit_hi);
                base |= (unsigned long)(io_base_hi << 16);
                limit |= (unsigned long)(io_limit_hi << 16);
        }

        printk("bridge resource 0, base: %lx, limit: %lx\n", base, limit);
        if ((base || limit) && base <= limit) {
                res->flags = (io_base_lo & PCI_IO_RANGE_TYPE_MASK) |
IORESOURCE_IO;
                res->start = base;
                res->end = limit + 0xfff;
                res->name = child->name;
                if (!add_bus_resource(child, res))
                        printk(KERN_ERR "Out of resource slots for bridge
resource %d: closing...\n", 0);
        } else {
                if (setup_transparent(child, IORESOURCE_IO))
                        printk(KERN_ERR "Unknown bridge resource %d:
assuming transparent IO\n", 0);
                else
                        printk(KERN_ERR "Unknown bridge resource %d:
assuming closed\n", 0);
        }

        res = &dev->resource[PCI_BRIDGE_RESOURCES + 1];
        pci_read_config_word(dev, PCI_MEMORY_BASE, &mem_base_lo);
        pci_read_config_word(dev, PCI_MEMORY_LIMIT, &mem_limit_lo);
        base = (mem_base_lo & PCI_MEMORY_RANGE_MASK) << 16;
        limit = (mem_limit_lo & PCI_MEMORY_RANGE_MASK) << 16;

        printk("bridge resource 1, base: %lx, limit: %lx\n", base, limit);

        if (base && base <= limit) {
                res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) |
IORESOURCE_MEM;
                res->start = base;
                res->end = limit + 0xfffff;
                res->name = child->name;
                if (!add_bus_resource(child, res))
                        printk(KERN_ERR "Out of resource slots for bridge
resource %d: closing...\n", 1);
        } else
                mem_transp |= 0x01;
        
        res = &dev->resource[PCI_BRIDGE_RESOURCES + 2];
        pci_read_config_word(dev, PCI_PREF_MEMORY_BASE, &mem_base_lo);
        pci_read_config_word(dev, PCI_PREF_MEMORY_LIMIT, &mem_limit_lo);
        base = (mem_base_lo & PCI_PREF_RANGE_MASK) << 16;
        limit = (mem_limit_lo & PCI_PREF_RANGE_MASK) << 16;

        if ((mem_base_lo & PCI_PREF_RANGE_TYPE_MASK) ==
PCI_PREF_RANGE_TYPE_64) {
                u32 mem_base_hi, mem_limit_hi;
                pci_read_config_dword(dev, PCI_PREF_BASE_UPPER32,
&mem_base_hi);
                pci_read_config_dword(dev, PCI_PREF_LIMIT_UPPER32,
&mem_limit_hi);
#if BITS_PER_LONG == 64
                base |= ((long) mem_base_hi) << 32;
                limit |= ((long) mem_limit_hi) << 32;
#else
                if (mem_base_hi || mem_limit_hi) {
                        printk(KERN_ERR "PCI: Unable to handle 64-bit
address space for %s\n", child->name);
                        return;
                }
#endif
        }
        printk("bridge resource 2, base: %lx, limit: %lx\n", base, limit);

        if (base && base <= limit) {
                res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) |
IORESOURCE_MEM | IORESOURCE_PREFETCH;
                res->start = base;
                res->end = limit + 0xfffff;
                res->name = child->name;
                if (!add_bus_resource(child, res))
                        printk(KERN_ERR "Out of resource slots for bridge
resource %d: closing...\n", 2);
        }
        else
                mem_transp |= 0x02;

        if (mem_transp == 0x3) {
                if (setup_transparent(child, IORESOURCE_MEM))
                        printk(KERN_ERR "Unknown bridge resource 1 & 2:
assuming transparent MEM\n");
        }
}


----------------- Fin du message transmis -----------------



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
       [not found] <20020806192951.7E6B44829@dsl2.external.hp.com>
@ 2002-08-06 19:20 ` Benjamin Herrenschmidt
  2002-08-07  5:54   ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-06 19:20 UTC (permalink / raw)
  To: Grant Grundler, Linux kernel mailing list
  Cc: Jeff Garzik, David S. Miller, ink

>Benjamin Herrenschmidt wrote:
>> Closed means the resource isn't configured at all.
>
>ok.
>
>> >It's definitely true for PCI-PCI bridge.
>> >Host bridge support is expected to support this convention as well.
>> 
>> This is not possible in lots of cases
>
>Uhm...why not?
>Arch support can add as many fields as it wants to it's own data structures.
>PCI-PCI bridge support doesn't need to see those, does it?

For PCI<->PCI, sure, but not host bridges. The problem is that function
making asumptions about it's parent resource layout, while this parent may
not be a PCI<->PCI bridges. We have a bunch of cases (embedded, pmac, ...)
where the layout of the host bridge doesn't match the one of a PCI<->PCI
bridge.

Since the parent resource already contain the necessary informations that
we represent with ordering (that is the IORESOURCE_IO, IORESOURCE_MEM, ...
flags), it seems to be that it makes more sense to pick the parent resources
according to their flags rather than their position.

Also, when a host bridge has more than one MMIO ranges, which is typically
the case on pmac, if you have a transparent PCI<->PCI bridge, you really
want to show that all of these MMIO ranges are exposed to the childs, thus
you want all of these resource pointers to get down to the bridge.

What my patch does is just that: don't rely on the parent's ordering, but
rather on the parent resource flags, an when, for a given type of region
(MMIO or IO), the bridge is considered transparent, then copy down _all_
resource pointers for this type of region.

The distinction between mem and prefetchable mem is, I think, irrelevant
when dealing with transparent bridges.


>> >If the host bridge wants to support additional resources, I'm
>> >sure that's possible and does not belong in the generic support.
>> 
>> Well, I see no other place than this function making this assumption,
>> so I'd rather fix the function.
>
>Maybe you haven't looked at other arches yet?

I know archs may call other functions for actually setting up the
bridges, I actually didn't look at that closely. This is the main
reason why I post that patch for discussion and not as something
to be included ;)

>parisc is definitely assumes how resources 0,1,2 are used.
>I'm open to getting rid of that assumption but want to understand why.

Ok, so my answer to the "why" is to not make this assumption into
generic code, to let that code deal properly with the cases where
the parent of the PCI<->PCI bridge doesn't quite match this assumption.

Basically, the information that the ordering gives us (basically if
it's an IO, MEM or prefetchable MEM region) is already present in
the resource flags structure. Let's use that.

>> On OpenFirmware based machines, we
>> setup the host resources as they come in from the firmware. For All of
>> the PPC machines (a bunch of them), this convention is definitely not
>> respected.
>
>From another angle, I could call that a bug. PARISC also gets
>the IO and MMIO routing information from host firmware.

Maybe, though I don't remember seeing explicitely that there is a
requirement for host bridge resources ;)

>> Finally, on Pmacs, I had to deal with hosts exposing 2
>> separate MMIO windows.
>
>You are clearly talking about (CPU) architecture specific code.
>PCI-PCI bridges can only forward 3 ranges.

Again, I'm not arguing about what the PCI<->PCI bridge does here.
I'm concerned about the assumptions made by that code about what
the _parent_ of the PCI<->PCI bridge looks like, ie the host bridge.

>> Honestly, I don't see why we would impose such a convention. It's
>> not necessary and the code can be easily fixed.
>
>It's not necessary but it's simple and easy to understand.
>I guess I still don't understand why it needs to be fixed.

I'm having some troubles with the current code, because of the
host bridge beeing setup with a different layout for one, and
because the current code mixing transparent and non-transparent
regions when only one of the 2 MMIO regions is configured, causing
some interesting conflicts to happen when I have several neighbour
bridges and devices below them.

I'm afaid I may not have been clear here. There are 2 different
issues I'm dealing with that patch: One is that I _think_ it
doesn't make sense to have a "half transparent" bridge (that is
transparent for MMIO but not prefetch MMIO or the opposite), so
if one of the 2 regions is set, don't assume we deal with a
transparent bridge. The other one is that when the bridge is
assumed transparent, copy down all the resource pointers of the
parent matching that resource type instead of relying on the
parent ordering, so if the parent (host) exposes 2 disctint
MMIO regions, then the transparent bridge will properly expose
the fact that it's actually forwarding MMIO transactions for those
2 regions.

>> Currently, that mean we are hard-coding yet another x86 crap in
>> the core kernel code :(
>
>Wrong. It's PCI specific. PCI-PCI Bridge spec spells out how
>the resources look and what's available.

Right, I must have a been a bit nervous when writing the above
sentence, sorry about that =P

>...
>> >The two mem resources (1 & 2) are not equivalent. They are not
>exchangeable.
>> >If BIOS or Host bridge support isn't getting that right, that's a bug.
>> >We need a workaround in the Host bridge support.
>> 
>> That's not what I meant. I know they aren't. But if one of them is
>> closed, it makes no sense to configure the pci<->pci bridge as fully
>> transparent.
>
>By chance, is this comment confusing?
>	/*
>	 * Ugh. We don't know enough about this bridge. Just assume
>	 * that it's entirely transparent.
>	 */
>
>"entirely" just refers to that one range.

Ok. That's a matter of choice. If a bridge has an MMIO region well
defined (typically in the case I have to deal with, the normal one)
and the other "closed" (in my case the prefetch one), should the
bridge be considered as transparent or should we only forward the
defined MMIO region ?

If we decide it's fully transparent, we take the risk of having
the kernel assign resource for childs that won't actually be
forwarded by the bridge.

>Can you show me a messed up resource tree?
>I think I'd understand better what problem you are trying to solve.

I don't have the offending box at hand right now, I'll find something
tomorrow. I hope I've been more clear this time anyway ;)

Regards,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
       [not found] <20020807042402.A4840@jurassic.park.msu.ru>
@ 2002-08-06 20:31 ` Benjamin Herrenschmidt
  2002-08-07 16:03   ` Ivan Kokshaysky
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-06 20:31 UTC (permalink / raw)
  To: Ivan Kokshaysky, Grant Grundler, Linux kernel mailing list
  Cc: Jeff Garzik, David S. Miller

>Not true. Closed means closed by firmware - in the terms of the
>PCI-PCI bridge, this means that window base > limit. There are _only_
>two reasons for "closed" state:
>1. There is no io or memory behind the bridge, which is perfectly valid.
>   This bridge (and its respective bus) resource must be ignored (flags = 0).
>2. BIOS bug. There _are_ io/mem resources behind the bridge, but any access
>   to these resources will cause machine checks, lockups etc., depends on
>   platform.
>In either case "assuming transparent" is totally wrong if the bridge
>is normal (positive) decoder.
>
>Not configured (i.e. after reset) state for vast majority of bridges
>is base == limit == 0, which means minimal io (4K) or mem (1M)
>window enabled at address 0. Obviously, this is not acceptable for
>most architectures.

Ok, right, I was indeed talking about ranges that have been disabled
by the firmware.

>I'm a bit puzzled. How do these windows look from a) CPU b) PCI bus
>point of view?  And what was a reason for having more than one
>MMIO window per PCI controller?

Well, it's definitely not that specific since I know several host
bridges that have several programmable windows to PCI, and those
aren't PPC specifc. In the case of pmacs, the UniNorth bridge
forwards up to 7 regions of 256Mb (0x80000000..0x8fffffff,
0x90000000..0x9fffffff, etc... up to 0xe0000000..0xefffffff),
each of them beeing selected by a bit setup by the firmware.
It then have additional 16 regions of type 0xfx000000 than can
also be individually selected, some of them beeing reserved for
IO and config space, but one of them beeing typically an additional
memory window to the PCI bus.
>> 
>> You are clearly talking about (CPU) architecture specific code.
>> PCI-PCI bridges can only forward 3 ranges.
>
>Surely.
>
>> > Currently, that mean we are hard-coding yet another x86 crap in
>> > the core kernel code :(
>
>PPC seems to be the only arch with such a weird host bridge
>design, so I'd talk about PPC rather than x86 crap ;-)

Yes, well ... ;) 
>
>> By chance, is this comment confusing?
>> 	/*
>> 	 * Ugh. We don't know enough about this bridge. Just assume
>> 	 * that it's entirely transparent.
>> 	 */
>> 
>> "entirely" just refers to that one range.
>
>The comment is _extremely_ confusing. P2P bridge specs are quite clear:
>the bridge can support the subtractive decoding (i.e. "transparent")
>mode, and the way how to place it in this mode is indeed not
>standardized. But, the subtractive decoding bridge _MUST_ have bit 0
>in the ProgIf set to 1. For some reasons, I don't believe that there are
>any exceptions. And even if they are, we can easily handle them with
>"quirks".

Ok, here we need Linus answert. We did have a patch in the PPC tree
that was consideing closed resources as really closed instead of
transparent, and we were told by Linus that there were non standard
bridges and various issues in the x86 world with that, and that those
would remain transparent. I don't have pointers at hand, but I think
this was dicusssed on lkml several monthes ago.

I would _love_ beeing able to simplify even more my code by assuming
that the resources are closed (what I really need in most of my
cases), or the bridge fully transparent, according to such a bit.

>> Earlier Benjamin wrote:
>> | - The current code for setting up a transparent resource is broken in
>> | a couple of ways. It makes assumptions about the layout of the parent
>> | resources (0 beeing IO, 1 memory, 2 prefetchable memory), while this
>> | is just not true, especially if your parent is the host bridge.
>
>Your parent is not a bridge - it's a pci_bus, and this resource layout
>is pci_bus specific. Note that it's _pointers_ to resources, either to
>standard PCI-PCI bridge resources, or arch specific resources in the case
>of the root bus.
>The "transparency" code is broken by no means though.
>I'm keen to fix that, but at this moment I've ~100K of pending
>alpha patches which are of higher priority for me... :-\

Well, I do have a case of conflict here where a device below a bridge
can't get it's resources allocated because it's considered as conflicting
with the bridge itself. The problem disappear if I set the bridge
as closed (thoough I get another problem where the IO region is not
enabled by the firmware, but that's another matter, I can quirk here).

In most cases, I have to trust the firmware setup as a critical ASIC
(the "mac-io" ASIC) tend to be behind a PCI<->PCI bridge and I cannot
afford to have access to it disabled nor moved at any time.

This is btw a problem with the current kernel code as the PCI probe
code will disable IO/MEM forwarding during probe on the bridge. That
means there is a small window of time during which, if I take an
interrupt for example, I'll be dead because my interrupt controller
(which is part of that "mac-io" ASIC) will be unreachable. But this
is a different issue.

>Basically, I suggest something like this for people trusting their
>firmware setup and therefore using pci_read_bridge_bases():
>1. Check bit 0 of the P2P bridge classcode/progif; if set, assume
>   transparent - set all 3 resource pointers to those of the parent
>   bus; ignore windows settings.

I'd set all 4 then, thus the bridge would really be seen as forwarding
all the regions of the host bridge, whatever they are.

Since pci_read_bridge_bases() is called by the arch code, I can implement
a version doing what you suggest in the arch code and use that. I'll
try and let you know.

>2. If cleared, check base/limit settings of windows 0,1,2. If they seem ok,
>   i.e base != 0 && base < limit, allocate these resources properly in
>   the resource tree.
>3. At the end of arch specific PCI setup, call either
>   pci_assign_unassigned_resources, or more fine-grained helpers from
>   pci/setup_bus.c. These routines will take care of 
>   a) unallocated P2P bridge resources;
>   b) regular resources, unassigned due to resource conflicts and other
>      BIOS bugs.

That makes sense. This would also allow to spot that there is no device
below a PCI<->PCI bridge when doing that assignement of unassigned
resources, and thus to spot that the firmware may have indeed been
right to close them and not to bother.

>The above was one of the purposes of the new 2.5 PCI allocation code,
>although some (relatively small) changes are still needed...

Ok, I'll look into implementing that in the PPC arch code for now

Thanks for your comments,
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-07  5:54   ` Grant Grundler
@ 2002-08-06 21:02     ` Benjamin Herrenschmidt
  2002-08-07 18:30       ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-06 21:02 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Linux kernel mailing list, Jeff Garzik, David S. Miller, ink

>> Also, when a host bridge has more than one MMIO ranges, which is typically
>> the case on pmac, if you have a transparent PCI<->PCI bridge, you really
>> want to show that all of these MMIO ranges are exposed to the childs, thus
>> you want all of these resource pointers to get down to the bridge.
>
>This is wrong. PCI-PCI bridges can only forward one IO range, one MMIO
>range, and one MMIO-Prefetchable range. ("range" == address window).
>Any ranges in addition to that means it's not a PCI-PCI bridge.

Well, we are talking about transparent (substractive decoding) bridges here
right ? Those don't care about ranges.
If my host bridge expose one MMIO range at 0x90000000..0x9fffffff and one
at 0xf2000000..0xf2ffffff, a transparent bridge below that host bridge will
actually forward both of these ranges, right ?

>By explicitly setting resource[1] of the parent to the MMIO range,
>arch specific code *knows* which MMIO range the PCI-PCI bridge
>will forward.
>
>Are you trying to address the following kind of problem?
>	o Host Bridge 00 forwards MMIO 0xf1000000-0xf1100000 and
>		MMIO 0xf1800000-0xf1900000.
>	o PCI-PCI Bridge 00:01.0 forwards MMIO 0xf1000000-0xf1100000
>	o PCI-PCI Bridge 00:02.0 forwards MMIO 0xf1800000-0xf1900000
>
>I'm hoping you'll have real data tomorrow for the problem machine.
>But a yes/no/"don't know" answer would be sufficient.

A typical setup found on mac is indeed to have the bridge forward
2 regions, but then, you may have a transparent bridge hanging
on the bus. I'm not talking about changing the behaviour of a bridge
that defines it's forwarding MMIO region, I'm talking about
not copying parent resources blindly for _transparent_ bridges,
but instead do it based on the flags.

The problematic machine I have at hand here is a bit different.
It does have 2 exposed MMIO regions, but doesn't have problem
with transparent bridge because of that (which isn't the case
of other pmac setups).

It has the host exposing 2 MMIO regions:

80000000-9fffffff
and
f3000000-f3ffffff

On that bus, it has 2 PCI-PCI bridges setup by the firmware
in an "interesting" way:

The first one has
bridge resource 0, base: 1000, limit: 0
bridge resource 1, base: 80000000, limit: 80000000
bridge resource 2, base: 80000000, limit: 7ff00000

The second has
bridge resource 0, base: 1000, limit: 0
bridge resource 1, base: 80100000, limit: 8ff00000
bridge resource 2, base: 80100000, limit: 80000000

As you can see, for both of them, the firmware only enable
one decoded MMIO region, and closed the IO and the prefetchable MEM
region. 

The current code would consider the prefetchable MEM region as
transparent, which seems plain wrong in this case. Thus my idea which
was to change the routine to consider the bridge as either fully
transparent for MMIO of _both_ MMIO windows are closed or not at all
(thus only using the MMIO window that is opened, leaving the resource
for the other one closed, shouldn't harm anybody).

Now, Ivan claims we shouldn't do that, but we should look for the
bit that states if a bridge is fully transparent, use that, and if
not, keep the window closed at this point and eventually assign new
ones later one. That would be the best mecanism, I agree, but
that's also, iirc, what Linus didn't want because of issues with
non standard bridges. (I have never seen such a bridge here though
it may be worth digging in the list archives. I'm about to leave
for a few days without good internet connexion though, I won't be
able to do that research before I'm back).

>...
>> The distinction between mem and prefetchable mem is, I think, irrelevant
>> when dealing with transparent bridges.
>
>Sorry - my gut feeling is it matters.  I need something better
>than "I think" before I can agree with that statement.
>For IO, cacheable vs un-cacheable addresses are worlds apart.

Well, look what a transparent bridge does at the HW level please.
How would it matter if the bridge is doing substractive decoding ?

>> >Maybe you haven't looked at other arches yet?
>> 
>> I know archs may call other functions for actually setting up the
>> bridges, I actually didn't look at that closely. This is the main
>> reason why I post that patch for discussion and not as something
>> to be included ;)
>
>ok. When implementing parisc PCI support, I looked alot at alpha,
>sparc64, and x86 code to understand how the peices fit together.
>parisc introduced some new issues neither of the above had to solve.
>Ivan's PCI code changes from 2.5.x took those problems into consideration
>and I helped test. Sounds like we need to iterate once the real
>problem is clear. You might consider looking at 2.5.30 before
>getting too hung up in a fix for 2.4.
>
>...
>> Basically, the information that the ordering gives us (basically if
>> it's an IO, MEM or prefetchable MEM region) is already present in
>> the resource flags structure. Let's use that.
>
>I'm sorry; I've still not understood the problem you are trying to fix
>by removing this assumption.

The problem fixed by removing the above asumption is when you have
a host bridge whose resource layout is different than the one of a
PCI<->PCI bridge, and then stuff a transparent PCI<->PCI bridge
below it.

But as Ivan implies in his other email, I feel the whole point is
that the bridge is either fully transparent, not transparent at all.
The current way of picking "some" bridge resources as transparent
when they are actually closed seems wrong. In the case of a fully
transparent bridge, just copying down pointers to all the 4 parent
resources would work just fine for me.

>> >From another angle, I could call that a bug. PARISC also gets
>> >the IO and MMIO routing information from host firmware.
>> 
>> Maybe, though I don't remember seeing explicitely that there is a
>> requirement for host bridge resources ;)
>
>Lots of things required of the arch support aren't explicitly specified.
>Only a handful (or two) people on this planet ever muck with arch PCI
>support and no one has felt it was worth writing a HOW-TO.
>
>If you write up an initial draft for linux/Documentation/pci-bios.txt,
>I'll review and comment on it.

Heh, well, i'm afraid I may not have time for that now, and I carefully
avoid some of the matters with legacy IOs address aliasing and other
similar cruft (that don't happen in practice on PPC machines) so I
wouldn't be able to document that.

>...
>> >It's not necessary but it's simple and easy to understand.
>> >I guess I still don't understand why it needs to be fixed.
>> 
>> I'm having some troubles with the current code, because of the
>> host bridge beeing setup with a different layout for one,
>
>This sounds like something you can fix in the arch specific code.

Sure. I wanted to raise the issue as the generic code seemed wrong
to me at that point. Of course I can (and will have to) fix that
in the arch code for now (that is for 2.4).

>> and
>> because the current code mixing transparent and non-transparent
>> regions
>
>regions aren't "transparent" - it's the bridge that's transparent.

Yup, here we agree. So why does the code in there does the
transparency thing on a per-region basis ?

>> when only one of the 2 MMIO regions is configured, causing
>> some interesting conflicts to happen when I have several neighbour
>> bridges and devices below them.
>
>If different MMIO regions route to different sibling PCI-PCI bridges
>(both P-P bridges are children of the same parent), that is a problem
>we can't solve in the generic code.
>I suspect you'll have to add some cruft to your pcibios_fixup_bus()
>to handle this properly.

Why not just consider than a PCI<->PCI bridge with one window open
and the other closed is just that... that is a PCI<->PCI bridge
forwarding 1 window, period. Currently, we make it a resource for
the window it forwards, then copy the other resource from the host
when we find it closed, which makes it "half transparent". That's
one of the things I'm trying to fix.

>> I'm afaid I may not have been clear here. There are 2 different
>> issues I'm dealing with that patch: One is that I _think_ it
>> doesn't make sense to have a "half transparent" bridge (that is
>> transparent for MMIO but not prefetch MMIO or the opposite), so
>> if one of the 2 regions is set, don't assume we deal with a
>> transparent bridge.
>
>pci_read_bridge_bases() handles each resource seperately.
>I prefer Ivan's suggestions on dealing with tranparent bridges.

I do too. Though according to the kernel resource management
mecanism, I beleive it's quite safe to incorrectly consider a
bridge with all resources closed as beeing 'transparent'
as such a bridge typically have no devices below it (which is
why the firmware closed it). That would also make Linus happy
with his problem of considering bridges we don't fully understand
as transparent (if I can ever find back his old email...)

So that would give us a pci_read_bridge_bases() that does what
it does today, except for the infamous "else { consider
transparent }" case. Instead, just add bits to a mask.

Then, at the end of the funcition, if that mask indicates all
resources where closed, then consider the bridge transparent
and copy all of the parent resources.

What my previous proposed code does is to do it on a per resource
type basis (thus allowing transparent PIO and non transparent
MMIO) but that may actually not be possible in read life (at
least not following the PCI<->PCI spec)

>
>> The other one is that when the bridge is
>> assumed transparent, copy down all the resource pointers of the
>> parent matching that resource type instead of relying on the
>> parent ordering, so if the parent (host) exposes 2 disctint
>> MMIO regions, then the transparent bridge will properly expose
>> the fact that it's actually forwarding MMIO transactions for those
>> 2 regions.
>
>The host can forward as many MMIO regions as it is capable of.
>The PCI-PCI bridge can only forward one (of each type).
>I'll keep telling you that until you get it.

I won't get it for a transparent bridge, sorry ;)

A transparent bridge does substractive decoding. It will forward
a cycel to _any_ address that have not been claimed by another
device on the same segment. Thus, it will forward all of the
regions exposed by the host

>...
>> If a bridge has an MMIO region well
>> defined (typically in the case I have to deal with, the normal one)
>> and the other "closed" (in my case the prefetch one), should the
>> bridge be considered as transparent or should we only forward the
>> defined MMIO region ?
>
>Look at setup-bus.c again in 2.4.18. Each type of range is
>handled seperately. It should only forward the MMIO region.
>If the prefetchable base and limit are not set correctly to indicate
>the range is "closed", it's a BIOS/Firmware bug which your arch support
>needs to work around.

They _are_ setup correctly !

But pci_read_bridge_bases will incorrectly assume a "closed" region is
transparent.
Thus we end up with both an explicit mem region and a copy of the parent
region, a "half transparent" bridge, which makes no sense.

>
>> If we decide it's fully transparent, we take the risk of having
>> the kernel assign resource for childs that won't actually be
>> forwarded by the bridge.
>
>yup.
>
>...
>> I don't have the offending box at hand right now, I'll find something
>> tomorrow. I hope I've been more clear this time anyway ;)
>
>Please do. Getting closer to the core problem at least. ;^)

There are several problems mixed here, what makes it a tad difficult ;)

One of them is the host having several MMIO regions an a transparent bridge,
one of them beeing the host not respecting the ordering of a PCI<->PCI
bridge (thus breaking the current transparent code in some cases), and
one of them beeing the current code considering bridges with only one of
the MMIO regions configured as "half transparent".

Ben.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-06 19:20 ` PCI<->PCI bridges, transparent resource fix Benjamin Herrenschmidt
@ 2002-08-07  5:54   ` Grant Grundler
  2002-08-06 21:02     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2002-08-07  5:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Linux kernel mailing list, Jeff Garzik, David S. Miller, ink

Benjamin Herrenschmidt wrote:
...
> Since the parent resource already contain the necessary informations that
> we represent with ordering (that is the IORESOURCE_IO, IORESOURCE_MEM, ...
> flags), it seems to be that it makes more sense to pick the parent resources
> according to their flags rather than their position.

It does not make more sense.
The three fields are specified by PCI-PCI Bridge Specification.

> Also, when a host bridge has more than one MMIO ranges, which is typically
> the case on pmac, if you have a transparent PCI<->PCI bridge, you really
> want to show that all of these MMIO ranges are exposed to the childs, thus
> you want all of these resource pointers to get down to the bridge.

This is wrong. PCI-PCI bridges can only forward one IO range, one MMIO
range, and one MMIO-Prefetchable range. ("range" == address window).
Any ranges in addition to that means it's not a PCI-PCI bridge.

By explicitly setting resource[1] of the parent to the MMIO range,
arch specific code *knows* which MMIO range the PCI-PCI bridge
will forward.

Are you trying to address the following kind of problem?
	o Host Bridge 00 forwards MMIO 0xf1000000-0xf1100000 and
		MMIO 0xf1800000-0xf1900000.
	o PCI-PCI Bridge 00:01.0 forwards MMIO 0xf1000000-0xf1100000
	o PCI-PCI Bridge 00:02.0 forwards MMIO 0xf1800000-0xf1900000

I'm hoping you'll have real data tomorrow for the problem machine.
But a yes/no/"don't know" answer would be sufficient.

...
> The distinction between mem and prefetchable mem is, I think, irrelevant
> when dealing with transparent bridges.

Sorry - my gut feeling is it matters.  I need something better
than "I think" before I can agree with that statement.
For IO, cacheable vs un-cacheable addresses are worlds apart.

> >Maybe you haven't looked at other arches yet?
> 
> I know archs may call other functions for actually setting up the
> bridges, I actually didn't look at that closely. This is the main
> reason why I post that patch for discussion and not as something
> to be included ;)

ok. When implementing parisc PCI support, I looked alot at alpha,
sparc64, and x86 code to understand how the peices fit together.
parisc introduced some new issues neither of the above had to solve.
Ivan's PCI code changes from 2.5.x took those problems into consideration
and I helped test. Sounds like we need to iterate once the real
problem is clear. You might consider looking at 2.5.30 before
getting too hung up in a fix for 2.4.

...
> Basically, the information that the ordering gives us (basically if
> it's an IO, MEM or prefetchable MEM region) is already present in
> the resource flags structure. Let's use that.

I'm sorry; I've still not understood the problem you are trying to fix
by removing this assumption.

> >From another angle, I could call that a bug. PARISC also gets
> >the IO and MMIO routing information from host firmware.
> 
> Maybe, though I don't remember seeing explicitely that there is a
> requirement for host bridge resources ;)

Lots of things required of the arch support aren't explicitly specified.
Only a handful (or two) people on this planet ever muck with arch PCI
support and no one has felt it was worth writing a HOW-TO.

If you write up an initial draft for linux/Documentation/pci-bios.txt,
I'll review and comment on it.

...
> >It's not necessary but it's simple and easy to understand.
> >I guess I still don't understand why it needs to be fixed.
> 
> I'm having some troubles with the current code, because of the
> host bridge beeing setup with a different layout for one,

This sounds like something you can fix in the arch specific code.

> and
> because the current code mixing transparent and non-transparent
> regions

regions aren't "transparent" - it's the bridge that's transparent.

> when only one of the 2 MMIO regions is configured, causing
> some interesting conflicts to happen when I have several neighbour
> bridges and devices below them.

If different MMIO regions route to different sibling PCI-PCI bridges
(both P-P bridges are children of the same parent), that is a problem
we can't solve in the generic code.
I suspect you'll have to add some cruft to your pcibios_fixup_bus()
to handle this properly.

> I'm afaid I may not have been clear here. There are 2 different
> issues I'm dealing with that patch: One is that I _think_ it
> doesn't make sense to have a "half transparent" bridge (that is
> transparent for MMIO but not prefetch MMIO or the opposite), so
> if one of the 2 regions is set, don't assume we deal with a
> transparent bridge.

pci_read_bridge_bases() handles each resource seperately.
I prefer Ivan's suggestions on dealing with tranparent bridges.


> The other one is that when the bridge is
> assumed transparent, copy down all the resource pointers of the
> parent matching that resource type instead of relying on the
> parent ordering, so if the parent (host) exposes 2 disctint
> MMIO regions, then the transparent bridge will properly expose
> the fact that it's actually forwarding MMIO transactions for those
> 2 regions.

The host can forward as many MMIO regions as it is capable of.
The PCI-PCI bridge can only forward one (of each type).
I'll keep telling you that until you get it.

...
> If a bridge has an MMIO region well
> defined (typically in the case I have to deal with, the normal one)
> and the other "closed" (in my case the prefetch one), should the
> bridge be considered as transparent or should we only forward the
> defined MMIO region ?

Look at setup-bus.c again in 2.4.18. Each type of range is
handled seperately. It should only forward the MMIO region.
If the prefetchable base and limit are not set correctly to indicate
the range is "closed", it's a BIOS/Firmware bug which your arch support
needs to work around.

> If we decide it's fully transparent, we take the risk of having
> the kernel assign resource for childs that won't actually be
> forwarded by the bridge.

yup.

...
> I don't have the offending box at hand right now, I'll find something
> tomorrow. I hope I've been more clear this time anyway ;)

Please do. Getting closer to the core problem at least. ;^)

thanks,
grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-06 20:31 ` Benjamin Herrenschmidt
@ 2002-08-07 16:03   ` Ivan Kokshaysky
  2002-08-08  8:20     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Ivan Kokshaysky @ 2002-08-07 16:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

On Tue, Aug 06, 2002 at 10:31:34PM +0200, Benjamin Herrenschmidt wrote:
> Well, it's definitely not that specific since I know several host
> bridges that have several programmable windows to PCI, and those
> aren't PPC specifc. In the case of pmacs, the UniNorth bridge
> forwards up to 7 regions of 256Mb (0x80000000..0x8fffffff,
> 0x90000000..0x9fffffff, etc... up to 0xe0000000..0xefffffff),
> each of them beeing selected by a bit setup by the firmware.
> It then have additional 16 regions of type 0xfx000000 than can
> also be individually selected, some of them beeing reserved for
> IO and config space, but one of them beeing typically an additional
> memory window to the PCI bus.

Ok. Assume that additional window is 0xf2000000-0xf2ffffff.
I'd try the following:
- set the _single_ memory resource of the root bus to 0x80000000-0xf2ffffff;
- create dummy memory type resource 0xf0000000-0xf1ffffff and "claim" it
  on the root bus. This will prevent all further allocations in the
  gap between two MMIO windows.
I think it should seriously simplify the things.

> Ok, here we need Linus answert. We did have a patch in the PPC tree
> that was consideing closed resources as really closed instead of
> transparent, and we were told by Linus that there were non standard
> bridges and various issues in the x86 world with that, and that those
> would remain transparent. I don't have pointers at hand, but I think
> this was dicusssed on lkml several monthes ago.

I recall that. I do agree with Linus, but only about bridges with
class code 0x60401 (subtractive decoders). The details of operating in
subtractive decoding mode are beyond the scope of the P2P bridge specs,
and probably we don't want to know these details either. "Assuming
transparent" is a sufficient workaround in this case.

Some additional notes. The P2P bridge specification says:
"The primary use of a subtractive decoding bridge is to connect
 a laptop system to a docking station and support legacy ISA devices
 in the docking station."

Indeed, that "transparency" code had been added to fix P2P bridge problems
on some Dell docking station (reported by Jamal) back in the 2.4.0-test
times. Unfortunately, lspci output of that machine hasn't been posted
(or I just missed that). However, I'm sure that the problematic bridge
did have ProgIf code 1, otherwise that type of machines would have
problems running Windows, as MS says:
  "A  bridge  indicates  that  it  performs  subtractive  decode  if  its
   Programming  Interface bit in the PCI Configuration Register is set to
   01h.  Not  all  PCI-to-PCI bridges support subtractive decode. Windows
   will  not  switch  a bridge from positive decode to subtractive decode
   (or  vice  versa) because there is no standard method defined for this
   action."

For those who are interested, the entire document (zipped .doc) is
http://download.microsoft.com/download/whistler/hwdev3/1.0/WXP/EN-US/pcibridge-cardbus.exe
Not in the human readable form, sorry. ;-)

> I'd set all 4 then, thus the bridge would really be seen as forwarding
> all the regions of the host bridge, whatever they are.

There are only 3, as Grant pointed out. :-)

> That makes sense. This would also allow to spot that there is no device
> below a PCI<->PCI bridge when doing that assignement of unassigned
> resources, and thus to spot that the firmware may have indeed been
> right to close them and not to bother.

Exactly.

Ivan.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-06 21:02     ` Benjamin Herrenschmidt
@ 2002-08-07 18:30       ` Grant Grundler
  2002-08-08 11:30         ` Ivan Kokshaysky
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2002-08-07 18:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Linux kernel mailing list, Jeff Garzik, David S. Miller, ink

Benjamin Herrenschmidt wrote:
> Well, we are talking about transparent (substractive decoding) bridges here
> right ? Those don't care about ranges.

Of course they do. It's just the ranges are inverted.
PCI-PCI bridges do subtractive decoding for the secondary
PCI bus view of the world.

> If my host bridge expose one MMIO range at 0x90000000..0x9fffffff and one
> at 0xf2000000..0xf2ffffff, a transparent bridge below that host bridge will
> actually forward both of these ranges, right ?

It can. But then it probably is forwarding alot of other addresses as well.
I think the HW will be ok as long as two devices (behind different bridges)
don't respond to the same address. If the resource tree reflects what the
HW is doing, ranges should be replicated and we have to trust BIOS
(or other code) to know it shouldn't assign overlapping ranges
to devices.

...
> A typical setup found on mac is indeed to have the bridge forward
> 2 regions, but then, you may have a transparent bridge hanging
> on the bus. I'm not talking about changing the behaviour of a bridge
> that defines it's forwarding MMIO region, I'm talking about
> not copying parent resources blindly for _transparent_ bridges,
> but instead do it based on the flags.

I think I understand the problem now. The resource mgt for
subtractive decoding bridges *asumes* devices below can't use
anything outside the one range assigned to the parent.
But a subtractively decoded bridge could route several ranges
per range type.

...
> As you can see, for both of them, the firmware only enable
> one decoded MMIO region, and closed the IO and the prefetchable MEM
> region. 
> 
> The current code would consider the prefetchable MEM region as
> transparent, which seems plain wrong in this case.

Agreed. 

> Now, Ivan claims we shouldn't do that, but we should look for the
> bit that states if a bridge is fully transparent, use that, and if
> not, keep the window closed at this point and eventually assign new
> ones later one. That would be the best mecanism, I agree, but
> that's also, iirc, what Linus didn't want because of issues with
> non standard bridges.

Look on the x86 laptops. ISTR my Omnibook 800 having such a bridge.
And I think some bridges for dedicated graphics slots are "transparent".

Support for pmac seems to require knowing if a range is closed or
subtractively decoded. I suspect knowing that for x86 would be
better than the voodoo resource assignment that's going on now.

> (I have never seen such a bridge here though
> it may be worth digging in the list archives. I'm about to leave
> for a few days without good internet connexion though, I won't be
> able to do that research before I'm back).

np.

> Well, look what a transparent bridge does at the HW level please.
> How would it matter if the bridge is doing substractive decoding ?

Right. address routing doesn't care about prefetchable attribute.


> But as Ivan implies in his other email, I feel the whole point is
> that the bridge is either fully transparent, not transparent at all.

got it.

> The current way of picking "some" bridge resources as transparent
> when they are actually closed seems wrong. In the case of a fully
> transparent bridge, just copying down pointers to all the 4 parent
> resources would work just fine for me.

Maybe that's the right thing to do.
Send me a patch for 2.4.19 and I'll try it on the laptop.

...
> >If you write up an initial draft for linux/Documentation/pci-bios.txt,
> >I'll review and comment on it.
> 
> Heh, well, i'm afraid I may not have time for that now,

yeah...isn't that a common problem? :^/


> Then, at the end of the function, if that mask indicates all
> resources where closed, then consider the bridge transparent
> and copy all of the parent resources.

That sounds convoluted.

Ivan wrote:
| ...subtractive decoding bridge _MUST_ have bit 0 in the ProgIf set to 1.

It sounds easy to check at the top and in that case DTRT.
The "else" parts of later resource checks can go away.


> I won't get it for a transparent bridge, sorry ;)

sorry - I was still thinking conventional bridges.

> A transparent bridge does substractive decoding. It will forward
> a cycel to _any_ address that have not been claimed by another
> device on the same segment. Thus, it will forward all of the
> regions exposed by the host

That doesn't sound quite right. I suspect subtractive decoding
means one has to blindly forward everything *outside* the range.
In that case, BIOS has to make sure only one device responds to any
address regardless of how many bridges are in the system.

What you suggest implies the bridge waits for someone else to "claim"
the transaction and I'm not convinced PCI spec would allow that.
Performance would certainly suffer if that were the case.

> There are several problems mixed here, what makes it a tad difficult ;)
> 
> One of them is the host having several MMIO regions an a transparent bridge,
> one of them beeing the host not respecting the ordering of a PCI<->PCI
> bridge (thus breaking the current transparent code in some cases), and
> one of them beeing the current code considering bridges with only one of
> the MMIO regions configured as "half transparent".

Well, I've always been told to keep patches as small as possible...maybe
you want to try again with a seperate patch for each?

grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-07 16:03   ` Ivan Kokshaysky
@ 2002-08-08  8:20     ` Benjamin Herrenschmidt
  2002-08-08 13:21       ` Ivan Kokshaysky
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-08  8:20 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

>Ok. Assume that additional window is 0xf2000000-0xf2ffffff.
>I'd try the following:
>- set the _single_ memory resource of the root bus to 0x80000000-0xf2ffffff;
>- create dummy memory type resource 0xf0000000-0xf1ffffff and "claim" it
>  on the root bus. This will prevent all further allocations in the
>  gap between two MMIO windows.
>I think it should seriously simplify the things.

Unfortunately that wouldn't work as I actually have 3 host bridges
on these models, and the windows can be "mixed". One host can have
0x80000000 to 0x9ffffffff (and one region at 0xfx000000), The next
one can have 0xa0000000 to 0xaffffffff and another region at
0xfx000000, etc...

>> Ok, here we need Linus answert. We did have a patch in the PPC tree
>> that was consideing closed resources as really closed instead of
>> transparent, and we were told by Linus that there were non standard
>> bridges and various issues in the x86 world with that, and that those
>> would remain transparent. I don't have pointers at hand, but I think
>> this was dicusssed on lkml several monthes ago.
>
>I recall that. I do agree with Linus, but only about bridges with
>class code 0x60401 (subtractive decoders). The details of operating in
>subtractive decoding mode are beyond the scope of the P2P bridge specs,
>and probably we don't want to know these details either. "Assuming
>transparent" is a sufficient workaround in this case.

Agreed.

>Some additional notes. The P2P bridge specification says:
>"The primary use of a subtractive decoding bridge is to connect
> a laptop system to a docking station and support legacy ISA devices
> in the docking station."
>
>Indeed, that "transparency" code had been added to fix P2P bridge problems
>on some Dell docking station (reported by Jamal) back in the 2.4.0-test
>times. Unfortunately, lspci output of that machine hasn't been posted
>(or I just missed that). However, I'm sure that the problematic bridge
>did have ProgIf code 1, otherwise that type of machines would have
>problems running Windows, as MS says:
>  "A  bridge  indicates  that  it  performs  subtractive  decode  if  its
>   Programming  Interface bit in the PCI Configuration Register is set to
>   01h.  Not  all  PCI-to-PCI bridges support subtractive decode. Windows
>   will  not  switch  a bridge from positive decode to subtractive decode
>   (or  vice  versa) because there is no standard method defined for this
>   action."
>
>For those who are interested, the entire document (zipped .doc) is
>http://download.microsoft.com/download/whistler/hwdev3/1.0/WXP/EN-US/
>pcibridge-cardbus.exe
>Not in the human readable form, sorry. ;-)
>
>> I'd set all 4 then, thus the bridge would really be seen as forwarding
>> all the regions of the host bridge, whatever they are.
>
>There are only 3, as Grant pointed out. :-)

Well, I as pointed out, I may actually need all 4 regions of the host ;)

Anyway, since we agree on copying down the parent regions, and the pci_bus
stucture holds 4 resource slots, then let's copy them all down.

I'll write some code about that when I'm back from vacation and we'll
see what's up. I may end up adding a quirk call inside the
pci_read_bridge_bases
functions so that it's behaviour can be easily overriden if we ever meet
a non-strandard enough bridge to be transparent without having ProgIf code 1

Ben.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-07 18:30       ` Grant Grundler
@ 2002-08-08 11:30         ` Ivan Kokshaysky
  2002-08-09  7:07           ` Grant Grundler
  2002-08-09  8:06           ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 16+ messages in thread
From: Ivan Kokshaysky @ 2002-08-08 11:30 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Benjamin Herrenschmidt, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

On Wed, Aug 07, 2002 at 12:30:25PM -0600, Grant Grundler wrote:
> Send me a patch for 2.4.19 and I'll try it on the laptop.

Appended - please do.

> Ivan wrote:
> | ...subtractive decoding bridge _MUST_ have bit 0 in the ProgIf set to 1.
> 
> It sounds easy to check at the top and in that case DTRT.
> The "else" parts of later resource checks can go away.

Exactly.

> What you suggest implies the bridge waits for someone else to "claim"
> the transaction and I'm not convinced PCI spec would allow that.

It allows that as a matter of fact.

> Performance would certainly suffer if that were the case.

Sure, performance sucks, and there are other bad side effects,
like impossibility of the peer-to-peer DMA behind such bridge.

Ivan.

--- linux/drivers/pci/pci.c~	Fri Jun 28 14:46:21 2002
+++ linux/drivers/pci/pci.c	Thu Aug  8 14:57:25 2002
@@ -1073,6 +1073,14 @@ void __devinit pci_read_bridge_bases(str
 	if (!dev)		/* It's a host bus, nothing to read */
 		return;
 
+	if (dev->class & 1) {
+		printk("Subtractive decoding bridge %s -"
+			" assuming transparent\n", dev->name);
+		for(i = 0; i < 3; i++)
+			child->resource[i] = child->parent->resource[i];
+		return;
+	}
+
 	for(i=0; i<3; i++)
 		child->resource[i] = &dev->resource[PCI_BRIDGE_RESOURCES+i];
 
@@ -1095,13 +1103,6 @@ void __devinit pci_read_bridge_bases(str
 		res->start = base;
 		res->end = limit + 0xfff;
 		res->name = child->name;
-	} else {
-		/*
-		 * Ugh. We don't know enough about this bridge. Just assume
-		 * that it's entirely transparent.
-		 */
-		printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 0);
-		child->resource[0] = child->parent->resource[0];
 	}
 
 	res = child->resource[1];
@@ -1114,10 +1115,6 @@ void __devinit pci_read_bridge_bases(str
 		res->start = base;
 		res->end = limit + 0xfffff;
 		res->name = child->name;
-	} else {
-		/* See comment above. Same thing */
-		printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 1);
-		child->resource[1] = child->parent->resource[1];
 	}
 
 	res = child->resource[2];
@@ -1145,10 +1142,6 @@ void __devinit pci_read_bridge_bases(str
 		res->start = base;
 		res->end = limit + 0xfffff;
 		res->name = child->name;
-	} else {
-		/* See comments above */
-		printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 2);
-		child->resource[2] = child->parent->resource[2];
 	}
 }
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-08  8:20     ` Benjamin Herrenschmidt
@ 2002-08-08 13:21       ` Ivan Kokshaysky
  2002-08-09  6:29         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Ivan Kokshaysky @ 2002-08-08 13:21 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

On Thu, Aug 08, 2002 at 10:20:15AM +0200, Benjamin Herrenschmidt wrote:
> Unfortunately that wouldn't work as I actually have 3 host bridges
> on these models, and the windows can be "mixed". One host can have
> 0x80000000 to 0x9ffffffff (and one region at 0xfx000000), The next
> one can have 0xa0000000 to 0xaffffffff and another region at
> 0xfx000000, etc...

Please elaborate. I always thought that 3 host bridges (controllers,
hoses and so on) mean 3 physically separated PCI buses. In this
case the approach with only one root bus structure is plain wrong -
resource allocation won't work correctly.
You should have 3 root buses, and apply suggested workaround to all 3.
Or am I missing something?

> >There are only 3, as Grant pointed out. :-)
> 
> Well, I as pointed out, I may actually need all 4 regions of the host ;)

I still hope you won't :-)

> Anyway, since we agree on copying down the parent regions, and the pci_bus
> stucture holds 4 resource slots, then let's copy them all down.

I think 4th slot makes no sense in terms of the PCI bus and
should be killed. I guess that initially it was intended for cardbus
drivers (for 2nd IO window), but it seems that they don't use
pci_bus->resource pointers at all.

> I'll write some code about that when I'm back from vacation and we'll
> see what's up. I may end up adding a quirk call inside the
> pci_read_bridge_bases
> functions so that it's behaviour can be easily overriden if we ever meet
> a non-strandard enough bridge to be transparent without having ProgIf code 1

This can be easily done with generic quirks:

	{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_XXX, PCI_DEVICE_ID_BAD_BRIDGE,
	  quirk_transparent_bridge }
...
static void __init
quirk_transparent_bridge(struct pci_dev *dev)
{
	dev->class |= 1;
}


Ivan.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-08 13:21       ` Ivan Kokshaysky
@ 2002-08-09  6:29         ` Benjamin Herrenschmidt
  2002-08-09 17:01           ` Ivan Kokshaysky
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-09  6:29 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

>On Thu, Aug 08, 2002 at 10:20:15AM +0200, Benjamin Herrenschmidt wrote:
>> Unfortunately that wouldn't work as I actually have 3 host bridges
>> on these models, and the windows can be "mixed". One host can have
>> 0x80000000 to 0x9ffffffff (and one region at 0xfx000000), The next
>> one can have 0xa0000000 to 0xaffffffff and another region at
>> 0xfx000000, etc...
>
>Please elaborate. I always thought that 3 host bridges (controllers,
>hoses and so on) mean 3 physically separated PCI buses. In this
>case the approach with only one root bus structure is plain wrong -
>resource allocation won't work correctly.
>You should have 3 root buses, and apply suggested workaround to all 3.
>Or am I missing something?

I do have 3 root busses, that's not a problem. But their resources
are all childs of the global iomem_resources which sorta represents
the system memory bus. (I do eventually declare additional resources
as child of this one, not behind any pci host bridge, like some
memory controller registers).

>> >There are only 3, as Grant pointed out. :-)
>> 
>> Well, I as pointed out, I may actually need all 4 regions of the host ;)
>
>I still hope you won't :-)

Well... at one point, I had more than that :( I added some code to
coalesce the ranges provided by the firmware and figured out it
mostly turned into 1 big range of 256 or 512Mb, one small in the
0xfx000000 region, and one IO. So that should fit. But nothing prevents
the firmware from setting things up differently.
But yes, at least on pmac, I think we now don't have more than 3 in
real life, though I can't speak for IBM high end stations.

We have a routine that takes whatever ranges are provided by openfirmware
and then populates the host bridge resources. This routine can fill up
to 4 slots and doesn't force any ordering on the way those are filled.
This may have to be changed, though currently, I think the case of
pci_read_bridge_bases() with a transparent bridge was the only common
routine we used that relied on this parent resource ordering assumption,
and this will be going away with your proposed patch.

>> Anyway, since we agree on copying down the parent regions, and the pci_bus
>> stucture holds 4 resource slots, then let's copy them all down.
>
>I think 4th slot makes no sense in terms of the PCI bus and
>should be killed. I guess that initially it was intended for cardbus
>drivers (for 2nd IO window), but it seems that they don't use
>pci_bus->resource pointers at all.

They should probably then, but I haven't quite looked at the cardbus
code yet. I still think the resource management should be generic
enough not to rely on ordering & number of resources, as the actual
informations we want out of the parent resources are already encoded
in the flags (that is knowing if we deal with the parent IO window,
MEM window, or MEM+prefetch window). We have generic routines
working only on flags for finding parents when populating the
tree already.

But that isn't an urgent issue nor difficult to work around if
needed, so let's put that on hold until I can prove we really need
all of those ;)

>> I'll write some code about that when I'm back from vacation and we'll
>> see what's up. I may end up adding a quirk call inside the
>> pci_read_bridge_bases
>> functions so that it's behaviour can be easily overriden if we ever meet
>> a non-strandard enough bridge to be transparent without having ProgIf
code 1
>
>This can be easily done with generic quirks:
>
>	{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_XXX, PCI_DEVICE_ID_BAD_BRIDGE,
>	  quirk_transparent_bridge }
>...
>static void __init
>quirk_transparent_bridge(struct pci_dev *dev)
>{
>	dev->class |= 1;
>}

Yah, that would make it.

Thanks for your enlightenment
Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-08 11:30         ` Ivan Kokshaysky
@ 2002-08-09  7:07           ` Grant Grundler
  2002-08-09  8:06           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2002-08-09  7:07 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Benjamin Herrenschmidt, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

Ivan Kokshaysky wrote:
> > It sounds easy to check at the top and in that case DTRT.
> > The "else" parts of later resource checks can go away.
> 
> Exactly.

I checked PCI 2.2 spec and "class code" is described in Appendix D.
For Base Class 6 (Bridges), Sub-Class 4 (PCI-PCI Bridges), Programming
Interface value of 0x01 says
	"Subtractive Decode PCI-to-PCI bridge. This interface code
	identifies the PCI-to-PCI bridge as a device that supports
	subtractive decoding in addition to all the currently defined
	functions of a PCI-to-PCI bridge."

Programming Interface byte is not specified for all other bridge types.
And the linux code isn't checking bridge type before calling
pci_read_bridges_bases().

Thinking maybe it's a defacto standard, I dumped the values
from my OB500 (< 1 year old), much older OB800, LPR1000, and
a prototype x86 machine (that's never going to be product *sigh*).
And none of these seem to have bit 1 set in Prog Interface byte.
(I was looking for 06xx01xx in the third u32)

I'm wondering if the LXR8000 I ditched earlier this year had such
a bridge. But I'll never know.

I'll try the patch anyway on the OB800. But it'll have to
wait until tomorrow. Past my bedtime here.

hth,
grant


OB500:
grundler <513>for i in */*; do echo -n $i " "; cat $i | od -Ax -t x4 | grep ^00000 | sed 's/^0* //'; done
00/00.0  71908086 22100106 06000003 00004000
00/01.0  71918086 0220001f 06040003 00018000
00/07.0  71108086 0280000f 06010002 00800000
00/07.1  71118086 02800005 01018001 00004000
00/07.2  71128086 02800005 0c030001 00004000
00/07.3  71138086 02800003 06800003 00000000
00/0a.0  ac50104c 02100007 06070001 0002a808
00/0b.0  605510b7 02100017 02000010 00805008
00/0b.1  100710b7 02100010 07800010 00005008
00/0d.0  1998125d 02900007 04010000 00004000
00/11.0  06481095 82900000 01018f01 00000000
01/00.0  4c4d1002 02900087 03000064 00004208
grundler <515>lspci | fgrep -i bridge
00:00.0 Host bridge: Intel Corp. 440BX/ZX - 82443BX/ZX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corp. 440BX/ZX - 82443BX/ZX AGP bridge (rev 03)
00:07.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 02)
00:07.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 03)
00:0a.0 CardBus bridge: Texas Instruments PCI1410 PC card Cardbus Controller (rev 01)

OB800:
00.0  01041004 22800006 06000003 00000000
01.0  01021004 0280000f 06040003 00010000
02.0  01011004 02800003 ff000002 00000000
03.0  000310c8 02800003 03000001 00000000
04.0  ac15104c 02000007 06070001 00824008
04.1  ac15104c 02000007 06070001 00824008
06.0  01051004 02800005 0d000002 00000000

00:00.0 Host bridge: VLSI Technology Inc 82C535 (rev 03)
00:01.0 PCI bridge: VLSI Technology Inc 82C534 (rev 03)
00:02.0 Class ff00: VLSI Technology Inc 82C532 (rev 02)
00:03.0 VGA compatible controller: Neomagic Corporation NM2093 [MagicGraph 128ZV] (rev 01)
00:04.0 CardBus bridge: Texas Instruments PCI1131 (rev 01)
00:04.1 CardBus bridge: Texas Instruments PCI1131 (rev 01)
00:06.0 IRDA controller: VLSI Technology Inc 82C147 (rev 02)
00:00.0 Host bridge: VLSI Technology Inc 82C535 (rev 03)
	Flags: bus master, medium devsel, latency 0


LPR1000:
00/00.0  71928086 02000106 06000002 00004000
00/04.0  71108086 0280000f 06010002 00800000
00/04.1  71118086 02800005 01018001 00002000
00/04.2  71128086 02800005 0c030001 00002000
00/04.3  71138086 02800003 06800002 00000000
00/07.0  00241011 02800147 06040002 00013908
00/08.0  90f0113f 02000002 07008004 00000000
00/09.0  905510b7 02100157 02000000 00005008
00/0d.0  00b81013 02000003 03000045 00000000
01/02.0  12298086 02900157 02000005 00004208
01/04.0  000c1000 02000157 01000001 0000f708
grundler <504>lspci | fgrep -i bridge
00:00.0 Host bridge: Intel Corp. 440BX/ZX - 82443BX/ZX Host bridge (AGP disabled) (rev 02)
00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 02)
00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 02)
00:07.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 02)

ProtoFoster:
grundler@t11:/proc/bus/pci$ for i in */*; do echo -n $i " "; cat $i | od -Ax -t x4 | grep ^00000 | sed 's/^0* //'; done
00/00.0  00111166 00000000 06000022 00800010
00/00.1  00111166 00000000 06000000 00800010
00/00.2  00111166 00000000 06000000 00800010
00/00.3  00111166 00000000 06000000 00800010
00/02.0  25a115bc 02900157 ff000000 00804008
00/02.1  253115bc 02900143 ff000000 80804008
00/02.2  25a115bc 02900143 0c070000 00804008
00/03.0  12298086 02900157 0200000d 00004008
00/05.0  47521002 02900087 03000027 00004208
00/0f.0  02011166 22000147 06000093 00802000
00/0f.1  02121166 02000005 01018a93 00800000
00/0f.2  02201166 02900153 0c031005 00804008
00/0f.3  02251166 02000004 06010000 00800000
00/10.0  00101166 22b00002 06000003 00800000
00/10.2  00101166 22300002 06000003 00804000
00/11.0  00101166 22300002 06000003 00804000
00/11.2  00101166 22b00002 06000003 00800000
01/05.0  1219103c 04900000 08040012 00000000
19/04.0  03098086 04b00147 06040001 00014008
1a/02.0  00211000 02300157 01000001 00804808
1a/02.1  00211000 02300157 01000001 00804808

grundler@t11:/proc/bus/pci$ lspci | fgrep -i bridge
00:00.0 Host bridge: ServerWorks CMIC-HE (rev 22)
00:00.1 Host bridge: ServerWorks CMIC-HE
00:00.2 Host bridge: ServerWorks CMIC-HE
00:00.3 Host bridge: ServerWorks CMIC-HE
00:0f.0 Host bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.3 ISA bridge: ServerWorks: Unknown device 0225
00:10.0 Host bridge: ServerWorks CIOB30 (rev 03)
00:10.2 Host bridge: ServerWorks CIOB30 (rev 03)
00:11.0 Host bridge: ServerWorks CIOB30 (rev 03)
00:11.2 Host bridge: ServerWorks CIOB30 (rev 03)
19:04.0 PCI bridge: Intel Corp.: Unknown device 0309 (rev 01)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-08 11:30         ` Ivan Kokshaysky
  2002-08-09  7:07           ` Grant Grundler
@ 2002-08-09  8:06           ` Benjamin Herrenschmidt
  2002-08-09 17:16             ` Ivan Kokshaysky
  1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-09  8:06 UTC (permalink / raw)
  To: Ivan Kokshaysky, Grant Grundler
  Cc: Linux kernel mailing list, Jeff Garzik, David S. Miller

>-	} else {
>-		/*
>-		 * Ugh. We don't know enough about this bridge. Just assume
>-		 * that it's entirely transparent.
>-		 */
>-		printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 0);
>-		child->resource[0] = child->parent->resource[0];
> 	}

BTW, in the case of really closed resources, you just removed the "else"
case. I don't have the kernel sources at hand at the moment (still
on vacation ;) So I can't check how pci_dev is initialized on alloc,
but shouldn't we make sure the resoure pointer of the child is either
NULL or points to some properly zeroed out resource structure ?

I know the "closed resources" patch we used to have in some PPC kernel
trees did that explicitely in the "else" case here.

Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-09  6:29         ` Benjamin Herrenschmidt
@ 2002-08-09 17:01           ` Ivan Kokshaysky
  2002-08-09 21:14             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Ivan Kokshaysky @ 2002-08-09 17:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

On Fri, Aug 09, 2002 at 08:29:30AM +0200, Benjamin Herrenschmidt wrote:
> I do have 3 root busses, that's not a problem. But their resources
> are all childs of the global iomem_resources which sorta represents
> the system memory bus.

Child<->parent resource relationship between system and PCI buses
not only isn't required, but might be simply impossible in some situations.
Consider non-linear (or linear, but just not 1:1) mapping between
system and PCI bus addressing. Or even no mapping at all. ;-)
Yes, most architectures use global resources as parents of PCI resources -
just because it happens to work and is convenient. But this doesn't mean
that everybody must do the same.

> Well... at one point, I had more than that :( I added some code to
> coalesce the ranges provided by the firmware and figured out it
> mostly turned into 1 big range of 256 or 512Mb, one small in the
> 0xfx000000 region, and one IO. So that should fit. But nothing prevents
> the firmware from setting things up differently.

Exactly. One day, after firmware update, you may end up asking for
a bit more resource slots. :-)
BTW, do you really need that additional small IOMEM range?

> They should probably then, but I haven't quite looked at the cardbus
> code yet. I still think the resource management should be generic
> enough not to rely on ordering & number of resources, as the actual
> informations we want out of the parent resources are already encoded
> in the flags (that is knowing if we deal with the parent IO window,
> MEM window, or MEM+prefetch window). We have generic routines
> working only on flags for finding parents when populating the
> tree already.

This would add a lot of unneeded complexity to the code in
drivers/pci/setup-bus.c. Also, this would make impossible
configurations like this:
root bus windows	0x80000000-0x8fffffff
			0xf0000000-0xf0ffffff
pci-pci bridge window	0x80200000-0xf02fffff

which is perfectly valid in your setup unless you
place some device resources in the range 0x90000000-0xefffffff.
BTW, you can avoid that range with properly coded pcibios_align_resource() -
maybe this would be cleaner solution than allocating dummy resource.

> But that isn't an urgent issue nor difficult to work around if
> needed, so let's put that on hold until I can prove we really need
> all of those ;)

Ok. But IMO, you're trying to expose your host bridge internals to
the generic code instead of hiding it...

Ivan.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-09  8:06           ` Benjamin Herrenschmidt
@ 2002-08-09 17:16             ` Ivan Kokshaysky
  0 siblings, 0 replies; 16+ messages in thread
From: Ivan Kokshaysky @ 2002-08-09 17:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

On Fri, Aug 09, 2002 at 10:06:30AM +0200, Benjamin Herrenschmidt wrote:
> BTW, in the case of really closed resources, you just removed the "else"
> case. I don't have the kernel sources at hand at the moment (still
> on vacation ;) So I can't check how pci_dev is initialized on alloc,

It's zeroed.

> but shouldn't we make sure the resoure pointer of the child is either
> NULL or points to some properly zeroed out resource structure ?

I'm not sure whether it could happen in current 2.4/2.5 code, but
if pci_read_bridge_bases() is called from hotplug code, and
bridge's window changes from "enabled" to "disabled" (card removed),
then yes, we must set resource.flags = 0.

Ivan.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: PCI<->PCI bridges, transparent resource fix
  2002-08-09 17:01           ` Ivan Kokshaysky
@ 2002-08-09 21:14             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-09 21:14 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Grant Grundler, Linux kernel mailing list, Jeff Garzik,
	David S. Miller

>Child<->parent resource relationship between system and PCI buses
>not only isn't required, but might be simply impossible in some situations.
>Consider non-linear (or linear, but just not 1:1) mapping between
>system and PCI bus addressing. Or even no mapping at all. ;-)
>Yes, most architectures use global resources as parents of PCI resources -
>just because it happens to work and is convenient. But this doesn't mean
>that everybody must do the same.

Right. Though it's convenient that way on ppc32 ;)

>> Well... at one point, I had more than that :( I added some code to
>> coalesce the ranges provided by the firmware and figured out it
>> mostly turned into 1 big range of 256 or 512Mb, one small in the
>> 0xfx000000 region, and one IO. So that should fit. But nothing prevents
>> the firmware from setting things up differently.
>
>Exactly. One day, after firmware update, you may end up asking for
>a bit more resource slots. :-)

Yup, especially since Apple does the firmware :)

>BTW, do you really need that additional small IOMEM range?

Yup. It's really used by some devices on some machines, and it's nasty
to let the kernel relocate PCI devices when it turns to be Apple's
ASICs. Especially when things get out of sync with the Open Firmware
device tree. So we need to keep the PCI setup as close as possible
as what is set by the firmware, while still having some freedom
for things like bus renumbering or reallocations of some PCI
devices.

>> They should probably then, but I haven't quite looked at the cardbus
>> code yet. I still think the resource management should be generic
>> enough not to rely on ordering & number of resources, as the actual
>> informations we want out of the parent resources are already encoded
>> in the flags (that is knowing if we deal with the parent IO window,
>> MEM window, or MEM+prefetch window). We have generic routines
>> working only on flags for finding parents when populating the
>> tree already.
>
>This would add a lot of unneeded complexity to the code in
>drivers/pci/setup-bus.c. Also, this would make impossible
>configurations like this:
>root bus windows	0x80000000-0x8fffffff
>			0xf0000000-0xf0ffffff
>pci-pci bridge window	0x80200000-0xf02fffff

A bit of complexity on a rarely usesed code path (typically at boot
only most of the time) for more flexibility may be worth the trade ;)

>which is perfectly valid in your setup unless you
>place some device resources in the range 0x90000000-0xefffffff.
>BTW, you can avoid that range with properly coded pcibios_align_resource() -
>maybe this would be cleaner solution than allocating dummy resource.

Yup, nasty case, but does the current code handle it cleanly anyway ?

>> But that isn't an urgent issue nor difficult to work around if
>> needed, so let's put that on hold until I can prove we really need
>> all of those ;)
>
>Ok. But IMO, you're trying to expose your host bridge internals to
>the generic code instead of hiding it...

Maybe. I have a quite non-PCI centric view of it (maybe after dealing
a bit with embedded stuff). I want to represent my physical bus layout
with those, having a coherent tree, crossing a PCI segment or not.
It makes sense to me to expose the resources of the host as they are
really implemented.

Ben.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-08-09 21:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20020806192951.7E6B44829@dsl2.external.hp.com>
2002-08-06 19:20 ` PCI<->PCI bridges, transparent resource fix Benjamin Herrenschmidt
2002-08-07  5:54   ` Grant Grundler
2002-08-06 21:02     ` Benjamin Herrenschmidt
2002-08-07 18:30       ` Grant Grundler
2002-08-08 11:30         ` Ivan Kokshaysky
2002-08-09  7:07           ` Grant Grundler
2002-08-09  8:06           ` Benjamin Herrenschmidt
2002-08-09 17:16             ` Ivan Kokshaysky
     [not found] <20020807042402.A4840@jurassic.park.msu.ru>
2002-08-06 20:31 ` Benjamin Herrenschmidt
2002-08-07 16:03   ` Ivan Kokshaysky
2002-08-08  8:20     ` Benjamin Herrenschmidt
2002-08-08 13:21       ` Ivan Kokshaysky
2002-08-09  6:29         ` Benjamin Herrenschmidt
2002-08-09 17:01           ` Ivan Kokshaysky
2002-08-09 21:14             ` Benjamin Herrenschmidt
2002-08-06 18:44 Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.