public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* The IO problem on multiple PCI busses
@ 2001-03-01 15:33 Benjamin Herrenschmidt
  2001-03-01 15:41 ` Benjamin Herrenschmidt
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Benjamin Herrenschmidt @ 2001-03-01 15:33 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev

Here's the return of an oooold problem for which we really need a
solution asap since it's now biting us in real life configurations...

So the problem happens when you have a machine with more than one PCI
host bridge. This is typically the case of all new Apple machines as they
have 3 host bridges in one chip (2 of them are relevant: the AGP and the
PCI). I don't think the problem exist on x86 machines with real IO
cycles, at least in that case, the problem is different.

In order to generate IO cycles, the bridge provides us with a region in
CPU physical memory space (a 16Mb region in our case) that translates
accesses to IO cycles on the PCI bus. Our implementation of inb/outb
currently relies on the kernel ioremap'ing one of these regions (the PCI
one) and using the ioremap result as a base (offset) inside the inb/outb
functions.

So that mean that the current design won't allow access to IOs located on
any bus but the one we arbitrarily choose (the PCI bus). That's fine in
most case, until you decide to put a 3dfx or nvidia card in the AGP slot.
Those cards require some IO accesses to be done to the legacy VGA
addresses, and of course, our inb/outb functions can't do that.

Obviously, we can hack some driver specific thing that would use the
arch-specific code to retreive the proper io base address for a given
host bridge, but that's a hack. I'm looking for a solution that would
cleanly apply to all archs that may potentially face this problem.

The problem potentially exist also for any PCI card that has PCI IOs on
anything but the main PCI bus. 

One possibility is to limit our IO space to 64k per bus (to avoid
bloating) and then use a hacked ioremap to create a single virtually
contiguous kernel region that appends all those IO spaces together.
Accessing IOs on bus N would just be the matter of calculating an address
of the type 64k*N+offset and doing normal inb/outb on the result. The
arch PCI code could then properly fixup PCI IO resources for PCI drivers,
and we could add a function of the kind

 unsigned long pci_bus_io_offset(int busno);

that would return the offset to add to inb/outb when accessing IOs on the
N'th PCI bus.

If we want to go a bit further, and allow ISA drivers that don't have a
pci_dev structure to work on legacy devices on any bus, we could provide
a set of function of the type

 int isa_get_bus_count();
 unsigned long isa_get_bus_io_offset(int busno);

and eventually

 int isa_bus_to_pci_bus(int isa_busno);
 int pci_bus_to_isa_bus(int pci_busno);

If we want to figure out on which PCI bus a given ISA bus is located if
any (-1 beeing no mapping 
exist).

Of course, the same problem exist for ISA memory (used by legacy VGA
modes). It's not a problem in real life currently since no powermac can
produce PCI cycles in the ISA memory range today, and non-powermac PPC
machines currently don't have needs for video cards on anything but the
main bus, but the potential issue is there, and the need for a solution
may pop up too.

I'm, of course open to any comments about this (in fact, I'd really like
some feedback). One thing is that we also need to find a way to pass
those infos to userland. Currently, we implement an arch-specific syscall
that allow to retreive the IO physical base of a given PCI bus. That may
be enough, but we may also want something that match more closely what we
do in the kernel.

Regards,
Ben.


^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: The IO problem on multiple PCI busses
@ 2001-03-02  1:22 Grant Grundler
  2001-03-02  2:19 ` David S. Miller
  0 siblings, 1 reply; 27+ messages in thread
From: Grant Grundler @ 2001-03-02  1:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linux-kernel


Benjamin Herrenschmidt wrote:
> Hi Grant !
> 
> Alan Cox suggested I contact you about this. I'm trying to figure out a
> way to cleanly resolve the problem of doing IO accesses on machines with
> multiple PCI host bridges (and multiple IO bases when IO cycles are not
> generated by the CPU). I'd be glad if you could catch on the 
> "The IO problem on multiple PCI busses" thread on linux-kernel list
> and let us share your point of viw.

To l-k, Benjamin wrote:
| I've looked at the parisc code (thanks Alan for pointing that out), and 
| it seem they implement all inb/outb as quite big functions that decypher 
| the address, retreive the bus, and do the proper IO call. Unfortunately,
| that's a bit bloated, and I don't think I'll ever get other PPC 
| maintainers to agree with such a mecanism (everybody seem to be quite 
| concerned with IO speed, I admit including me).

Benjamin,
As the main author/maintainer of that code, let me explain why
it's so ugly. Hopefully this will give you insight into a "better"
(arch independent) solution. Apologies for the length.

For IO Port space, I didn't worry about the bloat. A nice side effect of
this bloat is it will discourage use of I/O Port space. That's good for
everyone, AFAICT. (I know some devices *only* support I/O port space and
I personnally don't care about them. If someone who does care about one
wants to talk to me about it...fine...I'll help)

[ Caveat: I've simplified the following *alot* to keep it short. ]

parisc supports two different PCI host bus adapters with each having
variants that behave differently. All work under the model we are using
with one binary. One kernel binary is important since we want to make
install's easy for users.

Under Dino (GSCtoPCI), each PCI HBA has it's own 64K I/O port space.
I/O port space transactions are generated by poking registers on Dino.
Yes - performance sucks - that's why HPUX (almost) exclusively
uses devices which support MMIO.

Under Elroy (aka LBA or RopesToPCI), we have two methods of accessing
I/O port space. One view of I/O space can be shared across all Elroy's
which share the same IOMMU (aka SBA). This method distributes the 64K
I/O space over the 8 (or 16) "ropes" with rope 0 getting the first
8k (or 4k) and so on. The other view is each LBA has it's own 64K
of I/O port space. The second view is mapped above 4GB and requires
64-bit kernel to access. In both cases, processor loads/stores from/to
the region will generate an I/O cycle on the respective PCI bus.

Generally speaking, parisc doesn't support VGA or ISA legacy crud on
it's PCI busses. But I think those are orthogonal issues.


The inb/outb support hings on this definition in include/asm-parisc/pci.h:
struct pci_port_ops {
          u8 (*inb)  (struct pci_hba_data *hba, u16 port);
         u16 (*inw)  (struct pci_hba_data *hba, u16 port);   
         u32 (*inl)  (struct pci_hba_data *hba, u16 port); 
        void (*outb) (struct pci_hba_data *hba, u16 port,  u8 data);
        void (*outw) (struct pci_hba_data *hba, u16 port, u16 data);
        void (*outl) (struct pci_hba_data *hba, u16 port, u32 data);
};

Code which uses this is in arch/parisc/kernel/pci.c at:
	http://puffin.external.hp.com/cvs/linux/arch/parisc/kernel/pci.c

(look for PCI_PORT_HBA usage)


In a nut shell, the HBA number is encoded in the upper 16-bits
of the 32-bit I/O port space address. The inb() *function* uses the
decoded HBA number to lookup the matching pci_port_ops function table
and pci_hba_data * to pass in. PCI fixup_bus() code virtualizes the
I/O port addresses found by the generic PCI bus walk. inb() is
function so drivers work under *all* parisc PCI HBAs with one binary.

This scheme allows us to support "PCI-like" busses as well.
Some parisc machines have both PCI and EISA slots which are completely
independent of each other. We'd like to keep the semantics of inb/outb
the same and support both at the same time. It might be possible
to do this by feeding the drivers different versions of inb/outb
definitions at compile time. But initial attempts to do this ran
into problems (which I don't remember the details of).


Last comment is regarding who *configures* the PCI devices. On legacy PDC
(parisc's "BIOS on steriods"), the PDC sets everything up but does
not enable everything (ie pci_enable_device will set bits in PCI_COMMAND
cfg register).  On card-mode Dino, (GSC cards plugged in proprietary bus),
the firmware doesn't know *anything* about the PCI devices and the arch
support has to set everything up - PCI MMIO space is not currently
supported there. And new servers (like L2000 or A500) with "PAT PDC" only
initialize PCI devices for boot. OS has to initialize the rest.

grant

Grant Grundler
parisc-linux {PCI|IOMMU|SMP} hacker
+1.408.447.7253

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2001-03-07  2:10 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-01 15:33 The IO problem on multiple PCI busses Benjamin Herrenschmidt
2001-03-01 15:41 ` Benjamin Herrenschmidt
2001-03-01 18:30 ` Alan Cox
2001-03-01 19:09 ` David S. Miller
2001-03-01 19:33   ` Dan Malek
2001-03-01 19:41     ` David S. Miller
2001-03-01 19:59       ` Dan Malek
2001-03-01 20:22         ` David S. Miller
2001-03-01 20:09       ` Benjamin Herrenschmidt
2001-03-01 20:27         ` David S. Miller
2001-03-02 11:25           ` Benjamin Herrenschmidt
2001-03-03  1:08             ` David S. Miller
2001-03-01 19:49   ` Benjamin Herrenschmidt
2001-03-01 20:21     ` David S. Miller
2001-03-01 22:26       ` Alan Cox
2001-03-02 11:20       ` Benjamin Herrenschmidt
2001-03-03  1:06         ` David S. Miller
2001-03-03  2:25           ` Benjamin Herrenschmidt
2001-03-03 11:01             ` Jeff Garzik
2001-03-03 17:28               ` Benjamin Herrenschmidt
2001-03-05 16:20             ` David Woodhouse
2001-03-06 23:01           ` Oliver Xymoron
2001-03-07  2:07           ` Tony Mantler
2001-03-05 23:21   ` Chris Wedgwood
  -- strict thread matches above, loose matches on Subject: below --
2001-03-02  1:22 Grant Grundler
2001-03-02  2:19 ` David S. Miller
2001-03-02 17:46   ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox