* Re: __ioremap_at() in 2.4.0-test9-pre2
@ 2000-09-21 7:30 Iain Sandoe
0 siblings, 0 replies; 74+ messages in thread
From: Iain Sandoe @ 2000-09-21 7:30 UTC (permalink / raw)
To: paulus, Dan Malek; +Cc: Linux/PPC Development
I admit that I haven't followed all the points in this thread..
>> Yes....IMHO I think the PC is one of the worst architecture designs
>> ever, and making my PowerMac or anything else live within those
>> contraints isn't progress....
>
> Well, your powermac has a PCI bus, and PCI has an I/O space as well as
> a memory space (for better or for worse).
>
> I think my basic point is that a setup where you can't do inb(n) to
> read the byte at address n in PCI I/O space is broken. On systems
> with 1 PCI host bridge, this is unambiguous, on systems with >1 host
> bridge inb(n) should access address n in PCI I/O space on the first
> host bridge.
Whatever solution is chosen/evolves I have one request:
If we (linux on PPC generally) want to be able to take part in audio on
linux is is (quite/highly) likely that we will have to adopt ALSA fairly
soon - poss. starting with 2.5.
This involves the import of 30+ (at a guess) drivers, with soft-RT
requirements, *all* of these at present are written for PC - to the best of
my knowledge none have been ported yet.
It seems quite a formidable challenge that could put us off the audio stage
for a long while if we can't make is as easy as poss.
Iain.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 74+ messages in thread* __ioremap_at() in 2.4.0-test9-pre2
@ 2000-09-17 18:59 Geert Uytterhoeven
2000-09-19 3:59 ` Paul Mackerras
0 siblings, 1 reply; 74+ messages in thread
From: Geert Uytterhoeven @ 2000-09-17 18:59 UTC (permalink / raw)
To: Linux/PPC Development
While reading the test9-pre2 diff, I saw
--- native-2.4.0-test9-pre1/include/asm-ppc/io.h Sat Jun 24 10:37:42 2000
+++ native-2.4.0-test9-pre2/include/asm-ppc/io.h Sun Sep 17 19:59:42 2000
@@ -123,6 +181,8 @@
*/
extern void *__ioremap(unsigned long address, unsigned long size,
unsigned long flags);
+extern void *__ioremap_at(unsigned long phys, unsigned long size,
+ unsigned long flags);
extern void *ioremap(unsigned long address, unsigned long size);
#define ioremap_nocache(addr, size) ioremap((addr), (size))
extern void iounmap(void *addr);
and wondered what __ioremap_at() is used for? It's nowhere defined.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 74+ messages in thread* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-17 18:59 Geert Uytterhoeven @ 2000-09-19 3:59 ` Paul Mackerras 2000-09-19 5:56 ` Michel Lanners ` (2 more replies) 0 siblings, 3 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-19 3:59 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Linux/PPC Development Geert Uytterhoeven writes: [snip] > +extern void *__ioremap_at(unsigned long phys, unsigned long size, > + unsigned long flags); > and wondered what __ioremap_at() is used for? It's nowhere defined. Ah, did that leak into the patch? It didn't need to go in but it doesn't hurt I guess. What I am intending to do is to map the I/O space of all the PCI host bridges in consecutive areas beginning at some address such as 0xff000000, with some amount of space such as 64kB or 1MB per bridge, whatever is appropriate. Then we adjust the I/O port numbers in the pci_dev structures by adding on host_bridge_nr * space_per_bridge. As a side effect, isa_io_base (which is really pci_io_base) becomes a constant. To do this we need a version of ioremap which takes a virtual address as well as a physical address. This is what __ioremap_at was intended to be. I got as far as duplicating the __ioremap declaration before deciding that I needed to think about it a bit more. :-) Specifically I need some input from people working on 8xx and prep systems as to whether this idea will cause any problems. It may make it more difficult to use a BAT to map I/O regions. I notice that prep still uses isa_io_base = 0x80000000 and maps the whole 256MB starting at 0x80000000 with a BAT. Paul. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 3:59 ` Paul Mackerras @ 2000-09-19 5:56 ` Michel Lanners 2000-09-19 14:28 ` Dan Malek 2000-09-19 22:06 ` Matt Porter 2 siblings, 0 replies; 74+ messages in thread From: Michel Lanners @ 2000-09-19 5:56 UTC (permalink / raw) To: paulus; +Cc: geert, linuxppc-dev Hi Paul, On 19 Sep, this message from Paul Mackerras echoed through cyberspace: > What I am intending to do is to map the I/O space of all the PCI host > bridges in consecutive areas beginning at some address such as > 0xff000000, with some amount of space such as 64kB or 1MB per bridge, > whatever is appropriate. Then we adjust the I/O port numbers in the > pci_dev structures by adding on host_bridge_nr * space_per_bridge. As > a side effect, isa_io_base (which is really pci_io_base) becomes a > constant. Good idea. Keep in mind however, that not all bridges reserve the same amount of space for IO. So you either need to cope with that, or chose smaller, fixed-size regions, and be prepared to squeeze all devices into that space. Which shouldn't be a problem, since I can't see why someone would want to access something larger than a few bytes via IO... Michel ------------------------------------------------------------------------- Michel Lanners | " Read Philosophy. Study Art. 23, Rue Paul Henkes | Ask Questions. Make Mistakes. L-1710 Luxembourg | email mlan@cpu.lu | http://www.cpu.lu/~mlan | Learn Always. " ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 3:59 ` Paul Mackerras 2000-09-19 5:56 ` Michel Lanners @ 2000-09-19 14:28 ` Dan Malek 2000-09-19 18:31 ` Roman Zippel 2000-09-19 22:06 ` Matt Porter 2 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-19 14:28 UTC (permalink / raw) To: paulus; +Cc: Geert Uytterhoeven, Linux/PPC Development Paul Mackerras wrote: > What I am intending to do is to map the I/O space of all the PCI host > bridges in consecutive areas beginning at some address such as > 0xff000000, Hmmm..... > To do this we need a version of ioremap which takes a virtual address > as well as a physical address. Hmmm..... > Specifically I need some input from people working on 8xx and prep > systems as to whether this idea will cause any problems. OK. On the 8xx, I currently rely on the effect that ioremap will map virt->phys 1:1 before the kernel VM is initialized. Often this is space above 0xf0000000. Since I understand what you are trying to do, I can probably change this will little effort. This brings up another question....what happens if you ioremap before VM is set up? Mapping through BATs usually covers most of this, but more processors arriving (the IBM 4xx) don't have BATs either so we rely on page tables of some sort. The PReP machines (and I thought most systems) flip back and forth between VM not/enabled during start up, and have a few things like UARTs mapped 1:1 virt->phys (at least for debug). In general, I like what you are doing because I am struggling to find a better I/O mapping solution for some of these embedded processors. In particular I need a bus_to_virt() (or whatever we call it) that can work on mapped addresses. I was thinking about fixing some virtual address ranges so I could do this more easily, and you are sort of doing the same. > ..... It may make > it more difficult to use a BAT to map I/O regions. I notice that prep > still uses isa_io_base = 0x80000000 and maps the whole 256MB starting > at 0x80000000 with a BAT. Yes, that is a pretty nice feature...... I'm thinking.....(I'm thinking that Matt Porter should provide some insight....he is used to working with Linux/PPC on systems with a dozen or so PCI bridges....and likes it :-). -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 14:28 ` Dan Malek @ 2000-09-19 18:31 ` Roman Zippel 2000-09-19 20:09 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Roman Zippel @ 2000-09-19 18:31 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Hi, > OK. On the 8xx, I currently rely on the effect that ioremap will > map virt->phys 1:1 before the kernel VM is initialized. I'm curious, what needs ioremap before the VM is ready? > Mapping through BATs usually covers most of this, but more processors > arriving (the IBM 4xx) don't have BATs either so we rely on page > tables of some sort. Shouldn't it be possible, to add such stuff directly to the hash table and add the official mapping later? BTW the whole mm stuff really needs a big cleanup, but I don't really want to touch it, as I could only test this on my APUS machine, which has only the CPU in common with the other machines. bye, Roman ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 18:31 ` Roman Zippel @ 2000-09-19 20:09 ` Dan Malek 2000-09-19 23:42 ` Roman Zippel 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-19 20:09 UTC (permalink / raw) To: Roman Zippel; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Roman Zippel wrote: > I'm curious, what needs ioremap before the VM is ready? The IMMR (internal memory map to almost everything in the chip) has to be mapped to provide access to a variety of bits for initialization. On some boards, the board control/status register has to be mapped and configured. I hope people don't forget that this is done on other platforms with BATs as well, it just isn't as obvious as the 4xx/8xx. > Shouldn't it be possible, to add such stuff directly to the hash table > and add the official mapping later? In many cases (and certainly the 8xx) the mapping done early is assumed to be the address used throughout the life of the system. Using one set of mapping early, and then something else later is quite confusing when you have global pointers like immr, you have to update internal processor registers when it changes, and internal devices that you previously initialized use the old value. > BTW the whole mm stuff really needs a big cleanup, Heh...This quote has been in e-mail messages for years :-). I don't think we need lots of changes, but it should continue to evolve into something more efficient. -- Dan -- I like MMUs because I don't have a real life. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 20:09 ` Dan Malek @ 2000-09-19 23:42 ` Roman Zippel 2000-09-20 0:10 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Roman Zippel @ 2000-09-19 23:42 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Hi, > > Shouldn't it be possible, to add such stuff directly to the hash table > > and add the official mapping later? > > In many cases (and certainly the 8xx) the mapping done early is assumed > to be the address used throughout the life of the system. Using one > set of mapping early, and then something else later is quite confusing > when you have global pointers like immr, you have to update internal > processor registers when it changes, and internal devices that you > previously initialized use the old value. Fixed addresses shouldn't be that much of a problem. Either you allocate them somewhere after VMALLOC_END or you adjust VMALLOC_START. I think currently the first is done. Anyway, IMO the map_page() call can IMO be delayed, we would only need a C implementation from parts hashtable.S, what might be usefull for other stuff as well. > > BTW the whole mm stuff really needs a big cleanup, > > Heh...This quote has been in e-mail messages for years :-). > > I don't think we need lots of changes, but it should continue to > evolve into something more efficient. Something I would like to throw out first is mem_pieces.c. It can be replaced now with the new bootmem stuff, but that would reqire to find some piece unused piece of memory for the bootmem map, everything else can then be nicely done with bootmem. Most of the functions in mem_pieces.c are also in bootmem.c now, except mem_pieces_sort()/mem_pieces_coalesce(). But the funny thing here is, these two functions are only called by get_mem_prop(), which is only called by pmac_find_end_of_memory(), which then is completly confused, if doesn't find a single piece of memory starting at zero. I could do the generic part, but I simply don't know all the machine specific details, besides what I can read/guess from the source (I'm already responsible for all the bugs in the m68k implementation, so I have some experience with it. :) ) bye, Roman ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 23:42 ` Roman Zippel @ 2000-09-20 0:10 ` Dan Malek 2000-09-20 17:18 ` Roman Zippel 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-20 0:10 UTC (permalink / raw) To: Roman Zippel; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Roman Zippel wrote: > Fixed addresses shouldn't be that much of a problem. They are no problem today....It would be nice if it stayed that way :-). > .... we would only need a C implementation from parts hashtable.S, > what might be usefull for other stuff as well. Except only the 7xx (601?, 604) use the hastable. It's not a generic MMU place. > Something I would like to throw out first is mem_pieces.c.... > ... pmac_find_end_of_memory(), which then is completly confused, I am usually confused by this point, too :-). I then just tip-toe away thinking there are better places I can spend my time. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 0:10 ` Dan Malek @ 2000-09-20 17:18 ` Roman Zippel 2000-09-20 18:11 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Roman Zippel @ 2000-09-20 17:18 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Hi, > Except only the 7xx (601?, 604) use the hastable. It's not a generic > MMU place. Hmm, I think I have to order some new documentation. :-) What are some important cpus I should look into (except 60x)? bye, Roman ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 17:18 ` Roman Zippel @ 2000-09-20 18:11 ` Dan Malek 2000-09-20 20:22 ` Roman Zippel 2000-09-20 20:41 ` David Edelsohn 0 siblings, 2 replies; 74+ messages in thread From: Dan Malek @ 2000-09-20 18:11 UTC (permalink / raw) To: Roman Zippel; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Roman Zippel wrote: > Hmm, I think I have to order some new documentation. :-) > What are some important cpus I should look into (except 60x)? None of the embedded PowerPCs use a hash table, and the Book E processors don't either. We avoid using the hash table when given the option, like 603s. The "hardware assist" and subsequent hash table is a PITA for Linux VM, and Cort has documented the performance benefits of not using it. -- Dan -- I like MMUs because I don't have a real life. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 18:11 ` Dan Malek @ 2000-09-20 20:22 ` Roman Zippel 2000-09-20 20:41 ` David Edelsohn 1 sibling, 0 replies; 74+ messages in thread From: Roman Zippel @ 2000-09-20 20:22 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Hi, On Wed, 20 Sep 2000, Dan Malek wrote: > None of the embedded PowerPCs use a hash table, and the Book E > processors don't either. We avoid using the hash table when given > the option, like 603s. The "hardware assist" and subsequent hash > table is a PITA for Linux VM, and Cort has documented the performance > benefits of not using it. Oh, I forgot about that part. On m68k we have special init functions for that, which use alloc_bootmem_low_pages(). It maybe should be splitted for ppc too, so that we have a special (and very simple) remap function, that is only used during very early initialization and all the mem_init_done checks could be removed. bye, Roman ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 18:11 ` Dan Malek 2000-09-20 20:22 ` Roman Zippel @ 2000-09-20 20:41 ` David Edelsohn 2000-09-21 2:16 ` Dan Malek 1 sibling, 1 reply; 74+ messages in thread From: David Edelsohn @ 2000-09-20 20:41 UTC (permalink / raw) To: Dan Malek; +Cc: Roman Zippel, paulus, Geert Uytterhoeven, Linux/PPC Development >>>>> Dan Malek writes: Dan> None of the embedded PowerPCs use a hash table, and the Book E Dan> processors don't either. We avoid using the hash table when given Dan> the option, like 603s. The "hardware assist" and subsequent hash Dan> table is a PITA for Linux VM, and Cort has documented the performance Dan> benefits of not using it. Avoiding the hash table on PowerPC only is an advantage because the PowerPC Linux kernel implements some "optimizations" which do not match the PowerPC architecture very well. Avoiding the hash tables recovers some of the performance lost because of the earlier work, so it is a relative benefit but not an absolute benefit over a better kernel design. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 20:41 ` David Edelsohn @ 2000-09-21 2:16 ` Dan Malek 2000-09-21 2:26 ` David Edelsohn 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-21 2:16 UTC (permalink / raw) To: David Edelsohn Cc: Roman Zippel, paulus, Geert Uytterhoeven, Linux/PPC Development David Edelsohn wrote: > Avoiding the hash table on PowerPC only is an advantage because > the PowerPC Linux kernel implements some "optimizations" which do not > match the PowerPC architecture very well. Right....I thought I clearly said "Linux VM". If the main platform of focus and development was PowerPC (although I thought it was :-), the Linux VM subsystem would look a little different.... -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 2:16 ` Dan Malek @ 2000-09-21 2:26 ` David Edelsohn 2000-09-21 2:40 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: David Edelsohn @ 2000-09-21 2:26 UTC (permalink / raw) To: Dan Malek; +Cc: Roman Zippel, paulus, Geert Uytterhoeven, Linux/PPC Development Apparently I was too ambiguous in my reply. The mis-optimization is PowerPC Linux-specific, this is not a general Linux kernel or Linux VM problem. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 2:26 ` David Edelsohn @ 2000-09-21 2:40 ` Dan Malek 2000-09-21 3:53 ` David Edelsohn 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-21 2:40 UTC (permalink / raw) To: David Edelsohn Cc: Roman Zippel, paulus, Geert Uytterhoeven, Linux/PPC Development David Edelsohn wrote: > ...... The mis-optimization > is PowerPC Linux-specific, this is not a general Linux kernel or Linux VM > problem. Really (seriously)? I always thought if the Linux VM used a different structure and method of accessing page tables (upper entries are more than just a software address pointer for example), we could really optimize the PowerPC kernel. Toss out some ideas......I'm listening :-). -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 2:40 ` Dan Malek @ 2000-09-21 3:53 ` David Edelsohn 0 siblings, 0 replies; 74+ messages in thread From: David Edelsohn @ 2000-09-21 3:53 UTC (permalink / raw) To: Dan Malek; +Cc: Roman Zippel, paulus, Geert Uytterhoeven, Linux/PPC Development >>>>> Dan Malek writes: Dan> Really (seriously)? I always thought if the Linux VM used a different Dan> structure and method of accessing page tables (upper entries are more Dan> than just a software address pointer for example), we could really Dan> optimize the PowerPC kernel. Dan> Toss out some ideas......I'm listening :-). PowerPC Linux appears to be generating many hash table misses because it allocates a new VSID rather than unmap multiple pages from the page table. This also meants that PowerPC Linux cannot be exploiting the dirty bit in the page/hash table entry and presumably encounters double misses on write faults. In the K42 Research Operating System, on which I work, we assume that hash table misses are so infrequent that we handle them as in-core page faults. With a hash table 4 times the size of physical memory, and a good spread of entries across them, this seems reasonable. On the PowerPC architecture, one normally should not be encountering enough hash table misses so that handling them quickly is an issue. The software TLB trick using the Linux VM page table is neat, but with speculative execution and superscalar, highly-pipelined processors, handling them in SW means that you suffer a huge performance penalty because you introduce a barrier / bubble on every TLB miss. With hardware TLB handling, the processor can freeze the pipeline and handle the miss with much reduced cost. David ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 3:59 ` Paul Mackerras 2000-09-19 5:56 ` Michel Lanners 2000-09-19 14:28 ` Dan Malek @ 2000-09-19 22:06 ` Matt Porter 2000-09-19 22:58 ` Paul Mackerras 2000-09-20 12:08 ` Geert Uytterhoeven 2 siblings, 2 replies; 74+ messages in thread From: Matt Porter @ 2000-09-19 22:06 UTC (permalink / raw) To: Paul Mackerras; +Cc: Geert Uytterhoeven, Linux/PPC Development On Tue, Sep 19, 2000 at 02:59:02PM +1100, Paul Mackerras wrote: > What I am intending to do is to map the I/O space of all the PCI host > bridges in consecutive areas beginning at some address such as > 0xff000000, with some amount of space such as 64kB or 1MB per bridge, > whatever is appropriate. Then we adjust the I/O port numbers in the > pci_dev structures by adding on host_bridge_nr * space_per_bridge. As > a side effect, isa_io_base (which is really pci_io_base) becomes a > constant. Specifically, what problem are you trying to solve with this implementation? I gather that we're talking about the legacy I/O problem that the kernel has. What are the cases where you need to use in*/out* calls targetting devices on host bridges other than the "primary" one? I recently did a port to the SBS K2 cPCI board which involves the IBM CPC710 dual host bridge. Due to the multi host bridge legacy I/O difficulty, a documented assumption is that legacy I/O calls would only be used on the "primary" host bridge. I realize there are plenty of drivers (like de4x5) that insist on using inw/outw (and thus break on host bridge 2) but these drivers should be fixed. In the long run, the legacy I/O compatibility calls need to be completely wiped out for non-x86 architectures. It is all memory mapped after all and only a few drivers like serial and IDE would need separate low level access paths for x86/non-x86 architectures. If we want a short term solution why not just per architecture fixups to the pci_dev entries with the appropriate offsets when on a multi host bridge machine? This allows each architecture/board maximum flexibility with their memory map and BAT settings. > To do this we need a version of ioremap which takes a virtual address > as well as a physical address. This is what __ioremap_at was intended > to be. I got as far as duplicating the __ioremap declaration before > deciding that I needed to think about it a bit more. :-) > > Specifically I need some input from people working on 8xx and prep > systems as to whether this idea will cause any problems. It may make > it more difficult to use a BAT to map I/O regions. I notice that prep > still uses isa_io_base = 0x80000000 and maps the whole 256MB starting > at 0x80000000 with a BAT. If we can't cover I/O with a BAT then it will definitely have some ramifications with serial ports in the legacy I/O range among other things. -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 22:06 ` Matt Porter @ 2000-09-19 22:58 ` Paul Mackerras 2000-09-20 6:12 ` Matt Porter ` (2 more replies) 2000-09-20 12:08 ` Geert Uytterhoeven 1 sibling, 3 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-19 22:58 UTC (permalink / raw) To: Linux/PPC Development Matt Porter writes: > On Tue, Sep 19, 2000 at 02:59:02PM +1100, Paul Mackerras wrote: > > What I am intending to do is to map the I/O space of all the PCI host > > bridges in consecutive areas beginning at some address such as > > 0xff000000, with some amount of space such as 64kB or 1MB per bridge, > > whatever is appropriate. Then we adjust the I/O port numbers in the > > pci_dev structures by adding on host_bridge_nr * space_per_bridge. As > > a side effect, isa_io_base (which is really pci_io_base) becomes a > > constant. > > Specifically, what problem are you trying to solve with this > implementation? I gather that we're talking about the legacy I/O > problem that the kernel has. What are the cases where you need > to use in*/out* calls targetting devices on host bridges other > than the "primary" one? This is the situation: you have a machine with 2 or more PCI host bridges. You plug a board into the PCI bus behind the 2nd host bridge. The board has registers in PCI I/O space. An address is assigned for those registers in the BAR in config space. If you read the BAR and then do an inb from that port number, you don't get the I/O port on your board. One solution that has been proposed is to set the base I/O port number in the pci_dev structure to be actually the virtual address where you can access that I/O port. I don't like that solution because it means that drivers for legacy PC-style devices can't do inb/outb to the usual well-known port numbers and find the device they expect. For example, inb(0x3f8) won't access the first serial port (this is on machines such as prep and some chrp which have a lot of PC-style devices). My solution is to allocate say 1MB of I/O space per host bridge and then adjust the pci_dev structures for the devices behind the 2nd and subsequent host bridges. So for example a board that has I/O ports at 0x1000 behind the 2nd host bridge would end up with its pci_dev->resource[0].start == 0x101000. The virtual <-> physical mappings are set up so that the 2nd host bridge's I/O space is mapped in starting at 0xff100000. The result is that doing inb(0x101000) accesses the device as expected. > I recently did a port to the SBS K2 cPCI board which involves the > IBM CPC710 dual host bridge. Due to the multi host bridge legacy > I/O difficulty, a documented assumption is that legacy I/O calls > would only be used on the "primary" host bridge. I realize there > are plenty of drivers (like de4x5) that insist on using inw/outw > (and thus break on host bridge 2) but these drivers should be > fixed. Fixed how? I mean, how are you generically going to access PCI I/O space without using inw/outw etc.? On intel that is the only possible way to do it so I just cannot see that you will persuade driver authors that they shouldn't use inw/outw. > In the long run, the legacy I/O compatibility calls need to be > completely wiped out for non-x86 architectures. It is all > memory mapped after all and only a few drivers like serial and > IDE would need separate low level access paths for x86/non-x86 > architectures. I disagree. There will always be PCI devices with registers in PCI I/O space. I really don't want to see a situation where drivers get littered with ifdefs because you have to use different access functions for accessing PCI I/O space on non-intel machines and intel machines. That would be just totally counter-productive. > If we can't cover I/O with a BAT then it will definitely have some > ramifications with serial ports in the legacy I/O range among other > things. Why? Are you concerned about very early debugging? If not, what? Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 22:58 ` Paul Mackerras @ 2000-09-20 6:12 ` Matt Porter 2000-09-20 12:15 ` Geert Uytterhoeven 2000-09-20 23:08 ` Paul Mackerras 2000-09-20 8:34 ` Roman Zippel 2000-09-20 15:56 ` Dan Malek 2 siblings, 2 replies; 74+ messages in thread From: Matt Porter @ 2000-09-20 6:12 UTC (permalink / raw) To: Paul Mackerras; +Cc: Linux/PPC Development On Wed, Sep 20, 2000 at 09:58:07AM +1100, Paul Mackerras wrote: > > Matt Porter writes: > > > On Tue, Sep 19, 2000 at 02:59:02PM +1100, Paul Mackerras wrote: > > > What I am intending to do is to map the I/O space of all the PCI host > > > bridges in consecutive areas beginning at some address such as > > > 0xff000000, with some amount of space such as 64kB or 1MB per bridge, > > > whatever is appropriate. Then we adjust the I/O port numbers in the > > > pci_dev structures by adding on host_bridge_nr * space_per_bridge. As > > > a side effect, isa_io_base (which is really pci_io_base) becomes a > > > constant. > > > > Specifically, what problem are you trying to solve with this > > implementation? I gather that we're talking about the legacy I/O > > problem that the kernel has. What are the cases where you need > > to use in*/out* calls targetting devices on host bridges other > > than the "primary" one? > > This is the situation: you have a machine with 2 or more PCI host > bridges. You plug a board into the PCI bus behind the 2nd host > bridge. The board has registers in PCI I/O space. An address is > assigned for those registers in the BAR in config space. If you read > the BAR and then do an inb from that port number, you don't get the > I/O port on your board. Ok then, that's what I described. > One solution that has been proposed is to set the base I/O port number > in the pci_dev structure to be actually the virtual address where you > can access that I/O port. I don't like that solution because it means > that drivers for legacy PC-style devices can't do inb/outb to the > usual well-known port numbers and find the device they expect. For > example, inb(0x3f8) won't access the first serial port (this is on > machines such as prep and some chrp which have a lot of PC-style > devices). Ick...agree here. Of course, the serial driver could be changed to do other than inb/outb's for other archs and be passed the memory mapped address of the ports. > My solution is to allocate say 1MB of I/O space per host bridge and > then adjust the pci_dev structures for the devices behind the 2nd and > subsequent host bridges. So for example a board that has I/O ports at > 0x1000 behind the 2nd host bridge would end up with its > pci_dev->resource[0].start == 0x101000. The virtual <-> physical > mappings are set up so that the 2nd host bridge's I/O space is mapped > in starting at 0xff100000. The result is that doing inb(0x101000) > accesses the device as expected. This looks to be a good solution if we can't resolve to make the kernel less legacy-oriented. I changed the default K2 board memory map to do a similar thing by orienting the 2nd host bridge right after the 1st in the physical map. The virt->phys mapping was then 1:1 and things work as you describe above. Obviously, this approach won't work when you have separate bridge ASICs since you can't get the memory map granularity that I had in the dual host bridge package. > > I recently did a port to the SBS K2 cPCI board which involves the > > IBM CPC710 dual host bridge. Due to the multi host bridge legacy > > I/O difficulty, a documented assumption is that legacy I/O calls > > would only be used on the "primary" host bridge. I realize there > > are plenty of drivers (like de4x5) that insist on using inw/outw > > (and thus break on host bridge 2) but these drivers should be > > fixed. > > Fixed how? I mean, how are you generically going to access PCI I/O > space without using inw/outw etc.? On intel that is the only possible > way to do it so I just cannot see that you will persuade driver > authors that they shouldn't use inw/outw. Well, on drivers like the de4x5 you switch to use the mem bar instead of the I/O bar and then it's generic. There are lots of drivers that given the choice, chose to use I/O calls. These can easily be fixed. > > In the long run, the legacy I/O compatibility calls need to be > > completely wiped out for non-x86 architectures. It is all > > memory mapped after all and only a few drivers like serial and > > IDE would need separate low level access paths for x86/non-x86 > > architectures. > > I disagree. There will always be PCI devices with registers in PCI > I/O space. I really don't want to see a situation where drivers get > littered with ifdefs because you have to use different access > functions for accessing PCI I/O space on non-intel machines and intel > machines. That would be just totally counter-productive. PCI I/O is memory mapped on everything but x86. In most cases PCI devices provide bars that map the same set of registers into both I/O space and mem space. That leaves only the true legacy devices which have no memory bars to be concerned with. It's a smaller set of drivers that need different low level access methods than you're making it out to be. > > If we can't cover I/O with a BAT then it will definitely have some > > ramifications with serial ports in the legacy I/O range among other > > things. > > Why? Are you concerned about very early debugging? If not, what? Sure, that is a convenience of the BAT mapping. It can all be fixed though if that's the answer you're looking for. :) -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 6:12 ` Matt Porter @ 2000-09-20 12:15 ` Geert Uytterhoeven 2000-09-20 23:08 ` Paul Mackerras 1 sibling, 0 replies; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-20 12:15 UTC (permalink / raw) To: Matt Porter; +Cc: Paul Mackerras, Linux/PPC Development On Tue, 19 Sep 2000, Matt Porter wrote: > On Wed, Sep 20, 2000 at 09:58:07AM +1100, Paul Mackerras wrote: > > Matt Porter writes: > > > On Tue, Sep 19, 2000 at 02:59:02PM +1100, Paul Mackerras wrote: > > One solution that has been proposed is to set the base I/O port number > > in the pci_dev structure to be actually the virtual address where you > > can access that I/O port. I don't like that solution because it means > > that drivers for legacy PC-style devices can't do inb/outb to the > > usual well-known port numbers and find the device they expect. For > > example, inb(0x3f8) won't access the first serial port (this is on > > machines such as prep and some chrp which have a lot of PC-style > > devices). Another option would be to use a translation table. If you limit I/O space to 64 kB (like on PC), you waste only 256 kB of memory (assuming a table with 32-bit pointers). With a multi-level tree or a smarter pointer scheme, you can limit the waste even more. Since I/O accesses are intrinsically slow, the overhead is minimal. I agree this solution is not optimal, though. > Ick...agree here. Of course, the serial driver could be changed to > do other than inb/outb's for other archs and be passed the memory > mapped address of the ports. These days serial.c does support MMIO on PCI devices. In fact it supports everything that looks even remotely like a NS16550 (hence it should be renamed 16550.c), both in I/O and in memory space. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 6:12 ` Matt Porter 2000-09-20 12:15 ` Geert Uytterhoeven @ 2000-09-20 23:08 ` Paul Mackerras 2000-09-21 20:12 ` Matt Porter 1 sibling, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-09-20 23:08 UTC (permalink / raw) To: Matt Porter; +Cc: Linux/PPC Development Matt Porter writes: > Ick...agree here. Of course, the serial driver could be changed to > do other than inb/outb's for other archs and be passed the memory > mapped address of the ports. And the parallel port driver, and the floppy driver, and the keyboard/mouse drivers, and the ... > This looks to be a good solution if we can't resolve to make the kernel > less legacy-oriented. I changed the default K2 board memory map to If we only supported powermacs, and not preps or chrps, then we probably could make the kernel less "legacy-oriented". :-) :-) > do a similar thing by orienting the 2nd host bridge right after the > 1st in the physical map. The virt->phys mapping was then 1:1 and things > work as you describe above. Obviously, this approach won't work when > you have separate bridge ASICs since you can't get the memory map > granularity that I had in the dual host bridge package. So we set up the virtual -> physical mapping to suit ourselves. Not a problem. > Well, on drivers like the de4x5 you switch to use the mem bar instead > of the I/O bar and then it's generic. There are lots of drivers that > given the choice, chose to use I/O calls. These can easily be fixed. But what is wrong with using registers in I/O space and accessing them with in*/out*? I don't understand the aversion some people have shown to using in*/out*. They are the accessor functions for PCI I/O space, that's all. Whatever happens, we need accessor functions for PCI I/O space. We can call them something different or make them more complicated but that would just create extra work for ourselves and make it harder to port drivers between architectures. > PCI I/O is memory mapped on everything but x86. In most cases PCI So??? What we have is a level of abstraction that hides (from driver authors) exactly how PCI memory and I/O space are accessed. We have readb/writeb etc. for PCI memory space and inb/outb etc. for PCI I/O space. The fact that on PPC both are mapped into the processor's address space and accessed using loads and stores is something that drivers should not care in the slightest about. And in fact these spaces are not accessed this way on some other non-intel platforms; e.g. sparc64 uses special ldasi/stasi instructions. > devices provide bars that map the same set of registers into both > I/O space and mem space. That leaves only the true legacy devices > which have no memory bars to be concerned with. It's a smaller > set of drivers that need different low level access methods than > you're making it out to be. The fact remains that a lot of drivers will continue to use inb/outb etc., since they are developed primarily on the intel platform. It makes sense for us to have an inb/outb that works similarly to intel, so that we can use as many drivers as possible with as little pain as possible. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 23:08 ` Paul Mackerras @ 2000-09-21 20:12 ` Matt Porter 0 siblings, 0 replies; 74+ messages in thread From: Matt Porter @ 2000-09-21 20:12 UTC (permalink / raw) To: Paul Mackerras; +Cc: Linux/PPC Development On Thu, Sep 21, 2000 at 10:08:37AM +1100, Paul Mackerras wrote: > > Matt Porter writes: > > > Ick...agree here. Of course, the serial driver could be changed to > > do other than inb/outb's for other archs and be passed the memory > > mapped address of the ports. > > And the parallel port driver, and the floppy driver, and the > keyboard/mouse drivers, and the ... I'm not trying to say it's easy... :) It is a limited set of drivers though. > > This looks to be a good solution if we can't resolve to make the kernel > > less legacy-oriented. I changed the default K2 board memory map to > > If we only supported powermacs, and not preps or chrps, then we > probably could make the kernel less "legacy-oriented". :-) :-) > > > do a similar thing by orienting the 2nd host bridge right after the > > 1st in the physical map. The virt->phys mapping was then 1:1 and things > > work as you describe above. Obviously, this approach won't work when > > you have separate bridge ASICs since you can't get the memory map > > granularity that I had in the dual host bridge package. > > So we set up the virtual -> physical mapping to suit ourselves. Not a > problem. Agreed. I think you've convinced me that the virtual contortions of I/O space is the best way to go in we have to live with in*/out*. > > Well, on drivers like the de4x5 you switch to use the mem bar instead > > of the I/O bar and then it's generic. There are lots of drivers that > > given the choice, chose to use I/O calls. These can easily be fixed. > > But what is wrong with using registers in I/O space and accessing them > with in*/out*? I don't understand the aversion some people have shown > to using in*/out*. They are the accessor functions for PCI I/O space, > that's all. Whatever happens, we need accessor functions for PCI I/O > space. We can call them something different or make them more > complicated but that would just create extra work for ourselves and > make it harder to port drivers between architectures. We will need accessor functions, but the parameters of in*/out* are limiting for other architectures. That's been my point. I'm trying to be forward looking here so we don't do this all over again. > > PCI I/O is memory mapped on everything but x86. In most cases PCI > > So??? What we have is a level of abstraction that hides (from driver > authors) exactly how PCI memory and I/O space are accessed. We have > readb/writeb etc. for PCI memory space and inb/outb etc. for PCI I/O > space. The fact that on PPC both are mapped into the processor's > address space and accessed using loads and stores is something that > drivers should not care in the slightest about. And in fact these > spaces are not accessed this way on some other non-intel platforms; > e.g. sparc64 uses special ldasi/stasi instructions. Ok. > > devices provide bars that map the same set of registers into both > > I/O space and mem space. That leaves only the true legacy devices > > which have no memory bars to be concerned with. It's a smaller > > set of drivers that need different low level access methods than > > you're making it out to be. > > The fact remains that a lot of drivers will continue to use inb/outb > etc., since they are developed primarily on the intel platform. It > makes sense for us to have an inb/outb that works similarly to intel, > so that we can use as many drivers as possible with as little pain as > possible. I'm worn out. I've been convinced that we'll have to live with it. -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 22:58 ` Paul Mackerras 2000-09-20 6:12 ` Matt Porter @ 2000-09-20 8:34 ` Roman Zippel 2000-09-20 22:54 ` Paul Mackerras 2000-09-20 15:56 ` Dan Malek 2 siblings, 1 reply; 74+ messages in thread From: Roman Zippel @ 2000-09-20 8:34 UTC (permalink / raw) To: Paul Mackerras; +Cc: Linux/PPC Development Hi, > For > example, inb(0x3f8) won't access the first serial port (this is on > machines such as prep and some chrp which have a lot of PC-style > devices). You know that the serial driver supports pci cards? So it's somehow a bad example and even for broken drivers it shouldn't be that much of a problem to add: #if __broken__ #define xxx_inb(base, port) inb(port) #else #define xxx_inb(base, port) readb(base + port) #endif All the 8390 based drivers do something like this. (Although currently they abuse inb/outb for that.) bye, Roman ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 8:34 ` Roman Zippel @ 2000-09-20 22:54 ` Paul Mackerras 0 siblings, 0 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-20 22:54 UTC (permalink / raw) To: Roman Zippel; +Cc: Linux/PPC Development Roman Zippel writes: > > example, inb(0x3f8) won't access the first serial port (this is on > > machines such as prep and some chrp which have a lot of PC-style > > devices). > > You know that the serial driver supports pci cards? So it's somehow a bad > example and even for broken drivers it shouldn't be that much of a problem > to add: You know that prep and chrp machines usually have a super-I/O chip that has serial ports at I/O 0x3f8 and 0x2f8? How does support for pci cards in the serial driver help you there? > #if __broken__ > #define xxx_inb(base, port) inb(port) > #else > #define xxx_inb(base, port) readb(base + port) > #endif I can just see Ted T'so adding that extra cruft to the serial driver, and then Linus accepting the patch. Not. > All the 8390 based drivers do something like this. (Although currently > they abuse inb/outb for that.) Inb/outb are the *correct* things to use for accessing PCI I/O space. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 22:58 ` Paul Mackerras 2000-09-20 6:12 ` Matt Porter 2000-09-20 8:34 ` Roman Zippel @ 2000-09-20 15:56 ` Dan Malek 2000-09-20 23:22 ` Paul Mackerras 2 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-20 15:56 UTC (permalink / raw) To: paulus; +Cc: Linux/PPC Development Paul Mackerras wrote: > .... The virtual <-> physical > mappings are set up so that the 2nd host bridge's I/O space is mapped > in starting at 0xff100000. The result is that doing inb(0x101000) > accesses the device as expected. You still have to be careful here. Drivers that are written to do this may also make the assumption they can store that "address" in a 16-bit (signed, even worse) variable. You will have to change drivers (or structures) in this case. I don't so much care if in/out is used/abused, but I think device drivers should be written such that they ask for addresses through the PCI (or other I/O) subsystem, rather than just grab BARs and expect to use them. The x86 in/out was stupid back in 1983, and is even less useful today. It is lots easier to adopt a memory mapped model and adapt it to x86, than for the rest of us to keep trying to create contortions of an address map just so we can use poorly written x86 device drivers. I think it is easier to update a driver, which I have done on a couple of occasions, to be more portable than to try and find ways to use it without making any changes. The authors of the drivers have even accepted this in some cases :-). These address mapping hacks were fine years ago when there was only one PCI bus at a fixed address in the system. Today, there are lots of busses, with transparent (or not) bridges, and we have a PCI subsystem in Linux that is maturing into a really useful set of functions. The embedded CompactPCI systems are way ahead of workstations in terms of complexity of PCI (and other) bus structures. In these environments we rely heavily on the virtual mapping of I/O, and if you have a BAR it doesn't mean much to your driver. I would like to see us look toward the future, to a model where we map I/O through the VM subsystem, instead of trying to find hacks to support addressing assumptions that just aren't valid any longer. -- Dan -- I like MMUs because I don't have a real life. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 15:56 ` Dan Malek @ 2000-09-20 23:22 ` Paul Mackerras 2000-09-21 2:13 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-09-20 23:22 UTC (permalink / raw) To: Dan Malek; +Cc: Linux/PPC Development Dan Malek writes: > You still have to be careful here. Drivers that are written to > do this may also make the assumption they can store that "address" > in a 16-bit (signed, even worse) variable. You will have to change > drivers (or structures) in this case. Sure. Fortunately PPC isn't the only architecture where I/O ports can be > 16 bits, I believe sparc64 runs into this too. In fact sparc64 needs 32-bit interrupt numbers rather than 8-bit, as well. > I don't so much care if in/out is used/abused, but I think device > drivers should be written such that they ask for addresses through > the PCI (or other I/O) subsystem, rather than just grab BARs and Definitely, for drivers for PCI devices. > expect to use them. The x86 in/out was stupid back in 1983, and is > even less useful today. It is lots easier to adopt a memory mapped > model and adapt it to x86, than for the rest of us to keep trying to > create contortions of an address map just so we can use poorly written "Virtual != physical" is "contortions" ??? > x86 device drivers. I think it is easier to update a driver, which > I have done on a couple of occasions, to be more portable than to > try and find ways to use it without making any changes. The authors > of the drivers have even accepted this in some cases :-). I suggest you post a patch on linux-kernel to change all the device drivers to use memory-mapped I/O. I wish you luck. :-) :-) > These address mapping hacks were fine years ago when there was > only one PCI bus at a fixed address in the system. Today, there > are lots of busses, with transparent (or not) bridges, and we have > a PCI subsystem in Linux that is maturing into a really useful set of > functions. The embedded CompactPCI systems are way ahead of workstations > in terms of complexity of PCI (and other) bus structures. In these > environments we rely heavily on the virtual mapping of I/O, and if > you have a BAR it doesn't mean much to your driver. I would like to > see us look toward the future, to a model where we map I/O through > the VM subsystem, instead of trying to find hacks to support addressing > assumptions that just aren't valid any longer. You'll have to explain this a little more. When you say "virtual mapping of I/O", do you mean I/O devices in PCI memory space or in PCI I/O space? We (of course) already map I/O registers in PCI memory space through the VM subsystem. Are you talking about PCI I/O space? What sorts of VM mapping tricks do you want to do for PCI I/O space, and why? Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 23:22 ` Paul Mackerras @ 2000-09-21 2:13 ` Dan Malek 2000-09-21 2:35 ` Paul Mackerras 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-21 2:13 UTC (permalink / raw) To: paulus; +Cc: Linux/PPC Development Paul Mackerras wrote: > "Virtual != physical" is "contortions" ??? No, I mean the evolution of address maps of platforms has resulted in holes, restrictions, and just some weird things unique to any one of them. People make assumptions that a particular device always resides at a certain address, so they either hard code that or take short cuts based upon those assumptions. I think virtual != physical is fine and it should _always_ be that way. Further, a BAR is a physical PCI address, you should never assume you can use that directly. > I suggest you post a patch on linux-kernel to change all the device > drivers to use memory-mapped I/O. I wish you luck. :-) :-) As others have mentioned, we don't use all of the drivers in this manner. There are some legacy drivers that have worked well given the PReP/CHRP/PMac mapping hacks we have done in the past. With the new PowerMacs in particular, we now have a few drivers that need a little more work. As I said, I have updated some of these. I will probably update some more in the future. We need to adapt those that are immediately necessary, and work with the other non-x86 folks to see what they are doing about it. I agree, you can't roll in and change all of this overnight, but if you don't try to move in a better direction to address the complexity, you end up with something that looks like MS/DOS (or worse, Windows :-). > You'll have to explain this a little more. When you say "virtual > mapping of I/O", do you mean I/O devices in PCI memory space or in PCI > I/O space? I mean everywhere. The PCI (or ISA, or any bus) should have a resource map (or data base or whatever you want to call it) of devices, addresses and attributes. A driver should ask for these to be mapped (at some arbitrary virtual address) and then use the supplied virtual address. A driver should never simply 'inb(SERIAL_PORT_STATUS)' using some #define, or even with a partial address offset from some unknown base address. I don't have all of the "how do we do this" answers, yet, but we have to break the habit of memory mapping assumptions. In some cases, you can't autodetect or flexibly move things around, and this should be hidden inside the platform specific resource manager, not assumed by device drivers that we want to keep portable. > .... We (of course) already map I/O registers in PCI memory > space through the VM subsystem. Are you talking about PCI I/O space? We actually VM map everything right now, even if it is through BATs or page tables. There are just assumptions built in that people "utilize" :-). > What sorts of VM mapping tricks do you want to do for PCI I/O space, > and why? I don't think I would call it "tricks", but we need some layers of translation and flexibility. The "trick" you have been proposing for PMac will work fine there, but won't work many other places because the bridges or systems don't have the flexibility. My point is that you can do that on the PMac, but that assumption shouldn't find it's way into the in/out read/write macros. The in/out macros should either map to in/out x86 instructions, or simply a memory access with any barrier instructions necessary. When a driver asks for the address of that serial port on PCI bus 1, you can give them the 0xff10xxxx address. When that same driver asks that question on a 8260 with PowerSPAN PCI bridge, it will get a very different address. In this latter case, if they ask for the serial port on PCI bus 2, they are likely to get something that isn't even a reasonable address calculation from the previous. Done correctly, you could even make some drivers switch from using I/O space to using memory mapped space, depending upon how the system resources can be allocated, without changing the driver. Unfortunately, too much of this information is coded into drivers today. Treat I/O in the kernel just like you would if using mmap() from a user application. You request access to a device, you get an address, you use it. I don't personally like in/out read/write macros doing address arithmetic to help me out. Mapping these macros to processor specific I/O instructions is OK (although asking me to use in/out on PowerPC is kind of like asking the x86 person to use 'mtspr'). Although it doesn't result in portable drivers, people have asked to get ready to use mapped addresses to devices so they can manage their own memory barriers and take advantage of deep FIFOs in bridges for throughput rather than use any of the I/O macros. This would also allow it. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 2:13 ` Dan Malek @ 2000-09-21 2:35 ` Paul Mackerras 2000-09-21 3:57 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-09-21 2:35 UTC (permalink / raw) To: Dan Malek; +Cc: Linux/PPC Development Dan Malek writes: > No, I mean the evolution of address maps of platforms has resulted > in holes, restrictions, and just some weird things unique to any one > of them. At the physical address level? Can't we hide that at the virtual address level? > People make assumptions that a particular device always > resides at a certain address, so they either hard code that or take > short cuts based upon those assumptions. If you have a super-I/O chip with a serial port at I/O address 0x3f8 (for example), you just have to know that number, there's nothing that is going to tell you. > As others have mentioned, we don't use all of the drivers in this > manner. There are some legacy drivers that have worked well given > the PReP/CHRP/PMac mapping hacks we have done in the past. With the new > PowerMacs in particular, we now have a few drivers that need a little > more work. As I said, I have updated some of these. Which ones in particular? > I mean everywhere. The PCI (or ISA, or any bus) should have a resource > map (or data base or whatever you want to call it) of devices, addresses > and attributes. A driver should ask for these to be mapped (at some > arbitrary virtual address) and then use the supplied virtual address. Thereby assuming that all I/O is memory-mapped, making the driver non-portable to intel machines. > A driver should never simply 'inb(SERIAL_PORT_STATUS)' using some #define, Why not? > I don't think I would call it "tricks", but we need some layers of > translation and flexibility. The "trick" you have been proposing for > PMac will work fine there, but won't work many other places because > the bridges or systems don't have the flexibility. My point is that Huh? All I am proposing is that we set up the virtual -> physical mapping in a certain way. The I/O space of a host bridge has to be accessible somewhere in the physical address space, that's the only way it can be accessible. If the bridge connects the address lines up in a strange way (e.g. the prep mapping option which puts 64 (I think?) ports in each 4kB page) then inb/outb will have to cope with that. I hope it doesn't become necessary. > you can do that on the PMac, but that assumption shouldn't find it's > way into the in/out read/write macros. The in/out macros should either > map to in/out x86 instructions, or simply a memory access with any > barrier instructions necessary. When a driver asks for the address of > that serial port on PCI bus 1, you can give them the 0xff10xxxx address. No, that's broken. That's what I don't want. That's an extra unnecessary incompatibility with intel. Like it or not, not all devices are PCI, and most drivers are developed and tested on intel machines. > When that same driver asks that question on a 8260 with PowerSPAN PCI > bridge, it will get a very different address. In this latter case, > if they ask for the serial port on PCI bus 2, they are likely to get > something that isn't even a reasonable address calculation from the > previous. Done correctly, you could even make some drivers switch from > using I/O space to using memory mapped space, depending upon how the > system resources can be allocated, without changing the driver. > Unfortunately, too much of this information is coded into drivers today. The access functions for PCI memory space will always be distinct from the access functions for I/O space, because intel uses different instructions. Sorry. > Although it doesn't result in portable drivers, people have asked to > get ready to use mapped addresses to devices so they can manage their > own memory barriers and take advantage of deep FIFOs in bridges for > throughput rather than use any of the I/O macros. This would also > allow it. That's fine for devices with registers in PCI memory space. For registers in PCI I/O space there are more constraints which mean that you can't do these optimizations. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 2:35 ` Paul Mackerras @ 2000-09-21 3:57 ` Dan Malek 2000-09-21 5:06 ` Paul Mackerras 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-21 3:57 UTC (permalink / raw) To: paulus; +Cc: Linux/PPC Development Paul Mackerras wrote: > At the physical address level? Can't we hide that at the virtual > address level? Yes....IMHO I think the PC is one of the worst architecture designs ever, and making my PowerMac or anything else live within those contraints isn't progress.... > If you have a super-I/O chip with a serial port at I/O address 0x3f8 > (for example), you just have to know that number, there's nothing that > is going to tell you. Yes, _someone_ has to know, but when that is hardcoded into a driver, it isn't portable. It's not at that address if it isn't on the first ISA bridge of the first PCI bus, either. That's the basis of my suggestion that drivers don't assume where things are mapped. The one interface/function definition lacking in my proposal is a way for a serial driver to ask a resource manager for the proper mapped address to use to get to these things. On a PC, it will get 0x3f8, but it shouldn't assume that. > Which ones in particular? That I have modified or that need changing? The ones I have modified in the past are PCI Ethernet drivers that assumed they could hit I/O in the lower 64K of memory. They wouldn't work on an embedded system that had a PCI/PCI bridge, and they also had some byte swapping problems. You mentioned PMacs that require add-in serial cards on the second PCI bus. I am having lots of trouble with the SCSI adapters and I would like to get to my FireWire ports on the G4. These are a combination of read/write trying to perform address arithmetic, bridges not behaving, and believing too much OF is telling us. I also have some A/V cards that will probably never run in an x86 that I have to write drivers for. > Thereby assuming that all I/O is memory-mapped, making the driver > non-portable to intel machines. Well, not really. The only thing not memory mapped on a PC are the small ISA and PCI I/O spaces. All of the PCI cards I have been using recently (which admittedly are mostly audio/video or high performance controllers) use PCI memory mapped access. What I am really suggesting is even if you know it is I/O space that requires in/out instructions, you should request an access address. On a PC with a serial port in the Super I/O on the PCI bus you will still get 0x3f8 (or whatever it is, I never memorized these). I don't know what you get on a PC with more than one PCI bus.... > > > A driver should never simply 'inb(SERIAL_PORT_STATUS)' using some #define, > > Why not? Well, this is exactly why we are all discussing this right now. It doesn't work on anything except a PC. It won't even work with the changes you are suggesting (except for the local serial on a PMac). It is marginally more useful if you adopt the inb(BAR) and do the mapping as you suggest. That will be a PMac solution today for someone that plugs something into a PCI slot, but it will have to be hacked again when the next machine hits the street. > Huh? All I am proposing is that we set up the virtual -> physical > mapping in a certain way. Right, which works on the PMac, and requires drivers somehow find an address that will work with that mapping. > .... The I/O space of a host bridge has to be > accessible somewhere in the physical address space, that's the only > way it can be accessible. Yes, but somewhere isn't always mappable as conveniently as you can do it on the PMac today. > ..... then inb/outb will have to cope with > that. I hope it doesn't become necessary. I don't think inb/outb should ever have to "cope" with address calculations..... > No, that's broken. That's what I don't want. Well, which is it? Broken, or not what you want? All I'm suggesting is that the address value you give to inb/outb is exactly what it needs to use, and it has to be stored in 32 (or 64) bits. Any solution that maps multiple ISA busses has to do this, and PCI I/O has to do it anyway because of the address range. > ..... That's an extra > unnecessary incompatibility with intel. I don't see the incompatibility. If you want to support drivers across a wide variety of platforms with multiple busses of any kind, you have to provide an address that exists outside of the 64K ISA window. If you assume drivers can exist that hard code well known ISA/PCI offsets into inb/outb, they will _only_ work on PC platforms. > ..... Like it or not, not all > devices are PCI, and most drivers are developed and tested on intel > machines. Well, there are lots of drivers on embedded processors that aren't PCI and aren't Intel, but that isn't part of this discussion. I fail to see why suggesting a driver requests (somehow) the address (port, whatever) and uses that in the approprite in/out read/write macro is incompatible with anything. Even better, you can map it however you wish on a PMac, Matt can do this thing on PReP, I can use it on my cPCI 8260, someone else can use it on a PC, and we can all use the same driver. > The access functions for PCI memory space will always be distinct from > the access functions for I/O space, because intel uses different > instructions. Sorry. Yes, but you (and probably everyone else because of my poor writing) missed the point. The Linux I/O macros map to appropriate machine specific functions. If I _choose_ (for some reason because of some platform specific feature) to hand a driver an I/O address more suitable for what I know it can do, this abstraction allows that to happen. > That's fine for devices with registers in PCI memory space. For > registers in PCI I/O space there are more constraints which mean that > you can't do these optimizations. Yes, you are correct. Just like the people that asked for those features. Darwin is correct here as well..... All I am suggesting are a few small things, that can grow into something much better. If we can find a way for drivers to stop making mapping assumptions about the address space by asking for a "thing" to use in the in/out or whatever macros, and this "thing" (which would be an address on PowerPC or port on PC I/O) is not further adjusted by arithmetic in these macros, we end up with something more portable to more people. You can still do your addressing mapping on the PMac, but you haven't forced me to find a way to perform address computations in these macros on other platforms by providing only a portion of the necessary information. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 3:57 ` Dan Malek @ 2000-09-21 5:06 ` Paul Mackerras 2000-09-21 6:51 ` Dan Malek 2000-09-21 13:44 ` Geert Uytterhoeven 0 siblings, 2 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-21 5:06 UTC (permalink / raw) To: Dan Malek; +Cc: Linux/PPC Development Dan Malek writes: > Yes....IMHO I think the PC is one of the worst architecture designs > ever, and making my PowerMac or anything else live within those > contraints isn't progress.... Well, your powermac has a PCI bus, and PCI has an I/O space as well as a memory space (for better or for worse). I think my basic point is that a setup where you can't do inb(n) to read the byte at address n in PCI I/O space is broken. On systems with 1 PCI host bridge, this is unambiguous, on systems with >1 host bridge inb(n) should access address n in PCI I/O space on the first host bridge. > Yes, _someone_ has to know, but when that is hardcoded into a driver, > it isn't portable. It's not at that address if it isn't on the first > ISA bridge of the first PCI bus, either. That's the basis of my > suggestion that drivers don't assume where things are mapped. The In the case of I/O space, there isn't any mapping. Address n in I/O space is accessed with inb(n). > What I am really suggesting is even if you know it is I/O space > that requires in/out instructions, you should request an access > address. If you get a memory-mapped address then you should access it with readb/writeb. > On a PC with a serial port in the Super I/O on the PCI > bus you will still get 0x3f8 (or whatever it is, I never memorized > these). I don't know what you get on a PC with more than one > PCI bus.... Since an intel CPU has only a single I/O space (just as it has a single physical memory space) I assume that each PCI host bridge has a window that passes accesses to I/O ports in certain ranges through to the PCI bus behind it. Hopefully the ranges are all distinct. :-) We could do that too, we would just have to make sure that we assigned PCI I/O addresses so that no two bridges had devices in the same 4k range, then we could set up the virtual->physical mapping to give the illusion of a single I/O space. > > > A driver should never simply 'inb(SERIAL_PORT_STATUS)' using some #define, > > > > Why not? > > Well, this is exactly why we are all discussing this right now. It > doesn't work on anything except a PC. It doesn't work on anything except a PC, or a prep system, or a chrp, or an alpha system, or a sun ultra 5, or anything else where the designer has used a super-i/o chip because it is cheap and gives them all the usual things they want. In fact it works almost everywhere except on powermacs and embedded systems. :-) > It won't even work with the > changes you are suggesting (except for the local serial on a PMac). Yes it will, why won't it? In fact it won't work for on-board serial on a pmac. > It is marginally more useful if you adopt the inb(BAR) and do the > mapping as you suggest. That will be a PMac solution today for > someone that plugs something into a PCI slot, but it will have to > be hacked again when the next machine hits the street. Huh??? the drivers won't have to be changed, they just go on doing inb(pci_dev->resource[0].start) or whatever and that will go on working. If the next machine has a different PCI host bridge, then we will need code to support it (whether or not we adopt my proposal). > > Huh? All I am proposing is that we set up the virtual -> physical > > mapping in a certain way. > > Right, which works on the PMac, and requires drivers somehow find > an address that will work with that mapping. No, you must have misunderstood me somewhere I think. For PCI memory space accesses, nothing is changed. For PCI I/O space accesses, drivers get the base I/O address from the pci_dev->resource fields and use the in/out family of macros. Drivers don't have to look anywhere except in pci_dev->resource[] and they use read*/write* for memory space, in*/out* for I/O space. > > .... The I/O space of a host bridge has to be > > accessible somewhere in the physical address space, that's the only > > way it can be accessible. > > Yes, but somewhere isn't always mappable as conveniently as you > can do it on the PMac today. Can you give me an example of that? If it's in physical address space, how the heck could it not be mappable to virtual space? > I don't think inb/outb should ever have to "cope" with address > calculations..... inb(n) should do whatever is necessary to access address n in PCI I/O space. > All I'm suggesting is that the address value you give to inb/outb > is exactly what it needs to use, and it has to be stored in 32 (or > 64) bits. Any solution that maps multiple ISA busses has to do this, I don't believe there are any systems with multiple ISA buses. That would be an abomination. :-) > Yes, but you (and probably everyone else because of my poor writing) > missed the point. The Linux I/O macros map to appropriate machine > specific functions. If I _choose_ (for some reason because of some > platform specific feature) to hand a driver an I/O address more suitable > for what I know it can do, this abstraction allows that to happen. Which the driver accesses how? with read*/write* or with in*/out*? I would be quite happy with an ioportremap that said "give me an address that will let me access this region of PCI I/O space using readb/writeb". I suspect that the number of cases where that would be useful would be quite small though, since most devices where it would matter would have registers in PCI memory space. > All I am suggesting are a few small things, that can grow into something > much better. If we can find a way for drivers to stop making mapping > assumptions about the address space by asking for a "thing" to use > in the in/out or whatever macros, and this "thing" (which would be > an address on PowerPC or port on PC I/O) is not further adjusted by > arithmetic in these macros, we end up with something more portable > to more people. You can still do your addressing mapping on the PMac, > but you haven't forced me to find a way to perform address computations > in these macros on other platforms by providing only a portion of the > necessary information. As far as your driver is concerned, it wants to access a register at address n in PCI I/O space, so it does inb(n). It wants to access a register at address n in PCI memory space, it does readb(ioremap(n)) (in simple terms). What address computations do you need to do? What other information do you need? This discussion doesn't seem to be getting anywhere, either I am misunderstanding you or you are misunderstanding me. If you had a concrete example of a situation where you see a problem that might help clarify the issues. Paul. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 5:06 ` Paul Mackerras @ 2000-09-21 6:51 ` Dan Malek 2000-09-21 14:03 ` Geert Uytterhoeven ` (2 more replies) 2000-09-21 13:44 ` Geert Uytterhoeven 1 sibling, 3 replies; 74+ messages in thread From: Dan Malek @ 2000-09-21 6:51 UTC (permalink / raw) To: paulus; +Cc: Linux/PPC Development I actually think we are in nearly violent agreement, and I am getting way too tired tonight to continue much further..... Paul Mackerras wrote: > Well, your powermac has a PCI bus, and PCI has an I/O space as well as > a memory space (for better or for worse). Agree fully. > I think my basic point is that a setup where you can't do inb(n) to > read the byte at address n in PCI I/O space is broken. I agree. I am not suggesting you shouldn't. I'm just discussing what 'n' should be :-). > .... On systems > with 1 PCI host bridge, this is unambiguous, on systems with >1 host > bridge inb(n) should access address n in PCI I/O space on the first > host bridge. Only if 'n' is a hard coded (or nearly) number that the programmer assumed would exist on all systems. > In the case of I/O space, there isn't any mapping. Address n in I/O > space is accessed with inb(n). Of course it is mapped. On an x86 it is a hardware wire that selects one of two address spaces. On other systems it selects an address range that causes the PCI bridge to generate the I/O cycle instead of a memory cycle on the PCI bus. You can effectively think of in/out and read/write as selecting the most significant address bit of the I/O bus. Some of the confusion may be an overloaded use of the word 'map', but we will solve that with examples :-). > If you get a memory-mapped address then you should access it with > readb/writeb. If you are accessing an I/O bus memory space, you use readb/writeb. If you are accessing an I/O bus I/O space, you use inb/outb. The "handle" (address) you use in the in/out or read/write will be mapped through the MMU of any processor other than the x86. On the x86, the in/out will use a value that makes sense with the in/out instructions. > We could do that too, we would just have to make sure that we assigned > PCI I/O addresses so that no two bridges had devices in the same 4k > range, then we could set up the virtual->physical mapping to give the > illusion of a single I/O space. I think we agree that we just use the PCI bridges to the best of their ability, and let the MMU do the reset. There are combinations of this that are more efficient on some systems that others. I have no illusion of requiring a single I/O space (that's what MMUs are for :-). > It doesn't work on anything except a PC.... > ..... In fact it works almost everywhere > except on powermacs and embedded systems. :-) OK, ok :-)....You copy a PC, you get a PC, I get the point :-). > Huh??? the drivers won't have to be changed, they just go on doing > inb(pci_dev->resource[0].start) or whatever Ahhhh...OK....here we go...examples :-). I contend that access is wrong... Somewhere (and I thought it was in that resource structure), you need the BAR of that device on it's PCI bus. You also need something that indicates how that device is mapped through PCI bridges. If pci_dev->resource[0].start is the BAR of the device this isn't likely to work on many platforms. I believe what a device needs to do is something like: base = how_do_I_get_to(pci_dev, resource0); inb(base); Or, even better (if you don't know the spaces): requires_io = is_pcidev_io(pci_dev, resource0); base = how_do_I_get_to(pci_dev, resource0); if (requires_io) inb(base) else readb(base) Yes, you can map the PCI speces through the MMU and hack up the pci_dev resources to make the address work. I believe you need to have this abstraction, not assume in/out or read/write will perform address computation, and have hooks into the platform specific support to efficiently "map" this as resources allow. You can extrapolate this into other busses, and I am sure somehow get something like the ISA serial port to return 0x3f8 (I memorized this now :-) for the PC, or whatever is appropriate for other systems. You can even dynamically manage the I/O (memory or I/O to the I/O bus) resources because you have some idea about what is actually used. Although this example is pretty simple (address mapping usually is), when you start adding things like interrupt routing, inter-device DMA, hot swapping, and backplane networking there are more things a driver just can't assume to be simply pulled from a data structure. > ..... Drivers don't have to look anywhere > except in pci_dev->resource[] and they use read*/write* for memory > space, in*/out* for I/O space. I just don't think pci_dev->resource is the place to look, nor is using "assumed" values in any access. Just break the habit of doing this. I should be able to use usb_dev, or vme_dev, or firewire_dev (well, perhaps just vme_dev :-) as easily as pci_dev. There should probably be a higher level naming abstraction above this (like OF :-) so you can just ask for the serial port and not care where that device exists. > Can you give me an example of that? If it's in physical address > space, how the heck could it not be mappable to virtual space? My point was only that I may not be able to map it into nice 64K or 1M offsets like you do on the PMac. It depends upon how the bus is allocated among on-board devices and trying to use single MMU entries to map larger spaces. You can map anything to nearly anywhere, but for I/O you like to find a more efficient solution, and some mapping has alignment restrictions based upon attributes (cache, byte swapping, etc.). The address computations may become more than just simple add/shift, and require more complex operations when it is just easier to provide another address. > inb(n) should do whatever is necessary to access address n in PCI I/O > space. Ummm...no :-). inb is an x86 instruction and you have to use it on that platform. It's a wart they have to live with. I think Linux should have a isa_io() macro or something (that works like I want :-), but we have sort of implied inb/outb will do that for us.... > I don't believe there are any systems with multiple ISA buses. That > would be an abomination. :-) How about microchannel :-). > I would be quite happy with an ioportremap that said "give me an > address that will let me access this region of PCI I/O space using > readb/writeb". That's not really what I meant, although it would work great on everything but x86....I just think if you use in/out, you should still have to ask "give me something to access this region". > As far as your driver is concerned, it wants to access a register at > address n in PCI I/O space, so it does inb(n). It wants to access a > register at address n in PCI memory space, it does readb(ioremap(n)) > (in simple terms). What address computations do you need to do? None, if 'n' or the result of ioreamp() (I don't like that function much either :-), is ready to be used. I just don't like doing all of the arithmetic in the in/out read/write macros. That should all be done more intelligently by some platform functions only once. > This discussion doesn't seem to be getting anywhere, Is it any better now? I am really tired.....more later... Thanks. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 6:51 ` Dan Malek @ 2000-09-21 14:03 ` Geert Uytterhoeven 2000-09-21 22:40 ` Benjamin Herrenschmidt 2000-09-22 3:53 ` Dan Malek 2000-09-21 20:22 ` Matt Porter 2000-09-22 3:49 ` Paul Mackerras 2 siblings, 2 replies; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-21 14:03 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Linux/PPC Development On Thu, 21 Sep 2000, Dan Malek wrote: > > I think my basic point is that a setup where you can't do inb(n) to > > read the byte at address n in PCI I/O space is broken. > > I agree. I am not suggesting you shouldn't. I'm just discussing > what 'n' should be :-). `n' is the offset as accepted by the bus bridge, being it a host bridge or a PCI-PCI bridge, or a PCI-ISA bridge (with subtractive decoding (`claim all accesses that are not claimed by any other device on the bus') for legacy I/O. The first bridge (`host bridge 1') takes I/O addresses 0..n1-1, the next one n1..n2, and so on. > > We could do that too, we would just have to make sure that we assigned > > PCI I/O addresses so that no two bridges had devices in the same 4k > > range, then we could set up the virtual->physical mapping to give the > > illusion of a single I/O space. > > I think we agree that we just use the PCI bridges to the best of > their ability, and let the MMU do the reset. There are combinations > of this that are more efficient on some systems that others. I have > no illusion of requiring a single I/O space (that's what MMUs are for :-). And how to access PCI I/O space from user space? There the MMU doesn't help, since the user application (usually XFree86) just look at the BARs from /proc/bus/pci/... > > Huh??? the drivers won't have to be changed, they just go on doing > > inb(pci_dev->resource[0].start) or whatever > > Ahhhh...OK....here we go...examples :-). I contend that access is > wrong... > > Somewhere (and I thought it was in that resource structure), you need > the BAR of that device on it's PCI bus. You also need something that > indicates how that device is mapped through PCI bridges. If > pci_dev->resource[0].start is the BAR of the device > this isn't likely to work on many platforms. I believe what a > device needs to do is something like: > > base = how_do_I_get_to(pci_dev, resource0); > inb(base); > > Or, even better (if you don't know the spaces): > > requires_io = is_pcidev_io(pci_dev, resource0); > base = how_do_I_get_to(pci_dev, resource0); > if (requires_io) > inb(base) > else > readb(base) This is very similar to what many people already suggested on linux-kernel years ago: inb() and friends should take an additional argument pci_dev *. > Yes, you can map the PCI speces through the MMU and hack up the > pci_dev resources to make the address work. I believe you need to > have this abstraction, not assume in/out or read/write will perform > address computation, and have hooks into the platform specific > support to efficiently "map" this as resources allow. For kernel space. This doesn't work for user space, unless you mmap /dev/pci_{io,mem}_space, which don't exist at the moment. > You can extrapolate this into other busses, and I am sure somehow > get something like the ISA serial port to return 0x3f8 (I memorized > this now :-) for the PC, or whatever is appropriate for other systems. Currently the serial driver relies on the arch-specific #define SERIAL_PORT_DFNS to know which legacy ports to probe. This should at least become machine-specific, to support PowerMacs. BTW, I have Linus' tree only here, and I see it still has STD_COM_FLAGS for ttyS[0-2] and STD_COM4_FLAGS for ttyS3. The difference between these is that STD_COM_FLAGS contains ASYNC_SKIP_TEST to skip some presence detect. IIRC, this was the reason serial.c found a bogus ttyS2 on my LongTrail. So touching non-existent ports on non-PCs can give weird results... > > inb(n) should do whatever is necessary to access address n in PCI I/O > > space. > > Ummm...no :-). inb is an x86 instruction and you have to use it on > that platform. It's a wart they have to live with. I think Linux > should have a isa_io() macro or something (that works like I want :-), > but we have sort of implied inb/outb will do that for us.... > > > I don't believe there are any systems with multiple ISA buses. That > > would be an abomination. :-) > > How about microchannel :-). Microchannel is something different. inb() resp. readb() and friends are explicitly meant for PCI I/O resp. memory space only. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 14:03 ` Geert Uytterhoeven @ 2000-09-21 22:40 ` Benjamin Herrenschmidt 2000-09-22 3:53 ` Dan Malek 1 sibling, 0 replies; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-21 22:40 UTC (permalink / raw) To: Geert Uytterhoeven, Linux/PPC Development; +Cc: paulus > >And how to access PCI I/O space from user space? There the MMU doesn't help, >since the user application (usually XFree86) just look at the BARs from >/proc/bus/pci/... There are two ways: - 2.4 and my 2.2.x kernels have a new syscall that returns for a given PCI card, the io base for that card (the offset that must be added to the BAR in physical space). AFAIK, Kostas is working on adding support for this in XFree. - DaveM and I discussed about a better solution based on some mmap'ing mecanisms & ioctls on /proc/bus/pci entries. It would obviously require some arch-specific changes for IO (which are mmap'ed for us but not for x86). I had no time to look more in depth at this, but this is probably the way to go. We need also an ioctl that would give XFree access to pci_enable_device()/pci_disable_device() and possiblty to the device PM features. >This is very similar to what many people already suggested on linux-kernel >years ago: inb() and friends should take an additional argument pci_dev *. Which could be NULL for legacy.. yup. Well, IOs are slow, so a bit of overhead is ok ;) >For kernel space. This doesn't work for user space, unless you mmap >/dev/pci_{io,mem}_space, which don't exist at the moment. DaveM have made a patch some month ago. I'll ask him if he made any progress. In the mantime, we have the syscall. Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 14:03 ` Geert Uytterhoeven 2000-09-21 22:40 ` Benjamin Herrenschmidt @ 2000-09-22 3:53 ` Dan Malek 2000-09-22 11:58 ` Geert Uytterhoeven 1 sibling, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-22 3:53 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: paulus, Linux/PPC Development Geert Uytterhoeven wrote: > And how to access PCI I/O space from user space? There the MMU doesn't help, Sure. I do it all of the time. I have an 860 with a Tundra PCI bridge (any system but x86 will do this too). I just open /dev/mem, and map the physical address of the device on the bus. In this case the Tundra maps physical 0x80000000 to the PCI/ISA I/O space, and 0xc0000000 to PCI memory. I even have an ISA bridge downstream that I access ISA boards in the user application. Piece of cake. In the user application I get some virtual address like 0x34010000 or whatever, and that is mapped through the MMU to the 0x80000000 physical address. You can do this on any processor but the x86. The virtual address is likely to be different every time, but I don't care... > since the user application (usually XFree86) just look at the BARs from > /proc/bus/pci/... Well, there lies the challenge. I _know_ where I am going, so I can just hard code the addresses. This is the reason I keep asking for the one single thing that I think is going to make the biggest difference. We need a mechanism that will tell you how to find these devices (like give you a virtual address pointer). Reading BARs through config register access of bridges doesn't provide enough information that allow you to do in/out, or read/write, or whatever you want to do. We need a platform specific method of containing this information, mapping it, and providing it to portable drivers. > This is very similar to what many people already suggested on linux-kernel > years ago: inb() and friends should take an additional argument pci_dev *. Yeah, but it is easier to talk about here. I discovered it is easier to do something in the machine dependent part of the tree and let Linus see it work than try to convince him it is a good thing before you start :-). Hmmm...sounds like me :-). I don't have good enough filters for linux-kernel any more. Too much noise out there. > For kernel space. This doesn't work for user space, unless you mmap > /dev/pci_{io,mem}_space, which don't exist at the moment. I'm not sure it's a good thing to mmap from user space like this, although I do it quite a bit because it is easier than writing a driver. There were discussions not long ago about some interfaces that would work properly. You would identify a device by more than just an address, I believe the PCI bus/dev/function was discussed. > Microchannel is something different. I was just kidding :-). -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 3:53 ` Dan Malek @ 2000-09-22 11:58 ` Geert Uytterhoeven 2000-09-22 18:46 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-22 11:58 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Linux/PPC Development On Thu, 21 Sep 2000, Dan Malek wrote: > Geert Uytterhoeven wrote: > > And how to access PCI I/O space from user space? There the MMU doesn't help, > > Sure. I do it all of the time. I have an 860 with a Tundra PCI > bridge (any system but x86 will do this too). I just open /dev/mem, > and map the physical address of the device on the bus. In this > case the Tundra maps physical 0x80000000 to the PCI/ISA I/O space, > and 0xc0000000 to PCI memory. I even have an ISA bridge downstream > that I access ISA boards in the user application. Piece of cake. > > In the user application I get some virtual address like 0x34010000 > or whatever, and that is mapped through the MMU to the 0x80000000 > physical address. You can do this on any processor but the x86. > The virtual address is likely to be different every time, but I > don't care... I know about that one, I use it on my LongTrail as well. But you suggested the use of the MMU to map all I/O spaces from all bridges into one merged and consecutive universal I/O space. That works fine in kernel space, but not in user space, because you still need the correct _physical_ address when mmap'ing /dev/mem. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 11:58 ` Geert Uytterhoeven @ 2000-09-22 18:46 ` Dan Malek 2000-09-22 20:06 ` Frank Rowand 2000-09-23 21:38 ` Matt Porter 0 siblings, 2 replies; 74+ messages in thread From: Dan Malek @ 2000-09-22 18:46 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: paulus, Linux/PPC Development Geert Uytterhoeven wrote: > But you suggested the use of the MMU to map all I/O spaces from all bridges > into one merged and consecutive universal I/O space. Well, Paul is suggesting that, I don't really care and I think it is a detail that doesn't matter. I prefer the approach: addr_to_use = tell_me_where() inb(addr_to_use) but I guess I am wrong on this :-). Everyone seems to like hacking addresses either in the PCI resources structures or within the macros rather than doing it as a one time operation in some well contained platform specific function. I already know I can't use a simple mapping/arithmetic solution on a platform I have, so I am going to pay a high penalty for people using in/out on PCI I/O space. Fortunately, I can use drivers for devices that have PCI memory, although I will have to modify them, and most of the drivers will be new and unique. > ... but not in user space, because you still need the correct _physical_ > address when mmap'ing /dev/mem. Right, and I don't really think we should be promoting that kind of access at the user level. To do so you either need some more complex method of finding this physical address, and I don't know the outcome of some of these discussions that took place a while ago, or you should write a driver to just do it (open the device and mmap that descriptor). -- Dan -- I like MMUs because I don't have a real life. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 18:46 ` Dan Malek @ 2000-09-22 20:06 ` Frank Rowand 2000-09-23 21:38 ` Matt Porter 1 sibling, 0 replies; 74+ messages in thread From: Frank Rowand @ 2000-09-22 20:06 UTC (permalink / raw) To: Dan Malek; +Cc: Geert Uytterhoeven, paulus, Linux/PPC Development Dan Malek wrote: > > Geert Uytterhoeven wrote: > > > But you suggested the use of the MMU to map all I/O spaces from all bridges > > into one merged and consecutive universal I/O space. > > Well, Paul is suggesting that, I don't really care and I think it > is a detail that doesn't matter. I prefer the approach: > > addr_to_use = tell_me_where() > inb(addr_to_use) > > but I guess I am wrong on this :-). Everyone seems to like hacking > addresses either in the PCI resources structures or within the macros > rather than doing it as a one time operation in some well contained > platform specific function. I already know I can't use a simple I've been staying out of this discussion because I didn't want to just add noise. But what the heck... I think that the flexibility of the tell_me_where() approach is very appealing. It allows for arbitrary future designs of hardware topologies, including complex fabrics. (OK, so I don't know of any planned fabric designs for PowerPC, just other architectures.) I do know that the IBM 405 processors have split PCI I/O space into two non-contigous blocks in the physical address map. My current implementation only allows access to the first block. Adding the second block is going to be a bit "painful" (extra code complexity, etc) for me. > mapping/arithmetic solution on a platform I have, so I am going to > pay a high penalty for people using in/out on PCI I/O space. Fortunately, > I can use drivers for devices that have PCI memory, although I will > have to modify them, and most of the drivers will be new and unique. -Frank -- Frank Rowand <frank_rowand@mvista.com> MontaVista Software, Inc ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 18:46 ` Dan Malek 2000-09-22 20:06 ` Frank Rowand @ 2000-09-23 21:38 ` Matt Porter 1 sibling, 0 replies; 74+ messages in thread From: Matt Porter @ 2000-09-23 21:38 UTC (permalink / raw) To: Dan Malek; +Cc: Geert Uytterhoeven, paulus, Linux/PPC Development On Fri, Sep 22, 2000 at 02:46:34PM -0400, Dan Malek wrote: > > Geert Uytterhoeven wrote: > > > But you suggested the use of the MMU to map all I/O spaces from all bridges > > into one merged and consecutive universal I/O space. > > Well, Paul is suggesting that, I don't really care and I think it > is a detail that doesn't matter. I prefer the approach: > > addr_to_use = tell_me_where() > inb(addr_to_use) > > but I guess I am wrong on this :-). Everyone seems to like hacking > addresses either in the PCI resources structures or within the macros > rather than doing it as a one time operation in some well contained > platform specific function. I already know I can't use a simple > mapping/arithmetic solution on a platform I have, so I am going to > pay a high penalty for people using in/out on PCI I/O space. Fortunately, > I can use drivers for devices that have PCI memory, although I will > have to modify them, and most of the drivers will be new and unique. Paul's major concern is to not disturb the legacy drivers because of the unknown level of effort to "move that mountain" of developers to enhance them with a new I/O access scheme like you're suggesting. I would honestly like to see something like the above. I said that if we have to stick with the legacy inb usage then the MMU handled fixup would be preferred. > > ... but not in user space, because you still need the correct _physical_ > > address when mmap'ing /dev/mem. > > Right, and I don't really think we should be promoting that kind of > access at the user level. To do so you either need some more complex > method of finding this physical address, and I don't know the outcome > of some of these discussions that took place a while ago, or you should > write a driver to just do it (open the device and mmap that descriptor). Absolutely, if people want to mmap /dev/mem they'll have to do the hard work of getting the correct physical address on their own. -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 6:51 ` Dan Malek 2000-09-21 14:03 ` Geert Uytterhoeven @ 2000-09-21 20:22 ` Matt Porter 2000-09-22 3:49 ` Paul Mackerras 2 siblings, 0 replies; 74+ messages in thread From: Matt Porter @ 2000-09-21 20:22 UTC (permalink / raw) To: Dan Malek; +Cc: paulus, Linux/PPC Development On Thu, Sep 21, 2000 at 02:51:14AM -0400, Dan Malek wrote: > > I actually think we are in nearly violent agreement, and I am getting > way too tired tonight to continue much further..... > > Paul Mackerras wrote: > > We could do that too, we would just have to make sure that we assigned > > PCI I/O addresses so that no two bridges had devices in the same 4k > > range, then we could set up the virtual->physical mapping to give the > > illusion of a single I/O space. > > I think we agree that we just use the PCI bridges to the best of > their ability, and let the MMU do the reset. There are combinations > of this that are more efficient on some systems that others. I have > no illusion of requiring a single I/O space (that's what MMUs are for :-). I think that for a lot of bridges we won't have to have the MMU do the heavy lifting. Now that I understand the virtual mapping approach better I'm in agreement. -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 6:51 ` Dan Malek 2000-09-21 14:03 ` Geert Uytterhoeven 2000-09-21 20:22 ` Matt Porter @ 2000-09-22 3:49 ` Paul Mackerras 2000-09-22 4:16 ` Dan Malek 2000-09-23 12:34 ` Geert Uytterhoeven 2 siblings, 2 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-22 3:49 UTC (permalink / raw) To: Dan Malek; +Cc: Linux/PPC Development Dan Malek writes: > > I think my basic point is that a setup where you can't do inb(n) to > > read the byte at address n in PCI I/O space is broken. > > I agree. I am not suggesting you shouldn't. I'm just discussing > what 'n' should be :-). I think the only thing that makes sense is to say that `n' is the value that the device sees on the PCI AD lines during the address cycle. It's the value that gets compared with the value in the device's BARs. This discussion has been useful, it has convinced me that we really should have just one I/O space (or at least the illusion of a single I/O space), not one I/O space per host bridge. We should arrange that PCI boards behind different host bridges are not assigned I/O addresses in the same 4k range and then set up the virtual -> physical mapping appropriately. What I mean by that is that for each 4k range, say starting at i, we identify which bridge `b' has stuff in that 4k range and then map virtual _IO_BASE + i to physical io_base(b) + i, where io_base(b) is the I/O physical base address for bridge b. In other words we are mapping the physical addresses where each bridge has its I/O space to the same virtual area, and just mapping through the 4k regions where each bridge has stuff. > The "handle" (address) you use in the in/out or read/write will be > mapped through the MMU of any processor other than the x86. On the Not correct actually, on sparc64 it is actually a physical address, for read/write at least. > > Huh??? the drivers won't have to be changed, they just go on doing > > inb(pci_dev->resource[0].start) or whatever > > Ahhhh...OK....here we go...examples :-). I contend that access is > wrong... No, it's correct. I was going through my old emails yesterday and I found this email from Dave Miller to linux-kernel: > Does this means that there is no way to mmap the PCI IO space on > any platform other than ia32? > > One needs to be a bit more specific for me to give you an > answer :-) > > Inside the kernel: > > 1) PCI I/O space is accessed by obtaining the base address via the > appropriate pci_dev->resource[xxx] value, and feeding that directly > into inb/inw/inl and friends. > > 2) PCI MEM space is accessed by obtaining the opaque MEM base cookie > in pci_dev->resource[xxx], mapping it with ioremap(cookie), and > feeding what you obtain from that to readw and friends. When > done with the MEM space area, you iounmap it. So if I'm wrong, I'm in good company. :-) If you think I'm wrong, I suggest that it is actually people like Dave Miller and Linus that you need to be convincing, and that this discussion should move to linux-kernel. > > inb(n) should do whatever is necessary to access address n in PCI I/O > > space. > > Ummm...no :-). inb is an x86 instruction and you have to use it on > that platform. It's a wart they have to live with. I think Linux Ummm...no :-). inb is the accessor function for reading bytes from PCI I/O space. It happens to have the same name as an x86 instruction for hysterical raisins, that's all. Next you'll be telling me that cli is an x86 instruction and therefore we shouldn't use it in our drivers. (And yes, I know it's better to do spin_lock_irqsave.) > None, if 'n' or the result of ioreamp() (I don't like that function > much either :-), is ready to be used. I just don't like doing all of > the arithmetic in the in/out read/write macros. That should all be > done more intelligently by some platform functions only once. Given how long I/O accesses take (hundreds of ns, at the minimum) the cost of adding a constant is truly negligible. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 3:49 ` Paul Mackerras @ 2000-09-22 4:16 ` Dan Malek 2000-09-23 12:34 ` Geert Uytterhoeven 1 sibling, 0 replies; 74+ messages in thread From: Dan Malek @ 2000-09-22 4:16 UTC (permalink / raw) To: paulus; +Cc: Linux/PPC Development Paul Mackerras wrote: > I think the only thing that makes sense is to say that `n' is the > value that the device sees on the PCI AD lines during the address > cycle. It's the value that gets compared with the value in the > device's BARs. Fine.... > So if I'm wrong, I'm in good company. :-) You are not wrong today. You can make something work within the constraints chosen. This discussion has taken place many times over the past many years of compting history. Fortunately, I will be sitting on a beach and not worrying about it before 15 year old (or more) technology gets re-invented here again :-). > .....it is actually people like Dave Miller and Linus that you > need to be convincing, and that this discussion should move to > linux-kernel. Not worth it....I have already spent too much time discussing it here. > Given how long I/O accesses take (hundreds of ns, at the minimum) the > cost of adding a constant is truly negligible. ...until you realize the trick memory map is costing lots of time in the TLB miss handler...If you can coerce the bridges to map nicely into a single big TLB entry or BAT so all you have is a simple arithmetic operation and bus cycle, this will work great. With busses increasing in speed, its way below hundreds of ns, and that memory cycle to read the io base is going to be a large part of the cycle time. I'm done now...it was fun :-). -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-22 3:49 ` Paul Mackerras 2000-09-22 4:16 ` Dan Malek @ 2000-09-23 12:34 ` Geert Uytterhoeven 2000-09-27 10:37 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-23 12:34 UTC (permalink / raw) To: Paul Mackerras; +Cc: Dan Malek, Linux/PPC Development On Fri, 22 Sep 2000, Paul Mackerras wrote: > No, it's correct. I was going through my old emails yesterday and I > found this email from Dave Miller to linux-kernel: I remember that one :-) > > Does this means that there is no way to mmap the PCI IO space on > > any platform other than ia32? > > > > One needs to be a bit more specific for me to give you an > > answer :-) > > > > Inside the kernel: > > > > 1) PCI I/O space is accessed by obtaining the base address via the > > appropriate pci_dev->resource[xxx] value, and feeding that directly > > into inb/inw/inl and friends. > > > > 2) PCI MEM space is accessed by obtaining the opaque MEM base cookie > > in pci_dev->resource[xxx], mapping it with ioremap(cookie), and > > feeding what you obtain from that to readw and friends. When > > done with the MEM space area, you iounmap it. Life would be much simpler if PCI I/O space used a similar opaque IO base cookie with a corresponding ioportremap(cookie) function (looks a lot like Dan's tell_me_where() function, which I didn't realize until now :-), before feeding everything to inb() and friends. On ia32, ioportremap() would evaluate to the identity. No way we can convince The Others to use this approach? It does sound logical :-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-23 12:34 ` Geert Uytterhoeven @ 2000-09-27 10:37 ` Benjamin Herrenschmidt 2000-09-28 9:59 ` Geert Uytterhoeven 2000-09-28 23:24 ` Frank Rowand 0 siblings, 2 replies; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-27 10:37 UTC (permalink / raw) To: Geert Uytterhoeven, Linux/PPC Development, Paul Mackerras, Dan Malek >Life would be much simpler if PCI I/O space used a similar opaque IO base >cookie with a corresponding ioportremap(cookie) function (looks a lot like >Dan's tell_me_where() function, which I didn't realize until now :-), before >feeding everything to inb() and friends. On ia32, ioportremap() would evaluate >to the identity. > >No way we can convince The Others to use this approach? It does sound logical >:-) I'm not sure it would help. It would probably allow a kind of "mapping on demand" of the IO region on memory mapped IOs platforms, but unless we add another parameter to ioportremap telling it the pci_dev (or at least the bus number), we can't "guess" on which IO bus the device is and which physical base we must use. Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-27 10:37 ` Benjamin Herrenschmidt @ 2000-09-28 9:59 ` Geert Uytterhoeven 2000-09-28 19:19 ` Benjamin Herrenschmidt 2000-09-29 0:22 ` Paul Mackerras 2000-09-28 23:24 ` Frank Rowand 1 sibling, 2 replies; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-28 9:59 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Linux/PPC Development, Paul Mackerras, Dan Malek On Wed, 27 Sep 2000, Benjamin Herrenschmidt wrote: > >Life would be much simpler if PCI I/O space used a similar opaque IO base > >cookie with a corresponding ioportremap(cookie) function (looks a lot like > >Dan's tell_me_where() function, which I didn't realize until now :-), before > >feeding everything to inb() and friends. On ia32, ioportremap() would > evaluate > >to the identity. > > > >No way we can convince The Others to use this approach? It does sound logical > >:-) > > I'm not sure it would help. It would probably allow a kind of "mapping on > demand" of the IO region on memory mapped IOs platforms, but unless we > add another parameter to ioportremap telling it the pci_dev (or at least > the bus number), we can't "guess" on which IO bus the device is and which > physical base we must use. But we can find out the IO bus by looking in which region the physical address is located, right? Or do we have the same region on different IO busses? That would be really weird! Different IO busses should decode different regions. The ioportremap() function would move all overhead from looking up the IO bus and physical base from inb() and friends to ioportremap(). So instead of doing u8 inb(unsigned int phys_offset) { if (phys_offset >= region1_start && region1_end) return in_8(region1_base+phys_offset)); else if (phys_offset >= region2_start && region2_end) return in_8(region2_base+phys_offset)); else ... } we can do unsigned int ioportremap(unsigned int phys_offset, unsigned int size) { if (phys_offset >= region1_start && region1_end) return region1_base; else if (phys_offset >= region2_start && region2_end) return region2_base; else ... /* perhaps do some ioremap() as well, if this wasn't set up at machine init */ } static inline u8 inb(unsigned int virtual_offset) { return in_8(virtual_offset); } where virtual_offset is made from adding the magic cookie returned by ioportremap and the offset inside the ioportremap'ed region. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-28 9:59 ` Geert Uytterhoeven @ 2000-09-28 19:19 ` Benjamin Herrenschmidt 2000-09-28 23:33 ` Benjamin Herrenschmidt ` (2 more replies) 2000-09-29 0:22 ` Paul Mackerras 1 sibling, 3 replies; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-28 19:19 UTC (permalink / raw) To: Geert Uytterhoeven, linuxppc-dev >> I'm not sure it would help. It would probably allow a kind of "mapping on >> demand" of the IO region on memory mapped IOs platforms, but unless we >> add another parameter to ioportremap telling it the pci_dev (or at least >> the bus number), we can't "guess" on which IO bus the device is and which >> physical base we must use. > >But we can find out the IO bus by looking in which region the physical address >is located, right? Or do we have the same region on different IO busses? >That would be really weird! Different IO busses should decode different >regions. That mean that you intend to feed a physical address to ioportremap and not an IO address ? Hrm... that mean we need to have the physical address in the device IO resources instead of the content of the IO BAR. Well, how should this work for legacy devices, then ? By assuming addresses below 64k are legacy IO (hard coded) addresses ? I don't like it too much. The more I think about it, the more I want to separate legacy IO macros and PCI IO macros :) I personally would like is: Each bus define resources which are made of bus addresses on this bus (not CPU physical address). This is true for both IO and memory resources. The resources would contain the exact content of the BARs and this would probably allow to keep the resource management on a given bus a lot simpler (without fixup's, hooks, tricky calculations, ...) On most archs, the PCI mem bus address and CPU physical mem address will be the same (but not on some PRePs, AFAIK). Then, we can have separate: - isa_io_remap(range), - isa_mem_remap(range), - pci_io_remap(pci_bus, resource_addr) - pci_mem_remap(pci_bus, resource_addr) On most platforms, pci_mem_remap would be a simple #define of ioremap. PReP could handle adding 0xc0000000 there (well, if I understand how PReP work correctly), and all weird combinations can be handled just fine. This also allow us to have the platform support code for isa_io_remap() and isa_mem_remap() artificially put slices of the ISA IO space on various PCI IO busses & devices (for ex. the VGA ranges could be on an AGP bus while the legacy serial port ranges would be on a PCI bus with an ISA bridge). AFAIK, this scheme should be able to handle pretty much all kind of PCI/ ISA busses out there, including multiple hosts with PCI mem at different locations, etc... The BIG issue here is to adapt all drivers... Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-28 19:19 ` Benjamin Herrenschmidt @ 2000-09-28 23:33 ` Benjamin Herrenschmidt 2000-09-29 5:08 ` Dan Malek 2000-09-29 11:37 ` Geert Uytterhoeven 2 siblings, 0 replies; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-28 23:33 UTC (permalink / raw) To: linuxppc-dev, Geert Uytterhoeven, paulus Earlier today, I wrote: > >Then, we can have separate: > - isa_io_remap(range), > - isa_mem_remap(range), > - pci_io_remap(pci_bus, resource_addr) > - pci_mem_remap(pci_bus, resource_addr) > > .../... > Obviously this would be a long term plan and need more thinking and then endless discussions on the linux-kernel list... That's why I liked, as a temporary workaround, what Paul proposed which was basically to have the kernel hack a MMU mapping that puts all IO busses together, the first one beeing the default for legacy devices (eventually selected via a kernel arg). This way, inb/outb would work for legacy devices on this bus and for normal PCI IO "apertures" by using resource[x]->start as a base (like all PCI drivers currently do AFAIK) We would have to fixup IO resources of all other IO busses, but I think it's the less painful solution for 2.4. Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-28 19:19 ` Benjamin Herrenschmidt 2000-09-28 23:33 ` Benjamin Herrenschmidt @ 2000-09-29 5:08 ` Dan Malek 2000-09-29 11:37 ` Geert Uytterhoeven 2 siblings, 0 replies; 74+ messages in thread From: Dan Malek @ 2000-09-29 5:08 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, linuxppc-dev Benjamin Herrenschmidt wrote: > That mean that you intend to feed a physical address to ioportremap and > not an IO address ? Hrm... that mean we need to have the physical address > in the device IO resources instead of the content of the IO BAR. Don't confuse the contents of pci_dev->resource[].start with the value of a BAR you get using a PCI configuration cycle. These are not likely to be the same. The platform dependent PCI "fixup" functions are going to munge the pci_dev->resource[].start with knowledge of bridge mapping and potentially processor mapping. I don't understand why we don't just finish the job of adding _IO_BASE to this and be done with it, then we don't require different in/out macros for the different platforms (except x86). > The more I think about it, the more I want to separate legacy IO macros > and PCI IO macros :) This is just an example that illustrates we need to know more about the resource utilization and mapping of devices within the software that uses these devices. Using separate macros is just implicitly providing information we should have passed as a more flexible parameter, which is the bus mapping knowledge. > AFAIK, this scheme should be able to handle pretty much all kind of PCI/ > ISA busses out there, including multiple hosts with PCI mem at different > locations, etc... Just keep in mind that the address mapping of the processor to the bus is the easy part. There are also interrupt routing and other attributes that have to be considered (prefetch, pipelines, etc.). To further complicate the issue, high performance embedded systems will also employ inter-device DMA, so you need to be able to understand the views of the bus from their perspective so a driver can instruct devices to perform these functions. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-28 19:19 ` Benjamin Herrenschmidt 2000-09-28 23:33 ` Benjamin Herrenschmidt 2000-09-29 5:08 ` Dan Malek @ 2000-09-29 11:37 ` Geert Uytterhoeven 2000-09-29 17:12 ` Kostas Gewrgiou ` (3 more replies) 2 siblings, 4 replies; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-29 11:37 UTC (permalink / raw) To: Benjamin Herrenschmidt, Michel Lanners; +Cc: linuxppc-dev On Thu, 28 Sep 2000, Benjamin Herrenschmidt wrote: > >> I'm not sure it would help. It would probably allow a kind of "mapping on > >> demand" of the IO region on memory mapped IOs platforms, but unless we > >> add another parameter to ioportremap telling it the pci_dev (or at least > >> the bus number), we can't "guess" on which IO bus the device is and which > >> physical base we must use. > > > >But we can find out the IO bus by looking in which region the physical > address > >is located, right? Or do we have the same region on different IO busses? > >That would be really weird! Different IO busses should decode different > >regions. > > That mean that you intend to feed a physical address to ioportremap and > not an IO address ? Hrm... that mean we need to have the physical address > in the device IO resources instead of the content of the IO BAR. Well, > how should this work for legacy devices, then ? By assuming addresses > below 64k are legacy IO (hard coded) addresses ? I don't like it too much. No, I meant I/O addresses. Each PCI bus should decode different I/O addresses, just like it decodes different memory addresses. Say the first bus decodes I/O 0x0000-0x0fff, the second one decodes 0x1000-0x1fff, and so on. If the different busses have different physical addresses, you have to use the region decoding from my previous mail. Hence ioportremap() would move the burden (and overhead) of the region decoding from inb() and friends to one single place: ioportremap(). > The more I think about it, the more I want to separate legacy IO macros > and PCI IO macros :) That's impossible since legacy I/O macros _are_ PCI I/O macros. On Fri, 29 Sep 2000, Benjamin Herrenschmidt wrote: > >Essentially we have two cases with inb/outb - ISA devices and PCI > >devices. The ISA devices we handle by making sure that inb(0x3f8) > >hits I/O address 0x3f8 on the first PCI host bridge. The PCI devices > >we handle by setting the pci_dev->resource[].start values to account > >for any mapping we need to do. > > > >My proposal would simplify inb/outb by making _IO_BASE a constant > >rather than a variable. > > Well, the ISA space would be on a fixed bus, but not necessarily bus 0 > (please ;) IIRC, it has to be on bus 0, according to the PCI spec. > On uninorth, with the new bus remap code, the PCI slots will be bus 2 or > something like that. Ideally, the arch would tell you which bus to map > first, possibly overriden by a kernel arg. On a PC, the PCI bus (with ISA) is bus 0, while AGP is on bus 1. Hmm, I'm starting to wonder how legacy VGA works then, since the primary video card these days is usually on the AGP bus, i.e. bus 1. But it's quite possible this is some weird legacy quirk again... On Thu, 28 Sep 2000, Michel Lanners wrote: > On 28 Sep, this message from Geert Uytterhoeven echoed through cyberspace: > >> I'm not sure it would help. It would probably allow a kind of "mapping on > >> demand" of the IO region on memory mapped IOs platforms, but unless we > >> add another parameter to ioportremap telling it the pci_dev (or at least > >> the bus number), we can't "guess" on which IO bus the device is and which > >> physical base we must use. > > > > But we can find out the IO bus by looking in which region the physical address > > is located, right? Or do we have the same region on different IO busses? > > That would be really weird! Different IO busses should decode different > > regions. > > No, they map the same bus-view IO space (bus addr. 0x0 - 0xsomething) to > different windows in the processor's memory space. > > In other words, you can have IO port 0x3f8 on each of the PCI buses, but > it will be accessed at different addresses from the processor's point of > view. Hence you have to fixup any pci_dev on the non-primary bus so its resources reflect the `real' I/O addresses, as seen from the CPU. So I/O address 0x3f8 (CPU view) is decoded by the first bus and accesses 0x3f8 on the first bus. Say the second bus decodes, 0x1000-0x1fff. Then I/O address 0x13f8 (CPU view) is decoded by the second bus and accesses 0x3f8 on the second bus. Unless I'm totally mistaken (cfr. the `problem' with AGP video cards on PC above)... [ I think I really need the output of lspci -vv on a PC with PCI, AGP and a PCI-PCI bridge (and lots of cards) to bring some clarity... ] Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 11:37 ` Geert Uytterhoeven @ 2000-09-29 17:12 ` Kostas Gewrgiou 2000-09-29 17:18 ` Benjamin Herrenschmidt ` (2 subsequent siblings) 3 siblings, 0 replies; 74+ messages in thread From: Kostas Gewrgiou @ 2000-09-29 17:12 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Benjamin Herrenschmidt, Michel Lanners, linuxppc-dev On Fri, 29 Sep 2000, Geert Uytterhoeven wrote: > On a PC, the PCI bus (with ISA) is bus 0, while AGP is on bus 1. > > Hmm, I'm starting to wonder how legacy VGA works then, since the primary video > card these days is usually on the AGP bus, i.e. bus 1. But it's quite possible > this is some weird legacy quirk again... >From the RAC.Notes in the xf4 docs: ... Systems that host more than one bus system link these together using bridges. Bridges are a concern to RAC as they might block or pass specific resources. PCI-PCI bridges may be set up to pass VGA resources to the secondary bus. PCI-ISA buses pass any resources not decoded on the primary PCI bus to the ISA bus. This way VGA resources (although exclusive on the ISA bus) can be shared by ISA and PCI cards. Currently HOST-PCI bridges are not yet handled by RACY as they require specific drivers. ... xf4 plays a lot with this (enabling/disabling cards/bridges) to get multihead support with cards that use VGA resources. Kostas ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 11:37 ` Geert Uytterhoeven 2000-09-29 17:12 ` Kostas Gewrgiou @ 2000-09-29 17:18 ` Benjamin Herrenschmidt 2000-09-29 21:35 ` Michel Lanners 2000-09-30 0:11 ` Matt Porter 3 siblings, 0 replies; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-29 17:18 UTC (permalink / raw) To: Geert Uytterhoeven, linuxppc-dev > >No, I meant I/O addresses. Each PCI bus should decode different I/O addresses, >just like it decodes different memory addresses. > >Say the first bus decodes I/O 0x0000-0x0fff, the second one decodes >0x1000-0x1fff, and so on. If the different busses have different physical >addresses, you have to use the region decoding from my previous mail. Hence >ioportremap() would move the burden (and overhead) of the region decoding from >inb() and friends to one single place: ioportremap(). Well, I see why we didn't understand each other then. That's not what happens actually. At least on Uni-North based Macs, each bus has it's own IO address space assigned from 0x0000 to 0xffff at least. They are accessible at different locations in the CPU physical address space, but they have the same IO address. So in this case, we need some fixup of the resources too. >> The more I think about it, the more I want to separate legacy IO macros >> and PCI IO macros :) > >That's impossible since legacy I/O macros _are_ PCI I/O macros. Provided that those macros all access a single IO space. Which is not the case. I don't say those macros shouldn't resolve to the same code in most cases. >> Well, the ISA space would be on a fixed bus, but not necessarily bus 0 >> (please ;) > >IIRC, it has to be on bus 0, according to the PCI spec. What Apple does and what the PCI spec says are different things. Users will want to put legacy serial cards in the PCI slots for example, or PCMCIA cards with legacy stuffs on them, and on UniNorth machines (at least), those are on a bus number than can be 0,1 or 2 depending of various thing (kernel versions, machine model, ...) >On a PC, the PCI bus (with ISA) is bus 0, while AGP is on bus 1. Well, Apple decided they didn't need to hard-wire things that way, and so the AGP is the first hose inside UniNorth and the external PCI is the second. I could tweak the UniNorth probe code to put the second hose as bus 0 since I'm doing bus number remapping anyway... >Hmm, I'm starting to wonder how legacy VGA works then, since the primary video >card these days is usually on the AGP bus, i.e. bus 1. But it's quite possible >this is some weird legacy quirk again... Dunno. >Hence you have to fixup any pci_dev on the non-primary bus so its resources >reflect the `real' I/O addresses, as seen from the CPU. There is no choice since the BAR content overlap as I told you. The machine can have several IO spaces, all having the same 0x0000->0x1ffff range decoded. >So I/O address 0x3f8 (CPU view) is decoded by the first bus and accesses 0x3f8 >on the first bus. It's decoded on whatever bus you decide is the first one, at least on UniNorth-like machines where you have the choice. Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 11:37 ` Geert Uytterhoeven 2000-09-29 17:12 ` Kostas Gewrgiou 2000-09-29 17:18 ` Benjamin Herrenschmidt @ 2000-09-29 21:35 ` Michel Lanners 2000-09-30 0:11 ` Matt Porter 3 siblings, 0 replies; 74+ messages in thread From: Michel Lanners @ 2000-09-29 21:35 UTC (permalink / raw) To: geert; +Cc: bh40, linuxppc-dev On 29 Sep, this message from Geert Uytterhoeven echoed through cyberspace: > On Thu, 28 Sep 2000, Michel Lanners wrote: >> On 28 Sep, this message from Geert Uytterhoeven echoed through cyberspace: >> >> I'm not sure it would help. It would probably allow a kind of "mapping on >> >> demand" of the IO region on memory mapped IOs platforms, but unless we >> >> add another parameter to ioportremap telling it the pci_dev (or at least >> >> the bus number), we can't "guess" on which IO bus the device is and which >> >> physical base we must use. >> > >> > But we can find out the IO bus by looking in which region the physical address >> > is located, right? Or do we have the same region on different IO busses? >> > That would be really weird! Different IO busses should decode different >> > regions. >> >> No, they map the same bus-view IO space (bus addr. 0x0 - 0xsomething) to >> different windows in the processor's memory space. >> >> In other words, you can have IO port 0x3f8 on each of the PCI buses, but >> it will be accessed at different addresses from the processor's point of >> view. > > Hence you have to fixup any pci_dev on the non-primary bus so its resources > reflect the `real' I/O addresses, as seen from the CPU. Yes, you will have to correct the pci_dev IO resources of all devices on buses other than what you assign as being bus 0. In fact, the way Paul and Ben intend to do it, is adding a fixed offset in inb() and friends, and having a MMU mappung such that all IO regions appear as a single consecutive region within the processor's virtual space. In other words, IO on bus 1 needs to be offset (in pci_dev) by the size of bus 0's IO space, and so on. > So I/O address 0x3f8 (CPU view) is decoded by the first bus and accesses 0x3f8 > on the first bus. > Say the second bus decodes, 0x1000-0x1fff. Then I/O address 0x13f8 (CPU view) > is decoded by the second bus and accesses 0x3f8 on the second bus. Yes, except that the offset to bus 1 will more likely be something like 0x10000 or more, i.e inb(0x103f8) gives you 0x3f8 on bus 1. Remember we don't want to overlap the IO spaces of the different buses, rather concatenate them. This saves the hassle of 'correcting' the firmware's IO resource asignments. > [ I think I really need the output of lspci -vv on a PC with PCI, AGP and a > PCI-PCI bridge (and lots of cards) to bring some clarity... ] Sure you want to see that? ;-) Michel ------------------------------------------------------------------------- Michel Lanners | " Read Philosophy. Study Art. 23, Rue Paul Henkes | Ask Questions. Make Mistakes. L-1710 Luxembourg | email mlan@cpu.lu | http://www.cpu.lu/~mlan | Learn Always. " ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 11:37 ` Geert Uytterhoeven ` (2 preceding siblings ...) 2000-09-29 21:35 ` Michel Lanners @ 2000-09-30 0:11 ` Matt Porter 3 siblings, 0 replies; 74+ messages in thread From: Matt Porter @ 2000-09-30 0:11 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Benjamin Herrenschmidt, Michel Lanners, linuxppc-dev On Fri, Sep 29, 2000 at 01:37:45PM +0200, Geert Uytterhoeven wrote: > > On Thu, 28 Sep 2000, Benjamin Herrenschmidt wrote: > > >> I'm not sure it would help. It would probably allow a kind of "mapping on > > >> demand" of the IO region on memory mapped IOs platforms, but unless we > > >> add another parameter to ioportremap telling it the pci_dev (or at least > > >> the bus number), we can't "guess" on which IO bus the device is and which > > >> physical base we must use. > > > > > >But we can find out the IO bus by looking in which region the physical > > address > > >is located, right? Or do we have the same region on different IO busses? > > >That would be really weird! Different IO busses should decode different > > >regions. > > > > That mean that you intend to feed a physical address to ioportremap and > > not an IO address ? Hrm... that mean we need to have the physical address > > in the device IO resources instead of the content of the IO BAR. Well, > > how should this work for legacy devices, then ? By assuming addresses > > below 64k are legacy IO (hard coded) addresses ? I don't like it too much. > > No, I meant I/O addresses. Each PCI bus should decode different I/O addresses, > just like it decodes different memory addresses. > > Say the first bus decodes I/O 0x0000-0x0fff, the second one decodes > 0x1000-0x1fff, and so on. If the different busses have different physical > addresses, you have to use the region decoding from my previous mail. Hence > ioportremap() would move the burden (and overhead) of the region decoding from > inb() and friends to one single place: ioportremap(). No, each PCI host bridge segment contains an independent address space and may well have addresses that are identical numerically to addresses on other host bridge segments (I hate that "host" terminology). The host bridges are what map these to sane unique values on the processor bus. -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-28 9:59 ` Geert Uytterhoeven 2000-09-28 19:19 ` Benjamin Herrenschmidt @ 2000-09-29 0:22 ` Paul Mackerras 2000-09-29 0:40 ` Benjamin Herrenschmidt 2000-09-29 4:29 ` Dan Malek 1 sibling, 2 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-29 0:22 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Linux/PPC Development Geert Uytterhoeven writes: > But we can find out the IO bus by looking in which region the physical address > is located, right? Or do we have the same region on different IO busses? > That would be really weird! Different IO busses should decode different > regions. > > The ioportremap() function would move all overhead from looking up the IO bus > and physical base from inb() and friends to ioportremap(). So instead of doing > > u8 inb(unsigned int phys_offset) > { > if (phys_offset >= region1_start && region1_end) > return in_8(region1_base+phys_offset)); > else if (phys_offset >= region2_start && region2_end) > return in_8(region2_base+phys_offset)); > else > ... > } Well, we don't do that now anyway, and noone was suggesting we should. To the extent that anything like that was needed, we would use the MMU to do the necessary translations by setting up the virt -> phys mapping appropriately. My idea in suggesting ioportremap is that it would give you an address that you can use with readb/writeb (or in_8/out_8 if you like), not inb/outb. I personally don't think the ioportremap idea has much value though. Essentially we have two cases with inb/outb - ISA devices and PCI devices. The ISA devices we handle by making sure that inb(0x3f8) hits I/O address 0x3f8 on the first PCI host bridge. The PCI devices we handle by setting the pci_dev->resource[].start values to account for any mapping we need to do. My proposal would simplify inb/outb by making _IO_BASE a constant rather than a variable. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 0:22 ` Paul Mackerras @ 2000-09-29 0:40 ` Benjamin Herrenschmidt 2000-09-29 1:17 ` Paul Mackerras 2000-09-29 4:29 ` Dan Malek 1 sibling, 1 reply; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-29 0:40 UTC (permalink / raw) To: paulus, Linux/PPC Development >Essentially we have two cases with inb/outb - ISA devices and PCI >devices. The ISA devices we handle by making sure that inb(0x3f8) >hits I/O address 0x3f8 on the first PCI host bridge. The PCI devices >we handle by setting the pci_dev->resource[].start values to account >for any mapping we need to do. > >My proposal would simplify inb/outb by making _IO_BASE a constant >rather than a variable. Well, the ISA space would be on a fixed bus, but not necessarily bus 0 (please ;) On uninorth, with the new bus remap code, the PCI slots will be bus 2 or something like that. Ideally, the arch would tell you which bus to map first, possibly overriden by a kernel arg. ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 0:40 ` Benjamin Herrenschmidt @ 2000-09-29 1:17 ` Paul Mackerras 2000-09-29 4:22 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-09-29 1:17 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Linux/PPC Development Benjamin Herrenschmidt writes: > Well, the ISA space would be on a fixed bus, but not necessarily bus 0 > (please ;) > > On uninorth, with the new bus remap code, the PCI slots will be bus 2 or > something like that. Ideally, the arch would tell you which bus to map > first, possibly overriden by a kernel arg. The ISA space would be on the "primary" host bridge, whatever that means. :-) I don't see why it matters on powermacs which is the primary bridge though, since we fortunately don't have any legacy ISA devices. Hum... except for cards in the PC-card slot... So we need to make sure that the bus with the cardbus controller is behind the "primary" bridge. The issue here is that the pcmcia/cardbus stuff will assign a range of I/O addresses to a pcmcia card (starting at say I/O port n) and then expect that inb(n) will access the card. Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 1:17 ` Paul Mackerras @ 2000-09-29 4:22 ` Dan Malek 0 siblings, 0 replies; 74+ messages in thread From: Dan Malek @ 2000-09-29 4:22 UTC (permalink / raw) To: paulus; +Cc: Benjamin Herrenschmidt, Linux/PPC Development Paul Mackerras wrote: > I don't see why it matters on powermacs which is the primary bridge > though, since we fortunately don't have any legacy ISA devices. > Hum... except for cards in the PC-card slot... So we need to make > sure that the bus with the cardbus controller is behind the "primary" > bridge. I thought this whole discussion was started because someone wanted a 16550-like serial card on the PCI bus.... Isn't the cardbus only on the PowerBook/iBook, which only have one PCI bus? Of course, this will change with the new G4 PowerBooks.... > The issue here is that the pcmcia/cardbus stuff will assign a range > of I/O addresses to a pcmcia card (starting at say I/O port n) and > then expect that inb(n) will access the card. That should work, and most of the modules have 'ioport=' configuration option in any case.... -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 0:22 ` Paul Mackerras 2000-09-29 0:40 ` Benjamin Herrenschmidt @ 2000-09-29 4:29 ` Dan Malek 2000-09-29 4:36 ` Paul Mackerras 1 sibling, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-09-29 4:29 UTC (permalink / raw) To: paulus; +Cc: Geert Uytterhoeven, Linux/PPC Development Paul Mackerras wrote: > My proposal would simplify inb/outb by making _IO_BASE a constant > rather than a variable. That's the way it used to be, then all of the different ports made a mess of io.h because everyone wanted it different. Then it became a variable (sort of) and the mess moved to platform dependent files, now it is coming back.....Why will it be different this time :-)? -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 4:29 ` Dan Malek @ 2000-09-29 4:36 ` Paul Mackerras 2000-09-29 5:40 ` Dan Malek 2000-09-29 19:07 ` Frank Rowand 0 siblings, 2 replies; 74+ messages in thread From: Paul Mackerras @ 2000-09-29 4:36 UTC (permalink / raw) To: Dan Malek; +Cc: Geert Uytterhoeven, Linux/PPC Development Dan Malek writes: > Paul Mackerras wrote: > > > My proposal would simplify inb/outb by making _IO_BASE a constant > > rather than a variable. > > That's the way it used to be, then all of the different ports made > a mess of io.h because everyone wanted it different. Then it became > a variable (sort of) and the mess moved to platform dependent files, > now it is coming back.....Why will it be different this time :-)? Because we'll make it the same value for all ports. Anybody got any objections to 0xff000000? Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 4:36 ` Paul Mackerras @ 2000-09-29 5:40 ` Dan Malek 2000-09-29 19:07 ` Frank Rowand 1 sibling, 0 replies; 74+ messages in thread From: Dan Malek @ 2000-09-29 5:40 UTC (permalink / raw) To: paulus; +Cc: Geert Uytterhoeven, Linux/PPC Development Paul Mackerras wrote: > Because we'll make it the same value for all ports. Anybody got any > objections to 0xff000000? You know I will :-). Most of the embedded processors already map internal resources in this space, which of course can be changed. You know one of the reasons we map legacy I/O 1:1 to the processor is for debug and initialization. Why don't you pick an address like 0x80000000? This will work on at least some existing platforms without any changes (Prep, Chrp, and embeddded come to mind :-). -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 4:36 ` Paul Mackerras 2000-09-29 5:40 ` Dan Malek @ 2000-09-29 19:07 ` Frank Rowand 2000-09-30 1:39 ` Paul Mackerras 1 sibling, 1 reply; 74+ messages in thread From: Frank Rowand @ 2000-09-29 19:07 UTC (permalink / raw) To: paulus; +Cc: Dan Malek, Geert Uytterhoeven, Linux/PPC Development Paul Mackerras wrote: > > Dan Malek writes: > > > Paul Mackerras wrote: > > > > > My proposal would simplify inb/outb by making _IO_BASE a constant > > > rather than a variable. > > > > That's the way it used to be, then all of the different ports made > > a mess of io.h because everyone wanted it different. Then it became > > a variable (sort of) and the mess moved to platform dependent files, > > now it is coming back.....Why will it be different this time :-)? > > Because we'll make it the same value for all ports. Anybody got any > objections to 0xff000000? > > Paul. > > -- > Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. > +61 2 6262 8990 tel, +61 2 6262 8991 fax > paulus@linuxcare.com.au, http://www.linuxcare.com.au/ > Linuxcare. Support for the revolution. > Yes. It is equivalently mapped on the PPC 405 at 0xe8000000. -Frank -- Frank Rowand <frank_rowand@mvista.com> MontaVista Software, Inc ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-29 19:07 ` Frank Rowand @ 2000-09-30 1:39 ` Paul Mackerras 2000-09-30 22:50 ` Frank Rowand 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-09-30 1:39 UTC (permalink / raw) To: frowand; +Cc: Dan Malek, Geert Uytterhoeven, Linux/PPC Development Frank Rowand writes: > Paul Mackerras wrote: [snip] > > Because we'll make it the same value for all ports. Anybody got any > > objections to 0xff000000? > Yes. It is equivalently mapped on the PPC 405 at 0xe8000000. But that's a physical address, not necessarily a virtual address, right? I was talking about virtual address 0xff000000. Any particular reason why you have to have virtual == physical? (If there is, my response will probably be "fix it". :-) Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-30 1:39 ` Paul Mackerras @ 2000-09-30 22:50 ` Frank Rowand 2000-10-01 1:09 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Frank Rowand @ 2000-09-30 22:50 UTC (permalink / raw) To: paulus; +Cc: frowand, Dan Malek, Geert Uytterhoeven, Linux/PPC Development Paul Mackerras wrote: > > Frank Rowand writes: > > > Paul Mackerras wrote: > [snip] > > > Because we'll make it the same value for all ports. Anybody got any > > > objections to 0xff000000? > > > Yes. It is equivalently mapped on the PPC 405 at 0xe8000000. > > But that's a physical address, not necessarily a virtual address, > right? I was talking about virtual address 0xff000000. Any > particular reason why you have to have virtual == physical? (If there > is, my response will probably be "fix it". :-) If the answer ends up "fix it", that's fine. Dan Malek has been providing me lots of review of the code I've been doing for the 405, helping me to learn the Linux PowerPC way of doing things so I have lots of practice at fixing things. The address 0xe8000000 is both the physical and the virtual address ("equivalently mapped"). I can move the virtual address to 0xff000000 (kernel people can do anything, right?). This raises a question about equivalent mapping though. Everything above ioremap_base is equivalently mapped by ioremap(). In 2.4.0-test2 ioremap_base is initialized in MMU_init(): 0xe8000000 #ifdef CONFIG_IBM405 0xfffff000 #ifdef CONFIG_POWER4 0xf0000000 _MACH_prep 0xf8000000 _MACH_Pmac 0xe0000000 _MACH_8260 0xf8000000 everything else Is it wise to hardcode a virtual address of 0xff000000 to a specific object? If that is done, then an ioremap of physical address 0xff000000 will also have the same virtual address. What am I missing? So to answer your question, I arbitrarilly chose 0xe8000000 as ioremap_base because it seemed within reason, given other existing values. I can move ioremap_base to lots of other places if I need to. > > Paul. > > -- > Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. > +61 2 6262 8990 tel, +61 2 6262 8991 fax > paulus@linuxcare.com.au, http://www.linuxcare.com.au/ > Linuxcare. Support for the revolution. Thanks, Frank -- Frank Rowand <frank_rowand@mvista.com> MontaVista Software, Inc ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-30 22:50 ` Frank Rowand @ 2000-10-01 1:09 ` Dan Malek 2000-10-01 8:16 ` Paul Mackerras 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-10-01 1:09 UTC (permalink / raw) To: frowand; +Cc: paulus, Geert Uytterhoeven, Linux/PPC Development Frank Rowand wrote: > This raises a question about equivalent mapping though. Everything above > ioremap_base is equivalently mapped by ioremap(). In 2.4.0-test2 > ioremap_base is initialized in MMU_init(): This "equivalent" mapping is going to be the greatest challenge. The reason this happens is because we need access through the MMU before the kernel has initialized the kernel VM allocator. Parts of the early kernel initialization that have to access control/status registers of various types are going to be mapped 1:1. After some of these proposed changes, these mappings are going to be destroyed and remapped (to get the "standard" 0xff000000 address). This is really bad for the integrated devices since any that used the original mapping, either in the hardware registers or in data structures, must be tracked down, changed and initialized all over again. Either the "iobase" variable is going to continue to live, or io.h is going to be filled with lots of #ifdefs with different constants (again). > So to answer your question, I arbitrarilly chose 0xe8000000 as ioremap_base > because it seemed within reason, given other existing values. I can move > ioremap_base to lots of other places if I need to. The 0xff000000 simply isn't workable on many of the embedded 8xx and cPCI prep/chrp boards for the reason I mentioned above. If Paul wants to do this on the PMacs to assist the problem he is trying to solve, that's fine. I don't see any value, and only lots of time spent in both coding and processors cycles, in trying to make this a common VM mapping across all PowerPC platforms. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-10-01 1:09 ` Dan Malek @ 2000-10-01 8:16 ` Paul Mackerras 2000-10-01 21:30 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-10-01 8:16 UTC (permalink / raw) To: Dan Malek; +Cc: frowand, Linux/PPC Development Dan Malek writes: > This "equivalent" mapping is going to be the greatest challenge. The > reason this happens is because we need access through the MMU before > the kernel has initialized the kernel VM allocator. Parts of the early > kernel initialization that have to access control/status registers of > various types are going to be mapped 1:1. After some of these proposed > changes, these mappings are going to be destroyed and remapped (to get > the "standard" 0xff000000 address). This is really bad for the Ummm, no, I think we may have a misunderstanding here. First, you can use ioremap before the kernel VM allocator is set up, and you can continue to use the virtual addresses you get by doing so. So no mappings are going to be destroyed and remapped. All we do for early ioremaps of addresses < ioremap_base is to allocate addresses starting at ioremap_base and going down. And yes, we will need to do that for addresses >= 0xff000000 too (in fact 0xfe000000 if we are using CONFIG_HIGHMEM). Actually, there's really no reason why we shouldn't do that for all physical addresses. So the code in ioremap would become something like this: if (mem_init_done) { struct vm_struct *area; area = get_vm_area(size, VM_IOREMAP); if (area == 0) return NULL; v = VMALLOC_VMADDR(area->addr); } else { v = (ioremap_bot -= size); } (leaving out the "if (p >= ioremap_base)" test), and we could initialize ioremap_bot to 0xff000000 (or 0xfe000000 in the highmem case). Secondly, all your integrated devices would be using memory-mapped I/O, so the question of what _IO_BASE is, and how inb/outb work, is pretty much irrelevant to you, isn't it? Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-10-01 8:16 ` Paul Mackerras @ 2000-10-01 21:30 ` Dan Malek 2000-10-01 22:50 ` Paul Mackerras 0 siblings, 1 reply; 74+ messages in thread From: Dan Malek @ 2000-10-01 21:30 UTC (permalink / raw) To: paulus; +Cc: frowand, Linux/PPC Development Paul Mackerras wrote: > Ummm, no, I think we may have a misunderstanding here. > > First, you can use ioremap before the kernel VM allocator is set up, Yes, and I do that. The problem is many of the embedded boards map stuff above 0xff000000 (or 0xf0000000), which overlaps your "standard" PowerPC in/out mapping. On systems that have PCI bridges, legacy hardware, and want to use the standard drivers that in/out, I have to put the IO_BASE someplace else, or remap everything later when the VM allocator is initialized. -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-10-01 21:30 ` Dan Malek @ 2000-10-01 22:50 ` Paul Mackerras 2000-10-02 9:04 ` Dan Malek 0 siblings, 1 reply; 74+ messages in thread From: Paul Mackerras @ 2000-10-01 22:50 UTC (permalink / raw) To: Dan Malek; +Cc: frowand, Linux/PPC Development Dan Malek writes: > Yes, and I do that. The problem is many of the embedded boards map > stuff > above 0xff000000 (or 0xf0000000), which overlaps your "standard" PowerPC > in/out mapping. Now, are those addresses physical or virtual? My point is that we can do what we like with the virtual address assignments. If you are saying "but we have device registers at physical address 0xff000000" my response is "so why is that a problem?". Paul. -- Paul Mackerras, Senior Open Source Researcher, Linuxcare, Inc. +61 2 6262 8990 tel, +61 2 6262 8991 fax paulus@linuxcare.com.au, http://www.linuxcare.com.au/ Linuxcare. Support for the revolution. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-10-01 22:50 ` Paul Mackerras @ 2000-10-02 9:04 ` Dan Malek 0 siblings, 0 replies; 74+ messages in thread From: Dan Malek @ 2000-10-02 9:04 UTC (permalink / raw) To: paulus; +Cc: frowand, Linux/PPC Development Paul Mackerras wrote: > Now, are those addresses physical or virtual? Both. They get mapped 1:1 in the mmu initialization, ioremap, and work while the mmu is disabled. Simple, easy, debuggers work, everyone is happy. > ... If you are saying "but we have device registers at > physical address 0xff000000" my response is "so why is that a > problem?". Because you are taking something that works just fine and complicating it for me without adding any value. The whole PCI subsystem implementation needs lots of work, and adding a little VM mapping doesn't solve much of the problem. This is probably a bad time for me to comment on this because I am trying to get a PPC750 cPCI board running with 2.4. It worked great in a 2.2 kernel, with all of its multiple bridges and Prep memory map. It does the same stupid thing as a PMac right now, claiming everything is a mapping collision and you can't get there from here. The interrupt routing is all screwed up as well. There were a bunch of PCI updates from Matt Porter in the 2.2 kernel for this, and I don't know why they didn't make it into 2.4. I suspect if they did we could utilize it for PMacs too and just get on with life. Yeah, I can add more code to head.S, have multiple mappings for the same thing, and add more "fixup" functions for those places where it has to be adjusted during the initialization. The kernel is bigger, executing more code, and making it more complicated for others to understand....just so I can map my PCI like a PMac..... I've had enough of PCI for today. Maybe after I sleep on this for a few hours I'll see the light :-). Oh....and then there is his CONFIG_ALL_PPC thing..... -- Dan ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-27 10:37 ` Benjamin Herrenschmidt 2000-09-28 9:59 ` Geert Uytterhoeven @ 2000-09-28 23:24 ` Frank Rowand 1 sibling, 0 replies; 74+ messages in thread From: Frank Rowand @ 2000-09-28 23:24 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Geert Uytterhoeven, Linux/PPC Development, Paul Mackerras, Dan Malek Benjamin Herrenschmidt wrote: > > >Life would be much simpler if PCI I/O space used a similar opaque IO base > >cookie with a corresponding ioportremap(cookie) function (looks a lot like > >Dan's tell_me_where() function, which I didn't realize until now :-), before > >feeding everything to inb() and friends. On ia32, ioportremap() would > evaluate > >to the identity. > > > >No way we can convince The Others to use this approach? It does sound logical > >:-) > > I'm not sure it would help. It would probably allow a kind of "mapping on > demand" of the IO region on memory mapped IOs platforms, but unless we > add another parameter to ioportremap telling it the pci_dev (or at least > the bus number), we can't "guess" on which IO bus the device is and which > physical base we must use. > > Ben. > Yes, we should add pci_dev to ioportremap(), which can then extract the bus number from pci_dev->pci_bus. -Frank -- Frank Rowand <frank_rowand@mvista.com> MontaVista Software, Inc ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 5:06 ` Paul Mackerras 2000-09-21 6:51 ` Dan Malek @ 2000-09-21 13:44 ` Geert Uytterhoeven 2000-09-21 22:41 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-21 13:44 UTC (permalink / raw) To: Paul Mackerras; +Cc: Dan Malek, Linux/PPC Development On Thu, 21 Sep 2000, Paul Mackerras wrote: > Dan Malek writes: > > > Yes....IMHO I think the PC is one of the worst architecture designs > > ever, and making my PowerMac or anything else live within those > > contraints isn't progress.... > > Well, your powermac has a PCI bus, and PCI has an I/O space as well as > a memory space (for better or for worse). > > I think my basic point is that a setup where you can't do inb(n) to > read the byte at address n in PCI I/O space is broken. On systems > with 1 PCI host bridge, this is unambiguous, on systems with >1 host > bridge inb(n) should access address n in PCI I/O space on the first > host bridge. And how do you handle accesses to PCI I/O space on the other busses? > > Yes, _someone_ has to know, but when that is hardcoded into a driver, > > it isn't portable. It's not at that address if it isn't on the first > > ISA bridge of the first PCI bus, either. That's the basis of my > > suggestion that drivers don't assume where things are mapped. The > > In the case of I/O space, there isn't any mapping. Address n in I/O > space is accessed with inb(n). `The I/O space' is the union of all I/O spaces behind all bridges. > > On a PC with a serial port in the Super I/O on the PCI > > bus you will still get 0x3f8 (or whatever it is, I never memorized > > these). I don't know what you get on a PC with more than one > > PCI bus.... > > Since an intel CPU has only a single I/O space (just as it has a > single physical memory space) I assume that each PCI host bridge > has a window that passes accesses to I/O ports in certain ranges > through to the PCI bus behind it. Hopefully the ranges are all > distinct. :-) That's indeed how it's supposed to work (AFAIK). > We could do that too, we would just have to make sure that we assigned > PCI I/O addresses so that no two bridges had devices in the same 4k > range, then we could set up the virtual->physical mapping to give the > illusion of a single I/O space. I think the mapping from 8 (not 64, IIRC) consecutive I/O port addresses to 1 4K page was meant exactly to solve this problem. > > > > A driver should never simply 'inb(SERIAL_PORT_STATUS)' using some #define, > > > > > > Why not? > > > > Well, this is exactly why we are all discussing this right now. It > > doesn't work on anything except a PC. > > It doesn't work on anything except a PC, or a prep system, or a chrp, > or an alpha system, or a sun ultra 5, or anything else where the > designer has used a super-i/o chip because it is cheap and gives them > all the usual things they want. In fact it works almost everywhere > except on powermacs and embedded systems. :-) Even some embedded systems have Super I/Os :-) But: Super I/O is legacy I/O, and always present on the first bus (starting at I/O space address 0). I never heard of a system with a Super I/O on a different bus, but of course no one prevents me from building a system like that... Legacy I/O is also limited to 10 bit addresses. This knowledge could optimize the size of my translation table (cfr. my previous mail). > > I don't think inb/outb should ever have to "cope" with address > > calculations..... > > inb(n) should do whatever is necessary to access address n in PCI I/O > space. Yes indeed. > > All I'm suggesting is that the address value you give to inb/outb > > is exactly what it needs to use, and it has to be stored in 32 (or > > 64) bits. Any solution that maps multiple ISA busses has to do this, > > I don't believe there are any systems with multiple ISA buses. That > would be an abomination. :-) IIRC the spec allows only one PCI/ISA bridge, and it has to be on the first PCI bus so legacy I/O accesses work. And we still didn't mention the nightmare called ISA memory space... Where is your legacy VGA memory? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 13:44 ` Geert Uytterhoeven @ 2000-09-21 22:41 ` Benjamin Herrenschmidt 2000-09-22 21:59 ` Michel Lanners 0 siblings, 1 reply; 74+ messages in thread From: Benjamin Herrenschmidt @ 2000-09-21 22:41 UTC (permalink / raw) To: Geert Uytterhoeven, Linux/PPC Development >And we still didn't mention the nightmare called ISA memory space... Where is >your legacy VGA memory? I think we simply don't have access to it. Maybe on Grackle machines, we could try to hack the bridge config (MPC106), but I don't know if Apple's bandit can generate PCI mem cycles at 0. Well, weren't some of the early Apple PPC servers running AIX based on Bandit and using legacy VGA text mode ? In this case, it may be possible to configure Bandit to open a window to the ISA memory space... Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-21 22:41 ` Benjamin Herrenschmidt @ 2000-09-22 21:59 ` Michel Lanners 0 siblings, 0 replies; 74+ messages in thread From: Michel Lanners @ 2000-09-22 21:59 UTC (permalink / raw) To: bh40; +Cc: geert, linuxppc-dev On 22 Sep, this message from Benjamin Herrenschmidt echoed through cyberspace: > Well, weren't some of the early Apple PPC servers running AIX based on > Bandit and using legacy VGA text mode ? > In this case, it may be possible to configure Bandit to open a window to > the ISA memory space... Get Apple to open up the documentation about bandit ;-) Michel ------------------------------------------------------------------------- Michel Lanners | " Read Philosophy. Study Art. 23, Rue Paul Henkes | Ask Questions. Make Mistakes. L-1710 Luxembourg | email mlan@cpu.lu | http://www.cpu.lu/~mlan | Learn Always. " ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-19 22:06 ` Matt Porter 2000-09-19 22:58 ` Paul Mackerras @ 2000-09-20 12:08 ` Geert Uytterhoeven 2000-09-20 16:31 ` Matt Porter 1 sibling, 1 reply; 74+ messages in thread From: Geert Uytterhoeven @ 2000-09-20 12:08 UTC (permalink / raw) To: Matt Porter; +Cc: Paul Mackerras, Linux/PPC Development On Tue, 19 Sep 2000, Matt Porter wrote: > would only be used on the "primary" host bridge. I realize there > are plenty of drivers (like de4x5) that insist on using inw/outw > (and thus break on host bridge 2) but these drivers should be > fixed. Is there still a need for de4x5? Since Jeff Garzik fixed the Tulip driver, that one works fine on PPC. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
* Re: __ioremap_at() in 2.4.0-test9-pre2 2000-09-20 12:08 ` Geert Uytterhoeven @ 2000-09-20 16:31 ` Matt Porter 0 siblings, 0 replies; 74+ messages in thread From: Matt Porter @ 2000-09-20 16:31 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Linux/PPC Development On Wed, Sep 20, 2000 at 02:08:42PM +0200, Geert Uytterhoeven wrote: > On Tue, 19 Sep 2000, Matt Porter wrote: > > would only be used on the "primary" host bridge. I realize there > > are plenty of drivers (like de4x5) that insist on using inw/outw > > (and thus break on host bridge 2) but these drivers should be > > fixed. > > Is there still a need for de4x5? Since Jeff Garzik fixed the Tulip driver, > that one works fine on PPC. Yep, there's a little workaround for a buggy SROM on some embedded boards in the de4x5 that I haven't yet moved to the tulip driver. Of course, it's written using I/O as well. :-/ -- Matt Porter MontaVista Software, Inc. mporter@mvista.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 74+ messages in thread
end of thread, other threads:[~2000-10-02 9:04 UTC | newest] Thread overview: 74+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2000-09-21 7:30 __ioremap_at() in 2.4.0-test9-pre2 Iain Sandoe -- strict thread matches above, loose matches on Subject: below -- 2000-09-17 18:59 Geert Uytterhoeven 2000-09-19 3:59 ` Paul Mackerras 2000-09-19 5:56 ` Michel Lanners 2000-09-19 14:28 ` Dan Malek 2000-09-19 18:31 ` Roman Zippel 2000-09-19 20:09 ` Dan Malek 2000-09-19 23:42 ` Roman Zippel 2000-09-20 0:10 ` Dan Malek 2000-09-20 17:18 ` Roman Zippel 2000-09-20 18:11 ` Dan Malek 2000-09-20 20:22 ` Roman Zippel 2000-09-20 20:41 ` David Edelsohn 2000-09-21 2:16 ` Dan Malek 2000-09-21 2:26 ` David Edelsohn 2000-09-21 2:40 ` Dan Malek 2000-09-21 3:53 ` David Edelsohn 2000-09-19 22:06 ` Matt Porter 2000-09-19 22:58 ` Paul Mackerras 2000-09-20 6:12 ` Matt Porter 2000-09-20 12:15 ` Geert Uytterhoeven 2000-09-20 23:08 ` Paul Mackerras 2000-09-21 20:12 ` Matt Porter 2000-09-20 8:34 ` Roman Zippel 2000-09-20 22:54 ` Paul Mackerras 2000-09-20 15:56 ` Dan Malek 2000-09-20 23:22 ` Paul Mackerras 2000-09-21 2:13 ` Dan Malek 2000-09-21 2:35 ` Paul Mackerras 2000-09-21 3:57 ` Dan Malek 2000-09-21 5:06 ` Paul Mackerras 2000-09-21 6:51 ` Dan Malek 2000-09-21 14:03 ` Geert Uytterhoeven 2000-09-21 22:40 ` Benjamin Herrenschmidt 2000-09-22 3:53 ` Dan Malek 2000-09-22 11:58 ` Geert Uytterhoeven 2000-09-22 18:46 ` Dan Malek 2000-09-22 20:06 ` Frank Rowand 2000-09-23 21:38 ` Matt Porter 2000-09-21 20:22 ` Matt Porter 2000-09-22 3:49 ` Paul Mackerras 2000-09-22 4:16 ` Dan Malek 2000-09-23 12:34 ` Geert Uytterhoeven 2000-09-27 10:37 ` Benjamin Herrenschmidt 2000-09-28 9:59 ` Geert Uytterhoeven 2000-09-28 19:19 ` Benjamin Herrenschmidt 2000-09-28 23:33 ` Benjamin Herrenschmidt 2000-09-29 5:08 ` Dan Malek 2000-09-29 11:37 ` Geert Uytterhoeven 2000-09-29 17:12 ` Kostas Gewrgiou 2000-09-29 17:18 ` Benjamin Herrenschmidt 2000-09-29 21:35 ` Michel Lanners 2000-09-30 0:11 ` Matt Porter 2000-09-29 0:22 ` Paul Mackerras 2000-09-29 0:40 ` Benjamin Herrenschmidt 2000-09-29 1:17 ` Paul Mackerras 2000-09-29 4:22 ` Dan Malek 2000-09-29 4:29 ` Dan Malek 2000-09-29 4:36 ` Paul Mackerras 2000-09-29 5:40 ` Dan Malek 2000-09-29 19:07 ` Frank Rowand 2000-09-30 1:39 ` Paul Mackerras 2000-09-30 22:50 ` Frank Rowand 2000-10-01 1:09 ` Dan Malek 2000-10-01 8:16 ` Paul Mackerras 2000-10-01 21:30 ` Dan Malek 2000-10-01 22:50 ` Paul Mackerras 2000-10-02 9:04 ` Dan Malek 2000-09-28 23:24 ` Frank Rowand 2000-09-21 13:44 ` Geert Uytterhoeven 2000-09-21 22:41 ` Benjamin Herrenschmidt 2000-09-22 21:59 ` Michel Lanners 2000-09-20 12:08 ` Geert Uytterhoeven 2000-09-20 16:31 ` Matt Porter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).