* pte_pagenr/MAP_NR deleted in pre6
@ 2000-08-10 17:18 Kanoj Sarcar
2000-08-11 2:24 ` David S. Miller
` (2 more replies)
0 siblings, 3 replies; 36+ messages in thread
From: Kanoj Sarcar @ 2000-08-10 17:18 UTC (permalink / raw)
To: linux-mm, linux-kernel; +Cc: rmk, nico, davem, davidm, alan
Thought I would send out a quick note about a change I put into test6.
Basically, to make it easier to implement DISCONTIGMEM systems, the
concepts of page/mem_map number/index has been killed from the generic
(non architecture specific) parts of the kernel. This includes MAP_NR,
pte_pagenr and max_mapnr (although max_mapnr is used by a lot of
architectures, it is not used by the generic kernel anymore).
New macros that have been born to replace the above ones are
virt_to_page (thusly named by Linus!), which will take a kernel direct
mapped address as input and provide the corresponding struct page. The
other one is VALID_PAGE(), which given a page struct, determines whether
it is a valid page struct and represents _physical_ memory.
Both of virt_to_page and VALID_PAGE are in include/asm*/page.h. I have
tried to make sure there were no mistakes when making the changes for
the various architectures, but I am sure I goofed up a few cases, so
apologies in advance.
Also, as I have suggested before, the pte_page implementation in
sparc/sparc64 should be cleaned up, and the usages of MAP_NR in the
arm code. Russell, Linus has not put in the final patch that will
allow DISCONTIGMEM systems to lay out their mem_map arrays however
they see fit, I have resent it to him, if that is put in, we can get
down to simplifying most of the DISCONTIG arch code.
Thanks.
Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-10 17:18 pte_pagenr/MAP_NR deleted in pre6 Kanoj Sarcar @ 2000-08-11 2:24 ` David S. Miller 2000-08-14 0:34 ` Anton Blanchard 2000-08-11 11:50 ` Roman Zippel 2000-08-15 16:19 ` Stephen C. Tweedie 2 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2000-08-11 2:24 UTC (permalink / raw) To: kanoj; +Cc: linux-mm, linux-kernel, rmk, nico, davidm, alan Also, as I have suggested before, the pte_page implementation in sparc/sparc64 should be cleaned up I took care of sparc64 and have asked Anton to deal with sparc32. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 2:24 ` David S. Miller @ 2000-08-14 0:34 ` Anton Blanchard 0 siblings, 0 replies; 36+ messages in thread From: Anton Blanchard @ 2000-08-14 0:34 UTC (permalink / raw) To: David S. Miller; +Cc: kanoj, linux-mm, linux-kernel, rmk, nico, davidm, alan > I took care of sparc64 and have asked Anton to deal with sparc32. sparc32 has also been fixed. Anton -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-10 17:18 pte_pagenr/MAP_NR deleted in pre6 Kanoj Sarcar 2000-08-11 2:24 ` David S. Miller @ 2000-08-11 11:50 ` Roman Zippel 2000-08-11 13:20 ` Russell King 2000-08-15 16:19 ` Stephen C. Tweedie 2 siblings, 1 reply; 36+ messages in thread From: Roman Zippel @ 2000-08-11 11:50 UTC (permalink / raw) To: Kanoj Sarcar; +Cc: linux-mm, linux-kernel Hi, > Also, as I have suggested before, the pte_page implementation in > sparc/sparc64 should be cleaned up, and the usages of MAP_NR in the > arm code. Russell, Linus has not put in the final patch that will > allow DISCONTIGMEM systems to lay out their mem_map arrays however > they see fit, I have resent it to him, if that is put in, we can get > down to simplifying most of the DISCONTIG arch code. Can you send me that patch? I'd like to check it, if it can be used for the m68k port. m68k still has its own support for discontinous mem and from what I've seen so far I'm not really convinced yet to give it up. Small summary: m68k maps everything together into one virtual mapping and uses the virtual address as index into memmap. That has the advantage, that the address conversion stuff is concentrated in __va/__pa and the rest stays simple (e.g. we don't have to deal with multiple nodes). The only problem is that the generic code must not assume that a mem zone is a physically continuos area (what is mostly true, there are currently only two places, that are easy to fix). bye, Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 11:50 ` Roman Zippel @ 2000-08-11 13:20 ` Russell King 2000-08-11 14:56 ` Roman Zippel 2000-08-11 17:21 ` Kanoj Sarcar 0 siblings, 2 replies; 36+ messages in thread From: Russell King @ 2000-08-11 13:20 UTC (permalink / raw) To: Roman Zippel; +Cc: Kanoj Sarcar, linux-mm, linux-kernel Roman Zippel writes: > Can you send me that patch? I'd like to check it, if it can be used for > the m68k port. m68k still has its own support for discontinous mem and > from what I've seen so far I'm not really convinced yet to give it up. I don't see anything wrong in continuing with this. ARM also does this in addition to support for the discontig mem stuff. Why? The generial discontig code is ok so long as you have a lot of RAM in node 0. However, since all allocations currently come from node 0, if this node is small, then there is a chance that you will run out of memory at bootup, and then not be able to continue (and because we both use fbcon, there is no message visible to the user, and hence no diagnostics). Continuing with the single node but many "areas" that ARM follows, and from what it sounds like m68k does, means that you can allocate from any "area", and therefore don't hit this restriction. One way out of this would be if the NUMA stuff can have the "allocations only from node 0" feature turned off, and then I'd be happy to let the ARM version be replaced totally by the discontig case. _____ |_____| ------------------------------------------------- ---+---+- | | Russell King rmk@arm.linux.org.uk --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ ------------------------------------------------- /\\\ | -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 13:20 ` Russell King @ 2000-08-11 14:56 ` Roman Zippel 2000-08-12 9:18 ` Bjorn Wesen 2000-08-11 17:21 ` Kanoj Sarcar 1 sibling, 1 reply; 36+ messages in thread From: Roman Zippel @ 2000-08-11 14:56 UTC (permalink / raw) To: Russell King; +Cc: Kanoj Sarcar, linux-mm, linux-kernel Russell King wrote: > > Can you send me that patch? I'd like to check it, if it can be used for > > the m68k port. m68k still has its own support for discontinous mem and > > from what I've seen so far I'm not really convinced yet to give it up. > > I don't see anything wrong in continuing with this. ARM also does > this in addition to support for the discontig mem stuff. Why? My problem is that I'm not really familiar with the high memory support. The problem here is that the relation between virtual address / physical address / page struct / memmap+index is hardly documented and it gets more interesting when a page struct might also represent an i/o area (for direct i/o). > The generial discontig code is ok so long as you have a lot of RAM > in node 0. However, since all allocations currently come from node > 0, if this node is small, then there is a chance that you will run > out of memory at bootup, and then not be able to continue (and > because we both use fbcon, there is no message visible to the user, > and hence no diagnostics). Another problem on m68k: I can make almost no assumption about the memory layout to play some clever tricks. If I remember correctly I had some problems with the memmap layout, since lots of code assumed a continuos memmap and there were some tricks to get the above relationship right. bye, Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 14:56 ` Roman Zippel @ 2000-08-12 9:18 ` Bjorn Wesen 0 siblings, 0 replies; 36+ messages in thread From: Bjorn Wesen @ 2000-08-12 9:18 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-mm, linux-kernel On Fri, 11 Aug 2000, Roman Zippel wrote: > The problem here is that the relation between virtual address / physical > address / page struct / memmap+index is hardly documented and it gets > more interesting when a page struct might also represent an i/o area Amen to that - I'm doing a 2.4 port currently and our architecture has all DRAM at a pseudo-physical address 0xc0000000. Figuring out how to not make mem_map start at 0 and waste a lot of struct page's to cover everything up to 0xc0000000 and beyond, and what the __pa/__va things should do wrgds to the pseudo-0xc0000000 took some hours of groping around the archs and bootmem/zone code :) then it suddenly worked.. and like, "wow, don't touch it again!" :) (luckily I found a comment in mm/numa.c about exactly that and that m68k and arm used it - you could never have been led to believe that looking through the non-commented source :) The relationships between virtual/logical/physical etc. can be extremely confusing - our CPU has physical DRAM at 0x40000000 but it is segmented into 0xc0000000 in kernel-mode, while the paged virtual memory is at 0. Heh. Fortunately the 0x40000000 business can be largely ignored since it is only visible inside the TLB - for all other purposes the DRAM is at 0xc... So what I ended up doing was to make __pa/__va convert between 0xc.. and 0x4.., put PAGE_OFFSET == 0xc.., max/min_low_pfn's at 0xc..., mem_map indexes start at 0 (corresponding to 0xc).. seems to work so far :) It does not help of course that all archs do the bootmem and zone initialization in their own ways :) -Bjorn -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 13:20 ` Russell King 2000-08-11 14:56 ` Roman Zippel @ 2000-08-11 17:21 ` Kanoj Sarcar 2000-08-14 9:29 ` Roman Zippel 1 sibling, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-11 17:21 UTC (permalink / raw) To: Russell King; +Cc: Roman Zippel, linux-mm, linux-kernel > > Roman Zippel writes: > > Can you send me that patch? I'd like to check it, if it can be used for http://oss.sgi.com/projects/numa/download/map.patch And even if it doesn't help m68k, it definitely will help mips64, ia64 and ARM (from what I am understanding from Russell). So, unless it is _breaking_ m68k, I would rather see the patch go in ... > > the m68k port. m68k still has its own support for discontinous mem and > > from what I've seen so far I'm not really convinced yet to give it up. > > I don't see anything wrong in continuing with this. ARM also does > this in addition to support for the discontig mem stuff. Why? > > The generial discontig code is ok so long as you have a lot of RAM > in node 0. However, since all allocations currently come from node > 0, if this node is small, then there is a chance that you will run > out of memory at bootup, and then not be able to continue (and > because we both use fbcon, there is no message visible to the user, > and hence no diagnostics). Note: the biggest component of bootmem allocation is the mem_map for that node, which happens on specific nodes. I agree, other allocations happen out of node 0, but if there is a chance that on specific architectures we might run out of memory on node 0, we can fix this, although I would like to hear details offline ... > > Continuing with the single node but many "areas" that ARM follows, and > from what it sounds like m68k does, means that you can allocate from > any "area", and therefore don't hit this restriction. > > One way out of this would be if the NUMA stuff can have the "allocations > only from node 0" feature turned off, and then I'd be happy to let the > ARM version be replaced totally by the discontig case. This is not NUMA, this is regular DISCONTIG. One option while doing alloc_bootmem (ie, no node specified), is to do the allocation from node 0, since no other node can be guranteed to exist. If this sounds too constricting, we can modify alloc_bootmem to try allocating from all nodes for which alloc_bootmem_node() has already been done. Shouldn't be too hard to implement and the changes are completely in the bootmem allocator code. Lets talk offline (along with Roman) if you are interested. Kanoj > _____ > |_____| ------------------------------------------------- ---+---+- > | | Russell King rmk@arm.linux.org.uk --- --- > | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | > | +-+-+ --- -+- > / | THE developer of ARM Linux |+| /|\ > / | | | --- | > +-+-+ ------------------------------------------------- /\\\ | > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux.eu.org/Linux-MM/ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-11 17:21 ` Kanoj Sarcar @ 2000-08-14 9:29 ` Roman Zippel 0 siblings, 0 replies; 36+ messages in thread From: Roman Zippel @ 2000-08-14 9:29 UTC (permalink / raw) To: Kanoj Sarcar; +Cc: Russell King, linux-mm, linux-kernel Hi, > And even if it doesn't help m68k, it definitely will help mips64, ia64 > and ARM (from what I am understanding from Russell). So, unless it is > _breaking_ m68k, I would rather see the patch go in ... No, it doesn't :) and I think I can start thinking to make it usable under m68k. bye, Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-10 17:18 pte_pagenr/MAP_NR deleted in pre6 Kanoj Sarcar 2000-08-11 2:24 ` David S. Miller 2000-08-11 11:50 ` Roman Zippel @ 2000-08-15 16:19 ` Stephen C. Tweedie 2000-08-16 8:25 ` Roman Zippel 2 siblings, 1 reply; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-15 16:19 UTC (permalink / raw) To: Kanoj Sarcar; +Cc: linux-mm, linux-kernel, rmk, nico, davem, davidm, alan Hi, On Thu, Aug 10, 2000 at 10:18:49AM -0700, Kanoj Sarcar wrote: > Thought I would send out a quick note about a change I put into test6. > Basically, to make it easier to implement DISCONTIGMEM systems, the > concepts of page/mem_map number/index has been killed from the generic > (non architecture specific) parts of the kernel. Excellent, this will make it _tons_ easier for me to create new zones of mem_map arrays on the fly to allow us to create struct pages for PCI IO-aperture memory (necessary for kiobuf mappings of IO memory). Thanks! --Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-15 16:19 ` Stephen C. Tweedie @ 2000-08-16 8:25 ` Roman Zippel 2000-08-16 17:13 ` Kanoj Sarcar 2000-08-16 18:17 ` Stephen C. Tweedie 0 siblings, 2 replies; 36+ messages in thread From: Roman Zippel @ 2000-08-16 8:25 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Kanoj Sarcar, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan Hi, > Excellent, this will make it _tons_ easier for me to create new zones > of mem_map arrays on the fly to allow us to create struct pages for > PCI IO-aperture memory (necessary for kiobuf mappings of IO memory). A related question: do you already have an idea how the driver interface for that could look like? I mean, some drivers need a virtual address, some need the physical address for dma and some of them might need bounce buffers. E.g. I don't know how to get (quickly) from a page struct which represents an io mapping to the physical address. Will we add some generic funtions for this which can be used by drivers or even let the drivers only specify its requirements and the buffer code will generate an appropriate io request. I have a few ideas, but I don't know if already concrete plans exists. bye, Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 8:25 ` Roman Zippel @ 2000-08-16 17:13 ` Kanoj Sarcar 2000-08-16 18:20 ` Stephen C. Tweedie 2000-08-16 18:17 ` Stephen C. Tweedie 1 sibling, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-16 17:13 UTC (permalink / raw) To: Roman Zippel Cc: Stephen C. Tweedie, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan > > Hi, > > > Excellent, this will make it _tons_ easier for me to create new zones > > of mem_map arrays on the fly to allow us to create struct pages for > > PCI IO-aperture memory (necessary for kiobuf mappings of IO memory). > > A related question: do you already have an idea how the driver interface > for that could look like? I mean, some drivers need a virtual address, > some need the physical address for dma and some of them might need > bounce buffers. E.g. I don't know how to get (quickly) from a page > struct which represents an io mapping to the physical address. Will we > add some generic funtions for this which can be used by drivers or even > let the drivers only specify its requirements and the buffer code will > generate an appropriate io request. I have a few ideas, but I don't know > if already concrete plans exists. > > bye, Roman > FWIW, Linus was mildly suggesting I implement page_to_phys, to complement virt_to_page. I didn't see an immediate need for it, so I just did the bit I am interested in for now. If you look, most of the mk_pte() definitions should actually use page_to_phys ... Of course, I am talking about struct pages that represent memory, not io devices, I don't think either one of us was thinking about that ... I also thought about whether page_to_phys would be useful for drivers, decided against it, since the PCI-DMA apis which are quite a standard now want to go to the PCI bus addresses, instead of physical addresses. BTW, I am not sure I understand when you say "some drivers need a virtual address, some need the physical address for dma and some of them might need bounce buffers". I believe, the goal should be to pass in either a. struct page or b. physical address, then the driver makes the PCI-DMA calls to determine whether it can dma directly into the page (or transparently get a page which it can dma into, and the PCI-DMA layer handles the bouncing completely). Passing in virtual addresses into drivers is not good, if you think about the i386 class machines which can not direct map the entire memory (hence would need kmap addresses for high pages). Finally, whether the drivers accept virtual addresses or struct pages, they should not be trying to interpret their input, rather treat the input as opaque cookies, to be passed on to the PCI-DMA layer ... Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 17:13 ` Kanoj Sarcar @ 2000-08-16 18:20 ` Stephen C. Tweedie 2000-08-16 18:24 ` David S. Miller ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-16 18:20 UTC (permalink / raw) To: Kanoj Sarcar Cc: Roman Zippel, Stephen C. Tweedie, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan Hi, On Wed, Aug 16, 2000 at 10:13:21AM -0700, Kanoj Sarcar wrote: > > FWIW, Linus was mildly suggesting I implement page_to_phys, to complement > virt_to_page. It's part of what is necessary if we want to push kiobufs into the driver layers. page_to_pfn is needed to for PAE36 support so that PCI64 or dual-address-cycle drivers can handle physical addresses longer than 32 bits long. > BTW, I am not sure I understand when you say "some drivers need a virtual > address, some need the physical address for dma and some of them might need > bounce buffers". I believe, the goal should be to pass in either a. struct > page or b. physical address Yes, but different drivers have different requirements on those struct page *s. Drivers which do programmed IO need to be able to turn the page into a kernel virtual address. Drivers which can access >32-bit addresses need to turn the page into an index which fits inside 32 bits. Drivers which do DMA but only to <4GB addresses need bounce buffers. That is irrelevant as far as the kiobuf data structure is concerned, but it is very important for the internals of the drivers, so this sort of functionality must be made available for drivers to use internally as needed. Cheers, Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:20 ` Stephen C. Tweedie @ 2000-08-16 18:24 ` David S. Miller 2000-08-16 19:53 ` Stephen C. Tweedie 2000-08-16 18:47 ` Kanoj Sarcar 2000-08-16 22:22 ` Kanoj Sarcar 2 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2000-08-16 18:24 UTC (permalink / raw) To: sct; +Cc: kanoj, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan Drivers which do DMA but only to <4GB addresses need bounce buffers. This is only true in an architecture specific sense (ie. x86 systems are the one's which have this particular restriction). Which is one of the reasons I wish the bounce buffer stuff went into the place it belongs, behind the pci_dma API. If we move to a page+offset model for drivers, we could do exactly this and also handle cases like ix86 PAE. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:24 ` David S. Miller @ 2000-08-16 19:53 ` Stephen C. Tweedie 0 siblings, 0 replies; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-16 19:53 UTC (permalink / raw) To: David S. Miller Cc: sct, kanoj, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan Hi, On Wed, Aug 16, 2000 at 11:24:23AM -0700, David S. Miller wrote: > > Which is one of the reasons I wish the bounce buffer stuff went into > the place it belongs, behind the pci_dma API. If we move to a > page+offset model for drivers, we could do exactly this and also > handle cases like ix86 PAE. Fine --- it is pretty easy to generate a scatterlist from a kiobuf so that drivers using a kiovec API can use the existing pci_dma support. Cheers, Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:20 ` Stephen C. Tweedie 2000-08-16 18:24 ` David S. Miller @ 2000-08-16 18:47 ` Kanoj Sarcar 2000-08-16 18:39 ` David S. Miller 2000-08-16 22:22 ` Kanoj Sarcar 2 siblings, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-16 18:47 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Roman Zippel, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan > > Hi, > > On Wed, Aug 16, 2000 at 10:13:21AM -0700, Kanoj Sarcar wrote: > > > > FWIW, Linus was mildly suggesting I implement page_to_phys, to complement > > virt_to_page. > > It's part of what is necessary if we want to push kiobufs into the > driver layers. page_to_pfn is needed to for PAE36 support so that > PCI64 or dual-address-cycle drivers can handle physical addresses > longer than 32 bits long. > > > BTW, I am not sure I understand when you say "some drivers need a virtual > > address, some need the physical address for dma and some of them might need > > bounce buffers". I believe, the goal should be to pass in either a. struct > > page or b. physical address > > Yes, but different drivers have different requirements on those struct > page *s. Drivers which do programmed IO need to be able to turn the > page into a kernel virtual address. Drivers which can access >32-bit > addresses need to turn the page into an index which fits inside 32 > bits. Drivers which do DMA but only to <4GB addresses need bounce > buffers. > > That is irrelevant as far as the kiobuf data structure is concerned, > but it is very important for the internals of the drivers, so this > sort of functionality must be made available for drivers to use > internally as needed. > > Cheers, > Stephen > It might be easier all around if we could all agree to what drivers need to do. As David Miller points out, whether a driver can dma into >32-bit addresses etc is also a function of the architecture, so this is best hidden under per architecture PCI-DMA layer. So, if the driver writer codes according to this, he will transparently get the best performance for any architecure ... I guess finally, drivers will either get one or a list of 1. struct page or 2. pfn or 3. paddr_t (unsigned long long on PAE36, unsigned long on other platforms) The PCI-DMA layer should be able to handle this type of input. The driver must not attempt to convert this to PCI bus addresses. The driver must call an arcitecture hook, like kmap(), to get a kernel virtual address for the underlying page. It should be able to do without needing the physical address of the page, the PCI-DMA routines will know how to do that. kiobufs might need to get some hooks into PCI-DMA, but shouldn't this suffice, mostly? Or is this being too restrictive for some drivers? Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:47 ` Kanoj Sarcar @ 2000-08-16 18:39 ` David S. Miller 2000-08-16 19:30 ` Stephen C. Tweedie 0 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2000-08-16 18:39 UTC (permalink / raw) To: kanoj; +Cc: sct, roman I guess finally, drivers will either get one or a list of 1. struct page or Make this "struct page and offset", a page is not enough by itself to indicate all the necessary information, you need an offset within the page as well. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:39 ` David S. Miller @ 2000-08-16 19:30 ` Stephen C. Tweedie 0 siblings, 0 replies; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-16 19:30 UTC (permalink / raw) To: David S. Miller Cc: kanoj, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan Hi, On Wed, Aug 16, 2000 at 11:39:17AM -0700, David S. Miller wrote: > > I guess finally, drivers will either get one or a list of > > 1. struct page or > > Make this "struct page and offset", a page is not enough by itself to > indicate all the necessary information, you need an offset within the > page as well. That's exactly what a kiobuf is --- a vector of struct page *s, plus an arbitrary offset and length. --Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 18:20 ` Stephen C. Tweedie 2000-08-16 18:24 ` David S. Miller 2000-08-16 18:47 ` Kanoj Sarcar @ 2000-08-16 22:22 ` Kanoj Sarcar 2000-08-17 9:11 ` Stephen C. Tweedie 2 siblings, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-16 22:22 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Roman Zippel, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan > > Hi, > > On Wed, Aug 16, 2000 at 10:13:21AM -0700, Kanoj Sarcar wrote: > > > > FWIW, Linus was mildly suggesting I implement page_to_phys, to complement > > virt_to_page. > > It's part of what is necessary if we want to push kiobufs into the > driver layers. page_to_pfn is needed to for PAE36 support so that > PCI64 or dual-address-cycle drivers can handle physical addresses > longer than 32 bits long. > While we are on this topic, something like #define page_to_phys(page) \ ((((page)-(page)->zone->zone_mem_map) << PAGE_SHIFT) \ + ((page)->zone->zone_start_paddr)) should work on all platforms on 2.4. (You might have to add in an unsigned long long somewhere in there for PAE36). Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 22:22 ` Kanoj Sarcar @ 2000-08-17 9:11 ` Stephen C. Tweedie 2000-08-17 19:07 ` Kanoj Sarcar 0 siblings, 1 reply; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-17 9:11 UTC (permalink / raw) To: Kanoj Sarcar Cc: Stephen C. Tweedie, Roman Zippel, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan Hi, On Wed, Aug 16, 2000 at 03:22:07PM -0700, Kanoj Sarcar wrote: > > It's part of what is necessary if we want to push kiobufs into the > > driver layers. page_to_pfn is needed to for PAE36 support so that > > PCI64 or dual-address-cycle drivers can handle physical addresses > > longer than 32 bits long. > > While we are on this topic, something like > > #define page_to_phys(page) \ > ((((page)-(page)->zone->zone_mem_map) << PAGE_SHIFT) \ > + ((page)->zone->zone_start_paddr)) > > should work on all platforms on 2.4. (You might have to add in an > unsigned long long somewhere in there for PAE36). The long long is exactly what we need to avoid: PAE36 still has pointers as 32-bit values. Only ptes get the 64-bit treatment. Adding a BUG() test to detect illegal accesses to >4GB pages on PAE36 would be fine. If we have the appropriate bounce buffer support in place in pci_dma or wherever suits it, then by the time a driver is doing page_to_phys() it should already have created the appropriate bounce buffers and so the BUG() test is fine. For DAC/PCI64 drivers, though, we need a separate macro like page_to_pfn so that we can identify the physical address via a 32-bit value. The driver can then shift that into a 64-bit long long if it wants to --- there's no need to introduce new 64-bit macros into the mm just for this special case. Cheers, Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 9:11 ` Stephen C. Tweedie @ 2000-08-17 19:07 ` Kanoj Sarcar 2000-08-17 19:01 ` David S. Miller 0 siblings, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-17 19:07 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Roman Zippel, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan > > Hi, > > On Wed, Aug 16, 2000 at 03:22:07PM -0700, Kanoj Sarcar wrote: > > > > It's part of what is necessary if we want to push kiobufs into the > > > driver layers. page_to_pfn is needed to for PAE36 support so that > > > PCI64 or dual-address-cycle drivers can handle physical addresses > > > longer than 32 bits long. > > > > While we are on this topic, something like > > > > #define page_to_phys(page) \ > > ((((page)-(page)->zone->zone_mem_map) << PAGE_SHIFT) \ > > + ((page)->zone->zone_start_paddr)) > > > > should work on all platforms on 2.4. (You might have to add in an > > unsigned long long somewhere in there for PAE36). > > The long long is exactly what we need to avoid: PAE36 still has > pointers as 32-bit values. Only ptes get the 64-bit treatment. > > Adding a BUG() test to detect illegal accesses to >4GB pages on PAE36 > would be fine. If we have the appropriate bounce buffer support in > place in pci_dma or wherever suits it, then by the time a driver is > doing page_to_phys() it should already have created the appropriate > bounce buffers and so the BUG() test is fine. > > For DAC/PCI64 drivers, though, we need a separate macro like > page_to_pfn so that we can identify the physical address via a 32-bit Or, use a 64 bit value to represent physical addresses. Which is why I am proposing paddr_t. In that case, #define page_to_phys(page) ((((unsigned long long)((page)-(page)->zone->zone_mem_map)) \ << PAGE_SHIFT) + ((page)->zone->zone_start_paddr)) This would "work" (on i386) despite the fact that zone->zone_start_paddr is "unsigned long" not "unsigned long long". Things would be much easier with paddr_t. #define page_to_phys(page) ((((paddr_t)((page)-(page)->zone->zone_mem_map)) \ << PAGE_SHIFT) + ((page)->zone->zone_start_paddr)) and we would change zone->zone_start_paddr to be paddr_t too. > value. The driver can then shift that into a 64-bit long long if it > wants to --- there's no need to introduce new 64-bit macros into the > mm just for this special case. Whatever you do, you either have to introduce paddr_t (which to me seems more intuitive) or page_to_pfn. We can argue one way or another, but paddr_t might give you type checking for free too ... Kanoj > > Cheers, > Stephen > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:07 ` Kanoj Sarcar @ 2000-08-17 19:01 ` David S. Miller 2000-08-17 19:19 ` Alan Cox 2000-08-17 19:32 ` Kanoj Sarcar 0 siblings, 2 replies; 36+ messages in thread From: David S. Miller @ 2000-08-17 19:01 UTC (permalink / raw) To: kanoj; +Cc: sct Whatever you do, you either have to introduce paddr_t (which to me seems more intuitive) or page_to_pfn. We can argue one way or another, but paddr_t might give you type checking for free too ... My only two gripes about paddr_t is that long long is not only expensive but has been also known to be buggy on 32-bit platforms. The next gripe is that it will make many clueless driver etc. developers (who don't read documentation even, but write a large portion of the vendor Linux drivers :-) will try to do things like "void *p = (void *) (PAGE_OFFSET + x->paddr);" and expect this to work, or maybe they'll even pass it to virt_to_bus or similar. If people don't think these two things will be an issue, fine with me. :-) Which reminds me, we need to schedule a field day early 2.5.x where virt_to_bus and bus_to_virt are exterminated, this is the only way we can move to drivers using page+offset correctly, forcing them through interface such as the pci_dma API instead. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:01 ` David S. Miller @ 2000-08-17 19:19 ` Alan Cox 2000-08-17 19:20 ` David S. Miller 2000-08-17 19:24 ` Alan Cox 2000-08-17 19:32 ` Kanoj Sarcar 1 sibling, 2 replies; 36+ messages in thread From: Alan Cox @ 2000-08-17 19:19 UTC (permalink / raw) To: David S. Miller Cc: kanoj, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan > Whatever you do, you either have to introduce paddr_t (which to me > seems more intuitive) or page_to_pfn. We can argue one way or > another, but paddr_t might give you type checking for free too ... > > My only two gripes about paddr_t is that long long is not only > expensive but has been also known to be buggy on 32-bit platforms. Except for the x86 36bit abortion do we need a long long paddr_t on any 32bit platform ? > Which reminds me, we need to schedule a field day early 2.5.x where > virt_to_bus and bus_to_virt are exterminated, this is the only way we > can move to drivers using page+offset correctly, forcing them through > interface such as the pci_dma API instead. So you'll be adding an isa_alloc_consistant, mca_alloc_consistent, m68k_motherboard_alloc_consistent , .... And then of course I need virt_to_bus/bus_to_virt to poke at things like hardware on a PC and to access the roms. Its not trivial to exterminate. It really isnt. The PCI api is a tiny subset of uses for those functions. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:19 ` Alan Cox @ 2000-08-17 19:20 ` David S. Miller 2000-08-17 19:33 ` Alan Cox ` (2 more replies) 2000-08-17 19:24 ` Alan Cox 1 sibling, 3 replies; 36+ messages in thread From: David S. Miller @ 2000-08-17 19:20 UTC (permalink / raw) To: alan; +Cc: kanoj, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > My only two gripes about paddr_t is that long long is not only > expensive but has been also known to be buggy on 32-bit platforms. Except for the x86 36bit abortion do we need a long long paddr_t on any 32bit platform ? Sparc32, mips32... > Which reminds me, we need to schedule a field day early 2.5.x where > virt_to_bus and bus_to_virt are exterminated, this is the only way we > can move to drivers using page+offset correctly, forcing them through > interface such as the pci_dma API instead. So you'll be adding an isa_alloc_consistant, mca_alloc_consistent, m68k_motherboard_alloc_consistent , .... I'll probably be adding isa_virt_to_bus, because when it is in fact "ISA like" the driver already knows that it must be certain that the physical address is below the 16MB mark right? Then the cases left on x86 are MCA (which can use the ISA interface) and PCI drivers which must be updated to use the PCI dma API. Just like I did for SBUS, the m68k folks can deal with their issues any way they like. Its not trivial to exterminate. I think it is. What's not trivial is getting bozos to clean up their drivers. For example, BTTV still doesn't use the PCI dma stuff simply because nobody wishes to use their brains a little bit and encapsulate the user DMA stuff into a common spot (it's duplicated in 4 or 5 drivers) which uses scatter gather lists with the DMA api. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:20 ` David S. Miller @ 2000-08-17 19:33 ` Alan Cox 2000-08-17 19:36 ` Kanoj Sarcar 2000-08-17 19:50 ` Kanoj Sarcar 2 siblings, 0 replies; 36+ messages in thread From: Alan Cox @ 2000-08-17 19:33 UTC (permalink / raw) To: David S. Miller Cc: alan, kanoj, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > I'll probably be adding isa_virt_to_bus, because when it is in fact > "ISA like" the driver already knows that it must be certain that the isa_alloc_consistent makes sense actually. Its needed for ISA bus masters on ancient mips and other crap > physical address is below the 16MB mark right? Then the cases left on 16Mb for ISA - except on a few late 486 era boxes with magic extensions (which we'd finalyl be able to use) > x86 are MCA (which can use the ISA interface) and PCI drivers which no MCA bus is 32bit - its closer to PCI than ISA. mca_alloc_consistent is doable and if some loon ever does do old IBM power boxes it will be needed as they apparently arent cache coherent MCA > drivers. For example, BTTV still doesn't use the PCI dma stuff simply > because nobody wishes to use their brains a little bit and encapsulate > the user DMA stuff into a common spot (it's duplicated in 4 or 5 > drivers) which uses scatter gather lists with the DMA api. BTTV doesnt use it because the current stuff works and for post 2.4 using mmap_kiovec() and similar stuff probably will be a better solution - that will also help us to push PCI bug awareness into pci not drivers Also mmap_kiovec will let i810 and some other sound cards do scatter gather buffers sensibly. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:20 ` David S. Miller 2000-08-17 19:33 ` Alan Cox @ 2000-08-17 19:36 ` Kanoj Sarcar 2000-09-07 14:31 ` Ralf Baechle 2000-08-17 19:50 ` Kanoj Sarcar 2 siblings, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-17 19:36 UTC (permalink / raw) To: David S. Miller Cc: alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > > Date: Thu, 17 Aug 2000 20:19:59 +0100 (BST) > From: Alan Cox <alan@lxorguk.ukuu.org.uk> > > > My only two gripes about paddr_t is that long long is not only > > expensive but has been also known to be buggy on 32-bit platforms. > > Except for the x86 36bit abortion do we need a long long paddr_t on any > 32bit platform ? > > Sparc32, mips32... > Not for Indys on mips32. Is there a mips32 port on another machine (currently in Linux, or port ongoing) that requires this? Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:36 ` Kanoj Sarcar @ 2000-09-07 14:31 ` Ralf Baechle 0 siblings, 0 replies; 36+ messages in thread From: Ralf Baechle @ 2000-09-07 14:31 UTC (permalink / raw) To: Kanoj Sarcar Cc: David S. Miller, alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm On Thu, Aug 17, 2000 at 12:36:51PM -0700, Kanoj Sarcar wrote: > > Except for the x86 36bit abortion do we need a long long paddr_t on any > > 32bit platform ? > > > > Sparc32, mips32... > > > > Not for Indys on mips32. Is there a mips32 port on another machine > (currently in Linux, or port ongoing) that requires this? No. Right now mips32 assumes that all memory is accessible in KSEG0 which limits it to 512mb - $\epsilon$. I don't know of any 32-bit CPU configuration which supports memory than that and for 64-bit processors the policy should be to use mips64 - it's so much saner. Ralf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:20 ` David S. Miller 2000-08-17 19:33 ` Alan Cox 2000-08-17 19:36 ` Kanoj Sarcar @ 2000-08-17 19:50 ` Kanoj Sarcar 2000-08-17 19:41 ` David S. Miller 2000-08-17 19:56 ` Alan Cox 2 siblings, 2 replies; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-17 19:50 UTC (permalink / raw) To: David S. Miller Cc: alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > > So you'll be adding an isa_alloc_consistant, mca_alloc_consistent, > m68k_motherboard_alloc_consistent , .... > > I'll probably be adding isa_virt_to_bus, because when it is in fact > "ISA like" the driver already knows that it must be certain that the > physical address is below the 16MB mark right? Then the cases left on > x86 are MCA (which can use the ISA interface) and PCI drivers which > must be updated to use the PCI dma API. > Just a minor nit. So, unlike system vendors adding in dma mapping registers for PCI32 devices to dma anywhere into their >32 bit physical address space, you are assuming no vendor will ever have a mapping scheme for ISA devices that let them get over the 16MB mark? Of course, I am not aware of ISA that much anyway (and I hope I don't have to!), so please ignore this if it doesn't make sense. Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:50 ` Kanoj Sarcar @ 2000-08-17 19:41 ` David S. Miller 2000-09-07 14:26 ` Ralf Baechle 2000-08-17 19:56 ` Alan Cox 1 sibling, 1 reply; 36+ messages in thread From: David S. Miller @ 2000-08-17 19:41 UTC (permalink / raw) To: kanoj; +Cc: alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm So, unlike system vendors adding in dma mapping registers for PCI32 devices to dma anywhere into their >32 bit physical address space, you are assuming no vendor will ever have a mapping scheme for ISA devices that let them get over the 16MB mark? ISA is a dead hardware technology and therefore how it works is pretty much fixed in stone. Perhaps some older MIPS machines supporting ISA could benefit from an API similar to the PCI dma stuff, as Alan mentioned. But that is the only case which has any merit in my mind. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:41 ` David S. Miller @ 2000-09-07 14:26 ` Ralf Baechle 0 siblings, 0 replies; 36+ messages in thread From: Ralf Baechle @ 2000-09-07 14:26 UTC (permalink / raw) To: David S. Miller Cc: kanoj, alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm On Thu, Aug 17, 2000 at 12:41:52PM -0700, David S. Miller wrote: > ISA is a dead hardware technology and therefore how it works is pretty > much fixed in stone. > > Perhaps some older MIPS machines supporting ISA could benefit from > an API similar to the PCI dma stuff, as Alan mentioned. But that is > the only case which has any merit in my mind. ISA isn't really a consideration for MIPS. All that ISA hardware couldn't be supported by treating it the same as on a x86 system. That's not top efficient but justified given the importance of ISA for MIPS boxes - nearly NIL. Ralf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:50 ` Kanoj Sarcar 2000-08-17 19:41 ` David S. Miller @ 2000-08-17 19:56 ` Alan Cox 1 sibling, 0 replies; 36+ messages in thread From: Alan Cox @ 2000-08-17 19:56 UTC (permalink / raw) To: Kanoj Sarcar Cc: David S. Miller, alan, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > So, unlike system vendors adding in dma mapping registers for PCI32 > devices to dma anywhere into their >32 bit physical address space, you > are assuming no vendor will ever have a mapping scheme for ISA devices > that let them get over the 16MB mark? They did. Even on a few x86 boards. Supporting those bits of weirdness are not important. If the ISA 16Mb window is offset someone can wrap it in their arch specific isa_alloc_consistent code.. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:19 ` Alan Cox 2000-08-17 19:20 ` David S. Miller @ 2000-08-17 19:24 ` Alan Cox 1 sibling, 0 replies; 36+ messages in thread From: Alan Cox @ 2000-08-17 19:24 UTC (permalink / raw) To: Alan Cox Cc: David S. Miller, kanoj, sct, roman, linux-mm, linux-kernel, rmk, nico, davidm > > can move to drivers using page+offset correctly, forcing them through > > interface such as the pci_dma API instead. > > So you'll be adding an isa_alloc_consistant, mca_alloc_consistent, > m68k_motherboard_alloc_consistent , .... > > And then of course I need virt_to_bus/bus_to_virt to poke at things like > hardware on a PC and to access the roms. Umm wait - for those its hidden inside the ioremap so private.. so its just the mca/zorro/whateverbus ones -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:01 ` David S. Miller 2000-08-17 19:19 ` Alan Cox @ 2000-08-17 19:32 ` Kanoj Sarcar 2000-08-17 19:30 ` David S. Miller 1 sibling, 1 reply; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-17 19:32 UTC (permalink / raw) To: David S. Miller Cc: sct, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan > > Whatever you do, you either have to introduce paddr_t (which to me > seems more intuitive) or page_to_pfn. We can argue one way or > another, but paddr_t might give you type checking for free too ... > > My only two gripes about paddr_t is that long long is not only > expensive but has been also known to be buggy on 32-bit platforms. Yeah, yeah, I know ... (I didn't know about the buggy bit though). OTOH, paddr_t is such an intuitive concept (and without any disadvantages on any platform other than i386-PAE), its unfortunate if it gets shot down just because of this ... > > The next gripe is that it will make many clueless driver > etc. developers (who don't read documentation even, but write a large > portion of the vendor Linux drivers :-) will try to do things > like "void *p = (void *) (PAGE_OFFSET + x->paddr);" and expect > this to work, or maybe they'll even pass it to virt_to_bus or similar. Wait! You are saying you have a scheme that will prevent writers from writing buggy code that happens to work only on 32Mb i386 ... Go ahead, I am all ears :-) Basically, with all the pci-dma and sct's alternate page_to_pfn suggestion interfaces, you still can not claim that people will not do void *p = (void *) (PAGE_OFFSET + page_to_pfn(p) << PAGE_SHIFT) and check that code in because it works on their 1Gb i386 box. No? Kanoj > If people don't think these two things will be an issue, fine with > me. :-) > > Which reminds me, we need to schedule a field day early 2.5.x where > virt_to_bus and bus_to_virt are exterminated, this is the only way we > can move to drivers using page+offset correctly, forcing them through > interface such as the pci_dma API instead. > > Later, > David S. Miller > davem@redhat.com > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux.eu.org/Linux-MM/ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:32 ` Kanoj Sarcar @ 2000-08-17 19:30 ` David S. Miller 2000-08-17 20:00 ` Kanoj Sarcar 0 siblings, 1 reply; 36+ messages in thread From: David S. Miller @ 2000-08-17 19:30 UTC (permalink / raw) To: kanoj; +Cc: sct, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan BTW, I've sed s/vger.rutgers.edu/vger.redhat.com/ Wait! You are saying you have a scheme that will prevent writers from writing buggy code that happens to work only on 32Mb i386 ... Go ahead, I am all ears :-) I understand your point, but please understand mine. One might laugh, but after I read and really considered some of the points made by the author of "Writing Solid Code" in that book, I realized that one of my jobs as someone creating an API is that I should be trying as hard as possible to design it such that it is next to impossible to misuse it. Secondly, I learned that I shouldn't be adding API's spuriously because it will end up being maintained forever, re: the kern_addr_looks_ok sillyness :-) So anyways, I was probably being overly anal for this particular case. Later, David S. Miller davem@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-17 19:30 ` David S. Miller @ 2000-08-17 20:00 ` Kanoj Sarcar 0 siblings, 0 replies; 36+ messages in thread From: Kanoj Sarcar @ 2000-08-17 20:00 UTC (permalink / raw) To: David S. Miller Cc: sct, roman, linux-mm, linux-kernel, rmk, nico, davidm, alan > > From: Kanoj Sarcar <kanoj@google.engr.sgi.com> > Date: Thu, 17 Aug 2000 12:32:35 -0700 (PDT) > > BTW, I've sed s/vger.rutgers.edu/vger.redhat.com/ > > Wait! You are saying you have a scheme that will prevent writers > from writing buggy code that happens to work only on 32Mb i386 ... > Go ahead, I am all ears :-) > > I understand your point, but please understand mine. > > One might laugh, but after I read and really considered some of the > points made by the author of "Writing Solid Code" in that book, I > realized that one of my jobs as someone creating an API is that I > should be trying as hard as possible to design it such that it is next > to impossible to misuse it. Unfortunately, where there's a will to misuse, there usually is a way :-( And that's doubly hard to accept after all the hard work that gets into creating the API ... Kanoj -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: pte_pagenr/MAP_NR deleted in pre6 2000-08-16 8:25 ` Roman Zippel 2000-08-16 17:13 ` Kanoj Sarcar @ 2000-08-16 18:17 ` Stephen C. Tweedie 1 sibling, 0 replies; 36+ messages in thread From: Stephen C. Tweedie @ 2000-08-16 18:17 UTC (permalink / raw) To: Roman Zippel Cc: Stephen C. Tweedie, Kanoj Sarcar, linux-mm, linux-kernel, rmk, nico, davem, davidm, alan Hi, On Wed, Aug 16, 2000 at 10:25:08AM +0200, Roman Zippel wrote: > > > Excellent, this will make it _tons_ easier for me to create new zones > > of mem_map arrays on the fly to allow us to create struct pages for > > PCI IO-aperture memory (necessary for kiobuf mappings of IO memory). > > A related question: do you already have an idea how the driver interface > for that could look like? I mean, some drivers need a virtual address, > some need the physical address for dma and some of them might need > bounce buffers. It's even more complicated than that --- you can't even assume that the pages concerned have got valid pointers in _any_ address space, because they might be high memory pages on PAE36 which exist above the 4GB boundary and which aren't mapped into virtual memory anywhere. We will need to make sure that there is a clean way to convert any struct page * into (a) a kernel virtual address (that's easy, kmap() does it already); (b) a physical address (which can be translated easily into a bus address); or (c) a page frame number which can identify pages above 4GB even though a ulong pointer/address can't cope with such pages as addresses directly. However, the kiobuf code will not do anything fancy with _any_ of this --- it will continue just to carry struct page *s. It will be up to the users of the kiobufs to do anything further with them. I already have bounce buffer support for kiobufs in 2.2 (as a quick hack to let highmem raw IO work on 2.2; 2.4 is much cleaner and doesn't need that particular hack). I'll make sure that 2.4 has a clean way of doing bounce buffers too, probably by means of a clone_kiobuf() function which creates a new kiobuf by cloning the pages of the original if they satisfy some constraint (such as <1GB, <4GB), and pre/post-copying them if they do not. Cheers, Stephen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2000-09-07 14:31 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2000-08-10 17:18 pte_pagenr/MAP_NR deleted in pre6 Kanoj Sarcar 2000-08-11 2:24 ` David S. Miller 2000-08-14 0:34 ` Anton Blanchard 2000-08-11 11:50 ` Roman Zippel 2000-08-11 13:20 ` Russell King 2000-08-11 14:56 ` Roman Zippel 2000-08-12 9:18 ` Bjorn Wesen 2000-08-11 17:21 ` Kanoj Sarcar 2000-08-14 9:29 ` Roman Zippel 2000-08-15 16:19 ` Stephen C. Tweedie 2000-08-16 8:25 ` Roman Zippel 2000-08-16 17:13 ` Kanoj Sarcar 2000-08-16 18:20 ` Stephen C. Tweedie 2000-08-16 18:24 ` David S. Miller 2000-08-16 19:53 ` Stephen C. Tweedie 2000-08-16 18:47 ` Kanoj Sarcar 2000-08-16 18:39 ` David S. Miller 2000-08-16 19:30 ` Stephen C. Tweedie 2000-08-16 22:22 ` Kanoj Sarcar 2000-08-17 9:11 ` Stephen C. Tweedie 2000-08-17 19:07 ` Kanoj Sarcar 2000-08-17 19:01 ` David S. Miller 2000-08-17 19:19 ` Alan Cox 2000-08-17 19:20 ` David S. Miller 2000-08-17 19:33 ` Alan Cox 2000-08-17 19:36 ` Kanoj Sarcar 2000-09-07 14:31 ` Ralf Baechle 2000-08-17 19:50 ` Kanoj Sarcar 2000-08-17 19:41 ` David S. Miller 2000-09-07 14:26 ` Ralf Baechle 2000-08-17 19:56 ` Alan Cox 2000-08-17 19:24 ` Alan Cox 2000-08-17 19:32 ` Kanoj Sarcar 2000-08-17 19:30 ` David S. Miller 2000-08-17 20:00 ` Kanoj Sarcar 2000-08-16 18:17 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox