* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch @ 2001-03-05 22:08 Manfred Spraul 2001-03-05 22:52 ` David Brownell 0 siblings, 1 reply; 24+ messages in thread From: Manfred Spraul @ 2001-03-05 22:08 UTC (permalink / raw) To: zaitcev; +Cc: linux-usb-devel, linux-kernel, david-b > And mm/slab.c changes semantics when CONFIG_SLAB_DEBUG > is set: it ignores SLAB_HWCACHE_ALIGN. That seems more like > the root cause of the problem to me! > HWCACHE_ALIGN does not guarantee a certain byte alignment. And additionally it's not even guaranteed that kmalloc() uses that HWCACHE_ALIGN. Uhci is broken, not my slab code ;-) > I think that the pci_alloc_consistent patch that Johannes sent >by for "uhci.c" would be a better approach. Though I'd like >to see that be more general ... say, making mm/slab.c know >about such things. Add a simple abstraction, and that should >be it -- right? :-) I looked at it, and there are 2 problems that make it virtually impossible to integrate kmem_cache_alloc() with pci memory alloc without a major redesign: * pci_alloc_consistent returns 2 values, kmem_cache_alloc() only one. This one would be possible to work around. * the slab allocator heavily relies on the 'struct page' structure, but it's not guaranteed that it exists for pci_alloced memory. I'd switch to pci_alloc_consistent with some optimizations to avoid wasting a complete page for each DMA header. (I haven't seen Johannes patch, but we discussed the problem 6 weeks ago and that proposal was the end of the thread) -- Manfred ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch 2001-03-05 22:08 SLAB vs. pci_alloc_xxx in usb-uhci patch Manfred Spraul @ 2001-03-05 22:52 ` David Brownell 2001-03-05 23:20 ` Russell King 0 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-05 22:52 UTC (permalink / raw) To: Manfred Spraul, zaitcev; +Cc: linux-usb-devel, linux-kernel > > And mm/slab.c changes semantics when CONFIG_SLAB_DEBUG > > is set: it ignores SLAB_HWCACHE_ALIGN. That seems more like > > the root cause of the problem to me! > > HWCACHE_ALIGN does not guarantee a certain byte alignment. And > additionally it's not even guaranteed that kmalloc() uses that > HWCACHE_ALIGN. > Uhci is broken, not my slab code ;-) If so, the slab code docs/specs are broken too ... I just checked in my development tree (2.4.2-ac7 plus goodies) and what's written is that HWCACHE_ALIGN is to "Align the objects in this cache to a hardware cacheline." Which contradicts what you just wrote; it's supposed to always align, not do so only when convenient. Clearly, something in mm/slab.c needs to change even if it's just changing the spec for kmem_cache_create() behavior. > I'd switch to pci_alloc_consistent with some optimizations to avoid > wasting a complete page for each DMA header. (I haven't seen Johannes > patch, but we discussed the problem 6 weeks ago and that proposal was > the end of the thread) I didn't see that thread. I agree, pci_alloc_consistent() already has a signature that's up to the job. The change you suggest would need to affect every architecture's implementation of that code ... which somehow seems not the best solution at this time. Perhaps we should have different functions for pci consistent alloc at the "hardware" level (what we have now) and at the "slab" level. That's sort of what Johannes' patch did, except it was specific to that one driver rather than being a generic utility. Of course, I'd rather that such a generic package have all the debug support (except broken redzoning :-) that the current slab code does. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch 2001-03-05 22:52 ` David Brownell @ 2001-03-05 23:20 ` Russell King 2001-03-06 2:09 ` Alan Cox 2001-03-06 4:53 ` David S. Miller 0 siblings, 2 replies; 24+ messages in thread From: Russell King @ 2001-03-05 23:20 UTC (permalink / raw) To: David Brownell; +Cc: Manfred Spraul, zaitcev, linux-usb-devel, linux-kernel On Mon, Mar 05, 2001 at 02:52:24PM -0800, David Brownell wrote: > I didn't see that thread. I agree, pci_alloc_consistent() already has > a signature that's up to the job. The change you suggest would need > to affect every architecture's implementation of that code ... which > somehow seems not the best solution at this time. Needless to say that USB is currently broken for the architectures that need pci_alloc_consistent. A while ago, I looked at what was required to convert the OHCI driver to pci_alloc_consistent, and it turns out that the current interface is highly sub-optimal. It looks good on the face of it, but it _really_ does need sub-page allocations to make sense for USB. At the time, I didn't feel like creating a custom sub-allocator just for USB, and since then I haven't had the inclination nor motivation to go back to trying to get my USB mouse or iPAQ communicating via USB. (I've not used this USB port for 3 years anyway). I'd be good to get it done "properly" at some point though. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch 2001-03-05 23:20 ` Russell King @ 2001-03-06 2:09 ` Alan Cox 2001-03-06 2:29 ` [linux-usb-devel] " David Brownell 2001-03-06 4:53 ` David S. Miller 1 sibling, 1 reply; 24+ messages in thread From: Alan Cox @ 2001-03-06 2:09 UTC (permalink / raw) To: Russell King Cc: David Brownell, Manfred Spraul, zaitcev, linux-usb-devel, linux-kernel > At the time, I didn't feel like creating a custom sub-allocator just > for USB, and since then I haven't had the inclination nor motivation > to go back to trying to get my USB mouse or iPAQ communicating via USB. > (I've not used this USB port for 3 years anyway). > > I'd be good to get it done "properly" at some point though. Something like struct pci_pool *pci_alloc_consistent_pool(int objectsize, int align) pci_alloc_pool_consistent(pool,.. pci_free_pool_consistent(pool,.. ?? Where the pool allocator does page grabbing and chaining > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch 2001-03-06 2:09 ` Alan Cox @ 2001-03-06 2:29 ` David Brownell 0 siblings, 0 replies; 24+ messages in thread From: David Brownell @ 2001-03-06 2:29 UTC (permalink / raw) To: linux-usb-devel, Russell King Cc: Manfred Spraul, zaitcev, linux-usb-devel, linux-kernel > > At the time, I didn't feel like creating a custom sub-allocator just > > for USB, ... > > > > I'd be good to get it done "properly" at some point though. > > Something like > > struct pci_pool *pci_alloc_consistent_pool(int objectsize, int align) struct pci_pool * pci_create_consistent_pool (struct pci_dev *dev, int size, int align) and similar for freeing the pool ... pci_alloc_consistent() needs the device, presumably since some devices may need to dma into specific memory. I'd probably want at least "flags" from kmem_cache_create(). > pci_alloc_pool_consistent(pool,.. > pci_free_pool_consistent(pool,.. These should have signatures just like pci_alloc_consistent() and pci_free_consistent() except they take the pci_pool, not a pci_dev. Oh, and likely GFP_ flags to control blocking. > Where the pool allocator does page grabbing and chaining Given an agreement on API, I suspect Johannes' patch could get quickly generalized. Then debugging support (like in slab.c) could be added later. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch 2001-03-05 23:20 ` Russell King 2001-03-06 2:09 ` Alan Cox @ 2001-03-06 4:53 ` David S. Miller 2001-03-09 16:18 ` SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] David Brownell 1 sibling, 1 reply; 24+ messages in thread From: David S. Miller @ 2001-03-06 4:53 UTC (permalink / raw) To: Russell King Cc: David Brownell, Manfred Spraul, zaitcev, linux-usb-devel, linux-kernel Russell King writes: > A while ago, I looked at what was required to convert the OHCI driver > to pci_alloc_consistent, and it turns out that the current interface is > highly sub-optimal. It looks good on the face of it, but it _really_ > does need sub-page allocations to make sense for USB. > > At the time, I didn't feel like creating a custom sub-allocator just > for USB, and since then I haven't had the inclination nor motivation > to go back to trying to get my USB mouse or iPAQ communicating via USB. > (I've not used this USB port for 3 years anyway). Gerard Roudier wrote for the sym53c8xx driver the exact thing UHCI/OHCI need for this. I think people are pissing their pants over the pci_alloc_consistent interface for no reason. It gives PAGE<<order sized/aligned chunks back to the caller at the request of Linus so that drivers did not have to guess "is this 16-byte aligned..." etc. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-06 4:53 ` David S. Miller @ 2001-03-09 16:18 ` David Brownell [not found] ` <3AA91B2C.BEB85D8C@colorfullife.com> 0 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-09 16:18 UTC (permalink / raw) To: David S. Miller, Russell King Cc: Manfred Spraul, zaitcev, linux-usb-devel, linux-kernel David S. Miller writes: > Russell King writes: > > A while ago, I looked at what was required to convert the OHCI driver > > to pci_alloc_consistent, and it turns out that the current interface is > > highly sub-optimal. It looks good on the face of it, but it _really_ > > does need sub-page allocations to make sense for USB. > > > > At the time, I didn't feel like creating a custom sub-allocator just > > for USB, ... > > Gerard Roudier wrote for the sym53c8xx driver the exact thing > UHCI/OHCI need for this. I looked at that, and it didn't seem it'd drop in very easily. I'd be interested in feedback on the API below. It can handle the USB host controllers (EHCI/OHCI/UHCI) on currently unsupported hardware (the second bug I noted). Questions: is it general enough for other drivers to use? Should I package it as a patch for 2.4? There's also a simple implementation, which works (to limited testing) but would IMO better be replaced by something using the innards of the slab allocator (mm/slab.c). That'd need new APIs inside that allocator ... those changes seem like 2.5 fixes, unlike the slab allocator bug(s) I pointed out. (And which Manfred seems to have gone silent on.) - Dave /* LOGICALLY: <linux/pci.h> */ /* * This works like a pci-dma-consistent kmem_cache_t would, and handles * alignment guarantees for its (small) memory blocks. Many PCI device * drivers need such functionality. */ struct pci_pool; /* Create a pool - flags are SLAB_* */ extern struct pci_pool * pci_pool_create (const char *name, struct pci_dev *pdev, int size, int align, int flags); /* Get memory from a pool - flags are GFP_KERNEL or GFP_ATOMIC */ extern void * pci_pool_alloc (struct pci_pool *pool, int flags, dma_addr_t *handle); /* Return memory too a pool */ extern void pci_pool_free (struct pci_pool *pool, void *vaddr, dma_addr_t handle); /* Destroy a pool */ extern void pci_pool_destroy (struct pci_pool *pool); /* Convert a DMA mapping to its cpu address (as returned by pci_pool_alloc). * Don't assume this is cheap, although on some platforms it may be simple * macros adding a constant to the DMA handle. */ extern void * pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t handle); /* LOGICALLY: drivers/linux/pci.c */ /* There should be some memory pool allocator that this code can use ... it'd be a layer (just like kmem_cache_alloc and kmalloc) with an API that exposes PCI devices, dma address mappings, and hardware alignment requirements. Until that happens, this quick'n'dirty implementation of the API should work, and should be fast enough for small pools that use at most a few pages (and small allocation units). */ struct pci_pool { char name [32]; struct pci_dev *dev; struct list_head pages; int size; int align; int flags; int blocks_offset; int blocks_per_page; }; /* pci consistent pages allocated in units of LOGICAL_PAGE_SIZE, layout: * - pci_page (always in the 'slab') * - bitmap (with blocks_per_page bits) * - blocks (starting at blocks_offset) * * this can easily be optimized, but the best fix would be to * make this just a bus-specific front end to mm/slab.c logic. */ struct pci_page { struct list_head pages; dma_addr_t dma; unsigned long bitmap [0]; }; #define LOGICAL_PAGE_SIZE PAGE_SIZE /* irq-safe; protects any pool's pages and bitmaps */ static spinlock_t pci_page_lock = SPIN_LOCK_UNLOCKED; #define POISON_BYTE 0xa5 /** * pci_pool_create - Creates a pool of pci consistent memory blocks, for dma. * @name: name of pool, for diagnostics * @pdev: pci device that will be doing the DMA * @size: size of the blocks in this pool. * @align: alignment requirement for blocks; must be a power of two * @flags: SLAB_* (or GFP_*) flags (not all are supported). * * Returns a pci allocation pool with the requested characteristics, or * null if one can't be created. * * Memory from the pool will always be aligned to at least L1_CACHE_BYTES, * or more strictly if requested. The actual size be larger than requested. * Such memory will all have "consistent" DMA mappings, accessible by both * devices and their drivers without using cache flushing primitives. * * FIXME: probably need a way to specify "objects must not cross boundaries * of N Bytes", required by some device interfaces. */ extern struct pci_pool * pci_pool_create (const char *name, struct pci_dev *pdev, int size, int align, int flags) { struct pci_pool *retval; size_t temp; if ((LOGICAL_PAGE_SIZE - sizeof (struct pci_page)) < size) return 0; if (!(retval = kmalloc (sizeof *retval, flags))) return retval; if (align < L1_CACHE_BYTES) align = L1_CACHE_BYTES; if (size < align) size = align; else if (size % align) size = (size + align - 1) & ~(align - 1); strncpy (retval->name, name, sizeof retval->name); retval->name [sizeof retval->name - 1] = 0; retval->dev = pdev; INIT_LIST_HEAD (&retval->pages); retval->size = size; retval->align = align; retval->flags = flags; temp = sizeof (struct pci_page); if (temp % align) temp = (temp + align - 1) & ~(align - 1); retval->blocks_offset = temp; retval->blocks_per_page = (LOGICAL_PAGE_SIZE - temp) / size; temp -= sizeof (struct pci_page); /* bitmap size */ while ((temp * 8) < retval->blocks_per_page) { temp += align; retval->blocks_offset += align; retval->blocks_per_page = (LOGICAL_PAGE_SIZE - retval->blocks_offset) / size; } dbg ("create %s/%s, size %d align %d offset %d per-page %d", pdev->slot_name, retval->name, size, align, retval->blocks_offset, retval->blocks_per_page); return retval; } EXPORT_SYMBOL (pci_pool_create); static struct pci_page * pci_alloc_page (struct pci_pool *pool) { void *vaddr; struct pci_page *page; dma_addr_t dma; vaddr = pci_alloc_consistent (pool->dev, LOGICAL_PAGE_SIZE, &dma); page = (struct pci_page *) vaddr; if (page) { page->dma = dma; memset (page->bitmap, 0, sizeof (long) * ((pool->blocks_per_page + BITS_PER_LONG - 1) / BITS_PER_LONG)); if (pool->flags & SLAB_POISON) memset (vaddr + pool->blocks_offset, POISON_BYTE, pool->size * pool->blocks_per_page); list_add (&page->pages, &pool->pages); } return page; } static inline int is_page_busy (int blocks, struct pci_page *page) { int i, bit; for (i = 0; i < blocks; i += BITS_PER_LONG) { bit = ffs (page->bitmap [i / BITS_PER_LONG]); if (!bit || (i + bit) > blocks) continue; return 1; } return 0; } static void pci_free_page (struct pci_pool *pool, struct pci_page *page) { dma_addr_t dma = page->dma; #ifdef CONFIG_SLAB_DEBUG if (is_page_busy (pool->blocks_per_page, page)) { printk (KERN_ERR "pcipool %s/%s, free_page %p leak\n", pool->dev->slot_name, pool->name, page); return; } #endif if (pool->flags & SLAB_POISON) memset (page, POISON_BYTE, LOGICAL_PAGE_SIZE); pci_free_consistent (pool->dev, LOGICAL_PAGE_SIZE, (void *) page, dma); } /** * pci_pool_destroy - destroys a pool of pci memory blocks. * @pool: pci pool that will be freed * * Caller guarantees no more memory from the pool is in use, * and nothing will try to use the pool after this call. */ extern void pci_pool_destroy (struct pci_pool *pool) { unsigned long flags; struct pci_page *page; dbg ("destroy %s/%s", pool->dev->slot_name, pool->name); spin_lock_irqsave (&pci_page_lock, flags); while (!list_empty (&pool->pages)) { page = list_entry (pool->pages.next, struct pci_page, pages); list_del (&page->pages); pci_free_page (pool, page); } spin_unlock_irqrestore (&pci_page_lock, flags); kfree (pool); } EXPORT_SYMBOL (pci_pool_destroy); /** * pci_pool_alloc - allocate a block of consistent memory * @pool: pci pool that will produce the block * @mem_flags: GFP_KERNEL or GFP_ATOMIC * @handle: pointer to dma address of block * * This returns the kernel virtual address of the block, and reports * its dma address through the handle. */ extern void * pci_pool_alloc (struct pci_pool *pool, int mem_flags, dma_addr_t *handle) { unsigned long flags; struct list_head *entry; struct pci_page *page; int map, block; size_t offset; void *retval; spin_lock_irqsave (&pci_page_lock, flags); list_for_each (entry, &pool->pages) { int i; page = list_entry (entry, struct pci_page, pages); for (map = 0, i = 0; i < pool->blocks_per_page; i += BITS_PER_LONG, map++) { if (page->bitmap [map] == ~0UL) continue; block = ffz (page->bitmap [map]); if ((i + block) < pool->blocks_per_page) { set_bit (block, &page->bitmap [map]); offset = (BITS_PER_LONG * map) + block; offset *= pool->size; goto ready; } } } if (!(page = pci_alloc_page (pool))) { if (mem_flags == GFP_KERNEL) { /* FIXME: drop spinlock, block till woken * or timeout, then restart */ printk (KERN_WARNING "pci_pool_alloc %s/%s, can't block yet\n", pool->dev->slot_name, pool->name); } retval = 0; goto done; } offset = 0; set_bit (0, &page->bitmap [0]); ready: offset += pool->blocks_offset; retval = offset + (void *) page; *handle = offset + page->dma; done: spin_unlock_irqrestore (&pci_page_lock, flags); return retval; } EXPORT_SYMBOL (pci_pool_alloc); static struct pci_page * find_page (struct pci_pool *pool, dma_addr_t dma) { unsigned long flags; struct list_head *entry; struct pci_page *page; spin_lock_irqsave (&pci_page_lock, flags); list_for_each (entry, &pool->pages) { page = list_entry (pool->pages.next, struct pci_page, pages); if (dma < page->dma) continue; if (dma < (page->dma + LOGICAL_PAGE_SIZE)) goto done; } page = 0; done: spin_unlock_irqrestore (&pci_page_lock, flags); return page; } #ifdef pci_pool_dma_to_cpu # undef pci_pool_dma_to_cpu #endif /** * pci_pool_dma_to_cpu - maps a block's dma address to a kernel virtual address. * @pool: the pci pool holding the block * @dma: dma address of the allocated block * * Convert a DMA mapping to its cpu address (as returned by pci_pool_alloc), or * to null if the pool does not have a mapping for that particular dma address. * * Don't assume this is cheap, although on some platforms it can be optimized * into macros that add some constant to the DMA address. */ extern void * pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t dma) { struct pci_page *page; if ((page = find_page (pool, dma)) == 0) return 0; return (dma - page->dma) + (void *)page; } EXPORT_SYMBOL (pci_pool_dma_to_cpu); /** * pci_pool_free - free an entry from a pci pool * @pool: the pci pool holding the block * @vaddr: virtual address of block * @dma: dma address of block * * Caller promises neither device nor driver will touch this block unless * it re-allocated. */ extern void pci_pool_free (struct pci_pool *pool, void *vaddr, dma_addr_t dma) { struct pci_page *page; unsigned long flags; int block; if ((page = find_page (pool, dma)) == 0) { printk (KERN_ERR "pci_pool_free %s/%s, invalid dma %x\n", pool->dev->slot_name, pool->name, dma); return; } #ifdef CONFIG_SLAB_DEBUG if (((dma - page->dma) + (void *)page) != vaddr) { printk (KERN_ERR "pci_pool_free %s/%s, invalid vaddr %p\n", pool->dev->slot_name, pool->name, vaddr); return; } #endif if (pool->flags & SLAB_POISON) memset (vaddr, POISON_BYTE, pool->size); block = dma - page->dma; block /= pool->size; spin_lock_irqsave (&pci_page_lock, flags); clear_bit (block % BITS_PER_LONG, &page->bitmap [block / BITS_PER_LONG]); /* FIXME: if someone's waiting here, wakeup ... else */ if (!is_page_busy (pool->blocks_per_page, page)) { list_del (&page->pages); pci_free_page (pool, page); } spin_unlock_irqrestore (&pci_page_lock, flags); } EXPORT_SYMBOL (pci_pool_free); ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <3AA91B2C.BEB85D8C@colorfullife.com>]
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] [not found] ` <3AA91B2C.BEB85D8C@colorfullife.com> @ 2001-03-09 18:21 ` David S. Miller 2001-03-09 18:35 ` [linux-usb-devel] " Johannes Erdfelt 2001-03-09 18:35 ` Alan Cox 2001-03-09 18:29 ` David Brownell 1 sibling, 2 replies; 24+ messages in thread From: David S. Miller @ 2001-03-09 18:21 UTC (permalink / raw) To: Manfred Spraul Cc: David Brownell, Russell King, zaitcev, linux-usb-devel, linux-kernel Manfred Spraul writes: > Do lots of drivers need the reverse mapping? It wasn't on my todo list > yet. I am against any API which provides this. It can be extremely expensive to do this on some architectures, and since the rest of the PCI dma API does not provide such an interface neither should the pool routines. Drivers can keep track of this kind of information themselves, and that is what I tell every driver author to do who complains of a lack of a "bus_to_virt()" type thing, it's just lazy programming. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 18:21 ` David S. Miller @ 2001-03-09 18:35 ` Johannes Erdfelt 2001-03-09 19:42 ` David Brownell 2001-03-09 18:35 ` Alan Cox 1 sibling, 1 reply; 24+ messages in thread From: Johannes Erdfelt @ 2001-03-09 18:35 UTC (permalink / raw) To: linux-usb-devel Cc: Manfred Spraul, David Brownell, Russell King, zaitcev, linux-kernel On Fri, Mar 09, 2001, David S. Miller <davem@redhat.com> wrote: > Manfred Spraul writes: > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > yet. > > I am against any API which provides this. It can be extremely > expensive to do this on some architectures, and since the rest > of the PCI dma API does not provide such an interface neither > should the pool routines. The API I hacked together for uhci.c didn't have this. > Drivers can keep track of this kind of information themselves, > and that is what I tell every driver author to do who complains > of a lack of a "bus_to_virt()" type thing, it's just lazy > programming. Once I worked around not having a "bus_to_virt()" type thing I was happier with the resulting code. I completely agree. JE ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 18:35 ` [linux-usb-devel] " Johannes Erdfelt @ 2001-03-09 19:42 ` David Brownell 2001-03-09 20:07 ` David S. Miller 0 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-09 19:42 UTC (permalink / raw) To: Johannes Erdfelt, linux-usb-devel Cc: Manfred Spraul, Russell King, zaitcev, linux-kernel > > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > > yet. > > > > I am against any API which provides this. It can be extremely > > expensive to do this on some architectures, The implementation I posted needed no architecture-specific knowledge. If cost is the issue, fine -- this makes it finite, (not infinite), and some drivers can eliminate that cost. > > and since the rest > > of the PCI dma API does not provide such an interface neither > > should the pool routines. > > The API I hacked together for uhci.c didn't have this. But it didn't handle the OHCI done-list processing, and we've heard a lot more about pci_*_consistent being needed with OHCI than with UHCI; it's more common on non-Intel architectures. Given that some hardware must return the dma addresses, why should it be a good thing to have an API that doesn't expose the notion of a reverse mapping? At this level -- not the lower level code touching hardware PTEs. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:42 ` David Brownell @ 2001-03-09 20:07 ` David S. Miller 2001-03-09 21:14 ` David Brownell 0 siblings, 1 reply; 24+ messages in thread From: David S. Miller @ 2001-03-09 20:07 UTC (permalink / raw) To: David Brownell Cc: Johannes Erdfelt, linux-usb-devel, Manfred Spraul, Russell King, zaitcev, linux-kernel David Brownell writes: > Given that some hardware must return the dma addresses, why > should it be a good thing to have an API that doesn't expose > the notion of a reverse mapping? At this level -- not the lower > level code touching hardware PTEs. Because its' _very_ expensive on certain machines. You have to do 1 or more I/O accesses to get at the PTEs. If you add this reverse notion to just one API (the dma pool one) then people will complain (rightly) that there is not orthogonality in the API since the other mapping functions do not provide it. No, it is unacceptable. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 20:07 ` David S. Miller @ 2001-03-09 21:14 ` David Brownell 2001-03-09 22:34 ` Pete Zaitcev 0 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-09 21:14 UTC (permalink / raw) To: David S. Miller Cc: Johannes Erdfelt, linux-usb-devel, Manfred Spraul, Russell King, zaitcev, linux-kernel > > Given that some hardware must return the dma addresses, why > > should it be a good thing to have an API that doesn't expose > > the notion of a reverse mapping? At this level -- not the lower > > level code touching hardware PTEs. > > Because its' _very_ expensive on certain machines. You have to do > 1 or more I/O accesses to get at the PTEs. Except, I said this was NOT at that level. Those costs don't need to be incurred, but you are reasoning as if they did. > If you add this reverse notion to just one API (the dma pool one) then > people will complain (rightly) that there is not orthogonality in the > API since the other mapping functions do not provide it. "Orthogonality" is the wrong word there. In fact, this is a highly orthogonal approach: each layer deals with distinct problems. (Which is why I'd ignore that complaint.) There's a bunch of functionality drivers need to have, and which the pci_*_consistent() layer APIs (rightly) don't provide. Just like a kmem_cache provides functionality that's not visible through the generic page allocator code; except that this needs to work with the pci-specific page allocator. It feels to me like you're being inconsistent here, objecting to a library API for some functionality (mapping) yet not for any of the other functionality (alignment, small size, poisoning and so on). And yet when Pete Zaitcev described what that mapping code actually involved, you didn't object. So you've succeeded in confusing me. Care to unconfuse? - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 21:14 ` David Brownell @ 2001-03-09 22:34 ` Pete Zaitcev 0 siblings, 0 replies; 24+ messages in thread From: Pete Zaitcev @ 2001-03-09 22:34 UTC (permalink / raw) To: David Brownell Cc: David S. Miller, Johannes Erdfelt, linux-usb-devel, Manfred Spraul, Russell King, zaitcev, linux-kernel > Date: Fri, 09 Mar 2001 13:14:03 -0800 > From: David Brownell <david-b@pacbell.net> >[...] > It feels to me like you're being inconsistent here, objecting > to a library API for some functionality (mapping) yet not for > any of the other functionality (alignment, small size, poisoning > and so on). And yet when Pete Zaitcev described what that > mapping code actually involved, you didn't object. So you've > succeeded in confusing me. Care to unconfuse? I did not propose an API or library which would be equal amond equals with first rate citizens of pci_alloc_xxx and friends. I pointed out that driver can do tracking of reverse mappings at very little cost by using offset [Alan remarked to that how hash can use page number]; so, one may say that I supported DaveM's viewpoint. No wonder he did not object. -- Pete ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 18:21 ` David S. Miller 2001-03-09 18:35 ` [linux-usb-devel] " Johannes Erdfelt @ 2001-03-09 18:35 ` Alan Cox 1 sibling, 0 replies; 24+ messages in thread From: Alan Cox @ 2001-03-09 18:35 UTC (permalink / raw) To: linux-usb-devel Cc: Manfred Spraul, David Brownell, Russell King, zaitcev, linux-kernel > Drivers can keep track of this kind of information themselves, > and that is what I tell every driver author to do who complains > of a lack of a "bus_to_virt()" type thing, it's just lazy > programming. I'd agree. There are _good_ reasons for having reverse mappings especially on certain architectures, but thats for stuff like cache management. Its stuff the drivers have no business poking their noses directly into ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] [not found] ` <3AA91B2C.BEB85D8C@colorfullife.com> 2001-03-09 18:21 ` David S. Miller @ 2001-03-09 18:29 ` David Brownell 2001-03-09 19:14 ` Pete Zaitcev 2001-03-09 20:07 ` David S. Miller 1 sibling, 2 replies; 24+ messages in thread From: David Brownell @ 2001-03-09 18:29 UTC (permalink / raw) To: Manfred Spraul Cc: David S. Miller, Russell King, zaitcev, linux-usb-devel, linux-kernel > > unlike the slab allocator bug(s) I pointed out. (And which > > Manfred seems to have gone silent on.) > > which bugs? See my previous email ... its behavior contradicts its spec, and I'd sent a patch. You said you wanted kmalloc to have an "automagic redzoning" feature, which would involve one more change (to the flags used in kmalloc init when magic redzoning is in effect). I'd expected a response. > > * this can easily be optimized, but the best fix would be to > > * make this just a bus-specific front end to mm/slab.c logic. > ^^^^ > > Adding that new frond end was already on my todo list for 2.5, but it > means modifying half of mm/slab.c. Exactly why I think we need a usable solution changing that half! And why I asked for feedback about the API, not a focus on this particular implementation. > > if (align < L1_CACHE_BYTES) > > align = L1_CACHE_BYTES; > > Why? To see who was awake, of course! That shouldn't be there. > > /* Convert a DMA mapping to its cpu address (as returned by pci_pool_alloc). > > * Don't assume this is cheap, although on some platforms it may be simple > > * macros adding a constant to the DMA handle. > > */ > > extern void * > > pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t handle); > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > yet. Some hardware (like OHCI) talks to drivers using those dma handles. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 18:29 ` David Brownell @ 2001-03-09 19:14 ` Pete Zaitcev 2001-03-09 19:37 ` David Brownell ` (2 more replies) 2001-03-09 20:07 ` David S. Miller 1 sibling, 3 replies; 24+ messages in thread From: Pete Zaitcev @ 2001-03-09 19:14 UTC (permalink / raw) To: David Brownell Cc: Manfred Spraul, David S. Miller, Russell King, zaitcev, linux-usb-devel, linux-kernel > Date: Fri, 09 Mar 2001 10:29:22 -0800 > From: David Brownell <david-b@pacbell.net> > > > extern void * > > > pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t handle); > > > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > yet. > > Some hardware (like OHCI) talks to drivers using those dma handles. I wonder if it may be feasible to allocate a bunch of contiguous pages. Then, whenever the hardware returns a bus address, subtract the remembered bus address of the zone start, add the offset to the virtual and voila. -- Pete ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:14 ` Pete Zaitcev @ 2001-03-09 19:37 ` David Brownell 2001-03-09 19:04 ` Gérard Roudier 2001-03-09 20:00 ` David S. Miller 2001-03-09 21:38 ` [linux-usb-devel] " Alan Cox 2 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-09 19:37 UTC (permalink / raw) To: Pete Zaitcev Cc: Manfred Spraul, David S. Miller, Russell King, zaitcev, linux-usb-devel, linux-kernel > > > > extern void * > > > > pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t handle); > > > > > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > > yet. > > > > Some hardware (like OHCI) talks to drivers using those dma handles. > > I wonder if it may be feasible to allocate a bunch of contiguous > pages. Then, whenever the hardware returns a bus address, subtract > the remembered bus address of the zone start, add the offset to > the virtual and voila. That's effectively what the implementation I posted is doing. Simple math ... as soon as you get the right "logical page", and that page size could become a per-pool tunable. Currently one logical page is PAGE_SIZE; there are some issues to deal with in terms of not crossing page boundaries. There can be multiple such pages, known to the pool allocator and hidden from the device drivers. I'd expect most USB host controllers wouldn't allocate more than one or two pages, so the cost of this function would typically be small. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:37 ` David Brownell @ 2001-03-09 19:04 ` Gérard Roudier 2001-03-09 22:42 ` David Brownell 0 siblings, 1 reply; 24+ messages in thread From: Gérard Roudier @ 2001-03-09 19:04 UTC (permalink / raw) To: David Brownell Cc: Pete Zaitcev, Manfred Spraul, David S. Miller, Russell King, linux-usb-devel, linux-kernel On Fri, 9 Mar 2001, David Brownell wrote: > > > > > extern void * > > > > > pci_pool_dma_to_cpu (struct pci_pool *pool, dma_addr_t handle); > > > > > > > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > > > yet. > > > > > > Some hardware (like OHCI) talks to drivers using those dma handles. > > > > I wonder if it may be feasible to allocate a bunch of contiguous > > pages. Then, whenever the hardware returns a bus address, subtract > > the remembered bus address of the zone start, add the offset to > > the virtual and voila. > > That's effectively what the implementation I posted is doing. > > Simple math ... as soon as you get the right "logical page", > and that page size could become a per-pool tunable. Currently > one logical page is PAGE_SIZE; there are some issues to > deal with in terms of not crossing page boundaries. > > There can be multiple such pages, known to the pool allocator > and hidden from the device drivers. I'd expect most USB host > controllers wouldn't allocate more than one or two pages, so > the cost of this function would typically be small. Just for information to people that want to complexify the pci_alloc_consistent() interface thats looks simple and elegant to me: (Hopefully, I am not off topic here) 1) The sym53c8xx driver and friends expect this simple interface to return naturally aligned memory chunks. It mostly allocates 1 page at a time. 2) The sym* drivers use a very simple allocator that keeps track of bus addresses for each chunk (page sized). The object file of the allocator as seen in sym2 is as tiny as 3.4K unstripped and 2.5K stripped. 3) The drivers need reverse virtual addresses for the DSA bus addresses and implements a simplistic hashed list that costs no more than 10 lines of trivial C code. Btw, as a result, if pci_alloc_consistent() will ever return not naturally aligned addresses the sym* drivers will be broken for Linux. This leaves me very surprised by all this noise given the _no_ issue I have seen using the pci_alloc_consistent() interface. It looked to me a _lot_ more simple to use than the equivalent interfaces I have had to suffer with other O/Ses. Now, if modern programmers are expecting Java-like interfaces for writing kernel software, it is indeed another story. :-) Gérard. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:04 ` Gérard Roudier @ 2001-03-09 22:42 ` David Brownell 2001-03-09 21:07 ` Gérard Roudier 0 siblings, 1 reply; 24+ messages in thread From: David Brownell @ 2001-03-09 22:42 UTC (permalink / raw) To: Gérard Roudier Cc: Pete Zaitcev, Manfred Spraul, David S. Miller, Russell King, linux-usb-devel, linux-kernel Gérard -- > Just for information to people that want to complexify the > pci_alloc_consistent() interface thats looks simple and elegant to me: I certainly didn't propose that! Just a layer on top of the pci_alloc_consistent code -- used as a page allocator, just like you used it. > The object file of the allocator as seen in sym2 is as tiny as 3.4K > unstripped and 2.5K stripped. What I sent along just compiled to 2.3 KB ... stripped, and "-O". Maybe smaller with normal kernel flags. The reverse mapping code hast to be less than 0.1KB. I looked at your code, but it didn't seem straightforward to reuse. I think the allocation and deallocation costs can be pretty comparable in the two implementations. Your implementation might even fit behind the API I sent. They're both layers over pci_*_consistent (and both have address-to-address mappings, implemented much the same). > Now, if modern programmers are expecting Java-like interfaces for writing > kernel software, it is indeed another story. :-) Only if when you wrote "Java-like" you really meant "reusable"! :) - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 22:42 ` David Brownell @ 2001-03-09 21:07 ` Gérard Roudier 2001-03-10 3:11 ` David Brownell 0 siblings, 1 reply; 24+ messages in thread From: Gérard Roudier @ 2001-03-09 21:07 UTC (permalink / raw) To: David Brownell Cc: Pete Zaitcev, Manfred Spraul, David S. Miller, Russell King, linux-usb-devel, linux-kernel On Fri, 9 Mar 2001, David Brownell wrote: > Gérard -- > > > Just for information to people that want to complexify the > > pci_alloc_consistent() interface thats looks simple and elegant to me: > > I certainly didn't propose that! Just a layer on top of the > pci_alloc_consistent code -- used as a page allocator, just > like you used it. > > > > The object file of the allocator as seen in sym2 is as tiny as 3.4K > > unstripped and 2.5K stripped. > > What I sent along just compiled to 2.3 KB ... stripped, and "-O". > Maybe smaller with normal kernel flags. The reverse mapping > code hast to be less than 0.1KB. If reverse mapping means bus_to_virt(), then I would suggest not to provide it since it is a confusing interface. OTOH, only a few drivers need or want to retrieve the virtual address that lead to some bus dma address and they should check that this virtual address is still valid prior to using it. As I wrote, some trivial hashed list can be used by such drivers (as sym* do). > I looked at your code, but it didn't seem straightforward to reuse. > I think the allocation and deallocation costs can be pretty comparable > in the two implementations. Your implementation might even fit behind > the API I sent. They're both layers over pci_*_consistent (and both > have address-to-address mappings, implemented much the same). I wanted the code as short as possible since the driver code is already very large. On the other hand there are bunches of #ifdef to deal with all still alive kernel versions. As a result, the code may well not be general nor clean enough to be moved to the kernel. Just what it actually does is fairly simple. > > Now, if modern programmers are expecting Java-like interfaces for writing > > kernel software, it is indeed another story. :-) > > Only if when you wrote "Java-like" you really meant "reusable"! :) Hmmm... 'reusable' implies 'usable'... Does 'usable' apply to Java applications ? :-) Gérard. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 21:07 ` Gérard Roudier @ 2001-03-10 3:11 ` David Brownell 0 siblings, 0 replies; 24+ messages in thread From: David Brownell @ 2001-03-10 3:11 UTC (permalink / raw) To: Gérard Roudier Cc: Pete Zaitcev, Manfred Spraul, David S. Miller, Russell King, linux-usb-devel, linux-kernel > > The reverse mapping > > code hast to be less than 0.1KB. > > If reverse mapping means bus_to_virt(), then I would suggest not to > provide it since it is a confusing interface. OTOH, only a few drivers > need or want to retrieve the virtual address that lead to some bus dma Your SCSI code went the other way; the logic is about the same. That's easy enough ... I'm not going to argue that point any longer. The driver might even have Real Intelligence to apply. But I wonder how many assumptions drivers will end up making about those dma mappings. It may be important to expose the "logical" page size to the driver ("don't cross 4k boundaries"); currently it's hidden. Other than that L1_CACHE goof, it seems this was the main thing needing to change in the I API sent by. Sound right? Implementation would be a different question. I'm not in the least attached to what I sent by, but some implementation is needed. Slab-like, or buddy? :) > Does 'usable' apply to Java applications ? :-) Servers and other non-gui tools? I don't see why not. You can make good systems software in many languages. There are advantages to not having those classes of memory-related bugs. I'm looking forward to GCC 3.0 with GCJ, compiling Java just like C. But that's OT. - Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:14 ` Pete Zaitcev 2001-03-09 19:37 ` David Brownell @ 2001-03-09 20:00 ` David S. Miller 2001-03-09 21:38 ` [linux-usb-devel] " Alan Cox 2 siblings, 0 replies; 24+ messages in thread From: David S. Miller @ 2001-03-09 20:00 UTC (permalink / raw) To: Pete Zaitcev Cc: David Brownell, Manfred Spraul, Russell King, linux-usb-devel, linux-kernel Pete Zaitcev writes: > > Some hardware (like OHCI) talks to drivers using those dma handles. > > I wonder if it may be feasible to allocate a bunch of contiguous > pages. Then, whenever the hardware returns a bus address, subtract > the remembered bus address of the zone start, add the offset to > the virtual and voila. Thankfully, some people are not lazy :-) Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [linux-usb-devel] Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 19:14 ` Pete Zaitcev 2001-03-09 19:37 ` David Brownell 2001-03-09 20:00 ` David S. Miller @ 2001-03-09 21:38 ` Alan Cox 2 siblings, 0 replies; 24+ messages in thread From: Alan Cox @ 2001-03-09 21:38 UTC (permalink / raw) To: linux-usb-devel Cc: David Brownell, Manfred Spraul, David S. Miller, Russell King, zaitcev, linux-kernel > I wonder if it may be feasible to allocate a bunch of contiguous > pages. Then, whenever the hardware returns a bus address, subtract > the remembered bus address of the zone start, add the offset to > the virtual and voila. Even if not you can hash by page number not low bits so the hash is way smaller. You (in most cases) can also write the entry number in the relevant tables onto the end of the object in spare space (or in front of it) Something as trivial as struct usb_thingy { u32 thing_id; u32 flags; struct usb_thingy *next; #ifndef __LP64__ u32 pad #endif struct usb_controller_goo goo; } Alan ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] 2001-03-09 18:29 ` David Brownell 2001-03-09 19:14 ` Pete Zaitcev @ 2001-03-09 20:07 ` David S. Miller 1 sibling, 0 replies; 24+ messages in thread From: David S. Miller @ 2001-03-09 20:07 UTC (permalink / raw) To: David Brownell Cc: Manfred Spraul, Russell King, zaitcev, linux-usb-devel, linux-kernel David Brownell writes: > > Do lots of drivers need the reverse mapping? It wasn't on my todo list > > yet. > > Some hardware (like OHCI) talks to drivers using those dma handles. Drivers for such hardware will this keep track of the information necessary to make this reverse mapping. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2001-03-10 3:25 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-05 22:08 SLAB vs. pci_alloc_xxx in usb-uhci patch Manfred Spraul
2001-03-05 22:52 ` David Brownell
2001-03-05 23:20 ` Russell King
2001-03-06 2:09 ` Alan Cox
2001-03-06 2:29 ` [linux-usb-devel] " David Brownell
2001-03-06 4:53 ` David S. Miller
2001-03-09 16:18 ` SLAB vs. pci_alloc_xxx in usb-uhci patch [RFC: API] David Brownell
[not found] ` <3AA91B2C.BEB85D8C@colorfullife.com>
2001-03-09 18:21 ` David S. Miller
2001-03-09 18:35 ` [linux-usb-devel] " Johannes Erdfelt
2001-03-09 19:42 ` David Brownell
2001-03-09 20:07 ` David S. Miller
2001-03-09 21:14 ` David Brownell
2001-03-09 22:34 ` Pete Zaitcev
2001-03-09 18:35 ` Alan Cox
2001-03-09 18:29 ` David Brownell
2001-03-09 19:14 ` Pete Zaitcev
2001-03-09 19:37 ` David Brownell
2001-03-09 19:04 ` Gérard Roudier
2001-03-09 22:42 ` David Brownell
2001-03-09 21:07 ` Gérard Roudier
2001-03-10 3:11 ` David Brownell
2001-03-09 20:00 ` David S. Miller
2001-03-09 21:38 ` [linux-usb-devel] " Alan Cox
2001-03-09 20:07 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox