* Re: PCI DMA to small buffers on cache-incoherent arch
@ 2002-06-11 5:31 David Brownell
2002-06-11 5:44 ` David S. Miller
0 siblings, 1 reply; 102+ messages in thread
From: David Brownell @ 2002-06-11 5:31 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel
> From: Roland Dreier <roland@topspin.com>
> Date: 10 Jun 2002 21:39:27 -0700
>
> David> How about allocating struct something using pci_pool?
>
> The problem is the driver can't safely touch field1 or field2 near the
> DMA (it might pull the cache line back in too soon, or dirty the cache
> line and have it written back on top of DMA'ed data)
>
> Oh duh, I see, then go to making the thing a pointer to the
> dma area.
If it's going to be a pointer, it might as well be normal kmalloc data,
avoiding obfuscation of the memory model used by USB device drivers.
(They don't have to worry about DMA addresses, which IMO is a feature.)
Seems like there are two basic options for how to handle all this.
(a) Expose the dma/cache interaction to device drivers, so
they can manage (and prevent) bad ones like the cacheline
sharing scenarios.
(b) Require drivers to provide I/O buffers that are slab/page/...
pointers from some API that prevents such problems.
I think (a) is more or less a variant of the existing alignment
requirements that get embedded in ABIs, but which still show up
in code that's very close to the hardware. Many device driver
folk are familiar with them, or performance tuners, but not so
many folk doing USB device drivers would be. (They are thankfully
far from host controller hardware and its quirks!)
Personally I'd avoid (b) since I like to allocate memory all at
once in most cases ... lots fewer error cases to cope with (or,
more typically _not_ cope with :).
One question is whether to support only one of them, or allow both.
In either case the DMA-mapping.txt will need to touch on the issue.
- Dave
^ permalink raw reply [flat|nested] 102+ messages in thread* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 5:31 PCI DMA to small buffers on cache-incoherent arch David Brownell @ 2002-06-11 5:44 ` David S. Miller 2002-06-11 15:12 ` David Brownell 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 5:44 UTC (permalink / raw) To: david-b; +Cc: linux-kernel From: David Brownell <david-b@pacbell.net> Date: Mon, 10 Jun 2002 22:31:45 -0700 One question is whether to support only one of them, or allow both. In either case the DMA-mapping.txt will need to touch on the issue. Another important issue is from where do these issues originate? This stuff rarely happens most of the time because block buffers and networking buffers are what are fed to the chip. I still think it is crucial that we not put this alignment garbage into the drivers themselves. Driver writers are going to get it wrong or be confused by it. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 5:44 ` David S. Miller @ 2002-06-11 15:12 ` David Brownell 2002-06-11 15:44 ` Oliver Neukum 2002-06-12 3:25 ` David S. Miller 0 siblings, 2 replies; 102+ messages in thread From: David Brownell @ 2002-06-11 15:12 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel > One question is whether to support only one of them, or allow both. > In either case the DMA-mapping.txt will need to touch on the issue. > > Another important issue is from where do these issues originate? > > This stuff rarely happens most of the time because block buffers and > networking buffers are what are fed to the chip. I think the examples Oliver found related mostly to device control and status tracking. > I still think it is crucial that we not put this alignment garbage > into the drivers themselves. Driver writers are going to get it > wrong or be confused by it. I guess I don't see a way around exposing some of it. Do you? Even the (b) option requires drivers to know they can't pass pointers to the inside of a structure, AND that memory right after the end of an I/O buffer is unavailable. That exposes the issue. Although it doesn't provide help to manage it, as (a) does, or detect it. Should the dma mapping APIs try to detect the "DMA buffer starts in middle of non-coherent cacheline" case, and fail? That might be worth checking, catching some of these errors, even if it ignores the corresponding "ends in middle of non-coherent cacheline" case. And it'd handle that "it's a runtime issue on some HW" concern. Or then there's David Woodhouse's option (disable caching on those pages while the DMA mapping is active) which seems good, except for the fact that this issue is most common for buffers that are a lot smaller than one page ... so lots of otherwise cacheable data would suddenly get very slow. :) - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 15:12 ` David Brownell @ 2002-06-11 15:44 ` Oliver Neukum 2002-06-12 3:25 ` David S. Miller 1 sibling, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 15:44 UTC (permalink / raw) To: David Brownell, David S. Miller; +Cc: linux-kernel Am Dienstag, 11. Juni 2002 17:12 schrieb David Brownell: > > One question is whether to support only one of them, or allow both. > > In either case the DMA-mapping.txt will need to touch on the issue. > > > > Another important issue is from where do these issues originate? > > > > This stuff rarely happens most of the time because block buffers and > > networking buffers are what are fed to the chip. > > I think the examples Oliver found related mostly to device control > and status tracking. I did not look for others. It would seem that SCSI and networking are affected as well. SCSI due to the sense buffer. > Or then there's David Woodhouse's option (disable caching on those > pages while the DMA mapping is active) which seems good, except for > the fact that this issue is most common for buffers that are a lot > smaller than one page ... so lots of otherwise cacheable data would > suddenly get very slow. :) There might be several buffers in one page. You'd have to count DMA users in struct page. Secondly are you sure that a physical page won't be accessed by several different virtual addresses ? On some architectures that would kill us. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 15:12 ` David Brownell 2002-06-11 15:44 ` Oliver Neukum @ 2002-06-12 3:25 ` David S. Miller 2002-06-11 17:33 ` Benjamin Herrenschmidt 2002-06-12 6:25 ` PCI DMA to small buffers on cache-incoherent arch David Brownell 1 sibling, 2 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 3:25 UTC (permalink / raw) To: david-b; +Cc: linux-kernel From: David Brownell <david-b@pacbell.net> Date: Tue, 11 Jun 2002 08:12:35 -0700 Should the dma mapping APIs try to detect the "DMA buffer starts in middle of non-coherent cacheline" case, and fail? That might be worth checking, catching some of these errors, even if it ignores the corresponding "ends in middle of non-coherent cacheline" case. And it'd handle that "it's a runtime issue on some HW" concern. This brings back another issue, returning failure from pci_map_*() and friends which currently cannot happen. Or then there's David Woodhouse's option (disable caching on those pages while the DMA mapping is active) which seems good, except for the fact that this issue is most common for buffers that are a lot smaller than one page ... so lots of otherwise cacheable data would suddenly get very slow. :) Remember please that specifically the DMA mapping APIs encourage use of consistent memory for small data objects. It is specifically because non-consistent DMA accesses to small bits are going to be very slow (ie. the PCI controller is going to prefetch further cache lines for no reason, for example). The non-consistent end of the APIs is meant for long contiguous buffers, not small chunks. This is one of the reasons I want to fix this by making people use either consistent memory or PCI pools (which is consistent memory too). ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 3:25 ` David S. Miller @ 2002-06-11 17:33 ` Benjamin Herrenschmidt 2002-06-12 9:42 ` David S. Miller 2002-06-12 6:25 ` PCI DMA to small buffers on cache-incoherent arch David Brownell 1 sibling, 1 reply; 102+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-11 17:33 UTC (permalink / raw) To: David S. Miller, david-b; +Cc: linux-kernel >Remember please that specifically the DMA mapping APIs encourage use >of consistent memory for small data objects. It is specifically >because non-consistent DMA accesses to small bits are going to be very >slow (ie. the PCI controller is going to prefetch further cache lines >for no reason, for example). The non-consistent end of the APIs is >meant for long contiguous buffers, not small chunks. > >This is one of the reasons I want to fix this by making people use >either consistent memory or PCI pools (which is consistent memory >too). Please don't limit the API design to PCI ;) There are more and more embedded CPUs out there with their own bunch of on-chip devices that are neither consistent nor PCI based, their drivers will have the exact same problem to deal with though. Ben. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 17:33 ` Benjamin Herrenschmidt @ 2002-06-12 9:42 ` David S. Miller 2002-06-12 14:14 ` David Brownell 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-12 9:42 UTC (permalink / raw) To: benh; +Cc: david-b, linux-kernel From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Date: Tue, 11 Jun 2002 19:33:47 +0200 >This is one of the reasons I want to fix this by making people use >either consistent memory or PCI pools (which is consistent memory >too). Please don't limit the API design to PCI ;) When I say "PCI pools" think "struct device pools" because that will what it will be in the end. That is Linus's long range plan, everything you see as pci_* will become dev_*. So I'm not really limiting the design in any way. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 9:42 ` David S. Miller @ 2002-06-12 14:14 ` David Brownell 2002-06-12 15:00 ` Benjamin Herrenschmidt 2002-06-12 18:44 ` Roland Dreier 0 siblings, 2 replies; 102+ messages in thread From: David Brownell @ 2002-06-12 14:14 UTC (permalink / raw) To: David S. Miller; +Cc: benh, linux-kernel > Please don't limit the API design to PCI ;) > > When I say "PCI pools" think "struct device pools" because that > will what it will be in the end. > > That is Linus's long range plan, everything you see as pci_* > will become dev_*. I think the guts of it could happen in the 2.5 series soonish though, so it'd only be "long range" in that quaint Wall Street sense of "finishes sometime after next quarter". :) Just to put some flesh on the discussion ... here's the rough shape of what I was thinking, as more wood for the fire: - Associated with a "struct device" would be some kind of memory allocator/mapper object that would expose every (?) primitive in DMA-mapping.txt, but not PCI-specific. That would basically be a vtable, using the device for state. struct dev_dma_whatsit { /* page-level "consistent" allocators * should they continue to be nonblocking? */ void *alloc_consistent(struct device *dev, size_t *size, dma_addr_t *dma); void free_consistent(struct device *dev, size_t size, void *addr, dma_addr_t dma); /* "streaming mapping" calls * return BAD_DMA_ADDR on error */ dma_addr_t map_single(struct device *dev, void *addr, size_t size, int dir); void unmap_single(struct device *dev, dma_addr_t, size_t size, int dir); void sync_single(struct device *dev, dma_addr_t, size_t size, int dir); /* Also: dma masks, page and scatterlist * map/unmap/sync, dma64_addr_t, ... */ }; Those would be designed so they could be shared between devices, invisibly to code talking to a struct device. Example: all USB devices connected to a given bus would normally delegate to the PCI device underlying that bus. (Except for non-PCI host controllers, of course!) And the PCI versions of those methods have rather obvious implementations ... :) - There'd be dev_*() routines that in many cases just call directly to those methods, or failed cleanly if the whatsit wasn't set up ("used the default allocator"), or omitted key methods. Many could be inlinable functions: dev_alloc_consistent(...) dev_free_consistent(...) dev_map_single(...) ... etc In some cases they'd mostly be layered over those primitives: /* kmalloc analogue */ void *dev_kmalloc(struct device *dev, size_t size, int mem_flags, dma_addr_t *dma); void dev_kfree (struct device *dev, void *mem, dma_addr_t dma); /* pci_pool/kmem_cache_t analogues */ struct dev_pool *dev_pool_create(const char *name, struct device *dev, size_t size, size_t allocation, int mem_flags); void dev_pool_destroy(struct dev_pool *pool); void *dev_pool_alloc(struct dev_pool *pool, int mem_flags, dma_addr_t *dma); void dev_pool_free(struct dev_pool *pool, void *mem, dma_addr_t dma); - Some of the other driver APIs would want to change to leverage these primitives. They might want to provide wrappers to make the dev_*() calls given the subsystem device API. Again using USB for an, one _possible_ change would be to make urb->transfer_buffer be a dma_addr_t ... where today it's a void*. (Discussions of such details belong on the linux-usb-devel list, for the most part.) That'd be a pretty major change to the USB API, since it'd force device drivers to manage DMA mappings. While usb-storage might benefit from sglist support, other drivers might not see much of a win. OK, that's enough trouble for this morning... ;) - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 14:14 ` David Brownell @ 2002-06-12 15:00 ` Benjamin Herrenschmidt 2002-06-12 18:44 ` Roland Dreier 1 sibling, 0 replies; 102+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-12 15:00 UTC (permalink / raw) To: David Brownell, David S. Miller; +Cc: linux-kernel > Those would be designed so they could be shared between > devices, invisibly to code talking to a struct device. > > Example: all USB devices connected to a given bus would > normally delegate to the PCI device underlying that bus. > (Except for non-PCI host controllers, of course!) > > And the PCI versions of those methods have rather obvious > implementations ... :) Yup. The simplest way I see here is to have a device automatically inherit, by default, his parent device ones. The arch will provide default "root" allocators, so a fully coherent or fully non-coherent arch may not have to do anything but these. Now, if a given bus in the arch need specific quirks, it's up to the bus device node to put it's own versions to be used by it's childs. Ben. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 14:14 ` David Brownell 2002-06-12 15:00 ` Benjamin Herrenschmidt @ 2002-06-12 18:44 ` Roland Dreier 2002-06-12 19:13 ` David Brownell 1 sibling, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-12 18:44 UTC (permalink / raw) To: David Brownell; +Cc: David S. Miller, benh, linux-kernel >>>>> "David" == David Brownell <david-b@pacbell.net> writes: David> Again using USB for an, one _possible_ change would be to David> make urb->transfer_buffer be a dma_addr_t ... where today David> it's a void*. (Discussions of such details belong on the David> linux-usb-devel list, for the most part.) That'd be a David> pretty major change to the USB API, since it'd force device David> drivers to manage DMA mappings. While usb-storage might David> benefit from sglist support, other drivers might not see David> much of a win. Let's go back to the beginning of this discussion since I think we're losing sight of the original problem. In general I certainly support the idea of making the DMA mapping stuff device generic instead of tied to PCI. For example, this would really clean up the handling of on-chip peripherals for some of the PowerPC 4xx CPUs I work with (in fact on the linuxppc-embedded list there has been discussion about the abuse of constants like PCI_DMA_FROMDEVICE in contexts that have nothing to do with PCI). However, this discussion started when I fixed some problems I was having with USB on my IBM PowerPC 440GP system (which is not cache coherent). I posted a patch to linux-usb-devel that got rid of the use of unaligned buffers for DMA. Most of the changes got rid of DMA into variables on the stack, which everyone agreed were clear bugs. I also changed some code that did DMA into the middle of a struct in kmalloc()'ed memory. As a concrete example, in drivers/usb/hub.h I made the change: struct usb_hub { struct usb_device *dev; struct urb *urb; /* Interrupt polling pipe */ - char buffer[(USB_MAXCHILDREN + 1 + 7) / 8]; /* add 1 bit for hub status change */ - /* and add 7 bits to round up to byte boundary */ + char *buffer; int error; int nerrors; and then after the kmalloc of struct usb_hub *hub, I added a kmalloc of hub->buffer. There were two objections to this change. First, there were questions about whether the change was really needed. I think we have all agreed that DMA into buffer is not necessarily safe. Second, people felt that splitting the allocation of hub in two introduced needless complication and that it would be better to align buffer within the structure. To this end I proposed a __dma_buffer macro; cache coherent architectures would just do #define __dma_buffer while non-cache-coherent architectures would define #define __dma_buffer __dma_buffer_line(__LINE__) #define __dma_buffer_line(line) __dma_buffer_expand_line(line) #define __dma_buffer_expand_line(line) \ __attribute__ ((aligned(L1_CACHE_BYTES))); \ char __dma_pad_ ## line [0] __attribute__ ((aligned(L1_CACHE_BYTES))) Then the USB hub fix would just amount to changing the declaration of buffer to char buffer[(USB_MAXCHILDREN + 1 + 7) / 8] __dma_buffer; /* add 1 bit for hub status change */ and no other code would change. This seems to me to be pretty close to the minimal change to make things correct. You would get zero bloat on cache coherent architectures, and only one line of code needs to be touched. We can debate about whether this puts too much knowledge about DMA into the driver; I would argue that device-specific allocators are worse. The other objection to __dma_buffer seems to be that alignment requirements can't be calculated at compile time; however this problem seems to exist for kmalloc() as well (since slab caches have to be initialized well before every bus is discovered, not to mention the fact that we might have new alignment requirements coming from hot-plugged devices). It seems to me that both kmalloc() and __dma_buffer both need to align for the worst possible case, and no one objects to kmalloc(). (Well, you might say that device drivers should use device specific allocators, but skbuffs and other generic buffers don't know what device they'll end up at so they need to come from a generic allocator) The current USB driver design seems pretty reasonable: only the HCD drivers need to know about DMA mappings, and other USB drivers just pass buffer addresses. I don't think you would get much support for forcing every driver to handle its own DMA mapping. I would like to see both dev_map_xxx etc. and something like __dma_buffer go into the kernel. I think they both have their uses. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 18:44 ` Roland Dreier @ 2002-06-12 19:13 ` David Brownell 2002-06-12 19:58 ` Oliver Neukum ` (3 more replies) 0 siblings, 4 replies; 102+ messages in thread From: David Brownell @ 2002-06-12 19:13 UTC (permalink / raw) To: Roland Dreier; +Cc: David S. Miller, benh, linux-kernel > Let's go back to the beginning of this discussion since I think we're > losing sight of the original problem. I think actually a couple were unearthed, but you're right to focus. > In general I certainly support > the idea of making the DMA mapping stuff device generic instead of > tied to PCI. PCI was a good place to start (focus ... :) but clearly it shouldn't be the only bus architecture with such support. Note that there are actually two related approaches in the DMA-mapping.txt APIs ... one is DMA mapping, the other is "consistent memory". Both should be made generic rather than PCI-specific, not just mapping APIs. > However, this discussion started when I fixed some problems I was > having with USB on my IBM PowerPC 440GP system (which is not cache > coherent). ... which basically led to discussion of options I summarized a while back as (a) expose the issues to drivers via macros, or (b) expose them through some other API. And DaveM wanted to focus on (b) options that involve exposing consistent memory to drivers as buffers (sized in multiples of cachelines, where that matters) rather than DMA-mapping them. Based on the discussion, I think the answer for now is to go with the (b) variant you had originally started with, using kmalloc for the buffers. The __dma_buffer style macro didn't seem popular; though I agree that it's not clear kmalloc() really solves it today. (Given DaveM's SPARC example, the minimum size value it returns would need to be 128 bytes ... which clearly isn't so.) > The current USB driver design seems pretty reasonable: only the HCD > drivers need to know about DMA mappings, and other USB drivers just > pass buffer addresses. I don't think you would get much support for > forcing every driver to handle its own DMA mapping. Me either. But I suspect that it'd be good to have that as an option; maybe just add a transfer_dma field to the URB, and have the HCD use that, instead of creating a mapping, when transfer_buffer is null. That'd certainly be a better approach for supporting sglist in the usb-storage code than the alternatives I've heard so far. > I would like to see both dev_map_xxx etc. and something like > __dma_buffer go into the kernel. I think they both have their uses. Got Patch? Actually, I know you do, I shouldn't ask. :) - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 19:13 ` David Brownell @ 2002-06-12 19:58 ` Oliver Neukum 2002-06-12 22:51 ` David S. Miller 2002-06-13 4:57 ` David Brownell 2002-06-12 22:46 ` David S. Miller ` (2 subsequent siblings) 3 siblings, 2 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-12 19:58 UTC (permalink / raw) To: David Brownell, Roland Dreier Cc: David S. Miller, benh, linux-kernel, linux-usb-devel > Based on the discussion, I think the answer for now is to go with > the (b) variant you had originally started with, using kmalloc for > the buffers. The __dma_buffer style macro didn't seem popular; > though I agree that it's not clear kmalloc() really solves it > today. (Given DaveM's SPARC example, the minimum size value it > returns would need to be 128 bytes ... which clearly isn't so.) Perhaps I might point out that we need a solution for 2.4. We certainly could go for kmalloc. But given the sparc64 example, it won't work. If you do it with alignment in your own buffer, you can control which fields are accessed. We'd need to export a worst case minimum size for kmalloc. Ugly, very ugly. Tons of kmalloc( size>MIN_DMA ? size : MIN_DMA,... But it seems that this would mean quite a painful change in many drivers. Which alignment macros wouldn't mean. In fact in the common case there'd be no code change at all. So if we take stability as a guideline, the alignment macros win without question. [..] > That'd certainly be a better approach for supporting sglist in the > usb-storage code than the alternatives I've heard so far. Could you elaborate ? This sounds interesting. > > I would like to see both dev_map_xxx etc. and something like > > __dma_buffer go into the kernel. I think they both have their uses. > > Got Patch? Actually, I know you do, I shouldn't ask. :) So perhaps something could be done in time for 2.4.20 ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 19:58 ` Oliver Neukum @ 2002-06-12 22:51 ` David S. Miller 2002-06-12 23:17 ` Oliver Neukum 2002-06-13 4:57 ` David Brownell 1 sibling, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-12 22:51 UTC (permalink / raw) To: oliver; +Cc: david-b, roland, benh, linux-kernel, linux-usb-devel From: Oliver Neukum <oliver@neukum.name> Date: Wed, 12 Jun 2002 21:58:55 +0200 Perhaps I might point out that we need a solution for 2.4. For 2.4.x just do the DMA alignment macro idea. That would be the easiest change to verify. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 22:51 ` David S. Miller @ 2002-06-12 23:17 ` Oliver Neukum 0 siblings, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-12 23:17 UTC (permalink / raw) To: David S. Miller; +Cc: david-b, roland, benh, linux-kernel, linux-usb-devel Am Donnerstag, 13. Juni 2002 00:51 schrieb David S. Miller: > From: Oliver Neukum <oliver@neukum.name> > Date: Wed, 12 Jun 2002 21:58:55 +0200 > > Perhaps I might point out that we need a solution for 2.4. > > For 2.4.x just do the DMA alignment macro idea. That would > be the easiest change to verify. OK, there we have something to work on. Apart from the drivers I've already mailed, struct scsi_cmnd is affected. kmalloc might be a problem. Could we agree on a name for the macro ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 19:58 ` Oliver Neukum 2002-06-12 22:51 ` David S. Miller @ 2002-06-13 4:57 ` David Brownell 1 sibling, 0 replies; 102+ messages in thread From: David Brownell @ 2002-06-13 4:57 UTC (permalink / raw) To: Oliver Neukum; +Cc: linux-kernel, linux-usb-devel >>That'd certainly be a better approach for supporting sglist in the >>usb-storage code than the alternatives I've heard so far. > > Could you elaborate ? This sounds interesting. It's been discussed on linux-usb-devel more than once. It's really a secondary or tertiary consideration. The top performance issue for usb-storage in USB 2.0 is that it doesn't yet use bulk queueing for its requests ... which means that it wastes at least half the bandwidth available to it (staying in the low tens of MByte/sec vs twenties or even higher). [*] The sglist issue is a "how to create fewer DMA mappings" issue. There are specific sglist DMA mapping calls, but they can't be used without adding to the usbcore API ... either sglist, usable only by usb-storage AFAICT, or something more general like dma_addr_t. In any case it's an optimization issue that doesn't seem like it should be at the front of anyone's list for some time. - Dave [*] Yes, patches have been available to fix it. Not updated after the block I/O changes in 2.5, but not hard to do so. Worth updating and integrating after the HCD stuff settles down so we can make sure that all three major HCDs do the bulk queuing correctly in both success and failure paths. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 19:13 ` David Brownell 2002-06-12 19:58 ` Oliver Neukum @ 2002-06-12 22:46 ` David S. Miller 2002-06-13 5:13 ` David Brownell 2002-06-12 22:50 ` David S. Miller 2002-06-13 0:23 ` [PATCH] 2.4 add __dma_buffer alignment macro Roland Dreier 3 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-12 22:46 UTC (permalink / raw) To: david-b; +Cc: roland, benh, linux-kernel From: David Brownell <david-b@pacbell.net> Date: Wed, 12 Jun 2002 12:13:08 -0700 > The current USB driver design seems pretty reasonable: only the HCD > drivers need to know about DMA mappings, and other USB drivers just > pass buffer addresses. I don't think you would get much support for > forcing every driver to handle its own DMA mapping. Me either. But I suspect that it'd be good to have that as an option; maybe just add a transfer_dma field to the URB, and have the HCD use that, instead of creating a mapping, when transfer_buffer is null. That'd certainly be a better approach for supporting sglist in the usb-storage code than the alternatives I've heard so far. Another solution for the "small chunks" issue is to in fact keep all of the real DMA buffers in the host controller driver. Ie. a pool of tiny consitent DMA memory bounce buffers. That is an idea for discussion, I have no particular preference. This could actually improve performance for things like the input layer USB drivers. On several systems multiple cache lines are fetched when a keyboard key is typed, and this is because we use streaming buffers instead of consistent ones for these transfers. We have two problems we want to solve, the DMA alignment stuff and using consistent memory for these small buffers. Therefore moving to consistent memory (by whatever mechanism the USB desires to implement this) is the way to go. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 22:46 ` David S. Miller @ 2002-06-13 5:13 ` David Brownell 2002-06-13 9:38 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: David Brownell @ 2002-06-13 5:13 UTC (permalink / raw) To: David S. Miller; +Cc: roland, benh, linux-kernel David S. Miller wrote: > From: David Brownell <david-b@pacbell.net> > Date: Wed, 12 Jun 2002 12:13:08 -0700 > > > The current USB driver design seems pretty reasonable: only the HCD > > drivers need to know about DMA mappings, and other USB drivers just > > pass buffer addresses. I don't think you would get much support for > > forcing every driver to handle its own DMA mapping. > > Me either. But I suspect that it'd be good to have that as an option; > maybe just add a transfer_dma field to the URB, and have the HCD use > that, instead of creating a mapping, when transfer_buffer is null. > That'd certainly be a better approach for supporting sglist in the > usb-storage code than the alternatives I've heard so far. > > Another solution for the "small chunks" issue is to in fact keep all > of the real DMA buffers in the host controller driver. Ie. a pool > of tiny consitent DMA memory bounce buffers. That is an idea for > discussion, I have no particular preference. Interesting idea. I'd rather have the "hcd framework" layer manage such stuff than have each of N host controller drivers trying to do the same thing though ... less likely to turn into bug heaven. > This could actually improve performance for things like the input > layer USB drivers. On several systems multiple cache lines are > fetched when a keyboard key is typed, and this is because we use > streaming buffers instead of consistent ones for these transfers. And that's exactly the problem scenario, since it uses the "automagic resubmit" model with interrupt transfers. Each of the HCDs needs to handle that differently -- different behaviors, different bugs, yeech. I won't go into it now, but a queued transfer model would work a lot better. > We have two problems we want to solve, the DMA alignment stuff and > using consistent memory for these small buffers. Therefore moving to > consistent memory (by whatever mechanism the USB desires to implement > this) is the way to go. Right, the alignment stuff is a correctness issue, the consistent memory issue is a performance concern. I like to think that 2.5 will have a lot less correctness issues in the USB stack, so it can start to pay more attention to performance concerns. - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-13 5:13 ` David Brownell @ 2002-06-13 9:38 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-13 9:38 UTC (permalink / raw) To: david-b; +Cc: roland, benh, linux-kernel From: David Brownell <david-b@pacbell.net> Date: Wed, 12 Jun 2002 22:13:17 -0700 David S. Miller wrote: > We have two problems we want to solve, the DMA alignment stuff and > using consistent memory for these small buffers. Therefore moving to > consistent memory (by whatever mechanism the USB desires to implement > this) is the way to go. Right, the alignment stuff is a correctness issue, the consistent memory issue is a performance concern. I like to think that 2.5 will have a lot less correctness issues in the USB stack, so it can start to pay more attention to performance concerns. I want to reemphasize that by going to consistent memory we solve both problems, in particular the alignment stuff becomes a non-issue. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 19:13 ` David Brownell 2002-06-12 19:58 ` Oliver Neukum 2002-06-12 22:46 ` David S. Miller @ 2002-06-12 22:50 ` David S. Miller 2002-06-13 0:23 ` [PATCH] 2.4 add __dma_buffer alignment macro Roland Dreier 3 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 22:50 UTC (permalink / raw) To: david-b; +Cc: roland, benh, linux-kernel From: David Brownell <david-b@pacbell.net> Date: Wed, 12 Jun 2002 12:13:08 -0700 > In general I certainly support > the idea of making the DMA mapping stuff device generic instead of > tied to PCI. PCI was a good place to start (focus ... :) but clearly it shouldn't be the only bus architecture with such support. Note that there are actually two related approaches in the DMA-mapping.txt APIs ... one is DMA mapping, the other is "consistent memory". Both should be made generic rather than PCI-specific, not just mapping APIs. I tried to imply this, if I didn't it is my error. The intention is that every interface in DMA-mapping.txt will have a dev_* counterpart when we move away from pci_dev to the generic device struct. One idea I want people to avoid is that we end up trying to do DMA operations from a USB generic device struct. This will lead to multiple levels of indirection to do the PCI dma operation (USB device --> USB bus --> PCI bus --> PCI bus DMA ops) and that will be ugly as hell. Based on the discussion, I think the answer for now is to go with the (b) variant you had originally started with, using kmalloc for the buffers. The __dma_buffer style macro didn't seem popular; though I agree that it's not clear kmalloc() really solves it today. (Given DaveM's SPARC example, the minimum size value it returns would need to be 128 bytes ... which clearly isn't so.) Let's ignore the sparc64 issue for now. It's really a red herring because as Pavel mentioned I could just use the largest size. In actuality the "sparc64 issue" was partially a straw-man. Sparc64 has none of the DMA alignment issues as the PCI controller provides coherency of partial cacheline transfers, we just need to flush pending writes and prefetched reads out of the PCI controllers around non-consistent DMA transfers. Franks a lot, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 102+ messages in thread
* [PATCH] 2.4 add __dma_buffer alignment macro 2002-06-12 19:13 ` David Brownell ` (2 preceding siblings ...) 2002-06-12 22:50 ` David S. Miller @ 2002-06-13 0:23 ` Roland Dreier 2002-06-13 1:07 ` David S. Miller 3 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-13 0:23 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: David Brownell, Oliver Neukum, linux-kernel Following up on the "PCI DMA to small buffers on cache-incoherent arch" discussion, here is a patch that adds <asm/dma_buffer.h>, which contains a __dma_buffer alignment macro for DMA into buffers in the middle of structures. I only implemented for i386 and ppc, I don't know enough about other architectures to get them right. Thanks, Roland Documentation/DMA-mapping.txt | 27 +++++++++++++++++++++++++++ include/asm-alpha/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-arm/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-cris/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-i386/dma_buffer.h | 23 +++++++++++++++++++++++ include/asm-ia64/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-m68k/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-mips/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-mips64/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-parisc/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-ppc/dma_buffer.h | 38 ++++++++++++++++++++++++++++++++++++++ include/asm-s390/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-s390x/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-sh/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-sparc/dma_buffer.h | 22 ++++++++++++++++++++++ include/asm-sparc64/dma_buffer.h | 22 ++++++++++++++++++++++ 16 files changed, 374 insertions(+) diff -Naur linux-2.4.18.orig/Documentation/DMA-mapping.txt linux-2.4.18/Documentation/DMA-mapping.txt --- linux-2.4.18.orig/Documentation/DMA-mapping.txt Mon Feb 25 11:37:51 2002 +++ linux-2.4.18/Documentation/DMA-mapping.txt Wed Jun 12 13:38:30 2002 @@ -67,6 +67,33 @@ networking subsystems make sure that the buffers they use are valid for you to DMA from/to. +Note that on non-cache-coherent architectures, having a DMA buffer +that shares a cache line with other data can lead to memory +corruption. The __dma_buffer macro defined in <asm/dma_buffer.h> +exists to allow safe DMA buffers to be declared easily and portably as +part of larger structures without causing bloat on cache-coherent +architectures. Of course these structures must be contained in memory +that can be used for DMA as described above. + +To use __dma_buffer, just declare a struct like: + + struct mydevice { + int field1; + char buffer[BUFFER_SIZE] __dma_buffer; + int field2; + }; + +If this is used in code like: + + struct mydevice *dev; + dev = kmalloc(sizeof *dev, GFP_KERNEL); + +then dev->buffer will be safe for DMA on all architectures. On a +cache-coherent architecture the members of dev will be aligned exactly +as they would have been without __dma_buffer; on a non-cache-coherent +architecture buffer and field2 will be aligned so that buffer does not +share a cache line with any other data. + DMA addressing limitations Does your device have any DMA addressing limitations? For example, is diff -Naur linux-2.4.18.orig/include/asm-alpha/dma_buffer.h linux-2.4.18/include/asm-alpha/dma_buffer.h --- linux-2.4.18.orig/include/asm-alpha/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-alpha/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_ALPHA_DMA_BUFFER_H +#define _ASM_ALPHA_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for alpha + +#endif /* _ASM_ALPHA_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-arm/dma_buffer.h linux-2.4.18/include/asm-arm/dma_buffer.h --- linux-2.4.18.orig/include/asm-arm/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-arm/dma_buffer.h Wed Jun 12 13:45:45 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_ARM_DMA_BUFFER_H +#define _ASM_ARM_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for arm + +#endif /* _ASM_ARM_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-cris/dma_buffer.h linux-2.4.18/include/asm-cris/dma_buffer.h --- linux-2.4.18.orig/include/asm-cris/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-cris/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_CRIS_DMA_BUFFER_H +#define _ASM_CRIS_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for cris + +#endif /* _ASM_CRIS_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-i386/dma_buffer.h linux-2.4.18/include/asm-i386/dma_buffer.h --- linux-2.4.18.orig/include/asm-i386/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-i386/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,23 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_I386_DMA_BUFFER_H +#define _ASM_I386_DMA_BUFFER_H + +/* We have cache-coherent DMA so __dma_buffer is defined to be a NOP */ +#define __dma_buffer + +#endif /* _ASM_I386_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-ia64/dma_buffer.h linux-2.4.18/include/asm-ia64/dma_buffer.h --- linux-2.4.18.orig/include/asm-ia64/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-ia64/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_IA64_DMA_BUFFER_H +#define _ASM_IA64_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for ia64 + +#endif /* _ASM_IA64_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-m68k/dma_buffer.h linux-2.4.18/include/asm-m68k/dma_buffer.h --- linux-2.4.18.orig/include/asm-m68k/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-m68k/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_M68K_DMA_BUFFER_H +#define _ASM_M68K_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for m68k + +#endif /* _ASM_M68K_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-mips/dma_buffer.h linux-2.4.18/include/asm-mips/dma_buffer.h --- linux-2.4.18.orig/include/asm-mips/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-mips/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_MIPS_DMA_BUFFER_H +#define _ASM_MIPS_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for mips + +#endif /* _ASM_MIPS_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-mips64/dma_buffer.h linux-2.4.18/include/asm-mips64/dma_buffer.h --- linux-2.4.18.orig/include/asm-mips64/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-mips64/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_MIPS64_DMA_BUFFER_H +#define _ASM_MIPS64_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for mips64 + +#endif /* _ASM_MIPS64_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-parisc/dma_buffer.h linux-2.4.18/include/asm-parisc/dma_buffer.h --- linux-2.4.18.orig/include/asm-parisc/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-parisc/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_PARISC_DMA_BUFFER_H +#define _ASM_PARISC_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for parisc + +#endif /* _ASM_PARISC_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-ppc/dma_buffer.h linux-2.4.18/include/asm-ppc/dma_buffer.h --- linux-2.4.18.orig/include/asm-ppc/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-ppc/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,38 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. */ + +#ifdef __KERNEL__ +#ifndef _ASM_PPC_DMA_BUFFER_H +#define _ASM_PPC_DMA_BUFFER_H + +#include <linux/config.h> + +#ifdef CONFIG_NOT_COHERENT_CACHE + +#include <asm/cache.h> + +#define __dma_buffer __dma_buffer_line(__LINE__) +#define __dma_buffer_line(line) __dma_buffer_expand_line(line) +#define __dma_buffer_expand_line(line) \ + __attribute__ ((aligned(L1_CACHE_BYTES))); \ + char __dma_pad_ ## line [0] __attribute__ ((aligned(L1_CACHE_BYTES))) + +#else /* CONFIG_NOT_COHERENT_CACHE */ + +/* We have cache-coherent DMA so __dma_buffer is defined to be a NOP */ +#define __dma_buffer + +#endif /* CONFIG_NOT_COHERENT_CACHE */ + +#endif /* _ASM_PPC_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-s390/dma_buffer.h linux-2.4.18/include/asm-s390/dma_buffer.h --- linux-2.4.18.orig/include/asm-s390/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-s390/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_S390_DMA_BUFFER_H +#define _ASM_S390_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for s390 + +#endif /* _ASM_S390_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-s390x/dma_buffer.h linux-2.4.18/include/asm-s390x/dma_buffer.h --- linux-2.4.18.orig/include/asm-s390x/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-s390x/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_S390X_DMA_BUFFER_H +#define _ASM_S390X_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for s390x + +#endif /* _ASM_S390X_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-sh/dma_buffer.h linux-2.4.18/include/asm-sh/dma_buffer.h --- linux-2.4.18.orig/include/asm-sh/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-sh/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_SH_DMA_BUFFER_H +#define _ASM_SH_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for sh + +#endif /* _ASM_SH_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-sparc/dma_buffer.h linux-2.4.18/include/asm-sparc/dma_buffer.h --- linux-2.4.18.orig/include/asm-sparc/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-sparc/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_SPARC_DMA_BUFFER_H +#define _ASM_SPARC_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for sparc + +#endif /* _ASM_SPARC_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ diff -Naur linux-2.4.18.orig/include/asm-sparc64/dma_buffer.h linux-2.4.18/include/asm-sparc64/dma_buffer.h --- linux-2.4.18.orig/include/asm-sparc64/dma_buffer.h Wed Dec 31 16:00:00 1969 +++ linux-2.4.18/include/asm-sparc64/dma_buffer.h Wed Jun 12 13:38:30 2002 @@ -0,0 +1,22 @@ +/* + * __dma_buffer macro for safe DMA into struct members + * + * See "What memory is DMA'able?" in Documentation/DMA-mapping.txt + * for more information. + * + * Copyright (c) 2002 Roland Dreier <roland@digitalvampire.org> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifdef __KERNEL__ +#ifndef _ASM_SPARC64_DMA_BUFFER_H +#define _ASM_SPARC64_DMA_BUFFER_H + +#warning <asm/dma_buffer.h> is not implemented yet for sparc64 + +#endif /* _ASM_SPARC64_DMA_BUFFER_H */ +#endif /* __KERNEL__ */ ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH] 2.4 add __dma_buffer alignment macro 2002-06-13 0:23 ` [PATCH] 2.4 add __dma_buffer alignment macro Roland Dreier @ 2002-06-13 1:07 ` David S. Miller 2002-06-13 1:19 ` Roland Dreier 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-13 1:07 UTC (permalink / raw) To: roland; +Cc: marcelo, david-b, oliver, linux-kernel Just use asm/dma.h, no need to make a new file. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: [PATCH] 2.4 add __dma_buffer alignment macro 2002-06-13 1:07 ` David S. Miller @ 2002-06-13 1:19 ` Roland Dreier 0 siblings, 0 replies; 102+ messages in thread From: Roland Dreier @ 2002-06-13 1:19 UTC (permalink / raw) To: David S. Miller; +Cc: marcelo, david-b, oliver, linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: David> Just use asm/dma.h, no need to make a new file. I could do that, however my thinking was that if I added it to an existing file then it would add to the existing nested include bloat. For example <asm/dma.h> for i386 has 10 inline functions and includes <linux/config.h>, <linux/spinlock.h>, <asm/io.h> and <linux/delay.h>. Seems kind of excessive just to get one macro that ends up being the empty string, and I would prefer not to force people to include all that just to declare a structure. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 3:25 ` David S. Miller 2002-06-11 17:33 ` Benjamin Herrenschmidt @ 2002-06-12 6:25 ` David Brownell 2002-06-12 6:24 ` David S. Miller 1 sibling, 1 reply; 102+ messages in thread From: David Brownell @ 2002-06-12 6:25 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel David S. Miller wrote: > From: David Brownell <david-b@pacbell.net> > > Should the dma mapping APIs try to detect the "DMA buffer starts in > middle of non-coherent cacheline" case, and fail? > > This brings back another issue, returning failure from pci_map_*() > and friends which currently cannot happen. An issue that'll get fixed, yes? :) I'd suspect ((dma_addr_t)0) would be a reasonable error return. At least some hardware treats such values like software would treat null pointers. No call syntax change necessary, which might be good or bad depending on how you feel tomorrow. > Remember please that specifically the DMA mapping APIs encourage use > of consistent memory for small data objects. ... > ... The non-consistent end of the APIs is > meant for long contiguous buffers, not small chunks. And in between, a nice huge grey area to play with and argue about! > This is one of the reasons I want to fix this by making people use > either consistent memory or PCI pools (which is consistent memory > too). For that model, I would prefer tools more like a kmalloc than the pci_pool, which is most like a kmem_cache_t. The particular objects in question are a bit small to use page-or-bigger allocators, too. The problem for APIs like USB is that they haven't yet exposed DMA addresses. Doing that, giving folk a choice from the "non-consistent end of the APIs", would be a big change. Oh no -- I just had an evil thought. Now that we have the device model code partially in place, shouldn't we have DMA memory calls talk in terms of "struct device" not "struct pci_device"? That'd be the way to have _one_ API for dma mapping (and consistent memory allocation), working for PCI, USB, and any other bus framework that comes along. Nah ... forget I said that. Too logical, it could never happen! ;) - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 6:25 ` PCI DMA to small buffers on cache-incoherent arch David Brownell @ 2002-06-12 6:24 ` David S. Miller 2002-06-12 7:06 ` David Brownell 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-12 6:24 UTC (permalink / raw) To: david-b; +Cc: linux-kernel From: David Brownell <david-b@pacbell.net> Date: Tue, 11 Jun 2002 23:25:09 -0700 I'd suspect ((dma_addr_t)0) would be a reasonable error return. At least some hardware treats such values like software would treat null pointers. No call syntax change necessary, which might be good or bad depending on how you feel tomorrow. 0 is a valid PCI dma address on many platforms. This is part of the problem. > Remember please that specifically the DMA mapping APIs encourage use > of consistent memory for small data objects. ... > ... The non-consistent end of the APIs is > meant for long contiguous buffers, not small chunks. And in between, a nice huge grey area to play with and argue about! Not gray area, fully intentional! From the beginning that was meant to be one of the distinctions between consistent and streaming DMA memory. For that model, I would prefer tools more like a kmalloc than the pci_pool, which is most like a kmem_cache_t. The particular objects in question are a bit small to use page-or-bigger allocators, too. Huh? The whole idea is that it is memory for PCI dma, it has to be PCI in nature. If you want a kmalloc'ish thing, simple use pci_alloc_consistent and carve up the pages you get internally. The problem for APIs like USB is that they haven't yet exposed DMA addresses. Doing that, giving folk a choice from the "non-consistent end of the APIs", would be a big change. But that is the direction I'd like things to go in. A lot of problems have arisen because the USB layer likes to internally let drivers do DMA on any gob of memory whatsoever. That has to stop, it really does. Oh no -- I just had an evil thought. Now that we have the device model code partially in place, shouldn't we have DMA memory calls talk in terms of "struct device" not "struct pci_device"? That'd be the way to have _one_ API for dma mapping (and consistent memory allocation), working for PCI, USB, and any other bus framework that comes along. Sure, that's the idea. Just change pci_alloc_consistent to dev_alloc_consistent whatever. It's all still the same problem though. The USB drivers have to stop DMA'ing to arbitrary little gobs of memory. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 6:24 ` David S. Miller @ 2002-06-12 7:06 ` David Brownell 2002-06-12 9:22 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: David Brownell @ 2002-06-12 7:06 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel David S. Miller wrote: > From: David Brownell <david-b@pacbell.net> > Date: Tue, 11 Jun 2002 23:25:09 -0700 > > I'd suspect ((dma_addr_t)0) would be a reasonable error return. > At least some hardware treats such values like software would > treat null pointers. No call syntax change necessary, which > might be good or bad depending on how you feel tomorrow. > > 0 is a valid PCI dma address on many platforms. This is part > of the problem. OK, so a new call syntax would be desirable. Or reserving page zero even on such platforms. > > Remember please that specifically the DMA mapping APIs encourage use > > of consistent memory for small data objects. ... > > ... The non-consistent end of the APIs is > > meant for long contiguous buffers, not small chunks. > > And in between, a nice huge grey area to play with and argue about! > > Not gray area, fully intentional! From the beginning that was meant > to be one of the distinctions between consistent and streaming DMA > memory. I realize that, but it's still a grey area ... meaning only that the specific choice, in a set of systems, is one of the design tradeoffs. Costs can vary a lot, but near either end the tradeoffs get clearer. > For that model, I would prefer tools more like a kmalloc than the > pci_pool, which is most like a kmem_cache_t. The particular objects > in question are a bit small to use page-or-bigger allocators, too. > > Huh? The whole idea is that it is memory for PCI dma, it has to be > PCI in nature. If you want a kmalloc'ish thing, simple use > pci_alloc_consistent and carve up the pages you get internally. And the reason that logic doesn't lead us to rip kmalloc() out of the kernel (in favor of the page allocator) is ... what? Every driver shouldn't _need_ to write yet another malloc() clone, that's all I'm saying; that'd be a waste. > The problem for APIs like USB is that they haven't yet exposed DMA > addresses. Doing that, giving folk a choice from the "non-consistent > end of the APIs", would be a big change. > > But that is the direction I'd like things to go in. A lot of problems > have arisen because the USB layer likes to internally let drivers do > DMA on any gob of memory whatsoever. That has to stop, it really does. I'll accept that for argument -- though I'll ask what other problems (than this shared-cacheline scenario) you're thinking of. This is the only one that I recall seeming like it might argue for an API change. (And USB is just one example, too. Oliver turned up similar cases in the SCSI layer, as I recall. Likely a lot of it can be traced down to widely-available hardware that wouldn't recognize a NUMA if it tripped on one.) > Oh no -- I just had an evil thought. Now that we have the device model > code partially in place, shouldn't we have DMA memory calls talk in > terms of "struct device" not "struct pci_device"? That'd be the way > to have _one_ API for dma mapping (and consistent memory allocation), > working for PCI, USB, and any other bus framework that comes along. > > Sure, that's the idea. Just change pci_alloc_consistent to > dev_alloc_consistent whatever. Change, and fix up that lack of error codes on the mapping calls, and maybe a few other things ... :) Certainly hanging some sort of memory allocator object off struct device should be pretty cheap. At least for USB, all devices on the same bus could share the same allocator. Not that I know PCI that well, but I suspect that could be true there too ... more sharing, less internal fragmentation, and more efficient use of various resources. - Dave ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 7:06 ` David Brownell @ 2002-06-12 9:22 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 9:22 UTC (permalink / raw) To: david-b; +Cc: linux-kernel From: David Brownell <david-b@pacbell.net> Date: Wed, 12 Jun 2002 00:06:39 -0700 David S. Miller wrote: > 0 is a valid PCI dma address on many platforms. This is part > of the problem. OK, so a new call syntax would be desirable. Or reserving page zero even on such platforms. Or, better yet, define a PCI_BAD_ADDR per-arch. > Huh? The whole idea is that it is memory for PCI dma, it has to be > PCI in nature. If you want a kmalloc'ish thing, simple use > pci_alloc_consistent and carve up the pages you get internally. And the reason that logic doesn't lead us to rip kmalloc() out of the kernel (in favor of the page allocator) is ... what? Every driver shouldn't _need_ to write yet another malloc() clone, that's all I'm saying; that'd be a waste. That is why we have the PCI pool thing, it's PCI alloc consistent + SLAB carving. Certainly hanging some sort of memory allocator object off struct device should be pretty cheap. At least for USB, all devices on the same bus could share the same allocator. Not that I know PCI that well, but I suspect that could be true there too ... more sharing, less internal fragmentation, and more efficient use of various resources. These instances we are talking about want to allocate multiple instance of the same object (thus the same size). They are not allocating things like variable length strings and whatnot. That is why I keep suggesting PCI pools, it's the perfect kind of setup (and the easiest to implement). A generic kmalloc() out of pages is more to implement, certainly. If you want to hide this behind a usb_dma_pool_create, usb_dma_pool_alloc etc. kind of interface that calls back into the host controller layer to get the pci_dev etc., that's fine. Now enough talking and someone implement this, talk is cheap. ^ permalink raw reply [flat|nested] 102+ messages in thread
* PCI DMA to small buffers on cache-incoherent arch
@ 2002-06-08 20:38 Roland Dreier
2002-06-08 13:58 ` Anton Blanchard
2002-06-08 23:03 ` David S. Miller
0 siblings, 2 replies; 102+ messages in thread
From: Roland Dreier @ 2002-06-08 20:38 UTC (permalink / raw)
To: linux-kernel
I recently fixed some problems in the 2.4 USB stack that caused
crashes on the PowerPC 440GP (and probably other cache-incoherent
architectures). However, some parts of my fix have raised some
questions on the linux-usb-devel and linuxppc-embedded mailing lists
and I would like to raise these issues here so that we can get a
definitive answer. I would especially appreciate comments from David
Miller and other PCI DMA experts.
The problem that caused crashes on cache-incoherent architectures (my
specific system uses a PPC 440GP but this should apply in general) was
the following. The USB stack was doing PCI DMA into buffers that were
allocated on the stack, which causes stack corruption: on the PPC
440GP, pci_map_single() with PCI_DMA_FROMDEVICE just invalidates the
cache for the region being mapped. However, if a buffer is smaller
than a cache line, then two bad things can happen.
First, there may be valid data outside the buffer but in the same
cache line that has not been flushed to main memory yet. In that case
when the cache is invalidated the new data is lost and any access to
that memory will get the old data from main memory. Second, access to
the cache line after the cache has been invalidated but before the DMA
has completed will pull the cache line back into processor cache and
the DMA buffer will have invalid data.
The solution to this was simply to use kmalloc() to allocate buffers
instead of using automatic variables. On the PPC 440GP this is
definitely safe because the 440GP's has 32 byte cache lines and
kmalloc() will never return a buffer that is smaller than 32 bytes or
not 32-byte aligned. However, this leads to my first question: is
this safe on all architectures? Could there be a cache-incoherent
architecture where kmalloc() returned a buffer smaller than a cache
line?
The second question is related to this. There are other parts of the
USB stack where structures are defined:
struct something {
/* ... some members ... */
char buffer[SMALLER_THAN_L1_CACHE_LINE_SIZE];
/* ... some more members ... */
};
Then a struct something is kmalloc()'ed and buffer is used to DMA
into. However, even though the overall structure is aligned on a
cache line boundary, buffer is not aligned. It seems to me that this
is not safe because access to members near buffer during a DMA could
cause corruption (as detailed above). In my patch I changed code like
this to look like
struct something {
/* ... some members ... */
char *buffer;
/* ... some more members ... */
};
and then kmalloc()'ed buffer separately when kmalloc()'ing a struct
something.
However, there is some question about whether changing these buffers
is really necessary. Code like the above doesn't cause immediately
obvious crashes the way buffers on the stack do (since we usually get
lucky and don't touch the members near buffer around the DMA access).
I felt that relying on this coincidence is not safe and should be
fixed at the same time.
DMA-mapping.txt says kmalloc()'ed memory is OK for DMAs and does not
mention cache alignment. So the question is: did I misunderstand the
cache coherency issues or is my patch correct? At a higher level: how
is this supposed to work? Should code doing DMA into a smallish
buffer do
buffer = kmalloc(max(BUFFER_SIZE, L1_CACHE_BYTES), GFP_XXX);
or is a kmalloc()'ed buffer always safe? Does the code doing DMA need
to worry about the cache alignment of its buffer?
In any case this probably needs to be documented better. Once I
understand the answer I'll write up a patch to DMA-mapping.txt so no
one has to rediscover all this.
Thanks,
Roland
^ permalink raw reply [flat|nested] 102+ messages in thread* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-08 20:38 Roland Dreier @ 2002-06-08 13:58 ` Anton Blanchard 2002-06-09 0:52 ` Roland Dreier 2002-06-08 23:03 ` David S. Miller 1 sibling, 1 reply; 102+ messages in thread From: Anton Blanchard @ 2002-06-08 13:58 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-kernel > The problem that caused crashes on cache-incoherent architectures (my > specific system uses a PPC 440GP but this should apply in general) was > the following. The USB stack was doing PCI DMA into buffers that were > allocated on the stack, which causes stack corruption: on the PPC > 440GP, pci_map_single() with PCI_DMA_FROMDEVICE just invalidates the > cache for the region being mapped. However, if a buffer is smaller > than a cache line, then two bad things can happen. Yes, DMAing to the stack is definitely a bug, thats mentioned in Documentation/DMA-mapping.txt. We used to vmalloc our kernel stacks on ppc64 and that picked up all sorts of DMA violations. I just checked 2.5 and noticed the scsi code is _still_ DMAing to the stack! Maybe it would be worth having a debug option for x86 where it uses vmalloc for kernel stack allocation :) Anyway attached is a patch from Todd Inglett that I updated for 2.5. Anton ===== drivers/scsi/scsi_scan.c 1.13 vs edited ===== --- 1.13/drivers/scsi/scsi_scan.c Fri May 31 11:17:30 2002 +++ edited/drivers/scsi/scsi_scan.c Sun Jun 9 07:28:25 2002 @@ -368,7 +368,6 @@ unsigned int dev; unsigned int lun; unsigned char *scsi_result; - unsigned char scsi_result0[256]; Scsi_Device *SDpnt; Scsi_Device *SDtail; @@ -390,9 +389,7 @@ */ scsi_initialize_queue(SDpnt, shpnt); SDpnt->request_queue.queuedata = (void *) SDpnt; - /* Make sure we have something that is valid for DMA purposes */ - scsi_result = ((!shpnt->unchecked_isa_dma) - ? &scsi_result0[0] : kmalloc(512, GFP_DMA)); + scsi_result = kmalloc(512, GFP_DMA); } if (scsi_result == NULL) { @@ -532,10 +529,9 @@ kfree((char *) SDpnt); } - /* If we allocated a buffer so we could do DMA, free it now */ - if (scsi_result != &scsi_result0[0] && scsi_result != NULL) { - kfree(scsi_result); - } { + kfree(scsi_result); + + { Scsi_Device *sdev; Scsi_Cmnd *scmd; ===== drivers/scsi/sr_ioctl.c 1.13 vs edited ===== --- 1.13/drivers/scsi/sr_ioctl.c Thu May 23 23:18:39 2002 +++ edited/drivers/scsi/sr_ioctl.c Sun Jun 9 07:30:15 2002 @@ -344,7 +344,12 @@ Scsi_CD *SCp = cdi->handle; u_char sr_cmd[10]; int result, target = minor(cdi->dev); - unsigned char buffer[32]; + unsigned char *buffer = kmalloc(512, GFP_DMA); + + if (buffer == NULL) { + printk("SCSI DMA pool exhausted."); + return -ENOMEM; + } memset(sr_cmd, 0, sizeof(sr_cmd)); @@ -417,6 +422,7 @@ return -EINVAL; } + kfree(buffer); #if 0 if (result) printk("DEBUG: sr_audio: result for ioctl %x: %x\n", cmd, result); ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-08 13:58 ` Anton Blanchard @ 2002-06-09 0:52 ` Roland Dreier 0 siblings, 0 replies; 102+ messages in thread From: Roland Dreier @ 2002-06-09 0:52 UTC (permalink / raw) To: Anton Blanchard; +Cc: linux-kernel >>>>> "Anton" == Anton Blanchard <anton@samba.org> writes: Anton> I just checked 2.5 and noticed the scsi code is _still_ Anton> DMAing to the stack! Maybe it would be worth having a debug Anton> option for x86 where it uses vmalloc for kernel stack Anton> allocation :) Good catches, just a few comments on the patch. Anton> Anyway attached is a patch from Todd Inglett that I updated Anton> for 2.5. ===== drivers/scsi/scsi_scan.c 1.13 vs edited ===== - /* Make sure we have something that is valid for DMA purposes */ - scsi_result = ((!shpnt->unchecked_isa_dma) - ? &scsi_result0[0] : kmalloc(512, GFP_DMA)); + scsi_result = kmalloc(512, GFP_DMA); I don't think you should always use GFP_DMA here... probably better to do: scsi_result = kmalloc(512, shpnt->unchecked_isa_dma ? GFP_DMA : GFP_ATOMIC); And similarly this: ===== drivers/scsi/sr_ioctl.c 1.13 vs edited ===== + unsigned char *buffer = kmalloc(512, GFP_DMA); should probably use GFP_KERNEL (or possibly a check of unchecked_isa_dma as above, though this doesn't seem to be needed based on the existing code). Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-08 20:38 Roland Dreier 2002-06-08 13:58 ` Anton Blanchard @ 2002-06-08 23:03 ` David S. Miller 2002-06-09 0:40 ` Roland Dreier 2002-06-09 1:30 ` Albert D. Cahalan 1 sibling, 2 replies; 102+ messages in thread From: David S. Miller @ 2002-06-08 23:03 UTC (permalink / raw) To: roland; +Cc: linux-kernel From: Roland Dreier <roland@topspin.com> Date: 08 Jun 2002 13:38:53 -0700 The USB stack was doing PCI DMA into buffers that were allocated on the stack This is illegal. which causes stack corruption: on the PPC 440GP, pci_map_single() with PCI_DMA_FROMDEVICE just invalidates the cache for the region being mapped. However, if a buffer is smaller than a cache line, then two bad things can happen. There is no allocation scheme legal for PCI DMA which gives you smaller than a cacheline of data, this includes SLAB. This is why stack buffers and the like are illegal for PCI DMA. The solution to this was simply to use kmalloc() to allocate buffers instead of using automatic variables. Right. However, this leads to my first question: is this safe on all architectures? It must be. If the architecture allows SLAB to give smaller than cacheline sized data, it must handle PCI DMA map/unmap flushing in an appropriate fashion (ie. handle sub-cacheline buffers). DMA-mapping.txt says kmalloc()'ed memory is OK for DMAs and does not mention cache alignment. It doesn't mention cache alignment because that is an implementation specific issue. The user of the interfaces need not be concerned about any of this. There need be no changes to the documentation. If you do as the documentation states (use kmalloc or get_free_page to get your buffers) then it will just work. Franks a lot, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-08 23:03 ` David S. Miller @ 2002-06-09 0:40 ` Roland Dreier 2002-06-09 0:53 ` David S. Miller 2002-06-09 1:30 ` Albert D. Cahalan 1 sibling, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-09 0:40 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: David> There is no allocation scheme legal for PCI DMA which gives David> you smaller than a cacheline of data, this includes SLAB. David> This is why stack buffers and the like are illegal for PCI David> DMA. David> If the architecture allows SLAB to give smaller than David> cacheline sized data, it must handle PCI DMA map/unmap David> flushing in an appropriate fashion (ie. handle David> sub-cacheline buffers). Thanks, that's great information. However, there is one question you didn't cover. How about using a sub-cache-line piece of a kmalloc()'ed buffer (eg the case I described of using one member from a struct as a DMA buffer)? As far as I can see there is no guarantee that this will always work. Do you agree that this should be treated as a bug and fixed when it is found? Or should we leave that usage unless it is observed causing problems (since we almost always get lucky and don't touch the rest of the cache line near the DMA)? Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 0:40 ` Roland Dreier @ 2002-06-09 0:53 ` David S. Miller 2002-06-09 1:26 ` Roland Dreier 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-09 0:53 UTC (permalink / raw) To: roland; +Cc: linux-kernel From: Roland Dreier <roland@topspin.com> Date: 08 Jun 2002 17:40:24 -0700 Or should we leave that usage unless it is observed causing problems (since we almost always get lucky and don't touch the rest of the cache line near the DMA)? I think passing in a 4 byte chunk and assuming the rest of the cacheline is unmodified is a valid expectation the more I think about it. This means what MIPS is doing is wrong. For partial cacheline bits it can't do the invalidate thing. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 0:53 ` David S. Miller @ 2002-06-09 1:26 ` Roland Dreier 2002-06-09 5:29 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-09 1:26 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: Roland> Or should we leave that usage unless it is observed Roland> causing problems (since we almost always get lucky and Roland> don't touch the rest of the cache line near the DMA)? David> I think passing in a 4 byte chunk and assuming the rest of David> the cacheline is unmodified is a valid expectation the more David> I think about it. Just to make sure I'm reading this correctly, you're saying that as long as a buffer is OK for DMA, it should be OK to use a sub-cache-line chunk as a DMA buffer via pci_map_single(), and accessing the rest of the cache line should be OK at any time before, during and after the DMA. David> This means what MIPS is doing is wrong. For partial David> cacheline bits it can't do the invalidate thing. If I understand you, this means non-cache-coherent PPC is wrong as well -- pci_map_single() goes through consistent_sync() and turns into: case PCI_DMA_FROMDEVICE: /* invalidate only */ invalidate_dcache_range(start, end); break; What alternate implementation are you proposing? Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 1:26 ` Roland Dreier @ 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:16 ` Roland Dreier ` (2 more replies) 0 siblings, 3 replies; 102+ messages in thread From: David S. Miller @ 2002-06-09 5:29 UTC (permalink / raw) To: roland; +Cc: linux-kernel From: Roland Dreier <roland@topspin.com> Date: 08 Jun 2002 18:26:12 -0700 Just to make sure I'm reading this correctly, you're saying that as long as a buffer is OK for DMA, it should be OK to use a sub-cache-line chunk as a DMA buffer via pci_map_single(), and accessing the rest of the cache line should be OK at any time before, during and after the DMA. Yes. David> This means what MIPS is doing is wrong. For partial David> cacheline bits it can't do the invalidate thing. If I understand you, this means non-cache-coherent PPC is wrong as well -- pci_map_single() goes through consistent_sync() and turns into: case PCI_DMA_FROMDEVICE: /* invalidate only */ invalidate_dcache_range(start, end); break; What alternate implementation are you proposing? For non-cacheline aligned chunks in the range "start" to "end" you must perform a cache writeback and invalidate. To preserve the data outside of the DMA range. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 5:29 ` David S. Miller @ 2002-06-09 6:16 ` Roland Dreier 2002-06-10 16:03 ` Roland Dreier 2002-06-09 6:45 ` Oliver Neukum 2002-06-09 9:51 ` Paul Mackerras 2 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-09 6:16 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: Roland> Just to make sure I'm reading this correctly, you're Roland> saying that as long as a buffer is OK for DMA, it should Roland> be OK to use a sub-cache-line chunk as a DMA buffer via Roland> pci_map_single(), and accessing the rest of the cache line Roland> should be OK at any time before, during and after the DMA. David> Yes. Roland> What alternate implementation are you proposing? David> For non-cacheline aligned chunks in the range "start" to David> "end" you must perform a cache writeback and invalidate. To David> preserve the data outside of the DMA range. Doesn't this still have a problem if you touch data in the same cache line as the DMA buffer after the pci_map but before the DMA takes place? The CPU will pull the cache line back in and it might not see the data the DMA brought in. It seems to me that to be totally safe, pci_unmap would have to save the non-aligned part outside the buffer to temporary storage, do an invalidate, and then copy back the non-aligned part. In any case if this is what pci_map is supposed to do then we have to fix up several architectures at least... Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:16 ` Roland Dreier @ 2002-06-10 16:03 ` Roland Dreier 2002-06-11 14:04 ` David Woodhouse 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-10 16:03 UTC (permalink / raw) To: linux-kernel >>>>> "Roland" == Roland Dreier <roland@topspin.com> writes: David> For non-cacheline aligned chunks in the range "start" to David> "end" you must perform a cache writeback and invalidate. To David> preserve the data outside of the DMA range. Roland> Doesn't this still have a problem if you touch data in the Roland> same cache line as the DMA buffer after the pci_map but Roland> before the DMA takes place? The CPU will pull the cache Roland> line back in and it might not see the data the DMA brought Roland> in. Roland> It seems to me that to be totally safe, pci_unmap would Roland> have to save the non-aligned part outside the buffer to Roland> temporary storage, do an invalidate, and then copy back Roland> the non-aligned part. Replying to myself.... Anyway, I realized that even my idea above is wrong. I don't see _any_ safe way to share a cache line between a DMA buffer and other data. Access to the cache line might pull the cache line back in and write it back at any time, which could corrupt the DMA'ed data. I don't see a way to hide the existence of cache lines etc. from the driver. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 16:03 ` Roland Dreier @ 2002-06-11 14:04 ` David Woodhouse 0 siblings, 0 replies; 102+ messages in thread From: David Woodhouse @ 2002-06-11 14:04 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-kernel roland@topspin.com said: > Replying to myself.... Anyway, I realized that even my idea above is > wrong. I don't see _any_ safe way to share a cache line between a DMA > buffer and other data. Access to the cache line might pull the cache > line back in and write it back at any time, which could corrupt the > DMA'ed data. I don't see a way to hide the existence of cache lines > etc. from the driver. Access to a cache line can't be shared safely. Disable the cache for that page while its ownership is in doubt and you can happily share the RAM though. If we find another CPU which fetches cache lines preemptively, we may have to do that anyway. -- dwmw2 ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:16 ` Roland Dreier @ 2002-06-09 6:45 ` Oliver Neukum 2002-06-10 4:24 ` David S. Miller 2002-06-09 9:51 ` Paul Mackerras 2 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-09 6:45 UTC (permalink / raw) To: David S. Miller, roland; +Cc: linux-kernel Am Sonntag, 9. Juni 2002 07:29 schrieb David S. Miller: > From: Roland Dreier <roland@topspin.com> > Date: 08 Jun 2002 18:26:12 -0700 > > Just to make sure I'm reading this correctly, you're saying that as > long as a buffer is OK for DMA, it should be OK to use a > sub-cache-line chunk as a DMA buffer via pci_map_single(), and > accessing the rest of the cache line should be OK at any time before, > during and after the DMA. > > Yes. Does this mean that this piece of memory does have to be declared uncacheable until DMA is finished ? How else do you solve th problem of validity during DMA and especially after DMA ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:45 ` Oliver Neukum @ 2002-06-10 4:24 ` David S. Miller 2002-06-10 10:00 ` Oliver Neukum 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-10 4:24 UTC (permalink / raw) To: Oliver.Neukum; +Cc: roland, linux-kernel From: Oliver Neukum <Oliver.Neukum@lrz.uni-muenchen.de> Date: Sun, 9 Jun 2002 08:45:49 +0200 Am Sonntag, 9. Juni 2002 07:29 schrieb David S. Miller: > From: Roland Dreier <roland@topspin.com> > Date: 08 Jun 2002 18:26:12 -0700 > > Just to make sure I'm reading this correctly, you're saying that as > long as a buffer is OK for DMA, it should be OK to use a > sub-cache-line chunk as a DMA buffer via pci_map_single(), and > accessing the rest of the cache line should be OK at any time before, > during and after the DMA. > > Yes. Does this mean that this piece of memory does have to be declared uncacheable until DMA is finished ? How else do you solve th problem of validity during DMA and especially after DMA ? You flush either before/after depending upon whether the cpu caches are writeback in nature or not, and the cpu is not allowed to touch those addresses while the device is doing the DMA. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 4:24 ` David S. Miller @ 2002-06-10 10:00 ` Oliver Neukum 2002-06-10 10:24 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-10 10:00 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel > Does this mean that this piece of memory does have to be declared > uncacheable until DMA is finished ? > How else do you solve th problem of validity during DMA and > especially after DMA ? > > You flush either before/after depending upon whether the cpu caches > are writeback in nature or not, and the cpu is not allowed to touch > those addresses while the device is doing the DMA. So we need some kind of cache_aligned macro in our USB data structures if they contain a buffer. Which macro would we have to use ? Is there a macro conditional to incoherent architectures so we don't have to waste RAM needlessly ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 10:00 ` Oliver Neukum @ 2002-06-10 10:24 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-10 10:24 UTC (permalink / raw) To: oliver; +Cc: linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Mon, 10 Jun 2002 12:00:14 +0200 > You flush either before/after depending upon whether the cpu caches > are writeback in nature or not, and the cpu is not allowed to touch > those addresses while the device is doing the DMA. So we need some kind of cache_aligned macro in our USB data structures if they contain a buffer. Which macro would we have to use ? For now use SMP_CACHE_BYTES I guess. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:16 ` Roland Dreier 2002-06-09 6:45 ` Oliver Neukum @ 2002-06-09 9:51 ` Paul Mackerras 2002-06-09 10:54 ` Benjamin Herrenschmidt 2002-06-10 4:27 ` David S. Miller 2 siblings, 2 replies; 102+ messages in thread From: Paul Mackerras @ 2002-06-09 9:51 UTC (permalink / raw) To: David S. Miller; +Cc: roland, linux-kernel David S. Miller writes: > From: Roland Dreier <roland@topspin.com> > Date: 08 Jun 2002 18:26:12 -0700 > > Just to make sure I'm reading this correctly, you're saying that as > long as a buffer is OK for DMA, it should be OK to use a > sub-cache-line chunk as a DMA buffer via pci_map_single(), and > accessing the rest of the cache line should be OK at any time before, > during and after the DMA. > > Yes. No. Not on processors where the cache doesn't snoop DMA accesses to memory. > What alternate implementation are you proposing? > > For non-cacheline aligned chunks in the range "start" to "end" you > must perform a cache writeback and invalidate. To preserve the data > outside of the DMA range. This is the problem scenario. Suppose we are doing DMA to a buffer B and also independently accessing a variable X which is not part of B. Suppose that X and the beginning of B are both in cache line C. CPU I/O device 1. write back & invalidate cache line(s) containing B 2. start DMA 3. read X: cache line C now present in cache 4. DMA write to B: cache line C updated in memory 5. write X: cache line C now dirty in cache 6. Signal DMA completion Now at this point the driver calls pci_unmap_single or whatever. What is pci_unmap_single to do? If it does nothing, or does a writeback, we lose the DMA data. If it does an invalidate we lose the value written to X. Clearly, neither is correct. The bottom line is that we can only have one writer to any given cache line at a time. If a buffer is being used for DMA from a device to memory, the cpu MUST NOT write to any cache line that overlaps the buffer. Fundamentally, this is because the hardware lacks the ability to transfer the "ownership" of the cache line between the cpu and the DMA device on demand, and so we must manage that in software. The only safe way to allow the cpu to write to the cache line during a DMA transfer is active is to pause the DMA, invalidate the cache line, do the write, write back and invalidate the cache line, and restart the DMA. Paul. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 9:51 ` Paul Mackerras @ 2002-06-09 10:54 ` Benjamin Herrenschmidt 2002-06-10 4:27 ` David S. Miller 1 sibling, 0 replies; 102+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-09 10:54 UTC (permalink / raw) To: paulus, David S. Miller; +Cc: roland, linux-kernel > >Now at this point the driver calls pci_unmap_single or whatever. What >is pci_unmap_single to do? If it does nothing, or does a writeback, >we lose the DMA data. If it does an invalidate we lose the value >written to X. Clearly, neither is correct. > >The bottom line is that we can only have one writer to any given cache >line at a time. If a buffer is being used for DMA from a device to >memory, the cpu MUST NOT write to any cache line that overlaps the >buffer. Note that this typically happen with network drivers, some infos in the skbuf getting lost when they happen to share a cache line with the data portion of the skbuf. When writing a driver specific to a non-coherent CPU, it can be worked around by reserving enough space, but "generic" PCI drivers are still affected. On those architectures, the core skbuf alloc routines should probably make sure the data portion don't share a cache line with other informations. Ben. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 9:51 ` Paul Mackerras 2002-06-09 10:54 ` Benjamin Herrenschmidt @ 2002-06-10 4:27 ` David S. Miller 2002-06-10 15:59 ` Roland Dreier 2002-06-10 18:07 ` William Jhun 1 sibling, 2 replies; 102+ messages in thread From: David S. Miller @ 2002-06-10 4:27 UTC (permalink / raw) To: paulus; +Cc: roland, linux-kernel From: Paul Mackerras <paulus@samba.org> Date: Sun, 9 Jun 2002 19:51:58 +1000 (EST) This is the problem scenario. Suppose we are doing DMA to a buffer B and also independently accessing a variable X which is not part of B. Suppose that X and the beginning of B are both in cache line C. I see what the problem is. Hmmm... I'm trying to specify this such that knowledge of cachelines and whatnot don't escape the arch specific code, ho hum... Looks like that isn't possible. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 4:27 ` David S. Miller @ 2002-06-10 15:59 ` Roland Dreier 2002-06-10 17:03 ` Tom Rini 2002-06-10 18:07 ` William Jhun 1 sibling, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-10 15:59 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: Paul> This is the problem scenario. Suppose we are doing DMA to a Paul> buffer B and also independently accessing a variable X which Paul> is not part of B. Suppose that X and the beginning of B are Paul> both in cache line C. David> I see what the problem is. Hmmm... David> I'm trying to specify this such that knowledge of David> cachelines and whatnot don't escape the arch specific code, David> ho hum... Looks like that isn't possible. So is the consensus now that in general drivers should make sure any buffers passed to pci_map/unmap are aligned to SMP_CACHE_BYTES (and a multiple of SMP_CACHE_BYTES in size)? In other words if a driver uses an unaligned buffer it should be fixed unless we can prove (and document in a comment :) that it won't cause problems on an arch without cache coherency and with a writeback cache. Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 15:59 ` Roland Dreier @ 2002-06-10 17:03 ` Tom Rini 2002-06-10 17:22 ` Oliver Neukum 2002-06-10 17:28 ` Roland Dreier 0 siblings, 2 replies; 102+ messages in thread From: Tom Rini @ 2002-06-10 17:03 UTC (permalink / raw) To: Roland Dreier; +Cc: David S. Miller, linux-kernel On Mon, Jun 10, 2002 at 08:59:48AM -0700, Roland Dreier wrote: > >>>>> "David" == David S Miller <davem@redhat.com> writes: > > Paul> This is the problem scenario. Suppose we are doing DMA to a > Paul> buffer B and also independently accessing a variable X which > Paul> is not part of B. Suppose that X and the beginning of B are > Paul> both in cache line C. > > David> I see what the problem is. Hmmm... > > David> I'm trying to specify this such that knowledge of > David> cachelines and whatnot don't escape the arch specific code, > David> ho hum... Looks like that isn't possible. > > So is the consensus now that in general drivers should make sure any > buffers passed to pci_map/unmap are aligned to SMP_CACHE_BYTES (and a > multiple of SMP_CACHE_BYTES in size)? In other words if a driver uses > an unaligned buffer it should be fixed unless we can prove (and > document in a comment :) that it won't cause problems on an arch > without cache coherency and with a writeback cache. And how about we don't call it SMP_CACHE_BYTES too? The processors where this matters certainly aren't doing SMP... -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:03 ` Tom Rini @ 2002-06-10 17:22 ` Oliver Neukum 2002-06-10 17:29 ` Tom Rini 2002-06-10 17:57 ` Russell King 2002-06-10 17:28 ` Roland Dreier 1 sibling, 2 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-10 17:22 UTC (permalink / raw) To: Tom Rini, Roland Dreier; +Cc: David S. Miller, linux-kernel > > So is the consensus now that in general drivers should make sure any > > buffers passed to pci_map/unmap are aligned to SMP_CACHE_BYTES (and a > > multiple of SMP_CACHE_BYTES in size)? In other words if a driver uses > > an unaligned buffer it should be fixed unless we can prove (and > > document in a comment :) that it won't cause problems on an arch > > without cache coherency and with a writeback cache. > > And how about we don't call it SMP_CACHE_BYTES too? The processors > where this matters certainly aren't doing SMP... Definitely we should call it something different so we can limit it to architectures that need it. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:22 ` Oliver Neukum @ 2002-06-10 17:29 ` Tom Rini 2002-06-10 17:39 ` Oliver Neukum 2002-06-10 19:03 ` Roland Dreier 2002-06-10 17:57 ` Russell King 1 sibling, 2 replies; 102+ messages in thread From: Tom Rini @ 2002-06-10 17:29 UTC (permalink / raw) To: Oliver Neukum; +Cc: Roland Dreier, David S. Miller, linux-kernel On Mon, Jun 10, 2002 at 07:22:26PM +0200, Oliver Neukum wrote: > > > > So is the consensus now that in general drivers should make sure any > > > buffers passed to pci_map/unmap are aligned to SMP_CACHE_BYTES (and a > > > multiple of SMP_CACHE_BYTES in size)? In other words if a driver uses > > > an unaligned buffer it should be fixed unless we can prove (and > > > document in a comment :) that it won't cause problems on an arch > > > without cache coherency and with a writeback cache. > > > > And how about we don't call it SMP_CACHE_BYTES too? The processors > > where this matters certainly aren't doing SMP... > > Definitely we should call it something different so we can limit it to > architectures that need it. No. We should just make it come out to a nop for arches that don't need it. Otherwise we'll end up with ugly things like: #ifdef CONFIG_NOT_CACHE_COHERENT ... #else ... #endif All over things like USB... -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:29 ` Tom Rini @ 2002-06-10 17:39 ` Oliver Neukum 2002-06-10 19:03 ` Roland Dreier 1 sibling, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-10 17:39 UTC (permalink / raw) To: Tom Rini; +Cc: Roland Dreier, David S. Miller, linux-kernel Am Montag, 10. Juni 2002 19:29 schrieb Tom Rini: > On Mon, Jun 10, 2002 at 07:22:26PM +0200, Oliver Neukum wrote: > > > > So is the consensus now that in general drivers should make sure > > > > any buffers passed to pci_map/unmap are aligned to SMP_CACHE_BYTES > > > > (and a multiple of SMP_CACHE_BYTES in size)? In other words if a > > > > driver uses an unaligned buffer it should be fixed unless we can > > > > prove (and document in a comment :) that it won't cause problems > > > > on an arch without cache coherency and with a writeback cache. > > > > > > And how about we don't call it SMP_CACHE_BYTES too? The processors > > > where this matters certainly aren't doing SMP... > > > > Definitely we should call it something different so we can limit it to > > architectures that need it. > > No. We should just make it come out to a nop for arches that don't need > it. Otherwise we'll end up with ugly things like: > #ifdef CONFIG_NOT_CACHE_COHERENT > ... > #else > ... > #endif > > All over things like USB... You are right. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:29 ` Tom Rini 2002-06-10 17:39 ` Oliver Neukum @ 2002-06-10 19:03 ` Roland Dreier 2002-06-10 19:14 ` Tom Rini 1 sibling, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-10 19:03 UTC (permalink / raw) To: Tom Rini; +Cc: Oliver Neukum, David S. Miller, linux-kernel >>>>> "Tom" == Tom Rini <trini@kernel.crashing.org> writes: Tom> No. We should just make it come out to a nop for arches that Tom> don't need it. Otherwise we'll end up with ugly things like: Tom> #ifdef CONFIG_NOT_CACHE_COHERENT ... #else ... #endif Tom> All over things like USB... Good point. How about the following: add a file to each arch named say, <asm/dma_buffer.h>, that defines a macro __dma_buffer. This macro would be used as follows to mark DMA buffers (example taken from <linux/usb.h>): struct usb_device { /* ... stuff deleted ... */ struct usb_bus *bus; /* Bus we're part of */ struct usb_device_descriptor descriptor __dma_buffer; /* Descriptor */ struct usb_config_descriptor *config; /* All of the configs */ /* ... more stuff deleted ... */ }; Then cache-coherent architectures like i386 can just do #define __dma_buffer while PPC can do #ifdef CONFIG_NOT_CACHE_COHERENT #define __dma_buffer __dma_buffer_line(__LINE__) #define __dma_buffer_line(line) __dma_buffer_expand_line(line) #define __dma_buffer_expand_line(line) \ __attribute__ ((aligned(SMP_CACHE_BYTES))); \ char __dma_pad_ ## line [0] __attribute__ ((aligned(SMP_CACHE_BYTES))) #else /* CONFIG_NOT_CACHE_COHERENT */ #define __dma_buffer #endif /* CONFIG_NOT_CACHE_COHERENT */ C purists will point out that this is not guaranteed to work since the compiler can reorder structure members. However I'm sure that there are many, many other places in the kernel where we are counting on gcc not to reorder structures. Comments? Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 19:03 ` Roland Dreier @ 2002-06-10 19:14 ` Tom Rini 2002-06-10 19:21 ` Roland Dreier 0 siblings, 1 reply; 102+ messages in thread From: Tom Rini @ 2002-06-10 19:14 UTC (permalink / raw) To: Roland Dreier; +Cc: Oliver Neukum, David S. Miller, linux-kernel On Mon, Jun 10, 2002 at 12:03:19PM -0700, Roland Dreier wrote: > >>>>> "Tom" == Tom Rini <trini@kernel.crashing.org> writes: > > Tom> No. We should just make it come out to a nop for arches that > Tom> don't need it. Otherwise we'll end up with ugly things like: > Tom> #ifdef CONFIG_NOT_CACHE_COHERENT ... #else ... #endif > Tom> All over things like USB... > > Good point. How about the following: add a file to each arch named > say, <asm/dma_buffer.h>, that defines a macro __dma_buffer. This > macro would be used as follows to mark DMA buffers (example taken from > <linux/usb.h>): [snip] > Comments? SMP_CACHE_BYTES is non-sensical on 4xx (and 8xx) since they don't do SMP.. -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 19:14 ` Tom Rini @ 2002-06-10 19:21 ` Roland Dreier 2002-06-10 19:26 ` Tom Rini 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-10 19:21 UTC (permalink / raw) To: Tom Rini; +Cc: Oliver Neukum, linux-kernel >>>>> "Tom" == Tom Rini <trini@kernel.crashing.org> writes: Tom> SMP_CACHE_BYTES is non-sensical on 4xx (and 8xx) since they Tom> don't do SMP.. True but <asm/cache.h> defines it anyway... of course it would be no problem to use L1_CACHE_BYTES and in fact that probably makes sense because we're talking about PPC-only macros (other arches would have their own definition). Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 19:21 ` Roland Dreier @ 2002-06-10 19:26 ` Tom Rini 0 siblings, 0 replies; 102+ messages in thread From: Tom Rini @ 2002-06-10 19:26 UTC (permalink / raw) To: Roland Dreier; +Cc: Oliver Neukum, linux-kernel On Mon, Jun 10, 2002 at 12:21:44PM -0700, Roland Dreier wrote: > >>>>> "Tom" == Tom Rini <trini@kernel.crashing.org> writes: > > Tom> SMP_CACHE_BYTES is non-sensical on 4xx (and 8xx) since they > Tom> don't do SMP.. > > True but <asm/cache.h> defines it anyway... Right, to L1_CACHE_BYTES even. :) So we should probably use that directly. > of course it would be no > problem to use L1_CACHE_BYTES and in fact that probably makes sense > because we're talking about PPC-only macros (other arches would have > their own definition). Well, ARM (whom we (PPC)) borrowed some ideas from will set it to L1_CACHE_BYTES too, from the sound of rmk. So it might even be consistent among the non coherent processors. :) -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:22 ` Oliver Neukum 2002-06-10 17:29 ` Tom Rini @ 2002-06-10 17:57 ` Russell King 1 sibling, 0 replies; 102+ messages in thread From: Russell King @ 2002-06-10 17:57 UTC (permalink / raw) To: Oliver Neukum; +Cc: Tom Rini, Roland Dreier, David S. Miller, linux-kernel On Mon, Jun 10, 2002 at 07:22:26PM +0200, Oliver Neukum wrote: > Definitely we should call it something different so we can limit it to > architectures that need it. DMA_ALIGN, DMA_ALIGN_BYTES or DMA_CACHE_BYTES. Hmm, thinking about this some more (and just rambling). On some ARMs (maybe other CPUs as well) each cache line has two dirty bits - one for each half. If only half the cache line is dirty, it will only write out the dirty half when evicting it. In this case, we'd want: #define L1_CACHE_BYTES 32 #define DMA_CACHE_BYTES 16 which will probably work as long as we handle stuff carefully in the architecture specific layer. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 17:03 ` Tom Rini 2002-06-10 17:22 ` Oliver Neukum @ 2002-06-10 17:28 ` Roland Dreier 1 sibling, 0 replies; 102+ messages in thread From: Roland Dreier @ 2002-06-10 17:28 UTC (permalink / raw) To: Tom Rini; +Cc: David S. Miller, linux-kernel >>>>> "Tom" == Tom Rini <trini@kernel.crashing.org> writes: Roland> So is the consensus now that in general drivers should Roland> make sure any buffers passed to pci_map/unmap are aligned Roland> to SMP_CACHE_BYTES (and a multiple of SMP_CACHE_BYTES in Roland> size)? In other words if a driver uses an unaligned Roland> buffer it should be fixed unless we can prove (and Roland> document in a comment :) that it won't cause problems on Roland> an arch without cache coherency and with a writeback Roland> cache. Tom> And how about we don't call it SMP_CACHE_BYTES too? The Tom> processors where this matters certainly aren't doing SMP... Fair enough... there is of course L1_CACHE_BYTES but I'm not positive that's always the right thing. If we want to introduce a new constant then we will have to touch every arch (which is not necessarily a killer but it means "fixed" drivers won't compile for everyone until their arch is fixed). What would you propose? I don't have strong feelings about the exact form of a solution but I would like to decide something so we can have a standard way of fixing drivers that use unaligned DMA buffers (and convincing maintainers to apply the patches). Thanks, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 4:27 ` David S. Miller 2002-06-10 15:59 ` Roland Dreier @ 2002-06-10 18:07 ` William Jhun 2002-06-10 18:29 ` William Jhun ` (3 more replies) 1 sibling, 4 replies; 102+ messages in thread From: William Jhun @ 2002-06-10 18:07 UTC (permalink / raw) To: David S. Miller; +Cc: paulus, roland, linux-kernel On Sun, Jun 09, 2002 at 09:27:05PM -0700, David S. Miller wrote: > From: Paul Mackerras <paulus@samba.org> > Date: Sun, 9 Jun 2002 19:51:58 +1000 (EST) > > This is the problem scenario. Suppose we are doing DMA to a buffer B > and also independently accessing a variable X which is not part of B. > Suppose that X and the beginning of B are both in cache line C. > > I see what the problem is. Hmmm... > > I'm trying to specify this such that knowledge of cachelines and > whatnot don't escape the arch specific code, ho hum... Looks like > that isn't possible. Perhaps provide macros in asm/pci.h that will: - Take a buffer size and add an appropriate amount (one cache line for alignment and the remainder to fill out the last cache line) to be used for kmalloc(), etc, eg: #define DMA_SIZE_ROUNDUP(size) \ ((size + 2*SMP_CACHE_BYTES - 1) & ~(SMP_CACHE_BYTES - 1)) - Take a buffer address (as returned from kmalloc() with the modified size from above) and round it up to a cacheline boundary, eg: #define DMA_BUFFER_ALIGN(ptr) \ (((unsigned long)ptr + SMP_CACHE_BYTES - 1) & ~(SMP_CACHE_BYTES - 1)) These two, in conjunction, would provide a buffer that's aligned on a cacheline boundary and ends on a cacheline boundary. Kind of ugly, but would be sufficient and would hide the cacheline size specifics. Cache-coherent platforms would just returned the original argument. Thanks, Will ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 18:07 ` William Jhun @ 2002-06-10 18:29 ` William Jhun 2002-06-10 18:33 ` Mark Zealey ` (2 subsequent siblings) 3 siblings, 0 replies; 102+ messages in thread From: William Jhun @ 2002-06-10 18:29 UTC (permalink / raw) To: David S. Miller; +Cc: paulus, roland, linux-kernel On Mon, Jun 10, 2002 at 11:07:40AM -0700, William Jhun wrote: > Perhaps provide macros in asm/pci.h that will: > > - Take a buffer size and add an appropriate amount (one cache line for > alignment and the remainder to fill out the last cache line) to be > used for kmalloc(), etc, eg: Err, I should say any case where the buffer may not be cache-aligned (this would enable the use of DMA to stack...). For allocation routines that give a cache-aligned buffer, this is obviously not needed (but would just add one cacheline to the size). > > #define DMA_SIZE_ROUNDUP(size) \ > ((size + 2*SMP_CACHE_BYTES - 1) & ~(SMP_CACHE_BYTES - 1)) > ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 18:07 ` William Jhun 2002-06-10 18:29 ` William Jhun @ 2002-06-10 18:33 ` Mark Zealey 2002-06-10 18:44 ` Oliver Neukum 2002-06-11 3:10 ` David S. Miller 2002-06-12 11:47 ` David S. Miller 3 siblings, 1 reply; 102+ messages in thread From: Mark Zealey @ 2002-06-10 18:33 UTC (permalink / raw) To: linux-kernel On Mon, Jun 10, 2002 at 11:07:40AM -0700, William Jhun wrote: > On Sun, Jun 09, 2002 at 09:27:05PM -0700, David S. Miller wrote: > > From: Paul Mackerras <paulus@samba.org> > > Date: Sun, 9 Jun 2002 19:51:58 +1000 (EST) > > > > This is the problem scenario. Suppose we are doing DMA to a buffer B > > and also independently accessing a variable X which is not part of B. > > Suppose that X and the beginning of B are both in cache line C. > > > > I see what the problem is. Hmmm... > > > > I'm trying to specify this such that knowledge of cachelines and > > whatnot don't escape the arch specific code, ho hum... Looks like > > that isn't possible. > > Perhaps provide macros in asm/pci.h that will: > > - Take a buffer size and add an appropriate amount (one cache line for > alignment and the remainder to fill out the last cache line) to be > used for kmalloc(), etc, eg: > > #define DMA_SIZE_ROUNDUP(size) \ > ((size + 2*SMP_CACHE_BYTES - 1) & ~(SMP_CACHE_BYTES - 1)) > > - Take a buffer address (as returned from kmalloc() with the modified > size from above) and round it up to a cacheline boundary, eg: > > #define DMA_BUFFER_ALIGN(ptr) \ > (((unsigned long)ptr + SMP_CACHE_BYTES - 1) & ~(SMP_CACHE_BYTES - 1)) > > These two, in conjunction, would provide a buffer that's aligned on a > cacheline boundary and ends on a cacheline boundary. Kind of ugly, but > would be sufficient and would hide the cacheline size specifics. > Cache-coherent platforms would just returned the original argument. Why not just make some dmalloc() macro in pci.h which will do the nessecory magic resizing and alignments? seems a lot easier to do... -- Mark Zealey mark@zealos.org; mark@itsolve.co.uk ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 18:33 ` Mark Zealey @ 2002-06-10 18:44 ` Oliver Neukum 0 siblings, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-10 18:44 UTC (permalink / raw) To: Mark Zealey, linux-kernel > > These two, in conjunction, would provide a buffer that's aligned on a > > cacheline boundary and ends on a cacheline boundary. Kind of ugly, but > > would be sufficient and would hide the cacheline size specifics. > > Cache-coherent platforms would just returned the original argument. > > Why not just make some dmalloc() macro in pci.h which will do the > nessecory magic resizing and alignments? seems a lot easier to do... That won't be enough. We need a solution for buffers that are parts of structures. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 18:07 ` William Jhun 2002-06-10 18:29 ` William Jhun 2002-06-10 18:33 ` Mark Zealey @ 2002-06-11 3:10 ` David S. Miller 2002-06-11 4:04 ` Roland Dreier 2002-06-12 11:47 ` David S. Miller 3 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 3:10 UTC (permalink / raw) To: wjhun; +Cc: paulus, roland, linux-kernel Wait a second, forget all of this cache alignment crap. If we can avoid drivers seeing it, we should by all means necessary. We should just tell people to use PCI pools and be done with it. That way all the complexity about buffer alignment and all this other crapola lives strictly inside of the PCI pool code. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 3:10 ` David S. Miller @ 2002-06-11 4:04 ` Roland Dreier 2002-06-11 4:16 ` Brad Hards 2002-06-11 4:21 ` David S. Miller 0 siblings, 2 replies; 102+ messages in thread From: Roland Dreier @ 2002-06-11 4:04 UTC (permalink / raw) To: David S. Miller; +Cc: wjhun, paulus, linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: David> Wait a second, forget all of this cache alignment crap. If David> we can avoid drivers seeing it, we should by all means David> necessary. David> We should just tell people to use PCI pools and be done David> with it. That way all the complexity about buffer David> alignment and all this other crapola lives strictly inside David> of the PCI pool code. That's fine but there are drivers (USB, etc) doing struct something { int field1; char dma_buffer[SMALLER_THAN_CACHE_LINE]; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); Do they have to change to struct something { int field1; char *dma_buffer; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); dev->dma_buffer = kmalloc(SMALLER_THAN_CACHE_LINE, GFP_KERNEL); (This is always safe because as you said kmalloc can never return a slab that's not safe for DMA) I don't see how PCI pools help here. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:04 ` Roland Dreier @ 2002-06-11 4:16 ` Brad Hards 2002-06-11 4:24 ` Roland Dreier 2002-06-11 4:21 ` David S. Miller 1 sibling, 1 reply; 102+ messages in thread From: Brad Hards @ 2002-06-11 4:16 UTC (permalink / raw) To: Roland Dreier, David S. Miller; +Cc: linux-kernel On Tue, 11 Jun 2002 14:04, Roland Dreier wrote: > >>>>> "David" == David S Miller <davem@redhat.com> writes: > > David> Wait a second, forget all of this cache alignment crap. If > David> we can avoid drivers seeing it, we should by all means > David> necessary. > > David> We should just tell people to use PCI pools and be done > David> with it. That way all the complexity about buffer > David> alignment and all this other crapola lives strictly inside > David> of the PCI pool code. > > That's fine but there are drivers (USB, etc) doing > > struct something { > int field1; > char dma_buffer[SMALLER_THAN_CACHE_LINE]; > int field2; > }; > > struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); > > Do they have to change to > > struct something { > int field1; > char *dma_buffer; > int field2; > }; > > struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); > dev->dma_buffer = kmalloc(SMALLER_THAN_CACHE_LINE, GFP_KERNEL); > > (This is always safe because as you said kmalloc can never return a > slab that's not safe for DMA) I don't see how PCI pools help here. In the USB case, the "something" represents a device, and the "dma_buffer" is something you want to send-to / receive-from the device. kmallocing the transfer buffers is a problem for suspend. Doing the "you get the transfer buffer with the device" is really useful, because you know that the device configuration will be returned. But we might need to re-init the device after a suspend, and [I've been told that] memory allocation may deadlock under these circumstances. Would it be enought to move the transfer buffers to the start of the device struct, and then pad it up to a cacheline? -- http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:16 ` Brad Hards @ 2002-06-11 4:24 ` Roland Dreier 2002-06-11 4:24 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-11 4:24 UTC (permalink / raw) To: Brad Hards; +Cc: David S. Miller, linux-kernel >>>>> "Brad" == Brad Hards <bhards@bigpond.net.au> writes: Brad> Would it be enought to move the transfer buffers to the Brad> start of the device struct, and then pad it up to a Brad> cacheline? Something like the __dma_buffer macro I posted earlier makes it even simpler. I'll make a patch that adds <asm/dma_buffer.h> so we have something concrete to discuss. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:24 ` Roland Dreier @ 2002-06-11 4:24 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-11 4:24 UTC (permalink / raw) To: roland; +Cc: bhards, linux-kernel From: Roland Dreier <roland@topspin.com> Date: 10 Jun 2002 21:24:35 -0700 Something like the __dma_buffer macro I posted earlier makes it even simpler. I'll make a patch that adds <asm/dma_buffer.h> so we have something concrete to discuss. Why don't you just allocate the "struct something" from PCI pools? If you don't have the pci_dev from that point, make some callback into the host adapter driver so you can get at it. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:04 ` Roland Dreier 2002-06-11 4:16 ` Brad Hards @ 2002-06-11 4:21 ` David S. Miller 2002-06-11 4:39 ` Roland Dreier 1 sibling, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 4:21 UTC (permalink / raw) To: roland; +Cc: wjhun, paulus, linux-kernel From: Roland Dreier <roland@topspin.com> Date: 10 Jun 2002 21:04:30 -0700 That's fine but there are drivers (USB, etc) doing struct something { int field1; char dma_buffer[SMALLER_THAN_CACHE_LINE]; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); Do they have to change to How about allocating struct something using pci_pool? ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:21 ` David S. Miller @ 2002-06-11 4:39 ` Roland Dreier 2002-06-11 4:38 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-11 4:39 UTC (permalink / raw) To: David S. Miller; +Cc: wjhun, paulus, linux-kernel struct something { int field1; char dma_buffer[SMALLER_THAN_CACHE_LINE]; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); David> How about allocating struct something using pci_pool? The problem is the driver can't safely touch field1 or field2 near the DMA (it might pull the cache line back in too soon, or dirty the cache line and have it written back on top of DMA'ed data) Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:39 ` Roland Dreier @ 2002-06-11 4:38 ` David S. Miller 2002-06-11 6:23 ` Oliver Neukum 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 4:38 UTC (permalink / raw) To: roland; +Cc: wjhun, paulus, linux-kernel From: Roland Dreier <roland@topspin.com> Date: 10 Jun 2002 21:39:27 -0700 David> How about allocating struct something using pci_pool? The problem is the driver can't safely touch field1 or field2 near the DMA (it might pull the cache line back in too soon, or dirty the cache line and have it written back on top of DMA'ed data) Oh duh, I see, then go to making the thing a pointer to the dma area. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 4:38 ` David S. Miller @ 2002-06-11 6:23 ` Oliver Neukum 2002-06-11 6:38 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 6:23 UTC (permalink / raw) To: David S. Miller, roland; +Cc: wjhun, paulus, linux-kernel Am Dienstag, 11. Juni 2002 06:38 schrieb David S. Miller: > From: Roland Dreier <roland@topspin.com> > Date: 10 Jun 2002 21:39:27 -0700 > > David> How about allocating struct something using pci_pool? > > The problem is the driver can't safely touch field1 or field2 near > the DMA (it might pull the cache line back in too soon, or dirty the > cache line and have it written back on top of DMA'ed data) > > Oh duh, I see, then go to making the thing a pointer to the > dma area. That would mean doing things like memory allocations for one single byte. Also it would affect things like the scsi layer (sense buffer in the structure). And it would be additional overhead for everybody for the benefit of a small minority. An alignment macro could be turned into a nop on some architectures. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 6:23 ` Oliver Neukum @ 2002-06-11 6:38 ` David S. Miller 2002-06-11 7:38 ` Oliver Neukum 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 6:38 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Tue, 11 Jun 2002 08:23:52 +0200 That would mean doing things like memory allocations for one single byte. Also it would affect things like the scsi layer (sense buffer in the structure). And it would be additional overhead for everybody for the benefit of a small minority. An alignment macro could be turned into a nop on some architectures. The problem is people are going to get it wrong if we expose all of this cacheline crap to the drivers. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 6:38 ` David S. Miller @ 2002-06-11 7:38 ` Oliver Neukum 2002-06-11 7:36 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 7:38 UTC (permalink / raw) To: David S. Miller; +Cc: roland, wjhun, paulus, linux-kernel Am Dienstag, 11. Juni 2002 08:38 schrieb David S. Miller: > From: Oliver Neukum <oliver@neukum.name> > Date: Tue, 11 Jun 2002 08:23:52 +0200 > > That would mean doing things like memory allocations for one single > byte. Also it would affect things like the scsi layer (sense buffer > in the structure). > And it would be additional overhead for everybody for the benefit > of a small minority. An alignment macro could be turned into a nop > on some architectures. > > The problem is people are going to get it wrong if we expose all of > this cacheline crap to the drivers. You don't have to fully expose it. We make a simple rule like: If you want to do DMA to a variable it needs "DMA_ALIGN after the name in the declaration". Architectures could define it, with generic nop. People will get it wrong by doing DMA to members of structures otherwise. There's no really painless way to solve this. You have to introduce a new rule anyway. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 7:38 ` Oliver Neukum @ 2002-06-11 7:36 ` David S. Miller 2002-06-11 7:43 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 7:36 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Tue, 11 Jun 2002 09:38:52 +0200 You don't have to fully expose it. We make a simple rule like: If you want to do DMA to a variable it needs "DMA_ALIGN after the name in the declaration". Architectures could define it, with generic nop. People will get it wrong by doing DMA to members of structures otherwise. There's no really painless way to solve this. You have to introduce a new rule anyway. The DMA_ALIGN attribute doesn't work, on some systems the PCI cacheline size is determined at boot time not compile time. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 7:36 ` David S. Miller @ 2002-06-11 7:43 ` David S. Miller 2002-06-11 8:07 ` Oliver Neukum ` (2 more replies) 0 siblings, 3 replies; 102+ messages in thread From: David S. Miller @ 2002-06-11 7:43 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: "David S. Miller" <davem@redhat.com> Date: Tue, 11 Jun 2002 00:36:25 -0700 (PDT) The DMA_ALIGN attribute doesn't work, on some systems the PCI cacheline size is determined at boot time not compile time. Another note, it could be per-PCI controller what this cacheline size is. We'll need to pass in a pdev to the alignment interfaces to do this correctly. So none of this can be done at compile time folks. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 7:43 ` David S. Miller @ 2002-06-11 8:07 ` Oliver Neukum 2002-06-11 8:15 ` David S. Miller 2002-06-11 15:57 ` William Jhun 2002-06-12 9:06 ` Pavel Machek 2 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 8:07 UTC (permalink / raw) To: David S. Miller; +Cc: roland, wjhun, paulus, linux-kernel Am Dienstag, 11. Juni 2002 09:43 schrieb David S. Miller: > From: "David S. Miller" <davem@redhat.com> > Date: Tue, 11 Jun 2002 00:36:25 -0700 (PDT) > > The DMA_ALIGN attribute doesn't work, on some systems the PCI > cacheline size is determined at boot time not compile time. > > Another note, it could be per-PCI controller what this cacheline size > is. We'll need to pass in a pdev to the alignment interfaces to > do this correctly. Could you please explain this ? I thought this was a problem of a CPU dirtying a cache line that overlaps with an area being DMAed into. So the determining factor should be the granularity of the dirty status of the CPU. Are there really PCI controllers which have to physically write much more than is transfered ? Now really puzzeled Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 8:07 ` Oliver Neukum @ 2002-06-11 8:15 ` David S. Miller 2002-06-11 12:06 ` Oliver Neukum 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-11 8:15 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Tue, 11 Jun 2002 10:07:19 +0200 Are there really PCI controllers which have to physically write much more than is transfered ? On sparc64 the cacheline size can be either 64 or 128 bytes. It's a bus characteristic, so we have to get at the PCI controller info. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 8:15 ` David S. Miller @ 2002-06-11 12:06 ` Oliver Neukum 2002-06-11 12:04 ` David S. Miller 0 siblings, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 12:06 UTC (permalink / raw) To: David S. Miller; +Cc: roland, wjhun, paulus, linux-kernel Am Dienstag, 11. Juni 2002 10:15 schrieb David S. Miller: > From: Oliver Neukum <oliver@neukum.name> > Date: Tue, 11 Jun 2002 10:07:19 +0200 > > Are there really PCI controllers which have to physically write > much more than is transfered ? > > On sparc64 the cacheline size can be either 64 or 128 bytes. > It's a bus characteristic, so we have to get at the PCI > controller info. A sparc64 is unlikely to be short on memory, or is it ? What's wrong with always aligning on 128 bytes on sparc64 ? A runtime check would be expensive. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 12:06 ` Oliver Neukum @ 2002-06-11 12:04 ` David S. Miller 2002-06-11 14:23 ` Oliver Neukum 2002-06-11 18:26 ` Roland Dreier 0 siblings, 2 replies; 102+ messages in thread From: David S. Miller @ 2002-06-11 12:04 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Tue, 11 Jun 2002 14:06:14 +0200 A sparc64 is unlikely to be short on memory, or is it ? What's wrong with always aligning on 128 bytes on sparc64 ? A runtime check would be expensive. Maybe on arch FOO, target X needs no alignment when using PCI controller Y, but for PCI controller Z it does need alignment. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 12:04 ` David S. Miller @ 2002-06-11 14:23 ` Oliver Neukum 2002-06-14 4:14 ` David S. Miller 2002-06-11 18:26 ` Roland Dreier 1 sibling, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-11 14:23 UTC (permalink / raw) To: David S. Miller; +Cc: roland, wjhun, paulus, linux-kernel Am Dienstag, 11. Juni 2002 14:04 schrieb David S. Miller: > From: Oliver Neukum <oliver@neukum.name> > Date: Tue, 11 Jun 2002 14:06:14 +0200 > > A sparc64 is unlikely to be short on memory, or is it ? > What's wrong with always aligning on 128 bytes on sparc64 ? > A runtime check would be expensive. > > Maybe on arch FOO, target X needs no alignment when using PCI > controller Y, but for PCI controller Z it does need alignment. Still does that justify the overhead and the complications ? Couldn't we provide for the worst case in a generic kernel and make it a compile time option ? If I understand you correctly, we even couldn't use kmalloc() for allocating the buffers. IMHO you cannot expose that to driver writers and hope to get a useful result. So what are the alternatives ? We could either use a bounce buffer or disable caching for the page in question, which has its own set of problems. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 14:23 ` Oliver Neukum @ 2002-06-14 4:14 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-14 4:14 UTC (permalink / raw) To: oliver; +Cc: roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Tue, 11 Jun 2002 16:23:30 +0200 Still does that justify the overhead and the complications ? Couldn't we provide for the worst case in a generic kernel and make it a compile time option ? I already posted that using the worst case scenerio would be an acceptable solution to this problem, I consider it a closed issue and you can ignore my earlier gripes on this issue. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 12:04 ` David S. Miller 2002-06-11 14:23 ` Oliver Neukum @ 2002-06-11 18:26 ` Roland Dreier 2002-06-11 17:29 ` Benjamin Herrenschmidt 2002-06-11 23:00 ` Thunder from the hill 1 sibling, 2 replies; 102+ messages in thread From: Roland Dreier @ 2002-06-11 18:26 UTC (permalink / raw) To: David S. Miller; +Cc: oliver, wjhun, paulus, linux-kernel >>>>> "David" == David S Miller <davem@redhat.com> writes: David> Maybe on arch FOO, target X needs no alignment when using David> PCI controller Y, but for PCI controller Z it does need David> alignment. I'm not sure I understand this objection. kmalloc() is will return a buffer that is safe for DMA for all controllers. Any compile-time static alignment of structure members would similarly do worst-case alignment. Also I have to admit that I don't quite understand the issue you're raising here. Obviously the CPU's cache line size doesn't change based on the PCI controller. Are you referring to PCI device's CLS register? I don't see how that could matter since the bus master shouldn't ever write beyond the buffer the CPU gives it. In any case, given drivers that have: struct something { int field1; char dma_buffer[SMALLER_THAN_CACHELINE]; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); /* do DMA into dev->dma_buffer */ I know of several ways to make this safe: 1) Add a static alignment macro and change the declaration to struct something { int field1; char dma_buffer[SMALLER_THAN_CACHELINE] __dma_buffer; int field2; }; __dma_buffer would be a NOP on cache coherent architectures (i386, etc) but might introduce some alignment not strictly necessary on architectures (???) 2) Change the code to struct something { int field1; char *dma_buffer; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); dev->dma_buffer = kmalloc(SMALLER_THAN_CACHELINE, GFP_KERNEL); /* do DMA into dev->dma_buffer */ This always uses strictly more memory than 1) above, complicates the code, may introduce bugs (do we know if anyone did *dev_copy = *dev anywhere?). 3) Change the code to struct something { int field1; char *dma_buffer; int field2; }; struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); /* find pci_device */ dev->dma_buffer = aligned_pci_alloc(pci_device, SMALLER_THAN_CACHELINE); /* do DMA into dev->dma_buffer */ This may use less memory than 1) or 2) above on some architectures but will use more than 1) on cache-coherent architectures. It makes the code even more complex since now the code that allocates the dma buffer has to know which PCI device will use it (for example, in USB, the hub driver is separated from the HCD driver, which is who knows about the PCI bus). 4) David Woodhouse's suggestion of turning off caching for pages where unaligned DMA is in progress. This may be doable but seems quite complex. Also, drivers will probably some way of aligning their buffers to avoid turning of caching, which means that this approach and the above are complementary. Best, Roland ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 18:26 ` Roland Dreier @ 2002-06-11 17:29 ` Benjamin Herrenschmidt 2002-06-12 12:02 ` Oliver Neukum 2002-06-11 23:00 ` Thunder from the hill 1 sibling, 1 reply; 102+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-11 17:29 UTC (permalink / raw) To: Roland Dreier, David S. Miller; +Cc: oliver, wjhun, paulus, linux-kernel >This may use less memory than 1) or 2) above on some architectures but >will use more than 1) on cache-coherent architectures. It makes the >code even more complex since now the code that allocates the dma >buffer has to know which PCI device will use it (for example, in USB, >the hub driver is separated from the HCD driver, which is who knows >about the PCI bus). What about an arch that can be both coherent or incoherent (the same kernel binary would boot both) ? I can also imagine quite a bunch of embedded stuffs where you may have both coherent and non-coherent devices depending on the bus the live on or on bridge bugs. For your example, I don't buy it. You could well design the USB urb allocation in such a way that they are passed down the controller of a given device. Ben. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 17:29 ` Benjamin Herrenschmidt @ 2002-06-12 12:02 ` Oliver Neukum 2002-06-11 20:06 ` Benjamin Herrenschmidt 2002-06-12 12:02 ` David S. Miller 0 siblings, 2 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-12 12:02 UTC (permalink / raw) To: Benjamin Herrenschmidt, Roland Dreier, David S. Miller Cc: wjhun, paulus, linux-kernel > For your example, I don't buy it. You could well design the USB urb > allocation in such a way that they are passed down the controller of > a given device. Urbs are not the problem. An urb abstractly speaking is just a description of io. It does not contain a buffer, just a pointer to it. However many drivers allocate some of these buffers together with their device descriptors, which would, if a special allocator must be used, become impossible. Usbcore could allocate bounce buffers, but performance would suck. If I understand both Davids correctly this is the solution. Buffers for dma must be allocated seperately using a special allocation function which is given the device so it can allocate correctly. David B wants a bus specific pointer to a function in the generic driver structure, right ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 12:02 ` Oliver Neukum @ 2002-06-11 20:06 ` Benjamin Herrenschmidt 2002-06-12 12:02 ` David S. Miller 1 sibling, 0 replies; 102+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-11 20:06 UTC (permalink / raw) To: Oliver Neukum, Roland Dreier, David S. Miller; +Cc: wjhun, paulus, linux-kernel >If I understand both Davids correctly this is the solution. >Buffers for dma must be allocated seperately using a special allocation >function which is given the device so it can allocate correctly. >David B wants a bus specific pointer to a function in the generic >driver structure, right ? Then let's have those as part of the generic device struct, with the default ones pointing to the parent bus ones. That way, a couple of generic ones could be set at the root of the device tree for fully coherent or fully incoherent archs, and bus drivers would have the ability to affect their child devices ones. Ben. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 12:02 ` Oliver Neukum 2002-06-11 20:06 ` Benjamin Herrenschmidt @ 2002-06-12 12:02 ` David S. Miller 1 sibling, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 12:02 UTC (permalink / raw) To: oliver; +Cc: benh, roland, wjhun, paulus, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Wed, 12 Jun 2002 14:02:53 +0200 If I understand both Davids correctly this is the solution. Buffers for dma must be allocated seperately using a special allocation function which is given the device so it can allocate correctly. David B wants a bus specific pointer to a function in the generic driver structure, right ? Eventually it could be exactly that, via the generic driver struct. For now it can be a usb specific thing, and when the generic device stuff is ready we just go: #define usb_pool_alloc(...) dev_pool_alloc(...) without having to touch any of the driver call sites. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 18:26 ` Roland Dreier 2002-06-11 17:29 ` Benjamin Herrenschmidt @ 2002-06-11 23:00 ` Thunder from the hill 2002-06-11 23:56 ` Roland Dreier 1 sibling, 1 reply; 102+ messages in thread From: Thunder from the hill @ 2002-06-11 23:00 UTC (permalink / raw) To: Roland Dreier Cc: David S. Miller, oliver, wjhun, paulus, Linux Kernel Mailing List Hi, On 11 Jun 2002, Roland Dreier wrote: > 3) Change the code to > > struct something { > int field1; > char *dma_buffer; > int field2; > }; > > struct something *dev = kmalloc(sizeof *dev, GFP_KERNEL); > /* find pci_device */ > dev->dma_buffer = aligned_pci_alloc(pci_device, SMALLER_THAN_CACHELINE); > /* do DMA into dev->dma_buffer */ You introduce a possible null pointer dereference here, don't you? // Assume this fails: struct something *dev = kmalloc(sizeof(struct something), GFP_KERNEL); // dev is a NULL pointer now. dev->dma_buffer = aligned_pci_alloc(pci_device, SMALLER_THAN_CACHELINE); // Big bang Just wondering... Regards, Thunder -- German attitude becoming | Thunder from the hill at ngforever rightaway popular: | "Get outa my way, | free inhabitant not directly for I got a mobile phone!" | belonging anywhere ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 23:00 ` Thunder from the hill @ 2002-06-11 23:56 ` Roland Dreier 2002-06-12 0:09 ` Thunder from the hill 0 siblings, 1 reply; 102+ messages in thread From: Roland Dreier @ 2002-06-11 23:56 UTC (permalink / raw) To: Thunder from the hill Cc: David S. Miller, oliver, wjhun, paulus, Linux Kernel Mailing List >>>>> "Thunder" == Thunder from the hill <thunder@ngforever.de> writes: Thunder> You introduce a possible null pointer dereference here, Thunder> don't you? I left out the error checking for the allocations everywhere in my email. It wasn't real code and I thought that testing for NULL all over the place would obscure the real point. R. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 23:56 ` Roland Dreier @ 2002-06-12 0:09 ` Thunder from the hill 0 siblings, 0 replies; 102+ messages in thread From: Thunder from the hill @ 2002-06-12 0:09 UTC (permalink / raw) To: Roland Dreier Cc: Thunder from the hill, David S. Miller, oliver, wjhun, paulus, Linux Kernel Mailing List Hi, On 11 Jun 2002, Roland Dreier wrote: > I left out the error checking for the allocations everywhere in my > email. It wasn't real code and I thought that testing for NULL all > over the place would obscure the real point. Sure thing. It was just my newly introduced paranoia, I have some things like them to fix. Regards, Thunder -- German attitude becoming | Thunder from the hill at ngforever rightaway popular: | "Get outa my way, | free inhabitant not directly for I got a mobile phone!" | belonging anywhere ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 7:43 ` David S. Miller 2002-06-11 8:07 ` Oliver Neukum @ 2002-06-11 15:57 ` William Jhun 2002-06-12 9:06 ` Pavel Machek 2 siblings, 0 replies; 102+ messages in thread From: William Jhun @ 2002-06-11 15:57 UTC (permalink / raw) To: David S. Miller; +Cc: oliver, roland, paulus, linux-kernel On Tue, Jun 11, 2002 at 12:43:05AM -0700, David S. Miller wrote: > From: "David S. Miller" <davem@redhat.com> > Date: Tue, 11 Jun 2002 00:36:25 -0700 (PDT) > > The DMA_ALIGN attribute doesn't work, on some systems the PCI > cacheline size is determined at boot time not compile time. > > Another note, it could be per-PCI controller what this cacheline size > is. We'll need to pass in a pdev to the alignment interfaces to > do this correctly. > > So none of this can be done at compile time folks. Why is this a problem? So you just make a static inline that takes the pdev and does the Right Thing at runtime... It's implementation- dependent whether it's a compile-time macro or a more elaborate inline... ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-11 7:43 ` David S. Miller 2002-06-11 8:07 ` Oliver Neukum 2002-06-11 15:57 ` William Jhun @ 2002-06-12 9:06 ` Pavel Machek 2002-06-12 20:16 ` David S. Miller 2 siblings, 1 reply; 102+ messages in thread From: Pavel Machek @ 2002-06-12 9:06 UTC (permalink / raw) To: David S. Miller; +Cc: oliver, roland, wjhun, paulus, linux-kernel Hi! > The DMA_ALIGN attribute doesn't work, on some systems the PCI > cacheline size is determined at boot time not compile time. > > Another note, it could be per-PCI controller what this cacheline size > is. We'll need to pass in a pdev to the alignment interfaces to > do this correctly. > > So none of this can be done at compile time folks. But upper bound is certainly known at compile time, right? Pavel -- (about SSSCA) "I don't say this lightly. However, I really think that the U.S. no longer is classifiable as a democracy, but rather as a plutocracy." --hpa ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 9:06 ` Pavel Machek @ 2002-06-12 20:16 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 20:16 UTC (permalink / raw) To: pavel; +Cc: oliver, roland, wjhun, paulus, linux-kernel From: Pavel Machek <pavel@ucw.cz> Date: Wed, 12 Jun 2002 11:06:39 +0200 But upper bound is certainly known at compile time, right? That's true. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-10 18:07 ` William Jhun ` (2 preceding siblings ...) 2002-06-11 3:10 ` David S. Miller @ 2002-06-12 11:47 ` David S. Miller 2002-06-12 12:08 ` Oliver Neukum 2002-06-12 16:09 ` William Jhun 3 siblings, 2 replies; 102+ messages in thread From: David S. Miller @ 2002-06-12 11:47 UTC (permalink / raw) To: wjhun; +Cc: paulus, roland, linux-kernel From: William Jhun <wjhun@ayrnetworks.com> Date: Mon, 10 Jun 2002 11:07:40 -0700 On Sun, Jun 09, 2002 at 09:27:05PM -0700, David S. Miller wrote: > I'm trying to specify this such that knowledge of cachelines and > whatnot don't escape the arch specific code, ho hum... Looks like > that isn't possible. Perhaps provide macros in asm/pci.h that will: You don't understand, I think. I want to avoid the drivers doing any of the "align this, align that" stuff. I want the allocation to do it for them, that way the code is in one place. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 11:47 ` David S. Miller @ 2002-06-12 12:08 ` Oliver Neukum 2002-06-14 4:41 ` David S. Miller 2002-06-12 16:09 ` William Jhun 1 sibling, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-12 12:08 UTC (permalink / raw) To: David S. Miller, wjhun; +Cc: paulus, roland, linux-kernel Am Mittwoch, 12. Juni 2002 13:47 schrieb David S. Miller: > From: William Jhun <wjhun@ayrnetworks.com> > Date: Mon, 10 Jun 2002 11:07:40 -0700 > > On Sun, Jun 09, 2002 at 09:27:05PM -0700, David S. Miller wrote: > > I'm trying to specify this such that knowledge of cachelines and > > whatnot don't escape the arch specific code, ho hum... Looks like > > that isn't possible. > > Perhaps provide macros in asm/pci.h that will: > > You don't understand, I think. I want to avoid the drivers doing > any of the "align this, align that" stuff. I want the allocation > to do it for them, that way the code is in one place. That means that all buffers must be allocated seperately. Is it worth that ? How about skbuffs ? Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 12:08 ` Oliver Neukum @ 2002-06-14 4:41 ` David S. Miller 0 siblings, 0 replies; 102+ messages in thread From: David S. Miller @ 2002-06-14 4:41 UTC (permalink / raw) To: oliver; +Cc: wjhun, paulus, roland, linux-kernel From: Oliver Neukum <oliver@neukum.name> Date: Wed, 12 Jun 2002 14:08:26 +0200 That means that all buffers must be allocated seperately. Is it worth that ? How about skbuffs ? skbuffs are fine, they are already minimally aligned to SMP_CACHE_BYTES and also have the same minimum length. The only problematic case are as in the eepro100.c driver where it tries to use part of the skbuff data area for the receive descriptor. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-12 11:47 ` David S. Miller 2002-06-12 12:08 ` Oliver Neukum @ 2002-06-12 16:09 ` William Jhun 1 sibling, 0 replies; 102+ messages in thread From: William Jhun @ 2002-06-12 16:09 UTC (permalink / raw) To: David S. Miller; +Cc: paulus, roland, linux-kernel On Wed, Jun 12, 2002 at 04:47:59AM -0700, David S. Miller wrote: > From: William Jhun <wjhun@ayrnetworks.com> > Date: Mon, 10 Jun 2002 11:07:40 -0700 > > On Sun, Jun 09, 2002 at 09:27:05PM -0700, David S. Miller wrote: > > I'm trying to specify this such that knowledge of cachelines and > > whatnot don't escape the arch specific code, ho hum... Looks like > > that isn't possible. > > Perhaps provide macros in asm/pci.h that will: > > You don't understand, I think. I want to avoid the drivers doing > any of the "align this, align that" stuff. I want the allocation > to do it for them, that way the code is in one place. No, I do understand your point. And this does not bring "knowledge of cachelines and whatnot" into the driver; those "macros" could similarly be calls to arch-specific code that acts based on a pdev. I was simply trying to think of a compromise between that and massively changing the interface by which a driver obtains buffers. And I assume alloc_skb() and others would need to change otherwise. How would you specify if your skb data needs to be PCI DMA-able? What about net drivers not using DMA at all? Thanks, Will ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-08 23:03 ` David S. Miller 2002-06-09 0:40 ` Roland Dreier @ 2002-06-09 1:30 ` Albert D. Cahalan 2002-06-09 5:29 ` David S. Miller 1 sibling, 1 reply; 102+ messages in thread From: Albert D. Cahalan @ 2002-06-09 1:30 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel David S. Miller writes: > From: Roland Dreier <roland@topspin.com> >> which causes stack corruption: on the PPC >> 440GP, pci_map_single() with PCI_DMA_FROMDEVICE just invalidates the >> cache for the region being mapped. However, if a buffer is smaller >> than a cache line, then two bad things can happen. > > There is no allocation scheme legal for PCI DMA which gives you > smaller than a cacheline of data, this includes SLAB. This is why > stack buffers and the like are illegal for PCI DMA. > >> The solution to this was simply to use kmalloc() to allocate buffers >> instead of using automatic variables. > > Right. > >> However, this leads to my first question: is this safe on all >> architectures? > > It must be. If the architecture allows SLAB to give smaller than > cacheline sized data, it must handle PCI DMA map/unmap flushing > in an appropriate fashion (ie. handle sub-cacheline buffers). >> >> DMA-mapping.txt says kmalloc()'ed memory is OK for DMAs and does not >> mention cache alignment. > > It doesn't mention cache alignment because that is an implementation > specific issue. The user of the interfaces need not be concerned > about any of this. > > There need be no changes to the documentation. If you do as the > documentation states (use kmalloc or get_free_page to get your > buffers) then it will just work. On a non-SMP system, would it be OK to map all the memory without memory coherency enabled? You seem to be implying that one only needs to implement some mechanism in pci_map_single() to handle flushing cache lines (write back, then invalidate). This would be useful for Macs. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 1:30 ` Albert D. Cahalan @ 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:33 ` Albert D. Cahalan 0 siblings, 1 reply; 102+ messages in thread From: David S. Miller @ 2002-06-09 5:29 UTC (permalink / raw) To: acahalan; +Cc: linux-kernel From: "Albert D. Cahalan" <acahalan@cs.uml.edu> Date: Sat, 8 Jun 2002 21:30:31 -0400 (EDT) On a non-SMP system, would it be OK to map all the memory without memory coherency enabled? You seem to be implying that one only needs to implement some mechanism in pci_map_single() to handle flushing cache lines (write back, then invalidate). This would be useful for Macs. It's just avoiding flushing by effecting flushing the cache after every load/store the cpu does, so of course it would work. It would be slow as hell, but it would work. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 5:29 ` David S. Miller @ 2002-06-09 6:33 ` Albert D. Cahalan 2002-06-09 6:50 ` Oliver Neukum 2002-06-09 8:48 ` Russell King 0 siblings, 2 replies; 102+ messages in thread From: Albert D. Cahalan @ 2002-06-09 6:33 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel David S. Miller writes: > From: "Albert D. Cahalan" <acahalan@cs.uml.edu> >> On a non-SMP system, would it be OK to map all the memory >> without memory coherency enabled? You seem to be implying that >> one only needs to implement some mechanism in pci_map_single() >> to handle flushing cache lines (write back, then invalidate). >> >> This would be useful for Macs. > > It's just avoiding flushing by effecting flushing the cache after > every load/store the cpu does, so of course it would work. > > It would be slow as hell, but it would work. I don't see why it would have to be slow. Mapping the memory with coherency disabled will reduce bus traffic for all the regular kernel code doing non-DMA stuff. (coherency requires that the CPU generate address-only bus operations) The obvious concern is that some kernel code might touch a cache line after the invalidate but before the DMA is done. If that could be considered a bug, then every non-SMP Mac could gain some speed by turning off coherency. Maybe the x86 MTRRs allow for something similar. For device --> memory DMA: 1. write back cache lines that cross unaligned boundries 2. start the DMA 3. invalidate the above cache lines 4. invalidate cache lines that are fully inside the DMA For memory --> device DMA: 1. write back all cache lines affected by the DMA 2. start the DMA 3. invalidate the above cache lines ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:33 ` Albert D. Cahalan @ 2002-06-09 6:50 ` Oliver Neukum 2002-06-09 6:57 ` Albert D. Cahalan 2002-06-09 8:48 ` Russell King 1 sibling, 1 reply; 102+ messages in thread From: Oliver Neukum @ 2002-06-09 6:50 UTC (permalink / raw) To: Albert D. Cahalan, David S. Miller; +Cc: linux-kernel > For memory --> device DMA: > > 1. write back all cache lines affected by the DMA > 2. start the DMA > 3. invalidate the above cache lines Why the third step ? That data should still be valid. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:50 ` Oliver Neukum @ 2002-06-09 6:57 ` Albert D. Cahalan 2002-06-09 7:15 ` Oliver Neukum 0 siblings, 1 reply; 102+ messages in thread From: Albert D. Cahalan @ 2002-06-09 6:57 UTC (permalink / raw) To: Oliver Neukum; +Cc: David S. Miller, linux-kernel Oliver Neukum writes: >> For memory --> device DMA: >> >> 1. write back all cache lines affected by the DMA >> 2. start the DMA >> 3. invalidate the above cache lines > > Why the third step ? That data should still > be valid. I made a mistake, but perhaps it is a good one. There is no need to invalidate the cache lines, but I'd guess that commonly the data won't be used again. Doing the invalidate would free up some cache lines, meaning that the CPU would have empty slots to use for other stuff. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:57 ` Albert D. Cahalan @ 2002-06-09 7:15 ` Oliver Neukum 0 siblings, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-09 7:15 UTC (permalink / raw) To: Albert D. Cahalan; +Cc: David S. Miller, linux-kernel Am Sonntag, 9. Juni 2002 08:57 schrieb Albert D. Cahalan: > Oliver Neukum writes: > >> For memory --> device DMA: > >> > >> 1. write back all cache lines affected by the DMA > >> 2. start the DMA > >> 3. invalidate the above cache lines > > > > Why the third step ? That data should still > > be valid. > > I made a mistake, but perhaps it is a good one. > There is no need to invalidate the cache lines, > but I'd guess that commonly the data won't be > used again. Doing the invalidate would free up > some cache lines, meaning that the CPU would > have empty slots to use for other stuff. Then this should be made commonly available and could be used eg. on kfree. The choice of kicking the data out or not could then be made. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 6:33 ` Albert D. Cahalan 2002-06-09 6:50 ` Oliver Neukum @ 2002-06-09 8:48 ` Russell King 2002-06-09 15:42 ` Albert D. Cahalan 2002-06-09 23:26 ` Oliver Neukum 1 sibling, 2 replies; 102+ messages in thread From: Russell King @ 2002-06-09 8:48 UTC (permalink / raw) To: Albert D. Cahalan; +Cc: David S. Miller, linux-kernel On Sun, Jun 09, 2002 at 02:33:35AM -0400, Albert D. Cahalan wrote: > For device --> memory DMA: > > 1. write back cache lines that cross unaligned boundries What if some of the cache lines inside the DMA region are dirty and... > 2. start the DMA get evicted now when the CPU is doing something else? > 3. invalidate the above cache lines > 4. invalidate cache lines that are fully inside the DMA You really need to: 1. write back cache lines that cross unaligned boundries 3. invalidate the above cache lines 2. start the DMA 4. invalidate cache lines that are fully inside the DMA which is precisely we do on ARM. > For memory --> device DMA: > > 1. write back all cache lines affected by the DMA > 2. start the DMA > 3. invalidate the above cache lines What's the point of (3) ? The data in memory hasn't been changed by DMA. What if we were writing out a page that was mmaped into a process, and the process wrote to that page while we were DMAing ? -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 8:48 ` Russell King @ 2002-06-09 15:42 ` Albert D. Cahalan 2002-06-09 23:26 ` Oliver Neukum 1 sibling, 0 replies; 102+ messages in thread From: Albert D. Cahalan @ 2002-06-09 15:42 UTC (permalink / raw) To: Russell King; +Cc: David S. Miller, linux-kernel Russell King writes: > On Sun, Jun 09, 2002 at 02:33:35AM -0400, Albert D. Cahalan wrote: >> For device --> memory DMA: >> >> 1. write back cache lines that cross unaligned boundries > > What if some of the cache lines inside the DMA region are dirty and... > >> 2. start the DMA > > get evicted now when the CPU is doing something else? Ugh. Yes, I guess the very act of starting the DMA could evict a few. In that case, your method won't work either. It sure would be nice to get the DMA started early, but that would require some way to keep new stuff out of the cache. Well gee, the MPC7400 has such an ability. I don't know if it's fast, but one could do: 1. write back cache lines that cross unaligned boundries 2. sync 3. set L2CR[L2DO] and L2CR[L2IO] (and maybe HID0[DLOCK] too) 4. start the DMA 5. invalidate all cache lines affected by the DMA 6. undo step 3 (clear L2CR[L2DO], L2CR[L2IO], HID0[DLOCK]) Most likely the above needs another sync or two. > You really need to: > > 1. write back cache lines that cross unaligned boundries > 3. invalidate the above cache lines > 2. start the DMA ------ write-back happens during DMA; data is ruined ------ > 4. invalidate cache lines that are fully inside the DMA > > which is precisely we do on ARM. >> For memory --> device DMA: >> >> 1. write back all cache lines affected by the DMA >> 2. start the DMA >> 3. invalidate the above cache lines > > What's the point of (3) ? The data in memory hasn't been changed by DMA. > What if we were writing out a page that was mmaped into a process, and > the process wrote to that page while we were DMAing ? As noted in another email, that was a mistake. It might be good to invalidate anyway, since that frees up cache lines for other uses. The process won't get a chance to touch the page unless you have SMP or (maybe) preemption. ^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: PCI DMA to small buffers on cache-incoherent arch 2002-06-09 8:48 ` Russell King 2002-06-09 15:42 ` Albert D. Cahalan @ 2002-06-09 23:26 ` Oliver Neukum 1 sibling, 0 replies; 102+ messages in thread From: Oliver Neukum @ 2002-06-09 23:26 UTC (permalink / raw) To: Russell King, Albert D. Cahalan; +Cc: David S. Miller, linux-kernel Am Sonntag, 9. Juni 2002 10:48 schrieb Russell King: > On Sun, Jun 09, 2002 at 02:33:35AM -0400, Albert D. Cahalan wrote: > > For device --> memory DMA: > > > > 1. write back cache lines that cross unaligned boundries > > What if some of the cache lines inside the DMA region are dirty and... > > > 2. start the DMA > > get evicted now when the CPU is doing something else? > > > 3. invalidate the above cache lines > > 4. invalidate cache lines that are fully inside the DMA > > You really need to: > > 1. write back cache lines that cross unaligned boundries > 3. invalidate the above cache lines > 2. start the DMA > 4. invalidate cache lines that are fully inside the DMA Starting DMA has to be the very last step. Regards Oliver ^ permalink raw reply [flat|nested] 102+ messages in thread
end of thread, other threads:[~2002-06-14 4:46 UTC | newest] Thread overview: 102+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-06-11 5:31 PCI DMA to small buffers on cache-incoherent arch David Brownell 2002-06-11 5:44 ` David S. Miller 2002-06-11 15:12 ` David Brownell 2002-06-11 15:44 ` Oliver Neukum 2002-06-12 3:25 ` David S. Miller 2002-06-11 17:33 ` Benjamin Herrenschmidt 2002-06-12 9:42 ` David S. Miller 2002-06-12 14:14 ` David Brownell 2002-06-12 15:00 ` Benjamin Herrenschmidt 2002-06-12 18:44 ` Roland Dreier 2002-06-12 19:13 ` David Brownell 2002-06-12 19:58 ` Oliver Neukum 2002-06-12 22:51 ` David S. Miller 2002-06-12 23:17 ` Oliver Neukum 2002-06-13 4:57 ` David Brownell 2002-06-12 22:46 ` David S. Miller 2002-06-13 5:13 ` David Brownell 2002-06-13 9:38 ` David S. Miller 2002-06-12 22:50 ` David S. Miller 2002-06-13 0:23 ` [PATCH] 2.4 add __dma_buffer alignment macro Roland Dreier 2002-06-13 1:07 ` David S. Miller 2002-06-13 1:19 ` Roland Dreier 2002-06-12 6:25 ` PCI DMA to small buffers on cache-incoherent arch David Brownell 2002-06-12 6:24 ` David S. Miller 2002-06-12 7:06 ` David Brownell 2002-06-12 9:22 ` David S. Miller -- strict thread matches above, loose matches on Subject: below -- 2002-06-08 20:38 Roland Dreier 2002-06-08 13:58 ` Anton Blanchard 2002-06-09 0:52 ` Roland Dreier 2002-06-08 23:03 ` David S. Miller 2002-06-09 0:40 ` Roland Dreier 2002-06-09 0:53 ` David S. Miller 2002-06-09 1:26 ` Roland Dreier 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:16 ` Roland Dreier 2002-06-10 16:03 ` Roland Dreier 2002-06-11 14:04 ` David Woodhouse 2002-06-09 6:45 ` Oliver Neukum 2002-06-10 4:24 ` David S. Miller 2002-06-10 10:00 ` Oliver Neukum 2002-06-10 10:24 ` David S. Miller 2002-06-09 9:51 ` Paul Mackerras 2002-06-09 10:54 ` Benjamin Herrenschmidt 2002-06-10 4:27 ` David S. Miller 2002-06-10 15:59 ` Roland Dreier 2002-06-10 17:03 ` Tom Rini 2002-06-10 17:22 ` Oliver Neukum 2002-06-10 17:29 ` Tom Rini 2002-06-10 17:39 ` Oliver Neukum 2002-06-10 19:03 ` Roland Dreier 2002-06-10 19:14 ` Tom Rini 2002-06-10 19:21 ` Roland Dreier 2002-06-10 19:26 ` Tom Rini 2002-06-10 17:57 ` Russell King 2002-06-10 17:28 ` Roland Dreier 2002-06-10 18:07 ` William Jhun 2002-06-10 18:29 ` William Jhun 2002-06-10 18:33 ` Mark Zealey 2002-06-10 18:44 ` Oliver Neukum 2002-06-11 3:10 ` David S. Miller 2002-06-11 4:04 ` Roland Dreier 2002-06-11 4:16 ` Brad Hards 2002-06-11 4:24 ` Roland Dreier 2002-06-11 4:24 ` David S. Miller 2002-06-11 4:21 ` David S. Miller 2002-06-11 4:39 ` Roland Dreier 2002-06-11 4:38 ` David S. Miller 2002-06-11 6:23 ` Oliver Neukum 2002-06-11 6:38 ` David S. Miller 2002-06-11 7:38 ` Oliver Neukum 2002-06-11 7:36 ` David S. Miller 2002-06-11 7:43 ` David S. Miller 2002-06-11 8:07 ` Oliver Neukum 2002-06-11 8:15 ` David S. Miller 2002-06-11 12:06 ` Oliver Neukum 2002-06-11 12:04 ` David S. Miller 2002-06-11 14:23 ` Oliver Neukum 2002-06-14 4:14 ` David S. Miller 2002-06-11 18:26 ` Roland Dreier 2002-06-11 17:29 ` Benjamin Herrenschmidt 2002-06-12 12:02 ` Oliver Neukum 2002-06-11 20:06 ` Benjamin Herrenschmidt 2002-06-12 12:02 ` David S. Miller 2002-06-11 23:00 ` Thunder from the hill 2002-06-11 23:56 ` Roland Dreier 2002-06-12 0:09 ` Thunder from the hill 2002-06-11 15:57 ` William Jhun 2002-06-12 9:06 ` Pavel Machek 2002-06-12 20:16 ` David S. Miller 2002-06-12 11:47 ` David S. Miller 2002-06-12 12:08 ` Oliver Neukum 2002-06-14 4:41 ` David S. Miller 2002-06-12 16:09 ` William Jhun 2002-06-09 1:30 ` Albert D. Cahalan 2002-06-09 5:29 ` David S. Miller 2002-06-09 6:33 ` Albert D. Cahalan 2002-06-09 6:50 ` Oliver Neukum 2002-06-09 6:57 ` Albert D. Cahalan 2002-06-09 7:15 ` Oliver Neukum 2002-06-09 8:48 ` Russell King 2002-06-09 15:42 ` Albert D. Cahalan 2002-06-09 23:26 ` Oliver Neukum
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox