* Mem-2-Mem DMA - Generalized API @ 2007-06-24 19:39 Clifford Wolf 2007-06-24 20:21 ` Arnd Bergmann 0 siblings, 1 reply; 19+ messages in thread From: Clifford Wolf @ 2007-06-24 19:39 UTC (permalink / raw) To: linuxppc-embedded Hi, I've sent this mail to lkml some days ago: http://lkml.org/lkml/2007/6/23/27 and got that answer from Clemens Koller pointing to this mailing list: http://lkml.org/lkml/2007/6/23/128 short summary: I'm working on an MPC8349E based project and as some of you might know this chip has a four channel (bus-) memory-to-memory DMA controller. Unfortunately the linux kernel is atm lacking a generic interface for such DMA controllers. Instead of directly programming the DMA controller in my periphery driver I'd prefer implementing a generic api/scheduler and would like to discuss the implementation details. Such an api would make device drivers more portable and would allow easy sharing of the DMA controller among many device drivers. <end of short summary> imo there should be register/unregister functions for the driver(s) handling the DMA controller(s) as well as a generic function for submiting dma transfers. Since at least the MPC8349E DMA controller has support for scatter/gather dma modes and can run source/destionation optionally in 8, 16 or 32 fifo mode this function would have a more complicated interface than memcpy(). The function for submitting dma transfers could be passed a function pointer (as well a void data pointer for passing metadata) which is called in interrupt context as soon as the tranfser has been completed. Clemens pointed out that it could also be of use to additionaly get some kind of progress notifications so the partial data can already be processed before the entire dma transfer has been completed. In addition to the bare-bone 'callback in interrupt context' api I would also like to have small wrapper which just sleeps until the transfer has been finished. I would be very interested in your feedback, thoughts and whishes regarding such an api to avoid rewrite-cycles later on. yours, - clifford -- "The generation of random numbers is too important to be left to chance." - Robert R. Coveyou, Oak Ridge National Laboratory. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-24 19:39 Mem-2-Mem DMA - Generalized API Clifford Wolf @ 2007-06-24 20:21 ` Arnd Bergmann 2007-06-25 8:03 ` Clifford Wolf 2007-06-25 11:03 ` Matt Sealey 0 siblings, 2 replies; 19+ messages in thread From: Arnd Bergmann @ 2007-06-24 20:21 UTC (permalink / raw) To: linuxppc-embedded On Sunday 24 June 2007, Clifford Wolf wrote: > I'm working on an MPC8349E based project and as some of you might know this > chip has a four channel (bus-) memory-to-memory DMA controller. > > Unfortunately the linux kernel is atm lacking a generic interface for such > DMA controllers. So what's wrong with the include/linux/dmaengine.h API? I thought it was designed to cover this sort of DMA controller? Arnd <>< ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-24 20:21 ` Arnd Bergmann @ 2007-06-25 8:03 ` Clifford Wolf 2007-06-25 11:03 ` Matt Sealey 1 sibling, 0 replies; 19+ messages in thread From: Clifford Wolf @ 2007-06-25 8:03 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-embedded Hi, On Sun, Jun 24, 2007 at 10:21:57PM +0200, Arnd Bergmann wrote: > On Sunday 24 June 2007, Clifford Wolf wrote: > > I'm working on an MPC8349E based project and as some of you might know this > > chip has a four channel (bus-) memory-to-memory DMA controller. > > > > Unfortunately the linux kernel is atm lacking a generic interface for such > > DMA controllers. > > So what's wrong with the include/linux/dmaengine.h API? I thought it was > designed to cover this sort of DMA controller? nothing. I was just to blind to find it.. ;-) though there are some points: on the first glimpse it seams like this api does not support scatter/gather and fifo mode, right? in fact that's no problem at all for my project but it would be a pity to lose that hardware functionality because of the api.. i have also had a quick look at the ioatdma driver and it apears to me that it can only operate on address regions which are visible on the pci bus. The MPC8349E dma can operate on everything which is visible on the coherent local bus, i.e. everything that is also visible to the cpu. there seams to be no way to specify the bus a dma channel is needed for when requesting a channel thru this interface. It also appears to me that the dmaengine.h API is not capable of overcommiting. I.e. assigning a small pool of dma channels to a big pool of drivers in the hope that not all of the drivers are doing dma transfers at the same time (and schedule transfers if this assumtion turns out to be wrong). Wouldn't it be better to let the backend handle stuff like binding dma channels to specific cpus and let the user just commit dma requests which are then scheduled to the dma channel which fits the needs best (or done in cpu if no dma channel exists which would be capable of doing this kind of transfer)? yours, - clifford -- "The generation of random numbers is too important to be left to chance." - Robert R. Coveyou, Oak Ridge National Laboratory. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-24 20:21 ` Arnd Bergmann 2007-06-25 8:03 ` Clifford Wolf @ 2007-06-25 11:03 ` Matt Sealey 2007-06-25 12:53 ` Clemens Koller 1 sibling, 1 reply; 19+ messages in thread From: Matt Sealey @ 2007-06-25 11:03 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-embedded IOAT and Intel's DMA engine driver is very IOAT specific in places.. I had a peek at it as I have a little interest in the concept; at least the two platforms Genesi has been supporting (Pegasos and Efika) have quite competant DMA engines which are woefully underused (i.e. not at all). There exists a Marvell DMA driver somewhere (I have a copy, someone on this list posted it about a year ago) and while the MPC5200B doesn't have explicit support for DMA from memory to memory (although memory to SRAM might work in chunks, or memory to a FIFO wired as a loopback like in the docs..??) There is so much you can do with most SoC DMA controllers, and it's not even limited to PowerPC (most ARM/XScale SoCs have very capable devices inside too). I can only imagine that nobody got excited over IOAT because the entire programming interface stinks of "offloading gigabit ethernet" and not much else. -- Matt Sealey <matt@genesi-usa.com> Genesi, Manager, Developer Relations Arnd Bergmann wrote: > On Sunday 24 June 2007, Clifford Wolf wrote: >> I'm working on an MPC8349E based project and as some of you might know this >> chip has a four channel (bus-) memory-to-memory DMA controller. >> >> Unfortunately the linux kernel is atm lacking a generic interface for such >> DMA controllers. > > So what's wrong with the include/linux/dmaengine.h API? I thought it was > designed to cover this sort of DMA controller? > > Arnd <>< > _______________________________________________ > Linuxppc-embedded mailing list > Linuxppc-embedded@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-embedded ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 11:03 ` Matt Sealey @ 2007-06-25 12:53 ` Clemens Koller 2007-06-25 14:31 ` Matt Sealey 0 siblings, 1 reply; 19+ messages in thread From: Clemens Koller @ 2007-06-25 12:53 UTC (permalink / raw) To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded Hello, Matt! Matt Sealey schrieb: > IOAT and Intel's DMA engine driver is very IOAT specific in places.. > > I had a peek at it as I have a little interest in the concept; at least the > two platforms Genesi has been supporting (Pegasos and Efika) have quite > competant DMA engines which are woefully underused (i.e. not at all). True. > There exists a Marvell DMA driver somewhere (I have a copy, someone on > this list posted it about a year ago) and while the MPC5200B doesn't have > explicit support for DMA from memory to memory (although memory to SRAM > might work in chunks, or memory to a FIFO wired as a loopback like in > the docs..??) > > There is so much you can do with most SoC DMA controllers, and it's not > even limited to PowerPC (most ARM/XScale SoCs have very capable devices > inside too). I can only imagine that nobody got excited over IOAT because > the entire programming interface stinks of "offloading gigabit ethernet" > and not much else. The main question remains: Is it possible to have a flexible cross platform DMA API which handles even complex requests and does scheduling, prioritizing, queuing, locking, (re-)building/caching of SG lists... automagically. It could fall back to CPU's memcpy if the hardware doesn't have the ability to use the DMA machine because all channels are already busy, or the requested memory isn't DMAable or the request is just too small that it doesn't make sense to setup a DMA channel. Filling memory with zero is also a simple task for a DMA engine. (Thinking about malloc() and memset()) The problem is IMHO similar to video acceleration. Within the Xorg's XAA/EXA/whatever framework, the drivers accelerate certain calls if the hardware has the capability to do so. Other calls fall back to some default non accelerated memcpy() & friends. Sounds like a lot of fun... replacing kernel's and libc's memcpy() with memcpy_with_dma_if_possible(). :-) Best regards, -- Clemens Koller __________________________________ R&D Imaging Devices Anagramm GmbH Rupert-Mayer-Straße 45/1 Linhof Werksgelände D-81379 München Tel.089-741518-50 Fax 089-741518-19 http://www.anagramm-technology.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 12:53 ` Clemens Koller @ 2007-06-25 14:31 ` Matt Sealey 2007-06-25 17:00 ` Olof Johansson 2007-06-25 18:01 ` Clifford Wolf 0 siblings, 2 replies; 19+ messages in thread From: Matt Sealey @ 2007-06-25 14:31 UTC (permalink / raw) To: Clemens Koller; +Cc: Arnd Bergmann, linuxppc-embedded Clemens Koller wrote: > Hello, Matt! > >> There is so much you can do with most SoC DMA controllers, and it's not >> even limited to PowerPC (most ARM/XScale SoCs have very capable devices >> inside too). I can only imagine that nobody got excited over IOAT because >> the entire programming interface stinks of "offloading gigabit ethernet" >> and not much else. > > The main question remains: Is it possible to have a flexible cross platform > DMA API which handles even complex requests and does scheduling, > prioritizing, queuing, locking, (re-)building/caching of SG lists... automagically. I would think so. I think there is a fairly generic example in many parts of the Linux kernel. Dare I say the Via Unichrome AGP subsystem? And a bunch of the ARM/OMAP platforms..? A lot of the code is even identical, I wonder why it isn't some library rather than platform drivers. > Filling memory with zero is also a simple task for a DMA engine. > (Thinking about malloc() and memset()) Also xor and logical operations, byte swapping huge chunks of data, that kind of thing. Most DMA engines in SoCs have cute features like that. I think BestComm can even calculate CRCs for IP packets. > The problem is IMHO similar to video acceleration. Within the > Xorg's XAA/EXA/whatever framework, the drivers accelerate certain > calls if the hardware has the capability to do so. Other calls fall back > to some default non accelerated memcpy() & friends. > > Sounds like a lot of fun... replacing kernel's and libc's memcpy() with > memcpy_with_dma_if_possible(). :-) Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise it (it should be possible) or simply add to it, or if making a powerpc specific dma engine abstraction would be an easier idea. Probably the latter to be merged with the former at a later date would be easier to manage. Take inspiration but don't be bound by Intel's weird "new" (i.e. 15 year old) concept? -- Matt Sealey <matt@genesi-usa.com> Genesi, Manager, Developer Relations ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 14:31 ` Matt Sealey @ 2007-06-25 17:00 ` Olof Johansson 2007-06-25 17:48 ` Clifford Wolf 2007-06-25 18:01 ` Clifford Wolf 1 sibling, 1 reply; 19+ messages in thread From: Olof Johansson @ 2007-06-25 17:00 UTC (permalink / raw) To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded On Mon, Jun 25, 2007 at 03:31:59PM +0100, Matt Sealey wrote: > > Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise > it (it should be possible) or simply add to it, or if making a powerpc > specific dma engine abstraction would be an easier idea. It's hard to anticipate all possible uses of a framework when you first write it. So, when you first write it with one device in mind, it's fairly obvious that it might not fit well with the second device that will use it. That's the case of drivers/dma and IOAT today. That's the case with the dma engine framework today. Expand it, extend it, fix it and improve it. Don't duplicate, re-invent and re-implement. -Olof ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 17:00 ` Olof Johansson @ 2007-06-25 17:48 ` Clifford Wolf 0 siblings, 0 replies; 19+ messages in thread From: Clifford Wolf @ 2007-06-25 17:48 UTC (permalink / raw) To: Olof Johansson; +Cc: Arnd Bergmann, linuxppc-embedded Hi, On Mon, Jun 25, 2007 at 12:00:03PM -0500, Olof Johansson wrote: > That's the case with the dma engine framework today. Expand it, extend > it, fix it and improve it. Don't duplicate, re-invent and re-implement. I'm not sure if I can agree with that. The core idea befind the dma engine framework seams to be to statically assign dma channels to device drivers. I think that the channels should be dynamically assigned. Imo writing an alternative implementation is much easier than hacking that into the existing framework. Especially since the existing framework has only one backend driver and only one user. yours, - clifford -- /* You are not expected to understand this */ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 14:31 ` Matt Sealey 2007-06-25 17:00 ` Olof Johansson @ 2007-06-25 18:01 ` Clifford Wolf 2007-06-25 21:20 ` Matt Sealey 2007-07-04 9:05 ` Clifford Wolf 1 sibling, 2 replies; 19+ messages in thread From: Clifford Wolf @ 2007-06-25 18:01 UTC (permalink / raw) To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded Hi, On Mon, Jun 25, 2007 at 03:31:59PM +0100, Matt Sealey wrote: > > The main question remains: Is it possible to have a flexible cross platform > > DMA API which handles even complex requests and does scheduling, > > prioritizing, queuing, locking, (re-)building/caching of SG lists... automagically. > > I would think so. I think there is a fairly generic example in many parts > of the Linux kernel. Dare I say the Via Unichrome AGP subsystem? And a > bunch of the ARM/OMAP platforms..? A lot of the code is even identical, > I wonder why it isn't some library rather than platform drivers. I've put a 'draft header file' of an api as I would have expected it online: http://www.clifford.at/priv/dmatransfer.h I'd love to hear your feedback on it. One issue I'm absolutely not sure about atm are the different busses and their address spaces. The design in the header file is working directly on 'bus addresses' (the thing accessable thru /dev/mem). Does anyone know a case where this may be insufficient? > > Filling memory with zero is also a simple task for a DMA engine. > > (Thinking about malloc() and memset()) > > Also xor and logical operations, byte swapping huge chunks of data, that > kind of thing. Most DMA engines in SoCs have cute features like that. I > think BestComm can even calculate CRCs for IP packets. I havent added it yet but such things could be encoded using the DMATRANSFER_CHUNK_* and DMATRANSFER_* flags. However, i don't think that implementing stuff like memset() in a dma controller is any good because that would just flood the memory bus which would then block in 99% of all cases the cpu until the dma is finished. It would however cost less power than doing that in the CPU. ;-) > Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise > it (it should be possible) or simply add to it, or if making a powerpc > specific dma engine abstraction would be an easier idea. I don't think that this would actually be powerpc specific in any way. But since such general purpose dma controllers are more common in embedded hardware this still seams to be the right place to discuss the issue. yours, - clifford -- Relax, its only ONES and ZEROS! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 18:01 ` Clifford Wolf @ 2007-06-25 21:20 ` Matt Sealey 2007-07-04 9:05 ` Clifford Wolf 1 sibling, 0 replies; 19+ messages in thread From: Matt Sealey @ 2007-06-25 21:20 UTC (permalink / raw) To: Clifford Wolf; +Cc: Arnd Bergmann, linuxppc-embedded Clifford Wolf wrote: > Hi, > > However, i don't think that implementing stuff like memset() in a dma > controller is any good because that would just flood the memory bus which > would then block in 99% of all cases the cpu until the dma is finished. > > It would however cost less power than doing that in the CPU. ;-) At least while the DMA transfer is happening, you could preempt to some other task. Would it flood the memory bus? When a DMA transfer happens would it really do it in such a way that it would stall the CPU on a memory access far more than it would usually? I think it would have to be extremely badly designed to be even ABLE to do that, or at least, you'd be doing some incredibly unwise things to be able to flood it like that. >> Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise >> it (it should be possible) or simply add to it, or if making a powerpc >> specific dma engine abstraction would be an easier idea. > > I don't think that this would actually be powerpc specific in any way. But > since such general purpose dma controllers are more common in embedded > hardware this still seams to be the right place to discuss the issue. I meant powerpc platform (as in ARCH=powerpc) specific. Rather than dropping it in the global drivers, just keep it as a library for powerpc. Everyone else can get it later with a move into the full tree. As long as the headers have common, easy to get to names that do not conflict with anything preexisting, it would not affect anything. Taking IOAT as an example and fixing it's weirdness would be a better start than making a whole new API, but I think doing development *WITH* IOAT and potentially trashing all.. umm.. 4 users, and the weird Linux kernel tripling of development cost when heavily updating an actively maintained subsystem that everyone else wants to touch, that would be detrimental. We don't want to break Intel and we don't want to be tracking Intel's patches or having extra weirdness break in (or for the number of users of that DMA system to explode underneath New DMA Development) -- Matt Sealey <matt@genesi-usa.com> Genesi, Manager, Developer Relations ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-06-25 18:01 ` Clifford Wolf 2007-06-25 21:20 ` Matt Sealey @ 2007-07-04 9:05 ` Clifford Wolf 2007-07-04 10:11 ` Clemens Koller ` (2 more replies) 1 sibling, 3 replies; 19+ messages in thread From: Clifford Wolf @ 2007-07-04 9:05 UTC (permalink / raw) To: linuxppc-embedded Hi, On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote: > I've put a 'draft header file' of an api as I would have expected it > online: [...] Ok, so here comes the first implementation: (I also have other projects, so it took a while.. ;-) http://www.clifford.at/priv/dmatransfer-20070704.diff This is just for the MPC8349 DMA now, registers are still hardcoded in the driver instead of beeing taken from the platform files and support for scatter-gather is still missing and the Kconfig integration isn't checking if we are building for the mpc8349 (or even ppc) yet. But I think the direction of the API is pretty clear. The patch also contains a hackish demo client (dma_demo_client.ko) which is performing some dma transfers in the 256th MB of physical memory. So it should only be used on a machine with 256MB of memory bootet with mem=255M (but changing that should be trivial). The demo client shows well how the API works and how much overhead the API adds. Any feedback this time? yours, - clifford -- #!/usr/bin/perl $p="1"x1002;for$c(2..1000){if($p=~/^.{$c}0/){next;};printf"%3d\%s", $c,++$x%14?" ":"\n";while($p=~s/^((.{$c})+)1/${1}0/){}}$_="lPSFZQ". "SJNFTZBUZ<IUUQ:;;XXX.DMJGGPSE.BU;QSJNF;>\n";y:B-Zl;:a-x M/:;print; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-04 9:05 ` Clifford Wolf @ 2007-07-04 10:11 ` Clemens Koller 2007-07-07 5:24 ` Timur Tabi 2007-07-07 13:08 ` Arnd Bergmann 2 siblings, 0 replies; 19+ messages in thread From: Clemens Koller @ 2007-07-04 10:11 UTC (permalink / raw) To: Clifford Wolf; +Cc: linuxppc-embedded Clifford Wolf schrieb: > On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote: >> I've put a 'draft header file' of an api as I would have expected it >> online: [...] > > Ok, so here comes the first implementation: > (I also have other projects, so it took a while.. ;-) > > http://www.clifford.at/priv/dmatransfer-20070704.diff > > This is just for the MPC8349 DMA now, registers are still hardcoded in the > driver instead of beeing taken from the platform files and support for > scatter-gather is still missing and the Kconfig integration isn't checking > if we are building for the mpc8349 (or even ppc) yet. But I think the > direction of the API is pretty clear. That looks good. It should be useful on other PowerQUICC's DMA engines and maybe even for the MPC5200 BestComm, too, with some changes. > The patch also contains a hackish demo client (dma_demo_client.ko) which is > performing some dma transfers in the 256th MB of physical memory. So it > should only be used on a machine with 256MB of memory bootet with mem=255M > (but changing that should be trivial). The demo client shows well how the > API works and how much overhead the API adds. > > Any feedback this time? Sorry, I'm currently busy with some hardware design work. But if you want to test some code, I can get you an SSH account on my MPC8540 platform. Best regards, -- Clemens Koller __________________________________ R&D Imaging Devices Anagramm GmbH Rupert-Mayer-Straße 45/1 Linhof Werksgelände D-81379 München Tel.089-741518-50 Fax 089-741518-19 http://www.anagramm-technology.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-04 9:05 ` Clifford Wolf 2007-07-04 10:11 ` Clemens Koller @ 2007-07-07 5:24 ` Timur Tabi 2007-07-07 8:41 ` Clifford Wolf 2007-07-07 13:08 ` Arnd Bergmann 2 siblings, 1 reply; 19+ messages in thread From: Timur Tabi @ 2007-07-07 5:24 UTC (permalink / raw) To: Clifford Wolf; +Cc: linuxppc-embedded Clifford Wolf wrote: > Hi, > > On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote: >> I've put a 'draft header file' of an api as I would have expected it >> online: [...] > > Ok, so here comes the first implementation: > (I also have other projects, so it took a while.. ;-) > > http://www.clifford.at/priv/dmatransfer-20070704.diff Why aren't you following the DMA API in include/linux/dmaengine.h? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-07 5:24 ` Timur Tabi @ 2007-07-07 8:41 ` Clifford Wolf 0 siblings, 0 replies; 19+ messages in thread From: Clifford Wolf @ 2007-07-07 8:41 UTC (permalink / raw) To: Timur Tabi; +Cc: linuxppc-embedded Hi, On Sat, Jul 07, 2007 at 12:24:43AM -0500, Timur Tabi wrote: > Clifford Wolf wrote: >> Hi, >> On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote: >>> I've put a 'draft header file' of an api as I would have expected it >>> online: [...] >> Ok, so here comes the first implementation: >> (I also have other projects, so it took a while.. ;-) >> http://www.clifford.at/priv/dmatransfer-20070704.diff > > Why aren't you following the DMA API in include/linux/dmaengine.h? because dmaengine statically assigns dma channels to drivers. it does trhat becaus it is in fact not a generic api but only designed for speeding up TCP using the Intel I/OAT engine. If more drivers would try to use it one would need to decide at compile time which driver should be the one that may use the dmaengine. My dmatransfer api assigns the dma channels dynamically and there is no limit how many drivers could use dmatransfer instead of ioremap and memcpy. Also my framework has support for scatter/gather dma, fifo modes and all that stuff (but the mpc8349dma driver doen't implement it yet). 'Extending' dmaengine in a way so it supports all this would result in replacing every public api function of dmaengine and having a much more complicated backend. The only current dmaengine user is the tcp/ip code which is just a few (less than 10 if i remember correctly) lines of code. Imo dmaengine should be replaced by an entirely different api instead of hacking around its shortcommings. Thats why I propose the dmatransfer api. yours, - clifford -- for(var d,i=<>just</>,j=function(){d~=i~(defined(i=next[*],i)?" ":" ");},just,another,SPL,hacker;defined i||({debug d;return 0;});j()); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-04 9:05 ` Clifford Wolf 2007-07-04 10:11 ` Clemens Koller 2007-07-07 5:24 ` Timur Tabi @ 2007-07-07 13:08 ` Arnd Bergmann 2007-07-07 13:27 ` Clifford Wolf ` (2 more replies) 2 siblings, 3 replies; 19+ messages in thread From: Arnd Bergmann @ 2007-07-07 13:08 UTC (permalink / raw) To: linuxppc-embedded; +Cc: Clifford Wolf On Wednesday 04 July 2007, Clifford Wolf wrote: > Ok, so here comes the first implementation: > (I also have other projects, so it took a while.. ;-) > > http://www.clifford.at/priv/dmatransfer-20070704.diff > > This is just for the MPC8349 DMA now, registers are still hardcoded in the > driver instead of beeing taken from the platform files and support for > scatter-gather is still missing and the Kconfig integration isn't checking > if we are building for the mpc8349 (or even ppc) yet. But I think the > direction of the API is pretty clear. Is this a superset of what the dmaengine API can do? If not, I would guess that things can get really confusing for the user. Is it possible to wrap the existing dmaengine code inside of a dmatransfer implementation? You should probably have the authors of the dmaengine on Cc in your next version of the driver, if you indend to replace their code. > + > +/*** A dmatransfer is described using input and output chunks ***/ > + > +// set one of this bits in fifo mode an none in linear mode Please follow the common kernel commenting style, which allows /* one-line comment */ /* * multi-line * comment */ /** * identifier - description in kerneldoc format * @name: struct member or function argument * @name2: another one * * long description here */ > +struct dmatransfer > +{ > + struct kref refcount; > + > + // callbacks (may be called in interrupt context) > + dmatransfer_progress_callback_t progress_callback; > + dmatransfer_finish_callback_t finish_callback; > + dmatransfer_release_callback_t release_callback; > + > + // private data from the requestor > + union { > + void *data_voidp; > + __u32 data_u32; > + __u64 data_u64; > + } data; This could simply be an unsigned long, I guess. We tend to use unsigned long for generic data, unless it always is a pointer. > + __u32 flags; > + > + // how many input and output junks > + int n_input_chunks; > + int n_output_chunks; unsigned? > + // these are used internally in dmatransfer_request_* > + enum { > + DMATRANSFER_ASYNC, > + DMATRANSFER_WAITING, > + DMATRANSFER_FINISHED > + } status; > + wait_queue_head_t wq; > + > + // these are used internally in the backends > + struct dmatransfer_backend *backend; > + struct list_head backend_spool; > + void *backend_data_voidp; > + > + // this array first contains all input and output chunk descriptions (in > this order) + struct dmatransfer_chunk chunks[]; > +}; > + > +// This functions may return -EINVAL when the requested transfer can't be > +// done in hardware and DMATRANSFER_DMAONLY is set and -EAGAIN if there > are +// no free DMA channels and DMATRANSFER_NOSPOOL is set for the > transfer. +extern int dmatransfer_request_async(struct dmatransfer > *transfer); Put the description in front of the function definition in kerneldoc format > +extern int dmatransfer_request_wait(struct dmatransfer *transfer); > +extern struct dmatransfer *dmatransfer_malloc(int chunks_n, int flags); maybe better dmatransfer_alloc instead of dmatransfer_malloc? To me, malloc sort of implies that it only does a pure allocation, not also the initialization. > +extern void dmatransfer_release(struct kref *kref); There is a recent tendency to always leave out the 'extern' keyword for function declarations. > +/*** Backend drivers register themself using this struct ***/ > + > +enum dmatransfer_quality { > + DMATRANSFER_QUALITY_NO = 0, > + DMATRANSFER_QUALITY_SOFT = 1, > + DMATRANSFER_QUALITY_SPOOL = 2, > + DMATRANSFER_QUALITY_GOOD = 3, > + DMATRANSFER_QUALITY_BEST = 4, > +}; > + > +struct dmatransfer_backend > +{ > + enum dmatransfer_quality (*check)(struct dmatransfer *transfer); > + void (*perform)(struct dmatransfer *transfer); > + > + // performing transfers is 'reading' and unregistering is 'writing' > + struct rw_semaphore unreg_sem; > + > + // internally used by the dmatransfer core > + struct list_head backend_list; > +}; > + > +extern void dmatransfer_finish(struct dmatransfer *transfer); > + > +extern void dmatransfer_register_backend(struct dmatransfer_backend > *backend); + > +// This function may sleep until all pending dma transfers for this > backend +// have been finished. > +extern void dmatransfer_unregister_backend(struct dmatransfer_backend > *backend); + > +#endif /* DMATRANSFER_H */ > + > +#include <linux/dmatransfer.h> > +#include <linux/module.h> > +#include <linux/sched.h> > + > +static rwlock_t backend_list_lock = RW_LOCK_UNLOCKED; > +static LIST_HEAD(backend_list); > + > +static int dmatransfer_request_worker(struct dmatransfer *transfer) > +{ > + enum dmatransfer_quality best_quality = DMATRANSFER_QUALITY_NO; > + struct dmatransfer_backend *best_backend = 0; > + int error_ret = -EINVAL; > + struct list_head *p; > + > + // WARNING: > + // There is a non-critical race in this codepath: backend->check() may > return + // that the DMA request can be performed without spooling for two > concurrently + // running dmatransfer_request_worker() instances. If the > backend has only one + // free channel this will result in one request > beeing executed without spooling + // and the other one beeing spooled. So > DMATRANSFER_NOSPOOL does not guarantee + // that the request isn't spooled > - just that we _try_ to avoid spooling. + You appear to have a lot of overly long lines. Try to limit your code to less than 80 characters per line. > +struct dmatransfer *dmatransfer_malloc(int chunks_n, int flags) > +{ > + size_t transfer_size = sizeof(struct dmatransfer) + 2*sizeof(struct > dmatransfer_chunk); + struct dmatransfer *transfer = kmalloc(transfer_size, > flags); > + memset(transfer, 0, transfer_size); Use kzalloc instead kmalloc+memset. > +void dmatransfer_register_backend(struct dmatransfer_backend *backend) > +{ > + unsigned long irqflags; > + > + init_rwsem(&backend->unreg_sem); > + > + write_lock_irqsave(&backend_list_lock, irqflags); > + INIT_LIST_HEAD(&backend->backend_list); > + list_add(&backend->backend_list, &backend_list); > + write_unlock_irqrestore(&backend_list_lock, irqflags); > +} > + > +void dmatransfer_unregister_backend(struct dmatransfer_backend *backend) > +{ > + unsigned long irqflags; > + write_lock_irqsave(&backend_list_lock, irqflags); > + list_del(&backend->backend_list); > + write_unlock_irqrestore(&backend_list_lock, irqflags); > + > + // make sure all pending requests have finished before returning > + down_write(&backend->unreg_sem); > + up_write(&backend->unreg_sem); > +} This usage of rw semaphores looks fishy. > +EXPORT_SYMBOL(dmatransfer_request_async); > +EXPORT_SYMBOL(dmatransfer_request_wait); > + > +EXPORT_SYMBOL(dmatransfer_malloc); > +EXPORT_SYMBOL(dmatransfer_release); > + > +EXPORT_SYMBOL(dmatransfer_finish); > + > +EXPORT_SYMBOL(dmatransfer_register_backend); > +EXPORT_SYMBOL(dmatransfer_unregister_backend); Please put the EXPORT_SYMBOL lines directly below the respective symbol definitions. Also, make them EXPORT_SYMBOL_GPL() or explain why you don't. For new subsystems, we normally don't use non-GPL exports any more. > +#define CHANNEL_N 4 > + > +static LIST_HEAD(spool); > +static struct dmatransfer *spool_current_transfers[CHANNEL_N]; > +static spinlock_t spool_lock = SPIN_LOCK_UNLOCKED; use DEFINE_SPINLOCK instead of assigning to SPIN_LOCK_UNLOCKED. > +#define BIT(_name, _n) (1<<(_n)) I personally don't like such macros. Why don't you just define named constants for the bits in there right position? s/BIT(TE, 7)/DMASR_TE/ > + iowrite32(transfer->chunks[0].bus_address, > reg_map[i]+REG_OFFSET_DMASAR); > + iowrite32(transfer->chunks[1].bus_address, > reg_map[i]+REG_OFFSET_DMADAR); + iowrite32(transfer->chunks[1].bytecount, > reg_map[i]+REG_OFFSET_DMABCR); + > + iowrite32(BIT(TE, 7) | BIT(EOSI, 1) | BIT(EOCDI, 0), > reg_map[i]+REG_OFFSET_DMASR); + > + dmamr = 0; > + dmamr |= BIT(EOTIE, 7); > + dmamr |= BIT(CTM, 2); > + iowrite32(dmamr, reg_map[i]+REG_OFFSET_DMAMR); > + > + dmamr |= BIT(CS, 0); > + iowrite32(dmamr, reg_map[i]+REG_OFFSET_DMAMR); > + } > +} iowrite32 is currently only well-defined for PCI devices, which I believe this devide is not. Use out_be32 or out_le32 instead. > + if ((dmasr & (BIT(TE, 7) | BIT(EOCDI, 0))) != 0) { > + if ((dmasr & BIT(TE, 7)) != 0) { > + printk(KERN_ERR "MPC8349 DMA Transfer Error on DMA channel #%d.\n", > i); + BUG(); > + } BUG() may be a little harsh here, especially since you are holding spinlocks. It would be better to try to recover here, unless you expect actual data corruption, in which case a full panic() might be more appropriate. Arnd <>< ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-07 13:08 ` Arnd Bergmann @ 2007-07-07 13:27 ` Clifford Wolf 2007-07-07 13:28 ` Arnd Bergmann 2007-07-07 13:34 ` Clifford Wolf 2007-07-11 9:35 ` Clifford Wolf 2 siblings, 1 reply; 19+ messages in thread From: Clifford Wolf @ 2007-07-07 13:27 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-embedded Hi, Thanks for your feedback. I will incorporate to my code and release a new version of Tuesday when I'm back in the office.. On Sat, Jul 07, 2007 at 03:08:03PM +0200, Arnd Bergmann wrote: > > +void dmatransfer_unregister_backend(struct dmatransfer_backend *backend) > > +{ > > + unsigned long irqflags; > > + write_lock_irqsave(&backend_list_lock, irqflags); > > + list_del(&backend->backend_list); > > + write_unlock_irqrestore(&backend_list_lock, irqflags); > > + > > + // make sure all pending requests have finished before returning > > + down_write(&backend->unreg_sem); > > + up_write(&backend->unreg_sem); > > +} > > This usage of rw semaphores looks fishy. yep. do you have a better idea how to implement this easily? > > + if ((dmasr & (BIT(TE, 7) | BIT(EOCDI, 0))) != 0) { > > + if ((dmasr & BIT(TE, 7)) != 0) { > > + printk(KERN_ERR "MPC8349 DMA Transfer Error on DMA channel #%d.\n", > > i); + BUG(); > > + } > > BUG() may be a little harsh here, especially since you are holding spinlocks. > It would be better to try to recover here, unless you expect actual data > corruption, in which case a full panic() might be more appropriate. in this way dmatransfer() is designed to be used like memcpy(): you don't try and check for errors afterwards, you only start a dmatransfer if you know it is ok and can't fail. This codepath triggeres if 'the impossible' happens: a dma transfer actually fails. Since there is no datapath to communicate this error back to the caller it is almost sure that we have lost data. I will change that to use panic() instead. yours, - clifford -- Some languages are designed to solve a problem. Others are designed to prove a point. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-07 13:27 ` Clifford Wolf @ 2007-07-07 13:28 ` Arnd Bergmann 0 siblings, 0 replies; 19+ messages in thread From: Arnd Bergmann @ 2007-07-07 13:28 UTC (permalink / raw) To: linuxppc-embedded; +Cc: Clifford Wolf On Saturday 07 July 2007, Clifford Wolf wrote: > > > +=A0=A0=A0// make sure all pending requests have finished before retu= rning > > > +=A0=A0=A0down_write(&backend->unreg_sem); > > > +=A0=A0=A0up_write(&backend->unreg_sem); > > > +} > >=20 > > This usage of rw semaphores looks fishy.=20 >=20 > yep. do you have a better idea how to implement this easily? >=20 I'd guess what you really want is reference counting. Every request that gets assigned to a backend should get a kref on that backend and give that up once it gets released itself. I guess you can then have one more reference that you get in dmatransfer_register_backend and release in dmatransfer_unregister_backend. When you release the last reference, you call complete(), and dmatransfer_unregister_backend() ends with a wait_for_completion(). Arnd <>< ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-07 13:08 ` Arnd Bergmann 2007-07-07 13:27 ` Clifford Wolf @ 2007-07-07 13:34 ` Clifford Wolf 2007-07-11 9:35 ` Clifford Wolf 2 siblings, 0 replies; 19+ messages in thread From: Clifford Wolf @ 2007-07-07 13:34 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-embedded Hi again, On Sat, Jul 07, 2007 at 03:08:03PM +0200, Arnd Bergmann wrote: > Is this a superset of what the dmaengine API can do? If not, I would > guess that things can get really confusing for the user. dmatransfer should replace the dmaengine API. dmaengine has only one client (the tcp/ip network stack - just a few lines of code) and provides only one driver (ioatdma). I would have added a ioatdma driver to dmatransfer but unfortunately I don't have the hardware.. > Is it possible to wrap the existing dmaengine code inside of a > dmatransfer implementation? sure - but that wouldn't make much sense. dmaengine gives away exclusive use rights to dma channels. so if dmatransfer is using dmaengine, noone else could use it. > You should probably have the authors of the dmaengine on Cc in > your next version of the driver, if you indend to replace their > code. I'll write them a personal mail to get things rolling as soon as the next dmatransfer release with the cleanups you suggested and scatter/gather as well as fifo support is out. yours, - clifford -- ocaml graphics.cma <( echo 'open Graphics;;open_graph " 640x480"let complex_mul(a,b)(c,d)=(a*.c-.b*.d,a*.d+.b*.c)let complex_add(a,b)(c ,d)=(a+.c,b+.d);;let rec mandel c n=if n>0 then let z=mandel c(n-1) in complex_add(complex_mul z z)c else (0.0,0.0);; for x=0 to 640 do for y=0 to 480 do let c=((float_of_int(x-450))/.200.0,(float_of_int (y-240))/.200.0) in let cabs2(a,b)=(a*.a)+.(b*.b)in if cabs2(mandel c 50)<4.0 then plot x y done done;;read_line()' ) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API 2007-07-07 13:08 ` Arnd Bergmann 2007-07-07 13:27 ` Clifford Wolf 2007-07-07 13:34 ` Clifford Wolf @ 2007-07-11 9:35 ` Clifford Wolf 2 siblings, 0 replies; 19+ messages in thread From: Clifford Wolf @ 2007-07-11 9:35 UTC (permalink / raw) To: linuxppc-embedded, Arnd Bergmann Hi again, The updated status of my DMA Transfer API can be found at: http://www.clifford.at/priv/dmatransfer-20070711.diff I have incorporated your feedback and would appreciate if you could have another look over the updated sourcecode. Changes include: - Many cosmetic / coding style related changes - Added MPC3494 Scatter/Gather DMA Support - Added a 'softdma' driver providing software emulation - Added DMATRANSFER_MAYFAIL flag, panic() if a DMA Transfer without this flag set fails. - Added /proc/dmatransfer with statistic data I also have two questions. In the my current source there is the following hack: --snip-- wmb(); dma_cache_wback_inv(sg_list, sizeof(struct chain_desc) * segment_n); #if 1 /* FIXME: dma_cache_wback_inv() should have done this already! */ { void *p = sg_list; while (p < ((void*)sg_list) + sizeof(struct chain_desc) * segment_n) { __asm__ __volatile__ ("dcbf 0,%0" : : "r" (p)); p += 4; } __asm__ __volatile__ ("sync"); } #endif --snap-- In my understanding, dma_cache_wback_inv() should do exactly the same thing as my loop with PowerPC inline assembly. However, if I change the '#if 1' to '#if 0' in this source it stopps working. What am I doing wrong? What function should I use instead of dma_cache_wback_inv() in this place? > BUG() may be a little harsh here, especially since you are holding spinlocks. > It would be better to try to recover here, unless you expect actual data > corruption, in which case a full panic() might be more appropriate. So what exactly would be the correct usecase of BUG() over panic()? My impressions was that BUG() would be the right choice for errors coming from bad programming while panic() would be for errors coming from bad hardware. yours, - clifford -- For extra security, this message has been encrypted with double-ROT13. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-07-11 9:31 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-06-24 19:39 Mem-2-Mem DMA - Generalized API Clifford Wolf 2007-06-24 20:21 ` Arnd Bergmann 2007-06-25 8:03 ` Clifford Wolf 2007-06-25 11:03 ` Matt Sealey 2007-06-25 12:53 ` Clemens Koller 2007-06-25 14:31 ` Matt Sealey 2007-06-25 17:00 ` Olof Johansson 2007-06-25 17:48 ` Clifford Wolf 2007-06-25 18:01 ` Clifford Wolf 2007-06-25 21:20 ` Matt Sealey 2007-07-04 9:05 ` Clifford Wolf 2007-07-04 10:11 ` Clemens Koller 2007-07-07 5:24 ` Timur Tabi 2007-07-07 8:41 ` Clifford Wolf 2007-07-07 13:08 ` Arnd Bergmann 2007-07-07 13:27 ` Clifford Wolf 2007-07-07 13:28 ` Arnd Bergmann 2007-07-07 13:34 ` Clifford Wolf 2007-07-11 9:35 ` Clifford Wolf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).