* Mem-2-Mem DMA - Generalized API
@ 2007-06-24 19:39 Clifford Wolf
2007-06-24 20:21 ` Arnd Bergmann
0 siblings, 1 reply; 19+ messages in thread
From: Clifford Wolf @ 2007-06-24 19:39 UTC (permalink / raw)
To: linuxppc-embedded
Hi,
I've sent this mail to lkml some days ago:
http://lkml.org/lkml/2007/6/23/27
and got that answer from Clemens Koller pointing to this mailing list:
http://lkml.org/lkml/2007/6/23/128
short summary:
I'm working on an MPC8349E based project and as some of you might know this
chip has a four channel (bus-) memory-to-memory DMA controller.
Unfortunately the linux kernel is atm lacking a generic interface for such
DMA controllers.
Instead of directly programming the DMA controller in my periphery driver
I'd prefer implementing a generic api/scheduler and would like to discuss
the implementation details. Such an api would make device drivers more
portable and would allow easy sharing of the DMA controller among many
device drivers.
<end of short summary>
imo there should be register/unregister functions for the driver(s)
handling the DMA controller(s) as well as a generic function for submiting
dma transfers.
Since at least the MPC8349E DMA controller has support for scatter/gather
dma modes and can run source/destionation optionally in 8, 16 or 32 fifo
mode this function would have a more complicated interface than memcpy().
The function for submitting dma transfers could be passed a function
pointer (as well a void data pointer for passing metadata) which is
called in interrupt context as soon as the tranfser has been completed.
Clemens pointed out that it could also be of use to additionaly get some
kind of progress notifications so the partial data can already be processed
before the entire dma transfer has been completed.
In addition to the bare-bone 'callback in interrupt context' api I would
also like to have small wrapper which just sleeps until the transfer has
been finished.
I would be very interested in your feedback, thoughts and whishes regarding
such an api to avoid rewrite-cycles later on.
yours,
- clifford
--
"The generation of random numbers is too important to be left to chance."
- Robert R. Coveyou, Oak Ridge National Laboratory.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-24 19:39 Mem-2-Mem DMA - Generalized API Clifford Wolf
@ 2007-06-24 20:21 ` Arnd Bergmann
2007-06-25 8:03 ` Clifford Wolf
2007-06-25 11:03 ` Matt Sealey
0 siblings, 2 replies; 19+ messages in thread
From: Arnd Bergmann @ 2007-06-24 20:21 UTC (permalink / raw)
To: linuxppc-embedded
On Sunday 24 June 2007, Clifford Wolf wrote:
> I'm working on an MPC8349E based project and as some of you might know this
> chip has a four channel (bus-) memory-to-memory DMA controller.
>
> Unfortunately the linux kernel is atm lacking a generic interface for such
> DMA controllers.
So what's wrong with the include/linux/dmaengine.h API? I thought it was
designed to cover this sort of DMA controller?
Arnd <><
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-24 20:21 ` Arnd Bergmann
@ 2007-06-25 8:03 ` Clifford Wolf
2007-06-25 11:03 ` Matt Sealey
1 sibling, 0 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-06-25 8:03 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-embedded
Hi,
On Sun, Jun 24, 2007 at 10:21:57PM +0200, Arnd Bergmann wrote:
> On Sunday 24 June 2007, Clifford Wolf wrote:
> > I'm working on an MPC8349E based project and as some of you might know this
> > chip has a four channel (bus-) memory-to-memory DMA controller.
> >
> > Unfortunately the linux kernel is atm lacking a generic interface for such
> > DMA controllers.
>
> So what's wrong with the include/linux/dmaengine.h API? I thought it was
> designed to cover this sort of DMA controller?
nothing. I was just to blind to find it.. ;-)
though there are some points:
on the first glimpse it seams like this api does not support scatter/gather
and fifo mode, right? in fact that's no problem at all for my project but
it would be a pity to lose that hardware functionality because of the api..
i have also had a quick look at the ioatdma driver and it apears to me that
it can only operate on address regions which are visible on the pci bus.
The MPC8349E dma can operate on everything which is visible on the coherent
local bus, i.e. everything that is also visible to the cpu. there seams to
be no way to specify the bus a dma channel is needed for when requesting a
channel thru this interface.
It also appears to me that the dmaengine.h API is not capable of
overcommiting. I.e. assigning a small pool of dma channels to a big pool of
drivers in the hope that not all of the drivers are doing dma transfers at
the same time (and schedule transfers if this assumtion turns out to be
wrong).
Wouldn't it be better to let the backend handle stuff like binding dma
channels to specific cpus and let the user just commit dma requests which
are then scheduled to the dma channel which fits the needs best (or done in
cpu if no dma channel exists which would be capable of doing this kind of
transfer)?
yours,
- clifford
--
"The generation of random numbers is too important to be left to chance."
- Robert R. Coveyou, Oak Ridge National Laboratory.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-24 20:21 ` Arnd Bergmann
2007-06-25 8:03 ` Clifford Wolf
@ 2007-06-25 11:03 ` Matt Sealey
2007-06-25 12:53 ` Clemens Koller
1 sibling, 1 reply; 19+ messages in thread
From: Matt Sealey @ 2007-06-25 11:03 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-embedded
IOAT and Intel's DMA engine driver is very IOAT specific in places..
I had a peek at it as I have a little interest in the concept; at least the
two platforms Genesi has been supporting (Pegasos and Efika) have quite
competant DMA engines which are woefully underused (i.e. not at all).
There exists a Marvell DMA driver somewhere (I have a copy, someone on
this list posted it about a year ago) and while the MPC5200B doesn't have
explicit support for DMA from memory to memory (although memory to SRAM
might work in chunks, or memory to a FIFO wired as a loopback like in
the docs..??)
There is so much you can do with most SoC DMA controllers, and it's not
even limited to PowerPC (most ARM/XScale SoCs have very capable devices
inside too). I can only imagine that nobody got excited over IOAT because
the entire programming interface stinks of "offloading gigabit ethernet"
and not much else.
--
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations
Arnd Bergmann wrote:
> On Sunday 24 June 2007, Clifford Wolf wrote:
>> I'm working on an MPC8349E based project and as some of you might know this
>> chip has a four channel (bus-) memory-to-memory DMA controller.
>>
>> Unfortunately the linux kernel is atm lacking a generic interface for such
>> DMA controllers.
>
> So what's wrong with the include/linux/dmaengine.h API? I thought it was
> designed to cover this sort of DMA controller?
>
> Arnd <><
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 11:03 ` Matt Sealey
@ 2007-06-25 12:53 ` Clemens Koller
2007-06-25 14:31 ` Matt Sealey
0 siblings, 1 reply; 19+ messages in thread
From: Clemens Koller @ 2007-06-25 12:53 UTC (permalink / raw)
To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded
Hello, Matt!
Matt Sealey schrieb:
> IOAT and Intel's DMA engine driver is very IOAT specific in places..
>
> I had a peek at it as I have a little interest in the concept; at least the
> two platforms Genesi has been supporting (Pegasos and Efika) have quite
> competant DMA engines which are woefully underused (i.e. not at all).
True.
> There exists a Marvell DMA driver somewhere (I have a copy, someone on
> this list posted it about a year ago) and while the MPC5200B doesn't have
> explicit support for DMA from memory to memory (although memory to SRAM
> might work in chunks, or memory to a FIFO wired as a loopback like in
> the docs..??)
>
> There is so much you can do with most SoC DMA controllers, and it's not
> even limited to PowerPC (most ARM/XScale SoCs have very capable devices
> inside too). I can only imagine that nobody got excited over IOAT because
> the entire programming interface stinks of "offloading gigabit ethernet"
> and not much else.
The main question remains: Is it possible to have a flexible cross platform
DMA API which handles even complex requests and does scheduling, prioritizing,
queuing, locking, (re-)building/caching of SG lists... automagically.
It could fall back to CPU's memcpy if the hardware doesn't have
the ability to use the DMA machine because all channels are already busy,
or the requested memory isn't DMAable or the request is just too small
that it doesn't make sense to setup a DMA channel.
Filling memory with zero is also a simple task for a DMA engine.
(Thinking about malloc() and memset())
The problem is IMHO similar to video acceleration. Within the
Xorg's XAA/EXA/whatever framework, the drivers accelerate certain
calls if the hardware has the capability to do so. Other calls fall back
to some default non accelerated memcpy() & friends.
Sounds like a lot of fun... replacing kernel's and libc's memcpy() with
memcpy_with_dma_if_possible(). :-)
Best regards,
--
Clemens Koller
__________________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Straße 45/1
Linhof Werksgelände
D-81379 München
Tel.089-741518-50
Fax 089-741518-19
http://www.anagramm-technology.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 12:53 ` Clemens Koller
@ 2007-06-25 14:31 ` Matt Sealey
2007-06-25 17:00 ` Olof Johansson
2007-06-25 18:01 ` Clifford Wolf
0 siblings, 2 replies; 19+ messages in thread
From: Matt Sealey @ 2007-06-25 14:31 UTC (permalink / raw)
To: Clemens Koller; +Cc: Arnd Bergmann, linuxppc-embedded
Clemens Koller wrote:
> Hello, Matt!
>
>> There is so much you can do with most SoC DMA controllers, and it's not
>> even limited to PowerPC (most ARM/XScale SoCs have very capable devices
>> inside too). I can only imagine that nobody got excited over IOAT because
>> the entire programming interface stinks of "offloading gigabit ethernet"
>> and not much else.
>
> The main question remains: Is it possible to have a flexible cross platform
> DMA API which handles even complex requests and does scheduling,
> prioritizing, queuing, locking, (re-)building/caching of SG lists... automagically.
I would think so. I think there is a fairly generic example in many parts
of the Linux kernel. Dare I say the Via Unichrome AGP subsystem? And a
bunch of the ARM/OMAP platforms..? A lot of the code is even identical,
I wonder why it isn't some library rather than platform drivers.
> Filling memory with zero is also a simple task for a DMA engine.
> (Thinking about malloc() and memset())
Also xor and logical operations, byte swapping huge chunks of data, that
kind of thing. Most DMA engines in SoCs have cute features like that. I
think BestComm can even calculate CRCs for IP packets.
> The problem is IMHO similar to video acceleration. Within the
> Xorg's XAA/EXA/whatever framework, the drivers accelerate certain
> calls if the hardware has the capability to do so. Other calls fall back
> to some default non accelerated memcpy() & friends.
>
> Sounds like a lot of fun... replacing kernel's and libc's memcpy() with
> memcpy_with_dma_if_possible(). :-)
Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise
it (it should be possible) or simply add to it, or if making a powerpc
specific dma engine abstraction would be an easier idea.
Probably the latter to be merged with the former at a later date would
be easier to manage. Take inspiration but don't be bound by Intel's
weird "new" (i.e. 15 year old) concept?
--
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 14:31 ` Matt Sealey
@ 2007-06-25 17:00 ` Olof Johansson
2007-06-25 17:48 ` Clifford Wolf
2007-06-25 18:01 ` Clifford Wolf
1 sibling, 1 reply; 19+ messages in thread
From: Olof Johansson @ 2007-06-25 17:00 UTC (permalink / raw)
To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded
On Mon, Jun 25, 2007 at 03:31:59PM +0100, Matt Sealey wrote:
>
> Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise
> it (it should be possible) or simply add to it, or if making a powerpc
> specific dma engine abstraction would be an easier idea.
It's hard to anticipate all possible uses of a framework when you first
write it. So, when you first write it with one device in mind, it's
fairly obvious that it might not fit well with the second device that
will use it. That's the case of drivers/dma and IOAT today.
That's the case with the dma engine framework today. Expand it, extend
it, fix it and improve it. Don't duplicate, re-invent and re-implement.
-Olof
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 17:00 ` Olof Johansson
@ 2007-06-25 17:48 ` Clifford Wolf
0 siblings, 0 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-06-25 17:48 UTC (permalink / raw)
To: Olof Johansson; +Cc: Arnd Bergmann, linuxppc-embedded
Hi,
On Mon, Jun 25, 2007 at 12:00:03PM -0500, Olof Johansson wrote:
> That's the case with the dma engine framework today. Expand it, extend
> it, fix it and improve it. Don't duplicate, re-invent and re-implement.
I'm not sure if I can agree with that.
The core idea befind the dma engine framework seams to be to statically
assign dma channels to device drivers. I think that the channels should
be dynamically assigned.
Imo writing an alternative implementation is much easier than hacking that
into the existing framework. Especially since the existing framework has
only one backend driver and only one user.
yours,
- clifford
--
/* You are not expected to understand this */
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 14:31 ` Matt Sealey
2007-06-25 17:00 ` Olof Johansson
@ 2007-06-25 18:01 ` Clifford Wolf
2007-06-25 21:20 ` Matt Sealey
2007-07-04 9:05 ` Clifford Wolf
1 sibling, 2 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-06-25 18:01 UTC (permalink / raw)
To: Matt Sealey; +Cc: Arnd Bergmann, linuxppc-embedded
Hi,
On Mon, Jun 25, 2007 at 03:31:59PM +0100, Matt Sealey wrote:
> > The main question remains: Is it possible to have a flexible cross platform
> > DMA API which handles even complex requests and does scheduling,
> > prioritizing, queuing, locking, (re-)building/caching of SG lists... automagically.
>
> I would think so. I think there is a fairly generic example in many parts
> of the Linux kernel. Dare I say the Via Unichrome AGP subsystem? And a
> bunch of the ARM/OMAP platforms..? A lot of the code is even identical,
> I wonder why it isn't some library rather than platform drivers.
I've put a 'draft header file' of an api as I would have expected it
online:
http://www.clifford.at/priv/dmatransfer.h
I'd love to hear your feedback on it.
One issue I'm absolutely not sure about atm are the different busses and
their address spaces. The design in the header file is working directly on
'bus addresses' (the thing accessable thru /dev/mem). Does anyone know a
case where this may be insufficient?
> > Filling memory with zero is also a simple task for a DMA engine.
> > (Thinking about malloc() and memset())
>
> Also xor and logical operations, byte swapping huge chunks of data, that
> kind of thing. Most DMA engines in SoCs have cute features like that. I
> think BestComm can even calculate CRCs for IP packets.
I havent added it yet but such things could be encoded using the
DMATRANSFER_CHUNK_* and DMATRANSFER_* flags.
However, i don't think that implementing stuff like memset() in a dma
controller is any good because that would just flood the memory bus which
would then block in 99% of all cases the cpu until the dma is finished.
It would however cost less power than doing that in the CPU. ;-)
> Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise
> it (it should be possible) or simply add to it, or if making a powerpc
> specific dma engine abstraction would be an easier idea.
I don't think that this would actually be powerpc specific in any way. But
since such general purpose dma controllers are more common in embedded
hardware this still seams to be the right place to discuss the issue.
yours,
- clifford
--
Relax, its only ONES and ZEROS!
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 18:01 ` Clifford Wolf
@ 2007-06-25 21:20 ` Matt Sealey
2007-07-04 9:05 ` Clifford Wolf
1 sibling, 0 replies; 19+ messages in thread
From: Matt Sealey @ 2007-06-25 21:20 UTC (permalink / raw)
To: Clifford Wolf; +Cc: Arnd Bergmann, linuxppc-embedded
Clifford Wolf wrote:
> Hi,
>
> However, i don't think that implementing stuff like memset() in a dma
> controller is any good because that would just flood the memory bus which
> would then block in 99% of all cases the cpu until the dma is finished.
>
> It would however cost less power than doing that in the CPU. ;-)
At least while the DMA transfer is happening, you could preempt to some
other task. Would it flood the memory bus? When a DMA transfer happens
would it really do it in such a way that it would stall the CPU on a
memory access far more than it would usually?
I think it would have to be extremely badly designed to be even ABLE to do
that, or at least, you'd be doing some incredibly unwise things to be able
to flood it like that.
>> Indeed. I wonder if we could pull apart the IOAT/DMA stuff and genericise
>> it (it should be possible) or simply add to it, or if making a powerpc
>> specific dma engine abstraction would be an easier idea.
>
> I don't think that this would actually be powerpc specific in any way. But
> since such general purpose dma controllers are more common in embedded
> hardware this still seams to be the right place to discuss the issue.
I meant powerpc platform (as in ARCH=powerpc) specific. Rather than dropping
it in the global drivers, just keep it as a library for powerpc. Everyone
else can get it later with a move into the full tree. As long as the headers
have common, easy to get to names that do not conflict with anything
preexisting, it would not affect anything.
Taking IOAT as an example and fixing it's weirdness would be a better start
than making a whole new API, but I think doing development *WITH* IOAT and
potentially trashing all.. umm.. 4 users, and the weird Linux kernel tripling
of development cost when heavily updating an actively maintained subsystem
that everyone else wants to touch, that would be detrimental. We don't want
to break Intel and we don't want to be tracking Intel's patches or having
extra weirdness break in (or for the number of users of that DMA system to
explode underneath New DMA Development)
--
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-06-25 18:01 ` Clifford Wolf
2007-06-25 21:20 ` Matt Sealey
@ 2007-07-04 9:05 ` Clifford Wolf
2007-07-04 10:11 ` Clemens Koller
` (2 more replies)
1 sibling, 3 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-07-04 9:05 UTC (permalink / raw)
To: linuxppc-embedded
Hi,
On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote:
> I've put a 'draft header file' of an api as I would have expected it
> online: [...]
Ok, so here comes the first implementation:
(I also have other projects, so it took a while.. ;-)
http://www.clifford.at/priv/dmatransfer-20070704.diff
This is just for the MPC8349 DMA now, registers are still hardcoded in the
driver instead of beeing taken from the platform files and support for
scatter-gather is still missing and the Kconfig integration isn't checking
if we are building for the mpc8349 (or even ppc) yet. But I think the
direction of the API is pretty clear.
The patch also contains a hackish demo client (dma_demo_client.ko) which is
performing some dma transfers in the 256th MB of physical memory. So it
should only be used on a machine with 256MB of memory bootet with mem=255M
(but changing that should be trivial). The demo client shows well how the
API works and how much overhead the API adds.
Any feedback this time?
yours,
- clifford
--
#!/usr/bin/perl
$p="1"x1002;for$c(2..1000){if($p=~/^.{$c}0/){next;};printf"%3d\%s",
$c,++$x%14?" ":"\n";while($p=~s/^((.{$c})+)1/${1}0/){}}$_="lPSFZQ".
"SJNFTZBUZ<IUUQ:;;XXX.DMJGGPSE.BU;QSJNF;>\n";y:B-Zl;:a-x M/:;print;
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-04 9:05 ` Clifford Wolf
@ 2007-07-04 10:11 ` Clemens Koller
2007-07-07 5:24 ` Timur Tabi
2007-07-07 13:08 ` Arnd Bergmann
2 siblings, 0 replies; 19+ messages in thread
From: Clemens Koller @ 2007-07-04 10:11 UTC (permalink / raw)
To: Clifford Wolf; +Cc: linuxppc-embedded
Clifford Wolf schrieb:
> On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote:
>> I've put a 'draft header file' of an api as I would have expected it
>> online: [...]
>
> Ok, so here comes the first implementation:
> (I also have other projects, so it took a while.. ;-)
>
> http://www.clifford.at/priv/dmatransfer-20070704.diff
>
> This is just for the MPC8349 DMA now, registers are still hardcoded in the
> driver instead of beeing taken from the platform files and support for
> scatter-gather is still missing and the Kconfig integration isn't checking
> if we are building for the mpc8349 (or even ppc) yet. But I think the
> direction of the API is pretty clear.
That looks good. It should be useful on other PowerQUICC's DMA engines
and maybe even for the MPC5200 BestComm, too, with some changes.
> The patch also contains a hackish demo client (dma_demo_client.ko) which is
> performing some dma transfers in the 256th MB of physical memory. So it
> should only be used on a machine with 256MB of memory bootet with mem=255M
> (but changing that should be trivial). The demo client shows well how the
> API works and how much overhead the API adds.
>
> Any feedback this time?
Sorry, I'm currently busy with some hardware design work.
But if you want to test some code, I can get you an SSH account on my
MPC8540 platform.
Best regards,
--
Clemens Koller
__________________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Straße 45/1
Linhof Werksgelände
D-81379 München
Tel.089-741518-50
Fax 089-741518-19
http://www.anagramm-technology.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-04 9:05 ` Clifford Wolf
2007-07-04 10:11 ` Clemens Koller
@ 2007-07-07 5:24 ` Timur Tabi
2007-07-07 8:41 ` Clifford Wolf
2007-07-07 13:08 ` Arnd Bergmann
2 siblings, 1 reply; 19+ messages in thread
From: Timur Tabi @ 2007-07-07 5:24 UTC (permalink / raw)
To: Clifford Wolf; +Cc: linuxppc-embedded
Clifford Wolf wrote:
> Hi,
>
> On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote:
>> I've put a 'draft header file' of an api as I would have expected it
>> online: [...]
>
> Ok, so here comes the first implementation:
> (I also have other projects, so it took a while.. ;-)
>
> http://www.clifford.at/priv/dmatransfer-20070704.diff
Why aren't you following the DMA API in include/linux/dmaengine.h?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-07 5:24 ` Timur Tabi
@ 2007-07-07 8:41 ` Clifford Wolf
0 siblings, 0 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-07-07 8:41 UTC (permalink / raw)
To: Timur Tabi; +Cc: linuxppc-embedded
Hi,
On Sat, Jul 07, 2007 at 12:24:43AM -0500, Timur Tabi wrote:
> Clifford Wolf wrote:
>> Hi,
>> On Mon, Jun 25, 2007 at 08:01:10PM +0200, Clifford Wolf wrote:
>>> I've put a 'draft header file' of an api as I would have expected it
>>> online: [...]
>> Ok, so here comes the first implementation:
>> (I also have other projects, so it took a while.. ;-)
>> http://www.clifford.at/priv/dmatransfer-20070704.diff
>
> Why aren't you following the DMA API in include/linux/dmaengine.h?
because dmaengine statically assigns dma channels to drivers. it does trhat
becaus it is in fact not a generic api but only designed for speeding up
TCP using the Intel I/OAT engine. If more drivers would try to use it one
would need to decide at compile time which driver should be the one that
may use the dmaengine.
My dmatransfer api assigns the dma channels dynamically and there is no
limit how many drivers could use dmatransfer instead of ioremap and memcpy.
Also my framework has support for scatter/gather dma, fifo modes and all
that stuff (but the mpc8349dma driver doen't implement it yet).
'Extending' dmaengine in a way so it supports all this would result in
replacing every public api function of dmaengine and having a much more
complicated backend. The only current dmaengine user is the tcp/ip code
which is just a few (less than 10 if i remember correctly) lines of code.
Imo dmaengine should be replaced by an entirely different api instead of
hacking around its shortcommings. Thats why I propose the dmatransfer api.
yours,
- clifford
--
for(var d,i=<>just</>,j=function(){d~=i~(defined(i=next[*],i)?" ":"
");},just,another,SPL,hacker;defined i||({debug d;return 0;});j());
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-04 9:05 ` Clifford Wolf
2007-07-04 10:11 ` Clemens Koller
2007-07-07 5:24 ` Timur Tabi
@ 2007-07-07 13:08 ` Arnd Bergmann
2007-07-07 13:27 ` Clifford Wolf
` (2 more replies)
2 siblings, 3 replies; 19+ messages in thread
From: Arnd Bergmann @ 2007-07-07 13:08 UTC (permalink / raw)
To: linuxppc-embedded; +Cc: Clifford Wolf
On Wednesday 04 July 2007, Clifford Wolf wrote:
> Ok, so here comes the first implementation:
> (I also have other projects, so it took a while.. ;-)
>
> http://www.clifford.at/priv/dmatransfer-20070704.diff
>
> This is just for the MPC8349 DMA now, registers are still hardcoded in the
> driver instead of beeing taken from the platform files and support for
> scatter-gather is still missing and the Kconfig integration isn't checking
> if we are building for the mpc8349 (or even ppc) yet. But I think the
> direction of the API is pretty clear.
Is this a superset of what the dmaengine API can do? If not, I would
guess that things can get really confusing for the user.
Is it possible to wrap the existing dmaengine code inside of a
dmatransfer implementation?
You should probably have the authors of the dmaengine on Cc in
your next version of the driver, if you indend to replace their
code.
> +
> +/*** A dmatransfer is described using input and output chunks ***/
> +
> +// set one of this bits in fifo mode an none in linear mode
Please follow the common kernel commenting style, which allows
/* one-line comment */
/*
* multi-line
* comment
*/
/**
* identifier - description in kerneldoc format
* @name: struct member or function argument
* @name2: another one
*
* long description here
*/
> +struct dmatransfer
> +{
> + struct kref refcount;
> +
> + // callbacks (may be called in interrupt context)
> + dmatransfer_progress_callback_t progress_callback;
> + dmatransfer_finish_callback_t finish_callback;
> + dmatransfer_release_callback_t release_callback;
> +
> + // private data from the requestor
> + union {
> + void *data_voidp;
> + __u32 data_u32;
> + __u64 data_u64;
> + } data;
This could simply be an unsigned long, I guess. We tend to
use unsigned long for generic data, unless it always is a pointer.
> + __u32 flags;
> +
> + // how many input and output junks
> + int n_input_chunks;
> + int n_output_chunks;
unsigned?
> + // these are used internally in dmatransfer_request_*
> + enum {
> + DMATRANSFER_ASYNC,
> + DMATRANSFER_WAITING,
> + DMATRANSFER_FINISHED
> + } status;
> + wait_queue_head_t wq;
> +
> + // these are used internally in the backends
> + struct dmatransfer_backend *backend;
> + struct list_head backend_spool;
> + void *backend_data_voidp;
> +
> + // this array first contains all input and output chunk descriptions (in
> this order) + struct dmatransfer_chunk chunks[];
> +};
> +
> +// This functions may return -EINVAL when the requested transfer can't be
> +// done in hardware and DMATRANSFER_DMAONLY is set and -EAGAIN if there
> are +// no free DMA channels and DMATRANSFER_NOSPOOL is set for the
> transfer. +extern int dmatransfer_request_async(struct dmatransfer
> *transfer);
Put the description in front of the function definition in kerneldoc format
> +extern int dmatransfer_request_wait(struct dmatransfer *transfer);
> +extern struct dmatransfer *dmatransfer_malloc(int chunks_n, int flags);
maybe better dmatransfer_alloc instead of dmatransfer_malloc?
To me, malloc sort of implies that it only does a pure allocation,
not also the initialization.
> +extern void dmatransfer_release(struct kref *kref);
There is a recent tendency to always leave out the 'extern' keyword
for function declarations.
> +/*** Backend drivers register themself using this struct ***/
> +
> +enum dmatransfer_quality {
> + DMATRANSFER_QUALITY_NO = 0,
> + DMATRANSFER_QUALITY_SOFT = 1,
> + DMATRANSFER_QUALITY_SPOOL = 2,
> + DMATRANSFER_QUALITY_GOOD = 3,
> + DMATRANSFER_QUALITY_BEST = 4,
> +};
> +
> +struct dmatransfer_backend
> +{
> + enum dmatransfer_quality (*check)(struct dmatransfer *transfer);
> + void (*perform)(struct dmatransfer *transfer);
> +
> + // performing transfers is 'reading' and unregistering is 'writing'
> + struct rw_semaphore unreg_sem;
> +
> + // internally used by the dmatransfer core
> + struct list_head backend_list;
> +};
> +
> +extern void dmatransfer_finish(struct dmatransfer *transfer);
> +
> +extern void dmatransfer_register_backend(struct dmatransfer_backend
> *backend); +
> +// This function may sleep until all pending dma transfers for this
> backend +// have been finished.
> +extern void dmatransfer_unregister_backend(struct dmatransfer_backend
> *backend); +
> +#endif /* DMATRANSFER_H */
> +
> +#include <linux/dmatransfer.h>
> +#include <linux/module.h>
> +#include <linux/sched.h>
> +
> +static rwlock_t backend_list_lock = RW_LOCK_UNLOCKED;
> +static LIST_HEAD(backend_list);
> +
> +static int dmatransfer_request_worker(struct dmatransfer *transfer)
> +{
> + enum dmatransfer_quality best_quality = DMATRANSFER_QUALITY_NO;
> + struct dmatransfer_backend *best_backend = 0;
> + int error_ret = -EINVAL;
> + struct list_head *p;
> +
> + // WARNING:
> + // There is a non-critical race in this codepath: backend->check() may
> return + // that the DMA request can be performed without spooling for two
> concurrently + // running dmatransfer_request_worker() instances. If the
> backend has only one + // free channel this will result in one request
> beeing executed without spooling + // and the other one beeing spooled. So
> DMATRANSFER_NOSPOOL does not guarantee + // that the request isn't spooled
> - just that we _try_ to avoid spooling. +
You appear to have a lot of overly long lines. Try to limit your code to less
than 80 characters per line.
> +struct dmatransfer *dmatransfer_malloc(int chunks_n, int flags)
> +{
> + size_t transfer_size = sizeof(struct dmatransfer) + 2*sizeof(struct
> dmatransfer_chunk); + struct dmatransfer *transfer = kmalloc(transfer_size,
> flags);
> + memset(transfer, 0, transfer_size);
Use kzalloc instead kmalloc+memset.
> +void dmatransfer_register_backend(struct dmatransfer_backend *backend)
> +{
> + unsigned long irqflags;
> +
> + init_rwsem(&backend->unreg_sem);
> +
> + write_lock_irqsave(&backend_list_lock, irqflags);
> + INIT_LIST_HEAD(&backend->backend_list);
> + list_add(&backend->backend_list, &backend_list);
> + write_unlock_irqrestore(&backend_list_lock, irqflags);
> +}
> +
> +void dmatransfer_unregister_backend(struct dmatransfer_backend *backend)
> +{
> + unsigned long irqflags;
> + write_lock_irqsave(&backend_list_lock, irqflags);
> + list_del(&backend->backend_list);
> + write_unlock_irqrestore(&backend_list_lock, irqflags);
> +
> + // make sure all pending requests have finished before returning
> + down_write(&backend->unreg_sem);
> + up_write(&backend->unreg_sem);
> +}
This usage of rw semaphores looks fishy.
> +EXPORT_SYMBOL(dmatransfer_request_async);
> +EXPORT_SYMBOL(dmatransfer_request_wait);
> +
> +EXPORT_SYMBOL(dmatransfer_malloc);
> +EXPORT_SYMBOL(dmatransfer_release);
> +
> +EXPORT_SYMBOL(dmatransfer_finish);
> +
> +EXPORT_SYMBOL(dmatransfer_register_backend);
> +EXPORT_SYMBOL(dmatransfer_unregister_backend);
Please put the EXPORT_SYMBOL lines directly below the respective symbol definitions.
Also, make them EXPORT_SYMBOL_GPL() or explain why you don't. For new subsystems,
we normally don't use non-GPL exports any more.
> +#define CHANNEL_N 4
> +
> +static LIST_HEAD(spool);
> +static struct dmatransfer *spool_current_transfers[CHANNEL_N];
> +static spinlock_t spool_lock = SPIN_LOCK_UNLOCKED;
use DEFINE_SPINLOCK instead of assigning to SPIN_LOCK_UNLOCKED.
> +#define BIT(_name, _n) (1<<(_n))
I personally don't like such macros. Why don't you just define named
constants for the bits in there right position?
s/BIT(TE, 7)/DMASR_TE/
> + iowrite32(transfer->chunks[0].bus_address,
> reg_map[i]+REG_OFFSET_DMASAR);
> + iowrite32(transfer->chunks[1].bus_address,
> reg_map[i]+REG_OFFSET_DMADAR); + iowrite32(transfer->chunks[1].bytecount,
> reg_map[i]+REG_OFFSET_DMABCR); +
> + iowrite32(BIT(TE, 7) | BIT(EOSI, 1) | BIT(EOCDI, 0),
> reg_map[i]+REG_OFFSET_DMASR); +
> + dmamr = 0;
> + dmamr |= BIT(EOTIE, 7);
> + dmamr |= BIT(CTM, 2);
> + iowrite32(dmamr, reg_map[i]+REG_OFFSET_DMAMR);
> +
> + dmamr |= BIT(CS, 0);
> + iowrite32(dmamr, reg_map[i]+REG_OFFSET_DMAMR);
> + }
> +}
iowrite32 is currently only well-defined for PCI devices, which I believe
this devide is not. Use out_be32 or out_le32 instead.
> + if ((dmasr & (BIT(TE, 7) | BIT(EOCDI, 0))) != 0) {
> + if ((dmasr & BIT(TE, 7)) != 0) {
> + printk(KERN_ERR "MPC8349 DMA Transfer Error on DMA channel #%d.\n",
> i); + BUG();
> + }
BUG() may be a little harsh here, especially since you are holding spinlocks.
It would be better to try to recover here, unless you expect actual data
corruption, in which case a full panic() might be more appropriate.
Arnd <><
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-07 13:08 ` Arnd Bergmann
@ 2007-07-07 13:27 ` Clifford Wolf
2007-07-07 13:28 ` Arnd Bergmann
2007-07-07 13:34 ` Clifford Wolf
2007-07-11 9:35 ` Clifford Wolf
2 siblings, 1 reply; 19+ messages in thread
From: Clifford Wolf @ 2007-07-07 13:27 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-embedded
Hi,
Thanks for your feedback. I will incorporate to my code and release a new
version of Tuesday when I'm back in the office..
On Sat, Jul 07, 2007 at 03:08:03PM +0200, Arnd Bergmann wrote:
> > +void dmatransfer_unregister_backend(struct dmatransfer_backend *backend)
> > +{
> > + unsigned long irqflags;
> > + write_lock_irqsave(&backend_list_lock, irqflags);
> > + list_del(&backend->backend_list);
> > + write_unlock_irqrestore(&backend_list_lock, irqflags);
> > +
> > + // make sure all pending requests have finished before returning
> > + down_write(&backend->unreg_sem);
> > + up_write(&backend->unreg_sem);
> > +}
>
> This usage of rw semaphores looks fishy.
yep. do you have a better idea how to implement this easily?
> > + if ((dmasr & (BIT(TE, 7) | BIT(EOCDI, 0))) != 0) {
> > + if ((dmasr & BIT(TE, 7)) != 0) {
> > + printk(KERN_ERR "MPC8349 DMA Transfer Error on DMA channel #%d.\n",
> > i); + BUG();
> > + }
>
> BUG() may be a little harsh here, especially since you are holding spinlocks.
> It would be better to try to recover here, unless you expect actual data
> corruption, in which case a full panic() might be more appropriate.
in this way dmatransfer() is designed to be used like memcpy(): you don't
try and check for errors afterwards, you only start a dmatransfer if you
know it is ok and can't fail.
This codepath triggeres if 'the impossible' happens: a dma transfer
actually fails. Since there is no datapath to communicate this error back
to the caller it is almost sure that we have lost data. I will change that
to use panic() instead.
yours,
- clifford
--
Some languages are designed to solve a problem. Others are designed to
prove a point.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-07 13:27 ` Clifford Wolf
@ 2007-07-07 13:28 ` Arnd Bergmann
0 siblings, 0 replies; 19+ messages in thread
From: Arnd Bergmann @ 2007-07-07 13:28 UTC (permalink / raw)
To: linuxppc-embedded; +Cc: Clifford Wolf
On Saturday 07 July 2007, Clifford Wolf wrote:
> > > +=A0=A0=A0// make sure all pending requests have finished before retu=
rning
> > > +=A0=A0=A0down_write(&backend->unreg_sem);
> > > +=A0=A0=A0up_write(&backend->unreg_sem);
> > > +}
> >=20
> > This usage of rw semaphores looks fishy.=20
>=20
> yep. do you have a better idea how to implement this easily?
>=20
I'd guess what you really want is reference counting. Every request
that gets assigned to a backend should get a kref on that backend
and give that up once it gets released itself.
I guess you can then have one more reference that you get in
dmatransfer_register_backend and release in dmatransfer_unregister_backend.
When you release the last reference, you call complete(),
and dmatransfer_unregister_backend() ends with a wait_for_completion().
Arnd <><
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-07 13:08 ` Arnd Bergmann
2007-07-07 13:27 ` Clifford Wolf
@ 2007-07-07 13:34 ` Clifford Wolf
2007-07-11 9:35 ` Clifford Wolf
2 siblings, 0 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-07-07 13:34 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-embedded
Hi again,
On Sat, Jul 07, 2007 at 03:08:03PM +0200, Arnd Bergmann wrote:
> Is this a superset of what the dmaengine API can do? If not, I would
> guess that things can get really confusing for the user.
dmatransfer should replace the dmaengine API. dmaengine has only one client
(the tcp/ip network stack - just a few lines of code) and provides only one
driver (ioatdma).
I would have added a ioatdma driver to dmatransfer but unfortunately I
don't have the hardware..
> Is it possible to wrap the existing dmaengine code inside of a
> dmatransfer implementation?
sure - but that wouldn't make much sense. dmaengine gives away exclusive
use rights to dma channels. so if dmatransfer is using dmaengine, noone
else could use it.
> You should probably have the authors of the dmaengine on Cc in
> your next version of the driver, if you indend to replace their
> code.
I'll write them a personal mail to get things rolling as soon as the next
dmatransfer release with the cleanups you suggested and scatter/gather as
well as fifo support is out.
yours,
- clifford
--
ocaml graphics.cma <( echo 'open Graphics;;open_graph " 640x480"let
complex_mul(a,b)(c,d)=(a*.c-.b*.d,a*.d+.b*.c)let complex_add(a,b)(c
,d)=(a+.c,b+.d);;let rec mandel c n=if n>0 then let z=mandel c(n-1)
in complex_add(complex_mul z z)c else (0.0,0.0);; for x=0 to 640 do
for y=0 to 480 do let c=((float_of_int(x-450))/.200.0,(float_of_int
(y-240))/.200.0) in let cabs2(a,b)=(a*.a)+.(b*.b)in if cabs2(mandel
c 50)<4.0 then plot x y done done;;read_line()' )
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Mem-2-Mem DMA - Generalized API
2007-07-07 13:08 ` Arnd Bergmann
2007-07-07 13:27 ` Clifford Wolf
2007-07-07 13:34 ` Clifford Wolf
@ 2007-07-11 9:35 ` Clifford Wolf
2 siblings, 0 replies; 19+ messages in thread
From: Clifford Wolf @ 2007-07-11 9:35 UTC (permalink / raw)
To: linuxppc-embedded, Arnd Bergmann
Hi again,
The updated status of my DMA Transfer API can be found at:
http://www.clifford.at/priv/dmatransfer-20070711.diff
I have incorporated your feedback and would appreciate if you could have
another look over the updated sourcecode. Changes include:
- Many cosmetic / coding style related changes
- Added MPC3494 Scatter/Gather DMA Support
- Added a 'softdma' driver providing software emulation
- Added DMATRANSFER_MAYFAIL flag, panic() if a DMA Transfer without
this flag set fails.
- Added /proc/dmatransfer with statistic data
I also have two questions. In the my current source there is the following
hack:
--snip--
wmb();
dma_cache_wback_inv(sg_list, sizeof(struct chain_desc) * segment_n);
#if 1
/* FIXME: dma_cache_wback_inv() should have done this already! */
{
void *p = sg_list;
while (p < ((void*)sg_list) +
sizeof(struct chain_desc) * segment_n) {
__asm__ __volatile__ ("dcbf 0,%0" : : "r" (p));
p += 4;
}
__asm__ __volatile__ ("sync");
}
#endif
--snap--
In my understanding, dma_cache_wback_inv() should do exactly the same thing
as my loop with PowerPC inline assembly. However, if I change the '#if 1'
to '#if 0' in this source it stopps working.
What am I doing wrong? What function should I use instead of
dma_cache_wback_inv() in this place?
> BUG() may be a little harsh here, especially since you are holding spinlocks.
> It would be better to try to recover here, unless you expect actual data
> corruption, in which case a full panic() might be more appropriate.
So what exactly would be the correct usecase of BUG() over panic()?
My impressions was that BUG() would be the right choice for errors coming
from bad programming while panic() would be for errors coming from bad
hardware.
yours,
- clifford
--
For extra security, this message has been encrypted with double-ROT13.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-07-11 9:31 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-24 19:39 Mem-2-Mem DMA - Generalized API Clifford Wolf
2007-06-24 20:21 ` Arnd Bergmann
2007-06-25 8:03 ` Clifford Wolf
2007-06-25 11:03 ` Matt Sealey
2007-06-25 12:53 ` Clemens Koller
2007-06-25 14:31 ` Matt Sealey
2007-06-25 17:00 ` Olof Johansson
2007-06-25 17:48 ` Clifford Wolf
2007-06-25 18:01 ` Clifford Wolf
2007-06-25 21:20 ` Matt Sealey
2007-07-04 9:05 ` Clifford Wolf
2007-07-04 10:11 ` Clemens Koller
2007-07-07 5:24 ` Timur Tabi
2007-07-07 8:41 ` Clifford Wolf
2007-07-07 13:08 ` Arnd Bergmann
2007-07-07 13:27 ` Clifford Wolf
2007-07-07 13:28 ` Arnd Bergmann
2007-07-07 13:34 ` Clifford Wolf
2007-07-11 9:35 ` Clifford Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).