* cobalt & dma @ 2015-11-17 7:39 Ran Shalit 2015-11-17 7:53 ` Hans Verkuil 0 siblings, 1 reply; 10+ messages in thread From: Ran Shalit @ 2015-11-17 7:39 UTC (permalink / raw) To: linux-media Hello, I intend to use cobalt driver as a refence for new pci v4l2 driver, which is required to use several input simultaneously. for this cobalt seems like a best starting point. read/write streaming will probably be suffecient (at least for the dirst debugging). The configuration in my cast is i7 core <-- pci ---> fpga. I see that the dma implementation is quite complex, and would like to ask for some tips regarding the following points related to dma issue: 1. Is it possible to do the read/write without dma (for debug as start) ? What changes are required for read without dma (I assume dma is used by default in read/write) ? Is it done by using #include <media/videobuf2-vmalloc.h> instead of #include <media/videobuf2-dma*> ? 2. I find it difficult to unerstand cobalt_dma_start_streaming() implementation, which has many specific cobalt memory writing iowrite32(). How can I understand how/what to implement dma in my specific platform/device ? Best Regards, Ran ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 7:39 cobalt & dma Ran Shalit @ 2015-11-17 7:53 ` Hans Verkuil 2015-11-17 13:15 ` Ran Shalit 2015-11-20 14:49 ` Ran Shalit 0 siblings, 2 replies; 10+ messages in thread From: Hans Verkuil @ 2015-11-17 7:53 UTC (permalink / raw) To: Ran Shalit, linux-media On 11/17/2015 08:39 AM, Ran Shalit wrote: > Hello, > > I intend to use cobalt driver as a refence for new pci v4l2 driver, > which is required to use several input simultaneously. for this cobalt > seems like a best starting point. > read/write streaming will probably be suffecient (at least for the > dirst debugging). > The configuration in my cast is i7 core <-- pci ---> fpga. > I see that the dma implementation is quite complex, and would like to > ask for some tips regarding the following points related to dma issue: > > 1. Is it possible to do the read/write without dma (for debug as start) ? No. All video capture/output devices all use DMA since it would be prohibitively expensive for the CPU to do otherwise. So just dig in and implement it. > What changes are required for read without dma (I assume dma is used > by default in read/write) ? > Is it done by using #include <media/videobuf2-vmalloc.h> instead of > #include <media/videobuf2-dma*> ? No. The vmalloc variant is typically used for USB devices. For PCI(e) you'll use videobuf2-dma-contig if the DMA engine requires physically contiguous DMA, or videobuf2-dma-sg if the DMA engine supports scatter-gather DMA. You can start with dma-contig since the DMA code tends to be simpler, but it is harder to get the required physically contiguous memory if memory fragmentation takes place. So you may not be able to allocate the buffers. dma-sg works much better with virtual memory. > > 2. I find it difficult to unerstand cobalt_dma_start_streaming() > implementation, which has many specific cobalt memory writing > iowrite32(). > How can I understand how/what to implement dma in my specific platform/device ? Read include/media/videobuf2-core.h. There is also an LWN article somewhere (albeit somewhat outdated by now). Don't expect to write three lines of code and everything works. You *do* have to write the code for your DMA hardware, there is no way around that. Regards, Hans > > > Best Regards, > Ran > -- > To unsubscribe from this list: send the line "unsubscribe linux-media" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 7:53 ` Hans Verkuil @ 2015-11-17 13:15 ` Ran Shalit 2015-11-17 13:32 ` Steven Toth 2015-11-17 13:54 ` Hans Verkuil 2015-11-20 14:49 ` Ran Shalit 1 sibling, 2 replies; 10+ messages in thread From: Ran Shalit @ 2015-11-17 13:15 UTC (permalink / raw) To: Hans Verkuil; +Cc: linux-media On Tue, Nov 17, 2015 at 9:53 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote: > On 11/17/2015 08:39 AM, Ran Shalit wrote: >> Hello, >> >> I intend to use cobalt driver as a refence for new pci v4l2 driver, >> which is required to use several input simultaneously. for this cobalt >> seems like a best starting point. >> read/write streaming will probably be suffecient (at least for the >> dirst debugging). >> The configuration in my cast is i7 core <-- pci ---> fpga. >> I see that the dma implementation is quite complex, and would like to >> ask for some tips regarding the following points related to dma issue: >> >> 1. Is it possible to do the read/write without dma (for debug as start) ? > > No. All video capture/output devices all use DMA since it would be prohibitively > expensive for the CPU to do otherwise. So just dig in and implement it. > Hi, Is the cobalt or other pci v4l device have the chip datasheet available so that we can do a reverse engineering and gain more understanding about the register read/write for the dma transactions ? I made a search but it seems that the PCIe chip datasheet for these devices is not available anywhere. Best Regards, Ran ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 13:15 ` Ran Shalit @ 2015-11-17 13:32 ` Steven Toth 2015-11-17 13:54 ` Hans Verkuil 1 sibling, 0 replies; 10+ messages in thread From: Steven Toth @ 2015-11-17 13:32 UTC (permalink / raw) To: Ran Shalit; +Cc: Hans Verkuil, linux-media > Is the cobalt or other pci v4l device have the chip datasheet > available so that we can do a reverse engineering and gain more > understanding about the register read/write for the dma transactions ? > I made a search but it seems that the PCIe chip datasheet for these > devices is not available anywhere. Generally you wouldn't need it, and I'm not sure it would help having it. Get to grips with the fundamentals and don't worry about cobalt registers. DMA programming is highly chip specific, but in general terms its highly similar in concept on any PCIe controller. Every driver+controller uses virtual/physical bus addresses that need to be understood, scatter gather list created and programmed into the h/w, interrupts serviced, buffer/transfer completion identification and transfer sizes. Look hard enough at any of the PCI/E drivers in the media tree and you'll see each of them implementing their own versions of the above. -- Steven Toth - Kernel Labs http://www.kernellabs.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 13:15 ` Ran Shalit 2015-11-17 13:32 ` Steven Toth @ 2015-11-17 13:54 ` Hans Verkuil 2015-11-17 21:43 ` Ran Shalit 1 sibling, 1 reply; 10+ messages in thread From: Hans Verkuil @ 2015-11-17 13:54 UTC (permalink / raw) To: Ran Shalit; +Cc: linux-media On 11/17/15 14:15, Ran Shalit wrote: > On Tue, Nov 17, 2015 at 9:53 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote: >> On 11/17/2015 08:39 AM, Ran Shalit wrote: >>> Hello, >>> >>> I intend to use cobalt driver as a refence for new pci v4l2 driver, >>> which is required to use several input simultaneously. for this cobalt >>> seems like a best starting point. >>> read/write streaming will probably be suffecient (at least for the >>> dirst debugging). >>> The configuration in my cast is i7 core <-- pci ---> fpga. >>> I see that the dma implementation is quite complex, and would like to >>> ask for some tips regarding the following points related to dma issue: >>> >>> 1. Is it possible to do the read/write without dma (for debug as start) ? >> >> No. All video capture/output devices all use DMA since it would be prohibitively >> expensive for the CPU to do otherwise. So just dig in and implement it. >> > > Hi, > > Is the cobalt or other pci v4l device have the chip datasheet > available so that we can do a reverse engineering and gain more > understanding about the register read/write for the dma transactions ? > I made a search but it seems that the PCIe chip datasheet for these > devices is not available anywhere. Sorry, no, it's not publicly available. But they all work along the same lines: each DMA descriptor has a PCI DMA address (where the data should be written to in memory), the length (bytes) of the DMA transfer and the pointer to the next DMA descriptor (chaining descriptors together). Finally there is some bit to trigger and interrupt when the full frame has been transferred. Regards, Hans ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 13:54 ` Hans Verkuil @ 2015-11-17 21:43 ` Ran Shalit 0 siblings, 0 replies; 10+ messages in thread From: Ran Shalit @ 2015-11-17 21:43 UTC (permalink / raw) To: Hans Verkuil; +Cc: linux-media On Tue, Nov 17, 2015 at 3:54 PM, Hans Verkuil <hverkuil@xs4all.nl> wrote: > On 11/17/15 14:15, Ran Shalit wrote: >> On Tue, Nov 17, 2015 at 9:53 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote: >>> On 11/17/2015 08:39 AM, Ran Shalit wrote: >>>> Hello, >>>> >>>> I intend to use cobalt driver as a refence for new pci v4l2 driver, >>>> which is required to use several input simultaneously. for this cobalt >>>> seems like a best starting point. >>>> read/write streaming will probably be suffecient (at least for the >>>> dirst debugging). >>>> The configuration in my cast is i7 core <-- pci ---> fpga. >>>> I see that the dma implementation is quite complex, and would like to >>>> ask for some tips regarding the following points related to dma issue: >>>> >>>> 1. Is it possible to do the read/write without dma (for debug as start) ? >>> >>> No. All video capture/output devices all use DMA since it would be prohibitively >>> expensive for the CPU to do otherwise. So just dig in and implement it. >>> >> >> Hi, >> >> Is the cobalt or other pci v4l device have the chip datasheet >> available so that we can do a reverse engineering and gain more >> understanding about the register read/write for the dma transactions ? >> I made a search but it seems that the PCIe chip datasheet for these >> devices is not available anywhere. > > Sorry, no, it's not publicly available. > > But they all work along the same lines: each DMA descriptor has a > PCI DMA address (where the data should be written to in memory), the length > (bytes) of the DMA transfer and the pointer to the next DMA descriptor (chaining > descriptors together). Finally there is some bit to trigger and interrupt when > the full frame has been transferred. > Thank you all very much for all these valuable information ! I must admit that when I observe the source examples, it seems quite complex, (at least much more complex than the driver I am familiar with, which most of them time is taking a functional example and understanding what to change and how, or writing simple drivers.... ) If there are any other tips and ideas about debug/testing/development steps when doing pci v4l device driver please tell me. Thank you all very much, Ran ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-17 7:53 ` Hans Verkuil 2015-11-17 13:15 ` Ran Shalit @ 2015-11-20 14:49 ` Ran Shalit 2015-11-20 14:55 ` Hans Verkuil 1 sibling, 1 reply; 10+ messages in thread From: Ran Shalit @ 2015-11-20 14:49 UTC (permalink / raw) To: Hans Verkuil; +Cc: linux-media Hello, > > No. All video capture/output devices all use DMA since it would be prohibitively > expensive for the CPU to do otherwise. So just dig in and implement it. I am trying to better understand how read() operation actually use the dma, but I can't yet understand it from code. > > No. The vmalloc variant is typically used for USB devices. For PCI(e) you'll > use videobuf2-dma-contig if the DMA engine requires physically contiguous DMA, > or videobuf2-dma-sg if the DMA engine supports scatter-gather DMA. You can > start with dma-contig since the DMA code tends to be simpler, but it is > harder to get the required physically contiguous memory if memory fragmentation > takes place. So you may not be able to allocate the buffers. dma-sg works much > better with virtual memory. > > 1. I tried to understand the code implementation of videobuf2 with regards to read(): read() -> vb2_read() -> __vb2_perform_fileio()-> vb2_internal_dqbuf() & copy_to_user() Where is the actual allocation of dma contiguous memory ? Is done with the userspace calloc() call in userspace (as shown in the v4l2 API example) ? As I understand the calloc/malloc are not guaranteed to be contiguous. How do I know if the try to allocate contigious memory has failed or not ? 2. Is the call to copy_to_user results is performance degredation of read() in compare to mmap() method ? Best Regards, Ran ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-20 14:49 ` Ran Shalit @ 2015-11-20 14:55 ` Hans Verkuil 2015-11-20 16:14 ` Ran Shalit 0 siblings, 1 reply; 10+ messages in thread From: Hans Verkuil @ 2015-11-20 14:55 UTC (permalink / raw) To: Ran Shalit; +Cc: linux-media On 11/20/2015 03:49 PM, Ran Shalit wrote: > Hello, > > > >> >> No. All video capture/output devices all use DMA since it would be prohibitively >> expensive for the CPU to do otherwise. So just dig in and implement it. > > I am trying to better understand how read() operation actually use the > dma, but I can't yet understand it from code. > >> >> No. The vmalloc variant is typically used for USB devices. For PCI(e) you'll >> use videobuf2-dma-contig if the DMA engine requires physically contiguous DMA, >> or videobuf2-dma-sg if the DMA engine supports scatter-gather DMA. You can >> start with dma-contig since the DMA code tends to be simpler, but it is >> harder to get the required physically contiguous memory if memory fragmentation >> takes place. So you may not be able to allocate the buffers. dma-sg works much >> better with virtual memory. >> >> > > > 1. I tried to understand the code implementation of videobuf2 with > regards to read(): > read() -> > vb2_read() -> > __vb2_perform_fileio()-> > vb2_internal_dqbuf() & copy_to_user() > > Where is the actual allocation of dma contiguous memory ? Is done with > the userspace calloc() call in userspace (as shown in the v4l2 API > example) ? As I understand the calloc/malloc are not guaranteed to be > contiguous. > How do I know if the try to allocate contigious memory has failed or not ? The actual allocation happens in videobuf2-vmalloc/dma-contig/dma-sg depending on the flavor of buffers you want (virtual memory, DMA into physically contiguous memory or DMA into scatter-gather memory). The alloc operation is the one that allocates the memory. > > > 2. Is the call to copy_to_user results is performance degredation of > read() in compare to mmap() method ? Correct. But if you use the vb2 framework then you get stream I/O and the read/write operations for free. vb2_read() sits on top of the stream I/O implementation. It basically requests buffers and loops while queuing and dequeuing buffers and calling copy_to_user() to copy the data into the read() buffer. This is (very) inefficient and applications should use the V4L2 stream I/O mechanism directly. Regards, Hans ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-20 14:55 ` Hans Verkuil @ 2015-11-20 16:14 ` Ran Shalit 2015-11-20 16:25 ` Hans Verkuil 0 siblings, 1 reply; 10+ messages in thread From: Ran Shalit @ 2015-11-20 16:14 UTC (permalink / raw) To: Hans Verkuil; +Cc: linux-media >> >> 1. I tried to understand the code implementation of videobuf2 with >> regards to read(): >> read() -> >> vb2_read() -> >> __vb2_perform_fileio()-> >> vb2_internal_dqbuf() & copy_to_user() >> >> Where is the actual allocation of dma contiguous memory ? Is done with >> the userspace calloc() call in userspace (as shown in the v4l2 API >> example) ? As I understand the calloc/malloc are not guaranteed to be >> contiguous. >> How do I know if the try to allocate contigious memory has failed or not ? > > The actual allocation happens in videobuf2-vmalloc/dma-contig/dma-sg depending > on the flavor of buffers you want (virtual memory, DMA into physically contiguous > memory or DMA into scatter-gather memory). The alloc operation is the one that > allocates the memory. Thank you very much for the time. Just to be sure I understand the general mechanism of DMA with regards to the read() operation and in the case of using contiguous memory, I try to draw the general sequence as I understand it from the code and reading on this issue: read() into user memory buffer -> vb2_read() -> __vb2_perform_fileio() -> deaque buffer with: vb2_internal_dqbuf() into contiguous DMA memory (kernel) -> copy_to_user() will actually copy from the contigious dma memory(kernel) into user buffer (userspace) 1. Is the above sequence correct ? 2. When talking about contiguous dma memory (or scatter-gatther) we actually always refer to memory allocated in kernel, right ? Best Regards, Ran ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: cobalt & dma 2015-11-20 16:14 ` Ran Shalit @ 2015-11-20 16:25 ` Hans Verkuil 0 siblings, 0 replies; 10+ messages in thread From: Hans Verkuil @ 2015-11-20 16:25 UTC (permalink / raw) To: Ran Shalit; +Cc: linux-media On 11/20/2015 05:14 PM, Ran Shalit wrote: >>> >>> 1. I tried to understand the code implementation of videobuf2 with >>> regards to read(): >>> read() -> >>> vb2_read() -> >>> __vb2_perform_fileio()-> >>> vb2_internal_dqbuf() & copy_to_user() >>> >>> Where is the actual allocation of dma contiguous memory ? Is done with >>> the userspace calloc() call in userspace (as shown in the v4l2 API >>> example) ? As I understand the calloc/malloc are not guaranteed to be >>> contiguous. >>> How do I know if the try to allocate contigious memory has failed or not ? >> >> The actual allocation happens in videobuf2-vmalloc/dma-contig/dma-sg depending >> on the flavor of buffers you want (virtual memory, DMA into physically contiguous >> memory or DMA into scatter-gather memory). The alloc operation is the one that >> allocates the memory. > > > Thank you very much for the time. > > Just to be sure I understand the general mechanism of DMA with regards > to the read() operation and in the case of using contiguous memory, > I try to draw the general sequence as I understand it from the code > and reading on this issue: > > read() into user memory buffer -> > vb2_read() -> > __vb2_perform_fileio() -> > deaque buffer with: vb2_internal_dqbuf() into > contiguous DMA memory (kernel) -> > copy_to_user() will actually copy from > the contigious dma memory(kernel) into user buffer (userspace) > > 1. Is the above sequence correct ? Yes. > 2. When talking about contiguous dma memory (or scatter-gatther) we > actually always refer to memory allocated in kernel, right ? Usually. With the V4L2_MEMORY_USERPTR stream I/O mode it is userspace that allocates the memory, but when using physically contiguous DMA this particular streaming mode is normally not supported. With V4L2_MEMORY_MMAP it is always the kernel that allocates the memory. Regards, Hans ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-11-20 16:25 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-11-17 7:39 cobalt & dma Ran Shalit 2015-11-17 7:53 ` Hans Verkuil 2015-11-17 13:15 ` Ran Shalit 2015-11-17 13:32 ` Steven Toth 2015-11-17 13:54 ` Hans Verkuil 2015-11-17 21:43 ` Ran Shalit 2015-11-20 14:49 ` Ran Shalit 2015-11-20 14:55 ` Hans Verkuil 2015-11-20 16:14 ` Ran Shalit 2015-11-20 16:25 ` Hans Verkuil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox