* [PATCH/RFC v1 0/1] Buffer sync for non cache-coherent architectures
@ 2010-01-19 15:28 Pawel Osciak
2010-01-19 15:28 ` [PATCH v1 1/1] V4L: Add sync before a hardware operation to videobuf Pawel Osciak
0 siblings, 1 reply; 3+ messages in thread
From: Pawel Osciak @ 2010-01-19 15:28 UTC (permalink / raw)
To: linux-media; +Cc: p.osciak, m.szyprowski, kyungmin.park
Hello,
this is the initial patch for buffer data synchronization for architectures
with non-coherent cache.
=====================
Rationale
=====================
Architectures with non-coherent CPU cache (e.g. ARM) require a sync both before
and after a hardware operation. Until now, videobuf could work properly for
cache-coherent memory areas only, which for ARM actually means no cache at all.
This is not only reduces functionality, but hinders performance.
We would like to add support for video buffers present in CPU-cached areas as
well, which is especially important for OUTPUT buffers, but valid for CAPTURE
buffers as well (see DMA_FROM_DEVICE below).
We have isolated 4 different types of sync operations. Three of them are the
ones defined in pci.h and dma-mapping.h (enum dma_data_direction):
- DMA_FROM_DEVICE - the buffer is to be used as a destination buffer, the
data contained within before the operation is unimportant.
Although the data in memory is not important, we have to prevent a possible
future writeback to it resulting in an accidental overwrite of contents
produced by hardware after an operation. This requires cache invalidation
before the operation.
- DMA_TO_DEVICE - the data in buffer has been touched by (prepared using)
the CPU and will be used as valid source for an operation. This data
may still be in cache but not yet in memory. This requires a writeback.
- DMA_BIDIRECTIONAL - the buffer will be used as both source and destination.
This requires both writeback and cache invalidation.
There is one more operation we are considering, although not really
cache-related:
- FINISH - the operation is finished and the data in buffer (put there by
hardware) will be used further.
Operations required here may include:
* Making the memory pages involved in the operation dirty.
* Copying data back from a bounce buffer.
They are not strictly cache-related, but valid from the point of view
of our proposed approach.
>From the point of view of the videobuf framework, the following scenarios can be
isolated, depending on when the sync has been called and current buffer type:
- before hardware operation
- OUTPUT: DMA_TO_DEVICE
- CAPTURE: DMA_FROM_DEVICE
- after hardware operation
- OUTPUT: nothing
- CAPTURE: FINISH
DMA_BIDIRECTIONAL would take place for OUTPUT+CAPTURE buffers, which are not
(yet?) used in videobuf.
=====================
sync() in videobuf
=====================
videobuf includes a sync operation, declared in struct videobuf_qtype_ops,
which can be implemented by each memory type-specific module. It is used for
different purposes than cache management though, namely for operations like
copying data back from bounce buffers after an operation.
Only videobuf-dma-sg does anything in its sync currently.
Operations required to be done before hardware operation is started are
currently performed in iolock(), but it is not usable for cache coherency
management. The reason for this is that iolock() is intended to be run once
per buffer, not once per each hardware operation. Cache coherency operations
have to be performed before every operation though, also on previously
iolock()ed buffers.
We believe that the existing videobuf sync() operation can be extended for cache
management. This requires adding sync() calls before each hardware operation
as well. It is left to the callee to determine the kind of sync requested,
taking into account the guidelines below.
No new flags/states/etc. have to be added to videobuf for it to support all
kinds of syncs mentioned above. Current sync type can be determined in
memory-specific code in two steps:
1. CAPTURE or OUTPUT - based on buffer type in struct videobuf_queue.
2. before or after operation - based on buffer state in struct videobuf_buffer:
VIDEOBUF_DONE or VIDEOBUF_ERROR -> after, otherwise -> before.
Example (pseudo)code for a sync() operation:
#define is_sync_after(vb) \
(vb->state == VIDEOBUF_DONE || vb->state == VIDEOBUF_ERROR)
int videobuf_foo_sync(struct videobuf_queue *q, struct videobuf_buffer *vb)
{
/* ... */
if (is_sync_after(vb) {
/* Sync after operation */
if (q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE) {
do_sync_finish();
} else if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) {
/* nothing, unless we are missing something */
}
} else {
/* Sync before operation */
if (q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE) {
do_sync_from_device();
} else if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) {
do_sync_to_device();
}
}
/* ... */
}
=====================
About this patch
=====================
This is a very small patch, just a starting point for future, memory-specific
development. No existing functionality is changed. These are the only changes
that - in our opinion - are required in the core framework. Each memory-specific
videobuf submodule will have to implement (or ignore) additional sync
functionality depending on its memory type.
A small patch for dma-sg, which is the only type that has its own sync is
included. The only thing it does is recognizing whether the sync is a
"post-operation" one and in that case the same code as previously is called, so
the functionality remains unchanged.
We will be posting cache coherency patches for dma-contig in the near future.
Best regards
--
Pawel Osciak
Linux Platform Group
Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v1 1/1] V4L: Add sync before a hardware operation to videobuf.
2010-01-19 15:28 [PATCH/RFC v1 0/1] Buffer sync for non cache-coherent architectures Pawel Osciak
@ 2010-01-19 15:28 ` Pawel Osciak
2010-05-05 20:57 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 3+ messages in thread
From: Pawel Osciak @ 2010-01-19 15:28 UTC (permalink / raw)
To: linux-media; +Cc: p.osciak, m.szyprowski, kyungmin.park
Architectures with non-coherent CPU cache (e.g. ARM) may require a cache
flush or invalidation before starting a hardware operation if the data in
a video buffer being queued has been touched by the CPU.
This patch adds calls to sync before a hardware operation that are expected
to be interpreted and handled by each memory type-specific module.
Whether it is a sync before or after the operation can be determined from
the current buffer state: VIDEOBUF_DONE and VIDEOBUF_ERROR indicate a sync
called after an operation.
Signed-off-by: Pawel Osciak <p.osciak@samsung.com>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
---
drivers/media/video/videobuf-core.c | 9 +++++++++
drivers/media/video/videobuf-dma-sg.c | 6 ++++++
2 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/drivers/media/video/videobuf-core.c b/drivers/media/video/videobuf-core.c
index bb0a1c8..e56c67a 100644
--- a/drivers/media/video/videobuf-core.c
+++ b/drivers/media/video/videobuf-core.c
@@ -561,6 +561,8 @@ int videobuf_qbuf(struct videobuf_queue *q,
goto done;
}
+ CALL(q, sync, q, buf);
+
list_add_tail(&buf->stream, &q->stream);
if (q->streaming) {
spin_lock_irqsave(q->irqlock, flags);
@@ -761,6 +763,8 @@ static ssize_t videobuf_read_zerocopy(struct videobuf_queue *q,
if (0 != retval)
goto done;
+ CALL(q, sync, q, q->read_buf);
+
/* start capture & wait */
spin_lock_irqsave(q->irqlock, flags);
q->ops->buf_queue(q, q->read_buf);
@@ -826,6 +830,8 @@ ssize_t videobuf_read_one(struct videobuf_queue *q,
goto done;
}
+ CALL(q, sync, q, q->read_buf);
+
spin_lock_irqsave(q->irqlock, flags);
q->ops->buf_queue(q, q->read_buf);
spin_unlock_irqrestore(q->irqlock, flags);
@@ -893,6 +899,9 @@ static int __videobuf_read_start(struct videobuf_queue *q)
err = q->ops->buf_prepare(q, q->bufs[i], field);
if (err)
return err;
+
+ CALL(q, sync, q, q->read_buf);
+
list_add_tail(&q->bufs[i]->stream, &q->stream);
}
spin_lock_irqsave(q->irqlock, flags);
diff --git a/drivers/media/video/videobuf-dma-sg.c b/drivers/media/video/videobuf-dma-sg.c
index fa78555..2b153f8 100644
--- a/drivers/media/video/videobuf-dma-sg.c
+++ b/drivers/media/video/videobuf-dma-sg.c
@@ -50,6 +50,9 @@ MODULE_LICENSE("GPL");
#define dprintk(level, fmt, arg...) if (debug >= level) \
printk(KERN_DEBUG "vbuf-sg: " fmt , ## arg)
+#define is_sync_after(vb) \
+ (vb->state == VIDEOBUF_DONE || vb->state == VIDEOBUF_ERROR)
+
/* --------------------------------------------------------------------- */
struct scatterlist*
@@ -516,6 +519,9 @@ static int __videobuf_sync(struct videobuf_queue *q,
BUG_ON(!mem);
MAGIC_CHECK(mem->magic,MAGIC_SG_MEM);
+ if (!is_sync_after(buf))
+ return 0;
+
return videobuf_dma_sync(q,&mem->dma);
}
--
1.6.4.2.253.g0b1fac
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v1 1/1] V4L: Add sync before a hardware operation to videobuf.
2010-01-19 15:28 ` [PATCH v1 1/1] V4L: Add sync before a hardware operation to videobuf Pawel Osciak
@ 2010-05-05 20:57 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 3+ messages in thread
From: Mauro Carvalho Chehab @ 2010-05-05 20:57 UTC (permalink / raw)
To: Pawel Osciak; +Cc: linux-media, m.szyprowski, kyungmin.park
Pawel Osciak wrote:
> Architectures with non-coherent CPU cache (e.g. ARM) may require a cache
> flush or invalidation before starting a hardware operation if the data in
> a video buffer being queued has been touched by the CPU.
>
> This patch adds calls to sync before a hardware operation that are expected
> to be interpreted and handled by each memory type-specific module.
>
> Whether it is a sync before or after the operation can be determined from
> the current buffer state: VIDEOBUF_DONE and VIDEOBUF_ERROR indicate a sync
> called after an operation.
Hi Pawel,
After analyzing this patch, maybe the better is to add a check for dma
coherency. So, instead of directly calling sync,the better is to check
if !dma_is_consistent(), to avoid adding a penalty on architectures
where the cache is coherent.
>
> diff --git a/drivers/media/video/videobuf-core.c b/drivers/media/video/videobuf-core.c
> index bb0a1c8..e56c67a 100644
> --- a/drivers/media/video/videobuf-core.c
> +++ b/drivers/media/video/videobuf-core.c
> @@ -561,6 +561,8 @@ int videobuf_qbuf(struct videobuf_queue *q,
> goto done;
> }
>
> + CALL(q, sync, q, buf);
> +
> list_add_tail(&buf->stream, &q->stream);
> if (q->streaming) {
> spin_lock_irqsave(q->irqlock, flags);
> @@ -761,6 +763,8 @@ static ssize_t videobuf_read_zerocopy(struct videobuf_queue *q,
> if (0 != retval)
> goto done;
>
> + CALL(q, sync, q, q->read_buf);
> +
> /* start capture & wait */
> spin_lock_irqsave(q->irqlock, flags);
> q->ops->buf_queue(q, q->read_buf);
> @@ -826,6 +830,8 @@ ssize_t videobuf_read_one(struct videobuf_queue *q,
> goto done;
> }
>
> + CALL(q, sync, q, q->read_buf);
> +
> spin_lock_irqsave(q->irqlock, flags);
> q->ops->buf_queue(q, q->read_buf);
> spin_unlock_irqrestore(q->irqlock, flags);
> @@ -893,6 +899,9 @@ static int __videobuf_read_start(struct videobuf_queue *q)
> err = q->ops->buf_prepare(q, q->bufs[i], field);
> if (err)
> return err;
> +
> + CALL(q, sync, q, q->read_buf);
> +
> list_add_tail(&q->bufs[i]->stream, &q->stream);
> }
> spin_lock_irqsave(q->irqlock, flags);
> diff --git a/drivers/media/video/videobuf-dma-sg.c b/drivers/media/video/videobuf-dma-sg.c
> index fa78555..2b153f8 100644
> --- a/drivers/media/video/videobuf-dma-sg.c
> +++ b/drivers/media/video/videobuf-dma-sg.c
> @@ -50,6 +50,9 @@ MODULE_LICENSE("GPL");
> #define dprintk(level, fmt, arg...) if (debug >= level) \
> printk(KERN_DEBUG "vbuf-sg: " fmt , ## arg)
>
> +#define is_sync_after(vb) \
> + (vb->state == VIDEOBUF_DONE || vb->state == VIDEOBUF_ERROR)
> +
> /* --------------------------------------------------------------------- */
>
> struct scatterlist*
> @@ -516,6 +519,9 @@ static int __videobuf_sync(struct videobuf_queue *q,
> BUG_ON(!mem);
> MAGIC_CHECK(mem->magic,MAGIC_SG_MEM);
>
> + if (!is_sync_after(buf))
> + return 0;
> +
> return videobuf_dma_sync(q,&mem->dma);
> }
>
--
Cheers,
Mauro
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-05-05 20:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-19 15:28 [PATCH/RFC v1 0/1] Buffer sync for non cache-coherent architectures Pawel Osciak
2010-01-19 15:28 ` [PATCH v1 1/1] V4L: Add sync before a hardware operation to videobuf Pawel Osciak
2010-05-05 20:57 ` Mauro Carvalho Chehab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox