* [PATCH v2 0/3] add dma noncoherent API
@ 2025-06-27 10:19 Xu Yang
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-27 10:19 UTC (permalink / raw)
To: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, xu.yang_2,
mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh
Cc: linux-media, linux-kernel, linux-usb, imx, jun.li
On architectures where there is no coherent caching such as ARM it's
proved that using dma_alloc_noncontiguous API and handling manually
the cache flushing will significantly improve performance.
Refer to:
commit 20e1dbf2bbe2 ("media: uvcvideo: Use dma_alloc_noncontiguous API")
commit 68d0c3311ec1 ("media: stk1160: use dma_alloc_noncontiguous API")
However, it's obvious that there is significant code duplication between
these two commits. Beside, a potential user USB Monitor may read outdated
data before the driver do dma sync for cpu which will make the data
unreliable.
To reduce code duplication and avoid USB Monitor result unreliable, this
series will introduce dma noncoherent API to usb core. And the USB core
layer will manage synchronization itself.
Then the last 2 patches have used the API.
I have tested uvcvideo driver. But I haven't tested stk1160 driver as I
don't have such boards. @Ezequiel Garcia, @Dafna Hirschfeld do you have
time to test it? Your support on this would be greatly appreciated.
Changes in v2:
- handle it in USB core
v1:
- https://lore.kernel.org/linux-usb/20250614132446.251218-1-xu.yang_2@nxp.com/
Xu Yang (3):
usb: core: add dma-noncoherent buffer alloc and free API
media: uvcvideo: use usb_alloc_noncoherent/usb_free_noncoherent()
media: stk1160: use usb_alloc_noncoherent/usb_free_noncoherent()
drivers/media/usb/stk1160/stk1160-v4l.c | 4 --
drivers/media/usb/stk1160/stk1160-video.c | 43 ++++--------
drivers/media/usb/stk1160/stk1160.h | 7 --
drivers/media/usb/uvc/uvc_video.c | 56 ++++------------
drivers/usb/core/hcd.c | 30 +++++++++
drivers/usb/core/usb.c | 80 +++++++++++++++++++++++
include/linux/usb.h | 9 +++
7 files changed, 144 insertions(+), 85 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 10:19 [PATCH v2 0/3] add dma noncoherent API Xu Yang
@ 2025-06-27 10:19 ` Xu Yang
2025-06-27 11:22 ` Andy Shevchenko
2025-06-27 14:23 ` Alan Stern
2025-06-27 10:19 ` [PATCH v2 2/3] media: uvcvideo: use usb_alloc_noncoherent/usb_free_noncoherent() Xu Yang
` (2 subsequent siblings)
3 siblings, 2 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-27 10:19 UTC (permalink / raw)
To: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, xu.yang_2,
mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh
Cc: linux-media, linux-kernel, linux-usb, imx, jun.li
This will add usb_alloc_noncoherent() and usb_free_noncoherent()
functions to support alloc and free buffer in a dma-noncoherent way.
To explicit manage the memory ownership for the kernel and device,
this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
and call it at proper time. The management requires the user save
sg_table returned by usb_alloc_noncoherent() to urb->sgt.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
---
drivers/usb/core/hcd.c | 30 ++++++++++++++++
drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
include/linux/usb.h | 9 +++++
3 files changed, 119 insertions(+)
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index c22de97432a0..5fa00d32afb8 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
}
EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
+static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
+ struct urb *urb)
+{
+ enum dma_data_direction dir;
+
+ if (!urb->sgt)
+ return;
+
+ dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+ invalidate_kernel_vmap_range(urb->transfer_buffer,
+ urb->transfer_buffer_length);
+ dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
+}
+
+static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
+ struct urb *urb)
+{
+ enum dma_data_direction dir;
+
+ if (!urb->sgt)
+ return;
+
+ dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+ flush_kernel_vmap_range(urb->transfer_buffer,
+ urb->transfer_buffer_length);
+ dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
+}
+
/*-------------------------------------------------------------------------*/
/* may be called in any context with a valid urb->dev usecount
@@ -1516,6 +1544,7 @@ int usb_hcd_submit_urb (struct urb *urb, gfp_t mem_flags)
atomic_inc(&urb->use_count);
atomic_inc(&urb->dev->urbnum);
usbmon_urb_submit(&hcd->self, urb);
+ usb_dma_noncoherent_sync_for_device(hcd, urb);
/* NOTE requirements on root-hub callers (usbfs and the hub
* driver, for now): URBs' urb->transfer_buffer must be
@@ -1632,6 +1661,7 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
status = -EREMOTEIO;
unmap_urb_for_dma(hcd, urb);
+ usb_dma_noncoherent_sync_for_cpu(hcd, urb);
usbmon_urb_complete(&hcd->self, urb, status);
usb_anchor_suspend_wakeups(anchor);
usb_unanchor_urb(urb);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index 118fa4c93a79..b78e61f38d0b 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -1030,6 +1030,86 @@ void usb_free_coherent(struct usb_device *dev, size_t size, void *addr,
}
EXPORT_SYMBOL_GPL(usb_free_coherent);
+/**
+ * usb_alloc_noncoherent - allocate dma-noncoherent buffer for URB_NO_xxx_DMA_MAP
+ * @dev: device the buffer will be used with
+ * @size: requested buffer size
+ * @mem_flags: affect whether allocation may block
+ * @dma: used to return DMA address of buffer
+ * @dir: dma transfer direction
+ * @table: used to return sg_table of allocated memory
+ *
+ * Return: Either null (indicating no buffer could be allocated), or the
+ * cpu-space pointer to a buffer that may be used to perform DMA to the
+ * specified device. Such cpu-space buffers are returned along with the DMA
+ * address (through the pointer provided).
+ *
+ * To explicit manage the memory ownership for the kernel vs the device by
+ * usb core, the user needs save sg_table to urb->sgt. Then usb core will
+ * do dma sync for cpu and device properly.
+ *
+ * When the buffer is no longer used, free it with usb_free_noncoherent().
+ */
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
+ gfp_t mem_flags, dma_addr_t *dma,
+ enum dma_data_direction dir,
+ struct sg_table **table)
+{
+ struct device *dmadev;
+ struct sg_table *sgt;
+ void *buffer;
+
+ if (!dev || !dev->bus)
+ return NULL;
+
+ dmadev = bus_to_hcd(dev->bus)->self.sysdev;
+
+ sgt = dma_alloc_noncontiguous(dmadev, size, dir, mem_flags, 0);
+ if (!sgt)
+ return NULL;
+
+ buffer = dma_vmap_noncontiguous(dmadev, size, sgt);
+ if (!buffer) {
+ dma_free_noncontiguous(dmadev, size, sgt, dir);
+ return NULL;
+ }
+
+ *table = sgt;
+ *dma = sg_dma_address(sgt->sgl);
+
+ return buffer;
+}
+EXPORT_SYMBOL_GPL(usb_alloc_noncoherent);
+
+/**
+ * usb_free_noncoherent - free memory allocated with usb_alloc_noncoherent()
+ * @dev: device the buffer was used with
+ * @size: requested buffer size
+ * @addr: CPU address of buffer
+ * @dir: dma transfer direction
+ * @table: describe the allocated and DMA mapped memory,
+ *
+ * This reclaims an I/O buffer, letting it be reused. The memory must have
+ * been allocated using usb_alloc_noncoherent(), and the parameters must match
+ * those provided in that allocation request.
+ */
+void usb_free_noncoherent(struct usb_device *dev, size_t size,
+ void *addr, enum dma_data_direction dir,
+ struct sg_table *table)
+{
+ struct device *dmadev;
+
+ if (!dev || !dev->bus)
+ return;
+ if (!addr)
+ return;
+
+ dmadev = bus_to_hcd(dev->bus)->self.sysdev;
+ dma_vunmap_noncontiguous(dmadev, addr);
+ dma_free_noncontiguous(dmadev, size, table, dir);
+}
+EXPORT_SYMBOL_GPL(usb_free_noncoherent);
+
/*
* Notifications of device and interface registration
*/
diff --git a/include/linux/usb.h b/include/linux/usb.h
index e8662843e68c..fee0e3cfad4e 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1619,6 +1619,7 @@ struct urb {
void *transfer_buffer; /* (in) associated data buffer */
dma_addr_t transfer_dma; /* (in) dma addr for transfer_buffer */
struct scatterlist *sg; /* (in) scatter gather buffer list */
+ struct sg_table *sgt; /* (in) scatter gather table for noncoherent buffer */
int num_mapped_sgs; /* (internal) mapped sg entries */
int num_sgs; /* (in) number of entries in the sg list */
u32 transfer_buffer_length; /* (in) data buffer length */
@@ -1824,6 +1825,14 @@ void *usb_alloc_coherent(struct usb_device *dev, size_t size,
void usb_free_coherent(struct usb_device *dev, size_t size,
void *addr, dma_addr_t dma);
+enum dma_data_direction;
+
+void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
+ gfp_t mem_flags, dma_addr_t *dma, enum dma_data_direction dir,
+ struct sg_table **table);
+void usb_free_noncoherent(struct usb_device *dev, size_t size, void *addr,
+ enum dma_data_direction dir, struct sg_table *table);
+
/*-------------------------------------------------------------------*
* SYNCHRONOUS CALL SUPPORT *
*-------------------------------------------------------------------*/
--
2.34.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 2/3] media: uvcvideo: use usb_alloc_noncoherent/usb_free_noncoherent()
2025-06-27 10:19 [PATCH v2 0/3] add dma noncoherent API Xu Yang
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
@ 2025-06-27 10:19 ` Xu Yang
2025-06-27 10:19 ` [PATCH v2 3/3] media: stk1160: " Xu Yang
2025-06-27 11:25 ` [PATCH v2 0/3] add dma noncoherent API Andy Shevchenko
3 siblings, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-27 10:19 UTC (permalink / raw)
To: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, xu.yang_2,
mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh
Cc: linux-media, linux-kernel, linux-usb, imx, jun.li
This will use USB noncoherent API to alloc/free urb buffers, then
uvc driver needn't to do dma sync operations by itself.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
---
drivers/media/usb/uvc/uvc_video.c | 56 ++++++++-----------------------
1 file changed, 14 insertions(+), 42 deletions(-)
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index e3567aeb0007..614cf4781221 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -1280,15 +1280,6 @@ static inline struct device *uvc_stream_to_dmadev(struct uvc_streaming *stream)
return bus_to_hcd(stream->dev->udev->bus)->self.sysdev;
}
-static int uvc_submit_urb(struct uvc_urb *uvc_urb, gfp_t mem_flags)
-{
- /* Sync DMA. */
- dma_sync_sgtable_for_device(uvc_stream_to_dmadev(uvc_urb->stream),
- uvc_urb->sgt,
- uvc_stream_dir(uvc_urb->stream));
- return usb_submit_urb(uvc_urb->urb, mem_flags);
-}
-
/*
* uvc_video_decode_data_work: Asynchronous memcpy processing
*
@@ -1310,7 +1301,7 @@ static void uvc_video_copy_data_work(struct work_struct *work)
uvc_queue_buffer_release(op->buf);
}
- ret = uvc_submit_urb(uvc_urb, GFP_KERNEL);
+ ret = usb_submit_urb(uvc_urb->urb, GFP_KERNEL);
if (ret < 0)
dev_err(&uvc_urb->stream->intf->dev,
"Failed to resubmit video URB (%d).\n", ret);
@@ -1736,12 +1727,6 @@ static void uvc_video_complete(struct urb *urb)
/* Re-initialise the URB async work. */
uvc_urb->async_operations = 0;
- /* Sync DMA and invalidate vmap range. */
- dma_sync_sgtable_for_cpu(uvc_stream_to_dmadev(uvc_urb->stream),
- uvc_urb->sgt, uvc_stream_dir(stream));
- invalidate_kernel_vmap_range(uvc_urb->buffer,
- uvc_urb->stream->urb_size);
-
/*
* Process the URB headers, and optionally queue expensive memcpy tasks
* to be deferred to a work queue.
@@ -1750,7 +1735,7 @@ static void uvc_video_complete(struct urb *urb)
/* If no async work is needed, resubmit the URB immediately. */
if (!uvc_urb->async_operations) {
- ret = uvc_submit_urb(uvc_urb, GFP_ATOMIC);
+ ret = usb_submit_urb(uvc_urb->urb, GFP_ATOMIC);
if (ret < 0)
dev_err(&stream->intf->dev,
"Failed to resubmit video URB (%d).\n", ret);
@@ -1765,17 +1750,15 @@ static void uvc_video_complete(struct urb *urb)
*/
static void uvc_free_urb_buffers(struct uvc_streaming *stream)
{
- struct device *dma_dev = uvc_stream_to_dmadev(stream);
+ struct usb_device *udev = stream->dev->udev;
struct uvc_urb *uvc_urb;
for_each_uvc_urb(uvc_urb, stream) {
if (!uvc_urb->buffer)
continue;
- dma_vunmap_noncontiguous(dma_dev, uvc_urb->buffer);
- dma_free_noncontiguous(dma_dev, stream->urb_size, uvc_urb->sgt,
- uvc_stream_dir(stream));
-
+ usb_free_noncoherent(udev, stream->urb_size, uvc_urb->buffer,
+ uvc_stream_dir(stream), uvc_urb->sgt);
uvc_urb->buffer = NULL;
uvc_urb->sgt = NULL;
}
@@ -1786,26 +1769,13 @@ static void uvc_free_urb_buffers(struct uvc_streaming *stream)
static bool uvc_alloc_urb_buffer(struct uvc_streaming *stream,
struct uvc_urb *uvc_urb, gfp_t gfp_flags)
{
- struct device *dma_dev = uvc_stream_to_dmadev(stream);
-
- uvc_urb->sgt = dma_alloc_noncontiguous(dma_dev, stream->urb_size,
- uvc_stream_dir(stream),
- gfp_flags, 0);
- if (!uvc_urb->sgt)
- return false;
- uvc_urb->dma = uvc_urb->sgt->sgl->dma_address;
-
- uvc_urb->buffer = dma_vmap_noncontiguous(dma_dev, stream->urb_size,
- uvc_urb->sgt);
- if (!uvc_urb->buffer) {
- dma_free_noncontiguous(dma_dev, stream->urb_size,
- uvc_urb->sgt,
- uvc_stream_dir(stream));
- uvc_urb->sgt = NULL;
- return false;
- }
+ struct usb_device *udev = stream->dev->udev;
- return true;
+ uvc_urb->buffer = usb_alloc_noncoherent(udev, stream->urb_size,
+ gfp_flags, &uvc_urb->dma,
+ uvc_stream_dir(stream),
+ &uvc_urb->sgt);
+ return !!uvc_urb->buffer;
}
/*
@@ -1953,6 +1923,7 @@ static int uvc_init_video_isoc(struct uvc_streaming *stream,
urb->complete = uvc_video_complete;
urb->number_of_packets = npackets;
urb->transfer_buffer_length = size;
+ urb->sgt = uvc_urb->sgt;
for (i = 0; i < npackets; ++i) {
urb->iso_frame_desc[i].offset = i * psize;
@@ -2009,6 +1980,7 @@ static int uvc_init_video_bulk(struct uvc_streaming *stream,
size, uvc_video_complete, uvc_urb);
urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP;
urb->transfer_dma = uvc_urb->dma;
+ urb->sgt = uvc_urb->sgt;
uvc_urb->urb = urb;
}
@@ -2120,7 +2092,7 @@ static int uvc_video_start_transfer(struct uvc_streaming *stream,
/* Submit the URBs. */
for_each_uvc_urb(uvc_urb, stream) {
- ret = uvc_submit_urb(uvc_urb, gfp_flags);
+ ret = usb_submit_urb(uvc_urb->urb, gfp_flags);
if (ret < 0) {
dev_err(&stream->intf->dev,
"Failed to submit URB %u (%d).\n",
--
2.34.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 3/3] media: stk1160: use usb_alloc_noncoherent/usb_free_noncoherent()
2025-06-27 10:19 [PATCH v2 0/3] add dma noncoherent API Xu Yang
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
2025-06-27 10:19 ` [PATCH v2 2/3] media: uvcvideo: use usb_alloc_noncoherent/usb_free_noncoherent() Xu Yang
@ 2025-06-27 10:19 ` Xu Yang
2025-06-27 11:25 ` [PATCH v2 0/3] add dma noncoherent API Andy Shevchenko
3 siblings, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-27 10:19 UTC (permalink / raw)
To: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, xu.yang_2,
mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh
Cc: linux-media, linux-kernel, linux-usb, imx, jun.li
This will use USB noncoherent API to alloc/free urb buffers, then
stk1160 driver needn't to do dma sync operations by itself.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
---
drivers/media/usb/stk1160/stk1160-v4l.c | 4 ---
drivers/media/usb/stk1160/stk1160-video.c | 43 ++++++-----------------
drivers/media/usb/stk1160/stk1160.h | 7 ----
3 files changed, 11 insertions(+), 43 deletions(-)
diff --git a/drivers/media/usb/stk1160/stk1160-v4l.c b/drivers/media/usb/stk1160/stk1160-v4l.c
index 5ba3d9c4b3fb..715ce1dcb304 100644
--- a/drivers/media/usb/stk1160/stk1160-v4l.c
+++ b/drivers/media/usb/stk1160/stk1160-v4l.c
@@ -232,10 +232,6 @@ static int stk1160_start_streaming(struct stk1160 *dev)
/* submit urbs and enables IRQ */
for (i = 0; i < dev->isoc_ctl.num_bufs; i++) {
- struct stk1160_urb *stk_urb = &dev->isoc_ctl.urb_ctl[i];
-
- dma_sync_sgtable_for_device(stk1160_get_dmadev(dev), stk_urb->sgt,
- DMA_FROM_DEVICE);
rc = usb_submit_urb(dev->isoc_ctl.urb_ctl[i].urb, GFP_KERNEL);
if (rc) {
stk1160_err("cannot submit urb[%d] (%d)\n", i, rc);
diff --git a/drivers/media/usb/stk1160/stk1160-video.c b/drivers/media/usb/stk1160/stk1160-video.c
index 9cbd957ecc90..416cb74377eb 100644
--- a/drivers/media/usb/stk1160/stk1160-video.c
+++ b/drivers/media/usb/stk1160/stk1160-video.c
@@ -298,9 +298,7 @@ static void stk1160_process_isoc(struct stk1160 *dev, struct urb *urb)
static void stk1160_isoc_irq(struct urb *urb)
{
int i, rc;
- struct stk1160_urb *stk_urb = urb->context;
- struct stk1160 *dev = stk_urb->dev;
- struct device *dma_dev = stk1160_get_dmadev(dev);
+ struct stk1160 *dev = urb->context;
switch (urb->status) {
case 0:
@@ -315,10 +313,6 @@ static void stk1160_isoc_irq(struct urb *urb)
return;
}
- invalidate_kernel_vmap_range(stk_urb->transfer_buffer,
- urb->transfer_buffer_length);
- dma_sync_sgtable_for_cpu(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
-
stk1160_process_isoc(dev, urb);
/* Reset urb buffers */
@@ -327,7 +321,6 @@ static void stk1160_isoc_irq(struct urb *urb)
urb->iso_frame_desc[i].actual_length = 0;
}
- dma_sync_sgtable_for_device(dma_dev, stk_urb->sgt, DMA_FROM_DEVICE);
rc = usb_submit_urb(urb, GFP_ATOMIC);
if (rc)
stk1160_err("urb re-submit failed (%d)\n", rc);
@@ -365,11 +358,9 @@ void stk1160_cancel_isoc(struct stk1160 *dev)
static void stk_free_urb(struct stk1160 *dev, struct stk1160_urb *stk_urb)
{
- struct device *dma_dev = stk1160_get_dmadev(dev);
-
- dma_vunmap_noncontiguous(dma_dev, stk_urb->transfer_buffer);
- dma_free_noncontiguous(dma_dev, stk_urb->urb->transfer_buffer_length,
- stk_urb->sgt, DMA_FROM_DEVICE);
+ usb_free_noncoherent(dev->udev, stk_urb->urb->transfer_buffer_length,
+ stk_urb->transfer_buffer, DMA_FROM_DEVICE,
+ stk_urb->sgt);
usb_free_urb(stk_urb->urb);
stk_urb->transfer_buffer = NULL;
@@ -410,32 +401,19 @@ void stk1160_uninit_isoc(struct stk1160 *dev)
static int stk1160_fill_urb(struct stk1160 *dev, struct stk1160_urb *stk_urb,
int sb_size, int max_packets)
{
- struct device *dma_dev = stk1160_get_dmadev(dev);
-
stk_urb->urb = usb_alloc_urb(max_packets, GFP_KERNEL);
if (!stk_urb->urb)
return -ENOMEM;
- stk_urb->sgt = dma_alloc_noncontiguous(dma_dev, sb_size,
- DMA_FROM_DEVICE, GFP_KERNEL, 0);
-
- /*
- * If the buffer allocation failed, we exit but return 0 since
- * we allow the driver working with less buffers
- */
- if (!stk_urb->sgt)
- goto free_urb;
- stk_urb->transfer_buffer = dma_vmap_noncontiguous(dma_dev, sb_size,
- stk_urb->sgt);
+ stk_urb->transfer_buffer = usb_alloc_noncoherent(dev->udev, sb_size,
+ GFP_KERNEL, &stk_urb->dma,
+ DMA_FROM_DEVICE, &stk_urb->sgt);
if (!stk_urb->transfer_buffer)
- goto free_sgt;
+ goto free_urb;
- stk_urb->dma = stk_urb->sgt->sgl->dma_address;
stk_urb->dev = dev;
return 0;
-free_sgt:
- dma_free_noncontiguous(dma_dev, sb_size, stk_urb->sgt, DMA_FROM_DEVICE);
- stk_urb->sgt = NULL;
+
free_urb:
usb_free_urb(stk_urb->urb);
stk_urb->urb = NULL;
@@ -494,12 +472,13 @@ int stk1160_alloc_isoc(struct stk1160 *dev)
urb->transfer_buffer = dev->isoc_ctl.urb_ctl[i].transfer_buffer;
urb->transfer_buffer_length = sb_size;
urb->complete = stk1160_isoc_irq;
- urb->context = &dev->isoc_ctl.urb_ctl[i];
+ urb->context = dev;
urb->interval = 1;
urb->start_frame = 0;
urb->number_of_packets = max_packets;
urb->transfer_flags = URB_ISO_ASAP | URB_NO_TRANSFER_DMA_MAP;
urb->transfer_dma = dev->isoc_ctl.urb_ctl[i].dma;
+ urb->sgt = dev->isoc_ctl.urb_ctl[i].sgt;
k = 0;
for (j = 0; j < max_packets; j++) {
diff --git a/drivers/media/usb/stk1160/stk1160.h b/drivers/media/usb/stk1160/stk1160.h
index 7b498d14ed7a..4cbcb0a03bab 100644
--- a/drivers/media/usb/stk1160/stk1160.h
+++ b/drivers/media/usb/stk1160/stk1160.h
@@ -16,8 +16,6 @@
#include <media/videobuf2-v4l2.h>
#include <media/v4l2-device.h>
#include <media/v4l2-ctrls.h>
-#include <linux/usb.h>
-#include <linux/usb/hcd.h>
#define STK1160_VERSION "0.9.5"
#define STK1160_VERSION_NUM 0x000905
@@ -195,8 +193,3 @@ void stk1160_select_input(struct stk1160 *dev);
/* Provided by stk1160-ac97.c */
void stk1160_ac97_setup(struct stk1160 *dev);
-
-static inline struct device *stk1160_get_dmadev(struct stk1160 *dev)
-{
- return bus_to_hcd(dev->udev->bus)->self.sysdev;
-}
--
2.34.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
@ 2025-06-27 11:22 ` Andy Shevchenko
2025-06-30 7:26 ` Xu Yang
2025-06-27 14:23 ` Alan Stern
1 sibling, 1 reply; 18+ messages in thread
From: Andy Shevchenko @ 2025-06-27 11:22 UTC (permalink / raw)
To: Xu Yang
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, viro, thomas.weissschuh, linux-media, linux-kernel,
linux-usb, imx, jun.li
On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> functions to support alloc and free buffer in a dma-noncoherent way.
>
> To explicit manage the memory ownership for the kernel and device,
> this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> and call it at proper time. The management requires the user save
> sg_table returned by usb_alloc_noncoherent() to urb->sgt.
...
> +/**
> + * usb_alloc_noncoherent - allocate dma-noncoherent buffer for URB_NO_xxx_DMA_MAP
> + * @dev: device the buffer will be used with
> + * @size: requested buffer size
> + * @mem_flags: affect whether allocation may block
> + * @dma: used to return DMA address of buffer
> + * @dir: dma transfer direction
> + * @table: used to return sg_table of allocated memory
> + *
> + * Return: Either null (indicating no buffer could be allocated), or the
> + * cpu-space pointer to a buffer that may be used to perform DMA to the
> + * specified device. Such cpu-space buffers are returned along with the DMA
> + * address (through the pointer provided).
Return section should be last in the kernel-doc (this requirement is
documented).
> + * To explicit manage the memory ownership for the kernel vs the device by
> + * usb core, the user needs save sg_table to urb->sgt. Then usb core will
> + * do dma sync for cpu and device properly.
> + *
> + * When the buffer is no longer used, free it with usb_free_noncoherent().
Here and everywhere else in your series, pay respect to acronyms by using them
as acronyms:
dma --> DMA
cpu --> CPU
usb --> USB
and so on...
> + */
> +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
> + gfp_t mem_flags, dma_addr_t *dma,
> + enum dma_data_direction dir,
> + struct sg_table **table)
> +{
> + struct device *dmadev;
> + struct sg_table *sgt;
> + void *buffer;
> +
> + if (!dev || !dev->bus)
When !dev case is possible?
> + return NULL;
> +
> + dmadev = bus_to_hcd(dev->bus)->self.sysdev;
> +
> + sgt = dma_alloc_noncontiguous(dmadev, size, dir, mem_flags, 0);
> + if (!sgt)
> + return NULL;
> +
> + buffer = dma_vmap_noncontiguous(dmadev, size, sgt);
> + if (!buffer) {
> + dma_free_noncontiguous(dmadev, size, sgt, dir);
> + return NULL;
> + }
> +
> + *table = sgt;
> + *dma = sg_dma_address(sgt->sgl);
> +
> + return buffer;
> +}
...
> +void usb_free_noncoherent(struct usb_device *dev, size_t size,
> + void *addr, enum dma_data_direction dir,
> + struct sg_table *table)
> +{
> + struct device *dmadev;
> +
> + if (!dev || !dev->bus)
Ditto.
> + return;
> + if (!addr)
> + return;
> +
> + dmadev = bus_to_hcd(dev->bus)->self.sysdev;
> + dma_vunmap_noncontiguous(dmadev, addr);
> + dma_free_noncontiguous(dmadev, size, table, dir);
> +}
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] add dma noncoherent API
2025-06-27 10:19 [PATCH v2 0/3] add dma noncoherent API Xu Yang
` (2 preceding siblings ...)
2025-06-27 10:19 ` [PATCH v2 3/3] media: stk1160: " Xu Yang
@ 2025-06-27 11:25 ` Andy Shevchenko
2025-06-30 7:28 ` Xu Yang
3 siblings, 1 reply; 18+ messages in thread
From: Andy Shevchenko @ 2025-06-27 11:25 UTC (permalink / raw)
To: Xu Yang
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, viro, thomas.weissschuh, linux-media, linux-kernel,
linux-usb, imx, jun.li
On Fri, Jun 27, 2025 at 06:19:36PM +0800, Xu Yang wrote:
> On architectures where there is no coherent caching such as ARM it's
> proved that using dma_alloc_noncontiguous API and handling manually
dma_alloc_noncontiguous()
> the cache flushing will significantly improve performance.
> Refer to:
> commit 20e1dbf2bbe2 ("media: uvcvideo: Use dma_alloc_noncontiguous API")
> commit 68d0c3311ec1 ("media: stk1160: use dma_alloc_noncontiguous API")
>
> However, it's obvious that there is significant code duplication between
> these two commits. Beside, a potential user USB Monitor may read outdated
> data before the driver do dma sync for cpu which will make the data
DMA
CPU
> unreliable.
>
> To reduce code duplication and avoid USB Monitor result unreliable, this
> series will introduce dma noncoherent API to usb core. And the USB core
DMA
USB
> layer will manage synchronization itself.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
2025-06-27 11:22 ` Andy Shevchenko
@ 2025-06-27 14:23 ` Alan Stern
2025-06-29 23:39 ` Laurent Pinchart
2025-06-30 8:18 ` Xu Yang
1 sibling, 2 replies; 18+ messages in thread
From: Alan Stern @ 2025-06-27 14:23 UTC (permalink / raw)
To: Xu Yang
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li
On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> functions to support alloc and free buffer in a dma-noncoherent way.
>
> To explicit manage the memory ownership for the kernel and device,
> this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> and call it at proper time. The management requires the user save
> sg_table returned by usb_alloc_noncoherent() to urb->sgt.
>
> Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> ---
> drivers/usb/core/hcd.c | 30 ++++++++++++++++
> drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/usb.h | 9 +++++
> 3 files changed, 119 insertions(+)
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index c22de97432a0..5fa00d32afb8 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> }
> EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
>
> +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> + struct urb *urb)
> +{
> + enum dma_data_direction dir;
> +
> + if (!urb->sgt)
> + return;
> +
> + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
Are the following operations really necessary if the direction is OUT?
There are no bidirectional URBs, and an OUT transfer never modifies the
contents of the transfer buffer so the buffer contents will be the same
after the URB completes as they were when the URB was submitted.
> + invalidate_kernel_vmap_range(urb->transfer_buffer,
> + urb->transfer_buffer_length);
> + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
> +}
This entire routine should be inserted at the appropriate place in
usb_hcd_unmap_urb_for_dma() instead of being standalone.
> +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> + struct urb *urb)
> +{
> + enum dma_data_direction dir;
> +
> + if (!urb->sgt)
> + return;
> +
> + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> + flush_kernel_vmap_range(urb->transfer_buffer,
> + urb->transfer_buffer_length);
> + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> +}
Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
Also, the material that this routine replaces in the uvc and stk1160
drivers do not call flush_kernel_vmap_range(). Why did you add that
here? Was this omission a bug in those drivers?
Alan Stern
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 14:23 ` Alan Stern
@ 2025-06-29 23:39 ` Laurent Pinchart
2025-06-30 6:48 ` Ricardo Ribalda
2025-06-30 8:45 ` Xu Yang
2025-06-30 8:18 ` Xu Yang
1 sibling, 2 replies; 18+ messages in thread
From: Laurent Pinchart @ 2025-06-29 23:39 UTC (permalink / raw)
To: Alan Stern
Cc: Xu Yang, ezequiel, mchehab, hdegoede, gregkh, mingo, tglx,
andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li, Ricardo Ribalda
On Fri, Jun 27, 2025 at 10:23:36AM -0400, Alan Stern wrote:
> On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > functions to support alloc and free buffer in a dma-noncoherent way.
> >
> > To explicit manage the memory ownership for the kernel and device,
> > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > and call it at proper time. The management requires the user save
> > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
> >
> > Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> > ---
> > drivers/usb/core/hcd.c | 30 ++++++++++++++++
> > drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> > include/linux/usb.h | 9 +++++
> > 3 files changed, 119 insertions(+)
> >
> > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > index c22de97432a0..5fa00d32afb8 100644
> > --- a/drivers/usb/core/hcd.c
> > +++ b/drivers/usb/core/hcd.c
> > @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> > }
> > EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
> >
> > +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> > + struct urb *urb)
> > +{
> > + enum dma_data_direction dir;
> > +
> > + if (!urb->sgt)
> > + return;
> > +
> > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>
> Are the following operations really necessary if the direction is OUT?
> There are no bidirectional URBs, and an OUT transfer never modifies the
> contents of the transfer buffer so the buffer contents will be the same
> after the URB completes as they were when the URB was submitted.
The arch part of dma_sync_sgtable_for_cpu(DMA_TO_DEVICE) is a no-op on
all architectures but microblaze, mips, parisc and powerpc (at least in
some configurations of those architectures).
The IOMMU DMA mapping backend calls into the arch-specific code, and
also handles swiotlb, which is a no-op for DMA_TO_DEVICE. There's also
some IOMMU-related arch-specific handling for sparc.
I think dma_sync_sgtable_for_cpu() should be called for the
DMA_TO_DEVICE direction, to ensure proper operation in those uncommon
but real cases where platforms need to perform some operation. It has a
non-zero cost on other platforms, as the CPU will need to go through a
few function calls to end up in no-ops and then go back up the call
stack.
invalidate_kernel_vmap_range() may not be needed. I don't recall why it
was added. The call was introduced in
commit 20e1dbf2bbe2431072571000ed31dfef09359c08
Author: Ricardo Ribalda <ribalda@chromium.org>
Date: Sat Mar 13 00:55:20 2021 +0100
media: uvcvideo: Use dma_alloc_noncontiguous API
Ricardo, do we need to invalidate the vmap range in the DMA_TO_DEVICE
case ?
> > + invalidate_kernel_vmap_range(urb->transfer_buffer,
> > + urb->transfer_buffer_length);
> > + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
In the DMA_FROM_DEVICE case, shouldn't the vmap range should be
invalidated after calling dma_sync_sgtable_for_cpu() ? Otherwise I think
speculative reads coming between invalidation and dma sync could result
in data corruption.
> > +}
>
> This entire routine should be inserted at the appropriate place in
> usb_hcd_unmap_urb_for_dma() instead of being standalone.
>
> > +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> > + struct urb *urb)
> > +{
> > + enum dma_data_direction dir;
> > +
> > + if (!urb->sgt)
> > + return;
> > +
> > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > + flush_kernel_vmap_range(urb->transfer_buffer,
> > + urb->transfer_buffer_length);
> > + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> > +}
>
> Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
>
> Also, the material that this routine replaces in the uvc and stk1160
> drivers do not call flush_kernel_vmap_range(). Why did you add that
> here? Was this omission a bug in those drivers?
>
> Alan Stern
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-29 23:39 ` Laurent Pinchart
@ 2025-06-30 6:48 ` Ricardo Ribalda
2025-06-30 8:23 ` Laurent Pinchart
2025-06-30 8:45 ` Xu Yang
1 sibling, 1 reply; 18+ messages in thread
From: Ricardo Ribalda @ 2025-06-30 6:48 UTC (permalink / raw)
To: Laurent Pinchart, Christoph Hellwig
Cc: Alan Stern, Xu Yang, ezequiel, mchehab, hdegoede, gregkh, mingo,
tglx, andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li
On Mon, 30 Jun 2025 at 01:39, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> On Fri, Jun 27, 2025 at 10:23:36AM -0400, Alan Stern wrote:
> > On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > > functions to support alloc and free buffer in a dma-noncoherent way.
> > >
> > > To explicit manage the memory ownership for the kernel and device,
> > > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > > and call it at proper time. The management requires the user save
> > > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
> > >
> > > Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> > > ---
> > > drivers/usb/core/hcd.c | 30 ++++++++++++++++
> > > drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> > > include/linux/usb.h | 9 +++++
> > > 3 files changed, 119 insertions(+)
> > >
> > > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > > index c22de97432a0..5fa00d32afb8 100644
> > > --- a/drivers/usb/core/hcd.c
> > > +++ b/drivers/usb/core/hcd.c
> > > @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> > > }
> > > EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
> > >
> > > +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> > > + struct urb *urb)
> > > +{
> > > + enum dma_data_direction dir;
> > > +
> > > + if (!urb->sgt)
> > > + return;
> > > +
> > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> >
> > Are the following operations really necessary if the direction is OUT?
> > There are no bidirectional URBs, and an OUT transfer never modifies the
> > contents of the transfer buffer so the buffer contents will be the same
> > after the URB completes as they were when the URB was submitted.
>
> The arch part of dma_sync_sgtable_for_cpu(DMA_TO_DEVICE) is a no-op on
> all architectures but microblaze, mips, parisc and powerpc (at least in
> some configurations of those architectures).
>
> The IOMMU DMA mapping backend calls into the arch-specific code, and
> also handles swiotlb, which is a no-op for DMA_TO_DEVICE. There's also
> some IOMMU-related arch-specific handling for sparc.
>
> I think dma_sync_sgtable_for_cpu() should be called for the
> DMA_TO_DEVICE direction, to ensure proper operation in those uncommon
> but real cases where platforms need to perform some operation. It has a
> non-zero cost on other platforms, as the CPU will need to go through a
> few function calls to end up in no-ops and then go back up the call
> stack.
>
> invalidate_kernel_vmap_range() may not be needed. I don't recall why it
> was added. The call was introduced in
>
> commit 20e1dbf2bbe2431072571000ed31dfef09359c08
> Author: Ricardo Ribalda <ribalda@chromium.org>
> Date: Sat Mar 13 00:55:20 2021 +0100
>
> media: uvcvideo: Use dma_alloc_noncontiguous API
>
> Ricardo, do we need to invalidate the vmap range in the DMA_TO_DEVICE
> case ?
That change came from Christoph
https://lore.kernel.org/linux-media/20210128150955.GA30563@lst.de/
"""
Given that we vmap the addresses this also needs
flush_kernel_vmap_range / invalidate_kernel_vmap_range calls for
VIVT architectures.
"""
>
> > > + invalidate_kernel_vmap_range(urb->transfer_buffer,
> > > + urb->transfer_buffer_length);
> > > + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
>
> In the DMA_FROM_DEVICE case, shouldn't the vmap range should be
> invalidated after calling dma_sync_sgtable_for_cpu() ? Otherwise I think
> speculative reads coming between invalidation and dma sync could result
> in data corruption.
>
> > > +}
> >
> > This entire routine should be inserted at the appropriate place in
> > usb_hcd_unmap_urb_for_dma() instead of being standalone.
> >
> > > +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> > > + struct urb *urb)
> > > +{
> > > + enum dma_data_direction dir;
> > > +
> > > + if (!urb->sgt)
> > > + return;
> > > +
> > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > > + flush_kernel_vmap_range(urb->transfer_buffer,
> > > + urb->transfer_buffer_length);
> > > + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> > > +}
> >
> > Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
> >
> > Also, the material that this routine replaces in the uvc and stk1160
> > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > here? Was this omission a bug in those drivers?
> >
> > Alan Stern
>
> --
> Regards,
>
> Laurent Pinchart
--
Ricardo Ribalda
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 11:22 ` Andy Shevchenko
@ 2025-06-30 7:26 ` Xu Yang
0 siblings, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-30 7:26 UTC (permalink / raw)
To: Andy Shevchenko
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, viro, thomas.weissschuh, linux-media, linux-kernel,
linux-usb, imx, jun.li
Hi Andy,
Thanks for your comments!
On Fri, Jun 27, 2025 at 02:22:59PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > functions to support alloc and free buffer in a dma-noncoherent way.
> >
> > To explicit manage the memory ownership for the kernel and device,
> > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > and call it at proper time. The management requires the user save
> > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
>
> ...
>
> > +/**
> > + * usb_alloc_noncoherent - allocate dma-noncoherent buffer for URB_NO_xxx_DMA_MAP
> > + * @dev: device the buffer will be used with
> > + * @size: requested buffer size
> > + * @mem_flags: affect whether allocation may block
> > + * @dma: used to return DMA address of buffer
> > + * @dir: dma transfer direction
> > + * @table: used to return sg_table of allocated memory
> > + *
> > + * Return: Either null (indicating no buffer could be allocated), or the
> > + * cpu-space pointer to a buffer that may be used to perform DMA to the
> > + * specified device. Such cpu-space buffers are returned along with the DMA
> > + * address (through the pointer provided).
>
> Return section should be last in the kernel-doc (this requirement is
> documented).
Okay. I'll improve it.
>
> > + * To explicit manage the memory ownership for the kernel vs the device by
> > + * usb core, the user needs save sg_table to urb->sgt. Then usb core will
> > + * do dma sync for cpu and device properly.
> > + *
> > + * When the buffer is no longer used, free it with usb_free_noncoherent().
>
> Here and everywhere else in your series, pay respect to acronyms by using them
> as acronyms:
>
> dma --> DMA
> cpu --> CPU
> usb --> USB
>
> and so on...
Okay.
>
>
> > + */
> > +void *usb_alloc_noncoherent(struct usb_device *dev, size_t size,
> > + gfp_t mem_flags, dma_addr_t *dma,
> > + enum dma_data_direction dir,
> > + struct sg_table **table)
> > +{
> > + struct device *dmadev;
> > + struct sg_table *sgt;
> > + void *buffer;
> > +
> > + if (!dev || !dev->bus)
>
> When !dev case is possible?
Not exactly sure, but it depends on the user. This sanity test is also
carried over from usb_alloc_coherent() to avoid any surprise.
>
> > + return NULL;
> > +
> > + dmadev = bus_to_hcd(dev->bus)->self.sysdev;
> > +
> > + sgt = dma_alloc_noncontiguous(dmadev, size, dir, mem_flags, 0);
> > + if (!sgt)
> > + return NULL;
> > +
> > + buffer = dma_vmap_noncontiguous(dmadev, size, sgt);
> > + if (!buffer) {
> > + dma_free_noncontiguous(dmadev, size, sgt, dir);
> > + return NULL;
> > + }
> > +
> > + *table = sgt;
> > + *dma = sg_dma_address(sgt->sgl);
> > +
> > + return buffer;
> > +}
>
> ...
>
> > +void usb_free_noncoherent(struct usb_device *dev, size_t size,
> > + void *addr, enum dma_data_direction dir,
> > + struct sg_table *table)
> > +{
> > + struct device *dmadev;
> > +
> > + if (!dev || !dev->bus)
>
> Ditto.
>
> > + return;
> > + if (!addr)
> > + return;
> > +
> > + dmadev = bus_to_hcd(dev->bus)->self.sysdev;
> > + dma_vunmap_noncontiguous(dmadev, addr);
> > + dma_free_noncontiguous(dmadev, size, table, dir);
> > +}
Thanks,
Xu Yang
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 0/3] add dma noncoherent API
2025-06-27 11:25 ` [PATCH v2 0/3] add dma noncoherent API Andy Shevchenko
@ 2025-06-30 7:28 ` Xu Yang
0 siblings, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-30 7:28 UTC (permalink / raw)
To: Andy Shevchenko
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, viro, thomas.weissschuh, linux-media, linux-kernel,
linux-usb, imx, jun.li
On Fri, Jun 27, 2025 at 02:25:41PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 27, 2025 at 06:19:36PM +0800, Xu Yang wrote:
> > On architectures where there is no coherent caching such as ARM it's
> > proved that using dma_alloc_noncontiguous API and handling manually
>
> dma_alloc_noncontiguous()
Okay.
>
> > the cache flushing will significantly improve performance.
>
> > Refer to:
> > commit 20e1dbf2bbe2 ("media: uvcvideo: Use dma_alloc_noncontiguous API")
> > commit 68d0c3311ec1 ("media: stk1160: use dma_alloc_noncontiguous API")
> >
> > However, it's obvious that there is significant code duplication between
> > these two commits. Beside, a potential user USB Monitor may read outdated
> > data before the driver do dma sync for cpu which will make the data
>
> DMA
> CPU
Okay.
>
> > unreliable.
> >
> > To reduce code duplication and avoid USB Monitor result unreliable, this
> > series will introduce dma noncoherent API to usb core. And the USB core
>
> DMA
> USB
Okay.
Thanks,
Xu Yang
>
> > layer will manage synchronization itself.
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-27 14:23 ` Alan Stern
2025-06-29 23:39 ` Laurent Pinchart
@ 2025-06-30 8:18 ` Xu Yang
2025-06-30 14:16 ` Alan Stern
1 sibling, 1 reply; 18+ messages in thread
From: Xu Yang @ 2025-06-30 8:18 UTC (permalink / raw)
To: Alan Stern
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li
Hi Alan,
On Fri, Jun 27, 2025 at 10:23:36AM -0400, Alan Stern wrote:
> On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > functions to support alloc and free buffer in a dma-noncoherent way.
> >
> > To explicit manage the memory ownership for the kernel and device,
> > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > and call it at proper time. The management requires the user save
> > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
> >
> > Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> > ---
> > drivers/usb/core/hcd.c | 30 ++++++++++++++++
> > drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> > include/linux/usb.h | 9 +++++
> > 3 files changed, 119 insertions(+)
> >
> > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > index c22de97432a0..5fa00d32afb8 100644
> > --- a/drivers/usb/core/hcd.c
> > +++ b/drivers/usb/core/hcd.c
> > @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> > }
> > EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
> >
> > +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> > + struct urb *urb)
> > +{
> > + enum dma_data_direction dir;
> > +
> > + if (!urb->sgt)
> > + return;
> > +
> > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>
> Are the following operations really necessary if the direction is OUT?
> There are no bidirectional URBs, and an OUT transfer never modifies the
> contents of the transfer buffer so the buffer contents will be the same
> after the URB completes as they were when the URB was submitted.
According to Laurent's reply email, it may be needed for some archs.
>
> > + invalidate_kernel_vmap_range(urb->transfer_buffer,
> > + urb->transfer_buffer_length);
> > + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
> > +}
>
> This entire routine should be inserted at the appropriate place in
> usb_hcd_unmap_urb_for_dma() instead of being standalone.
Okay.
>
> > +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> > + struct urb *urb)
> > +{
> > + enum dma_data_direction dir;
> > +
> > + if (!urb->sgt)
> > + return;
> > +
> > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > + flush_kernel_vmap_range(urb->transfer_buffer,
> > + urb->transfer_buffer_length);
> > + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> > +}
>
> Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
Okay.
>
> Also, the material that this routine replaces in the uvc and stk1160
> drivers do not call flush_kernel_vmap_range(). Why did you add that
> here? Was this omission a bug in those drivers?
According to dma-api.rst:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/tree/Documentation/core-api/dma-api.rst?h=linux-6.15.y#n664
"Once a non-contiguous allocation is mapped using this function, the
flush_kernel_vmap_range() and invalidate_kernel_vmap_range() APIs must
be used to manage the coherency between the kernel mapping, the device
and user space mappings (if any)."
Possibly the uvc and stk1160 missed calling it, but since they won't be
the only user of the USB core, so we'd better call these APIs.
Thanks,
Xu Yang
>
> Alan Stern
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-30 6:48 ` Ricardo Ribalda
@ 2025-06-30 8:23 ` Laurent Pinchart
2025-06-30 13:37 ` Christoph Hellwig
0 siblings, 1 reply; 18+ messages in thread
From: Laurent Pinchart @ 2025-06-30 8:23 UTC (permalink / raw)
To: Ricardo Ribalda
Cc: Christoph Hellwig, Alan Stern, Xu Yang, ezequiel, mchehab,
hdegoede, gregkh, mingo, tglx, andriy.shevchenko, viro,
thomas.weissschuh, linux-media, linux-kernel, linux-usb, imx,
jun.li
On Mon, Jun 30, 2025 at 08:48:23AM +0200, Ricardo Ribalda wrote:
> On Mon, 30 Jun 2025 at 01:39, Laurent Pinchart wrote:
> > On Fri, Jun 27, 2025 at 10:23:36AM -0400, Alan Stern wrote:
> > > On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > > > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > > > functions to support alloc and free buffer in a dma-noncoherent way.
> > > >
> > > > To explicit manage the memory ownership for the kernel and device,
> > > > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > > > and call it at proper time. The management requires the user save
> > > > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
> > > >
> > > > Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> > > > ---
> > > > drivers/usb/core/hcd.c | 30 ++++++++++++++++
> > > > drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> > > > include/linux/usb.h | 9 +++++
> > > > 3 files changed, 119 insertions(+)
> > > >
> > > > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > > > index c22de97432a0..5fa00d32afb8 100644
> > > > --- a/drivers/usb/core/hcd.c
> > > > +++ b/drivers/usb/core/hcd.c
> > > > @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> > > > }
> > > > EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
> > > >
> > > > +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> > > > + struct urb *urb)
> > > > +{
> > > > + enum dma_data_direction dir;
> > > > +
> > > > + if (!urb->sgt)
> > > > + return;
> > > > +
> > > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > >
> > > Are the following operations really necessary if the direction is OUT?
> > > There are no bidirectional URBs, and an OUT transfer never modifies the
> > > contents of the transfer buffer so the buffer contents will be the same
> > > after the URB completes as they were when the URB was submitted.
> >
> > The arch part of dma_sync_sgtable_for_cpu(DMA_TO_DEVICE) is a no-op on
> > all architectures but microblaze, mips, parisc and powerpc (at least in
> > some configurations of those architectures).
> >
> > The IOMMU DMA mapping backend calls into the arch-specific code, and
> > also handles swiotlb, which is a no-op for DMA_TO_DEVICE. There's also
> > some IOMMU-related arch-specific handling for sparc.
> >
> > I think dma_sync_sgtable_for_cpu() should be called for the
> > DMA_TO_DEVICE direction, to ensure proper operation in those uncommon
> > but real cases where platforms need to perform some operation. It has a
> > non-zero cost on other platforms, as the CPU will need to go through a
> > few function calls to end up in no-ops and then go back up the call
> > stack.
> >
> > invalidate_kernel_vmap_range() may not be needed. I don't recall why it
> > was added. The call was introduced in
> >
> > commit 20e1dbf2bbe2431072571000ed31dfef09359c08
> > Author: Ricardo Ribalda <ribalda@chromium.org>
> > Date: Sat Mar 13 00:55:20 2021 +0100
> >
> > media: uvcvideo: Use dma_alloc_noncontiguous API
> >
> > Ricardo, do we need to invalidate the vmap range in the DMA_TO_DEVICE
> > case ?
>
> That change came from Christoph
> https://lore.kernel.org/linux-media/20210128150955.GA30563@lst.de/
>
> """
>
> Given that we vmap the addresses this also needs
> flush_kernel_vmap_range / invalidate_kernel_vmap_range calls for
> VIVT architectures.
>
> """
Thank you, I looked for such a discussion in the list archive yesterday
but somehow missed it.
Christoph, you mentioned
Right now we don't have a proper state machine for the
*_kernel_vmap_range, but we should probably add one once usage of this
grows.
Has there been any progress on that front ?
> > > > + invalidate_kernel_vmap_range(urb->transfer_buffer,
> > > > + urb->transfer_buffer_length);
> > > > + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
> >
> > In the DMA_FROM_DEVICE case, shouldn't the vmap range should be
> > invalidated after calling dma_sync_sgtable_for_cpu() ? Otherwise I think
> > speculative reads coming between invalidation and dma sync could result
> > in data corruption.
> >
> > > > +}
> > >
> > > This entire routine should be inserted at the appropriate place in
> > > usb_hcd_unmap_urb_for_dma() instead of being standalone.
> > >
> > > > +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> > > > + struct urb *urb)
> > > > +{
> > > > + enum dma_data_direction dir;
> > > > +
> > > > + if (!urb->sgt)
> > > > + return;
> > > > +
> > > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > > > + flush_kernel_vmap_range(urb->transfer_buffer,
> > > > + urb->transfer_buffer_length);
> > > > + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> > > > +}
> > >
> > > Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
> > >
> > > Also, the material that this routine replaces in the uvc and stk1160
> > > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > > here? Was this omission a bug in those drivers?
> > >
> > > Alan Stern
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-29 23:39 ` Laurent Pinchart
2025-06-30 6:48 ` Ricardo Ribalda
@ 2025-06-30 8:45 ` Xu Yang
1 sibling, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-06-30 8:45 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Alan Stern, ezequiel, mchehab, hdegoede, gregkh, mingo, tglx,
andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li, Ricardo Ribalda
Hi Laurent,
On Mon, Jun 30, 2025 at 02:39:24AM +0300, Laurent Pinchart wrote:
> On Fri, Jun 27, 2025 at 10:23:36AM -0400, Alan Stern wrote:
> > On Fri, Jun 27, 2025 at 06:19:37PM +0800, Xu Yang wrote:
> > > This will add usb_alloc_noncoherent() and usb_free_noncoherent()
> > > functions to support alloc and free buffer in a dma-noncoherent way.
> > >
> > > To explicit manage the memory ownership for the kernel and device,
> > > this will also add usb_dma_noncoherent_sync_for_cpu/device() functions
> > > and call it at proper time. The management requires the user save
> > > sg_table returned by usb_alloc_noncoherent() to urb->sgt.
> > >
> > > Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
> > > ---
> > > drivers/usb/core/hcd.c | 30 ++++++++++++++++
> > > drivers/usb/core/usb.c | 80 ++++++++++++++++++++++++++++++++++++++++++
> > > include/linux/usb.h | 9 +++++
> > > 3 files changed, 119 insertions(+)
> > >
> > > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > > index c22de97432a0..5fa00d32afb8 100644
> > > --- a/drivers/usb/core/hcd.c
> > > +++ b/drivers/usb/core/hcd.c
> > > @@ -1496,6 +1496,34 @@ int usb_hcd_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> > > }
> > > EXPORT_SYMBOL_GPL(usb_hcd_map_urb_for_dma);
> > >
> > > +static void usb_dma_noncoherent_sync_for_cpu(struct usb_hcd *hcd,
> > > + struct urb *urb)
> > > +{
> > > + enum dma_data_direction dir;
> > > +
> > > + if (!urb->sgt)
> > > + return;
> > > +
> > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> >
> > Are the following operations really necessary if the direction is OUT?
> > There are no bidirectional URBs, and an OUT transfer never modifies the
> > contents of the transfer buffer so the buffer contents will be the same
> > after the URB completes as they were when the URB was submitted.
>
> The arch part of dma_sync_sgtable_for_cpu(DMA_TO_DEVICE) is a no-op on
> all architectures but microblaze, mips, parisc and powerpc (at least in
> some configurations of those architectures).
>
> The IOMMU DMA mapping backend calls into the arch-specific code, and
> also handles swiotlb, which is a no-op for DMA_TO_DEVICE. There's also
> some IOMMU-related arch-specific handling for sparc.
>
> I think dma_sync_sgtable_for_cpu() should be called for the
> DMA_TO_DEVICE direction, to ensure proper operation in those uncommon
> but real cases where platforms need to perform some operation. It has a
> non-zero cost on other platforms, as the CPU will need to go through a
> few function calls to end up in no-ops and then go back up the call
> stack.
>
> invalidate_kernel_vmap_range() may not be needed. I don't recall why it
> was added. The call was introduced in
>
> commit 20e1dbf2bbe2431072571000ed31dfef09359c08
> Author: Ricardo Ribalda <ribalda@chromium.org>
> Date: Sat Mar 13 00:55:20 2021 +0100
>
> media: uvcvideo: Use dma_alloc_noncontiguous API
>
> Ricardo, do we need to invalidate the vmap range in the DMA_TO_DEVICE
> case ?
>
> > > + invalidate_kernel_vmap_range(urb->transfer_buffer,
> > > + urb->transfer_buffer_length);
> > > + dma_sync_sgtable_for_cpu(hcd->self.sysdev, urb->sgt, dir);
>
> In the DMA_FROM_DEVICE case, shouldn't the vmap range should be
> invalidated after calling dma_sync_sgtable_for_cpu() ? Otherwise I think
> speculative reads coming between invalidation and dma sync could result
> in data corruption.
Your concern sounds resonable. But I see some drivers also call
invalidate_kernel_vmap_range() before dma_sync_sgtable_for_cpu(). I'm not
sure what's the correct order :(
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/sound/core/memalloc.c?h=linux-6.15.y#n600
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/media/common/videobuf2/videobuf2-dma-contig.c?h=linux-6.15.y#n157
Thanks,
Xu Yang
>
> > > +}
> >
> > This entire routine should be inserted at the appropriate place in
> > usb_hcd_unmap_urb_for_dma() instead of being standalone.
> >
> > > +static void usb_dma_noncoherent_sync_for_device(struct usb_hcd *hcd,
> > > + struct urb *urb)
> > > +{
> > > + enum dma_data_direction dir;
> > > +
> > > + if (!urb->sgt)
> > > + return;
> > > +
> > > + dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > > + flush_kernel_vmap_range(urb->transfer_buffer,
> > > + urb->transfer_buffer_length);
> > > + dma_sync_sgtable_for_device(hcd->self.sysdev, urb->sgt, dir);
> > > +}
> >
> > Likewise, this code belongs inside usb_hcd_map_urb_for_dma().
> >
> > Also, the material that this routine replaces in the uvc and stk1160
> > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > here? Was this omission a bug in those drivers?
> >
> > Alan Stern
>
> --
> Regards,
>
> Laurent Pinchart
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-30 8:23 ` Laurent Pinchart
@ 2025-06-30 13:37 ` Christoph Hellwig
0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2025-06-30 13:37 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Ricardo Ribalda, Christoph Hellwig, Alan Stern, Xu Yang, ezequiel,
mchehab, hdegoede, gregkh, mingo, tglx, andriy.shevchenko, viro,
thomas.weissschuh, linux-media, linux-kernel, linux-usb, imx,
jun.li
On Mon, Jun 30, 2025 at 11:23:13AM +0300, Laurent Pinchart wrote:
> Christoph, you mentioned
>
> Right now we don't have a proper state machine for the
> *_kernel_vmap_range, but we should probably add one once usage of this
> grows.
>
> Has there been any progress on that front ?
None that I'm aware off anyway.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-30 8:18 ` Xu Yang
@ 2025-06-30 14:16 ` Alan Stern
2025-06-30 17:38 ` Laurent Pinchart
0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2025-06-30 14:16 UTC (permalink / raw)
To: Xu Yang, Christoph Hellwig
Cc: ezequiel, mchehab, laurent.pinchart, hdegoede, gregkh, mingo,
tglx, andriy.shevchenko, viro, thomas.weissschuh, linux-media,
linux-kernel, linux-usb, imx, jun.li
On Mon, Jun 30, 2025 at 04:18:51PM +0800, Xu Yang wrote:
> > Also, the material that this routine replaces in the uvc and stk1160
> > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > here? Was this omission a bug in those drivers?
>
> According to dma-api.rst:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/tree/Documentation/core-api/dma-api.rst?h=linux-6.15.y#n664
>
> "Once a non-contiguous allocation is mapped using this function, the
> flush_kernel_vmap_range() and invalidate_kernel_vmap_range() APIs must
> be used to manage the coherency between the kernel mapping, the device
> and user space mappings (if any)."
>
> Possibly the uvc and stk1160 missed calling it, but since they won't be
> the only user of the USB core, so we'd better call these APIs.
Documentation/core-api/cachetbl.rst says:
``void flush_kernel_vmap_range(void *vaddr, int size)``
flushes the kernel cache for a given virtual address range in
the vmap area. This is to make sure that any data the kernel
modified in the vmap range is made visible to the physical
page. The design is to make this area safe to perform I/O on.
Note that this API does *not* also flush the offset map alias
of the area.
``void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates``
the cache for a given virtual address range in the vmap area
which prevents the processor from making the cache stale by
speculatively reading data while the I/O was occurring to the
physical pages. This is only necessary for data reads into the
vmap area.
So invalidate_kernel_vmap_range() is not needed for data writes, that
is, for OUT transfers. And ironically, flush_kernel_vmap_range() _is_
needed (but only for OUT transfers).
On the other hand, Christoph may think these call should be included
regardless. Let's see what he recommends. Christoph?
(Actually, I would expect dma_sync_sgtable_for_cpu() and
dma_sync_sgtable_for_device() to handle all of this for us
automatically, but never mind...)
Alan Stern
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-30 14:16 ` Alan Stern
@ 2025-06-30 17:38 ` Laurent Pinchart
2025-07-01 9:32 ` Xu Yang
0 siblings, 1 reply; 18+ messages in thread
From: Laurent Pinchart @ 2025-06-30 17:38 UTC (permalink / raw)
To: Alan Stern
Cc: Xu Yang, Christoph Hellwig, ezequiel, mchehab, hdegoede, gregkh,
mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh,
linux-media, linux-kernel, linux-usb, imx, jun.li
Hi Alan,
On Mon, Jun 30, 2025 at 10:16:30AM -0400, Alan Stern wrote:
> On Mon, Jun 30, 2025 at 04:18:51PM +0800, Xu Yang wrote:
> > > Also, the material that this routine replaces in the uvc and stk1160
> > > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > > here? Was this omission a bug in those drivers?
> >
> > According to dma-api.rst:
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/tree/Documentation/core-api/dma-api.rst?h=linux-6.15.y#n664
> >
> > "Once a non-contiguous allocation is mapped using this function, the
> > flush_kernel_vmap_range() and invalidate_kernel_vmap_range() APIs must
> > be used to manage the coherency between the kernel mapping, the device
> > and user space mappings (if any)."
> >
> > Possibly the uvc and stk1160 missed calling it, but since they won't be
> > the only user of the USB core, so we'd better call these APIs.
>
> Documentation/core-api/cachetbl.rst says:
>
> ``void flush_kernel_vmap_range(void *vaddr, int size)``
>
> flushes the kernel cache for a given virtual address range in
> the vmap area. This is to make sure that any data the kernel
> modified in the vmap range is made visible to the physical
> page. The design is to make this area safe to perform I/O on.
> Note that this API does *not* also flush the offset map alias
> of the area.
>
> ``void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates``
>
> the cache for a given virtual address range in the vmap area
> which prevents the processor from making the cache stale by
> speculatively reading data while the I/O was occurring to the
> physical pages. This is only necessary for data reads into the
> vmap area.
>
> So invalidate_kernel_vmap_range() is not needed for data writes, that
> is, for OUT transfers. And ironically, flush_kernel_vmap_range() _is_
> needed (but only for OUT transfers).
flush_kernel_vmap_range() for OUT transfers and
invalidate_kernel_vmap_range() for IN transfers make sense to me.
> On the other hand, Christoph may think these call should be included
> regardless. Let's see what he recommends. Christoph?
>
> (Actually, I would expect dma_sync_sgtable_for_cpu() and
> dma_sync_sgtable_for_device() to handle all of this for us
> automatically, but never mind...)
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API
2025-06-30 17:38 ` Laurent Pinchart
@ 2025-07-01 9:32 ` Xu Yang
0 siblings, 0 replies; 18+ messages in thread
From: Xu Yang @ 2025-07-01 9:32 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Alan Stern, Christoph Hellwig, ezequiel, mchehab, hdegoede,
gregkh, mingo, tglx, andriy.shevchenko, viro, thomas.weissschuh,
linux-media, linux-kernel, linux-usb, imx, jun.li
Hi Alan and Laurent,
On Mon, Jun 30, 2025 at 08:38:51PM +0300, Laurent Pinchart wrote:
> Hi Alan,
>
> On Mon, Jun 30, 2025 at 10:16:30AM -0400, Alan Stern wrote:
> > On Mon, Jun 30, 2025 at 04:18:51PM +0800, Xu Yang wrote:
> > > > Also, the material that this routine replaces in the uvc and stk1160
> > > > drivers do not call flush_kernel_vmap_range(). Why did you add that
> > > > here? Was this omission a bug in those drivers?
> > >
> > > According to dma-api.rst:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/tree/Documentation/core-api/dma-api.rst?h=linux-6.15.y#n664
> > >
> > > "Once a non-contiguous allocation is mapped using this function, the
> > > flush_kernel_vmap_range() and invalidate_kernel_vmap_range() APIs must
> > > be used to manage the coherency between the kernel mapping, the device
> > > and user space mappings (if any)."
> > >
> > > Possibly the uvc and stk1160 missed calling it, but since they won't be
> > > the only user of the USB core, so we'd better call these APIs.
> >
> > Documentation/core-api/cachetbl.rst says:
> >
> > ``void flush_kernel_vmap_range(void *vaddr, int size)``
> >
> > flushes the kernel cache for a given virtual address range in
> > the vmap area. This is to make sure that any data the kernel
> > modified in the vmap range is made visible to the physical
> > page. The design is to make this area safe to perform I/O on.
> > Note that this API does *not* also flush the offset map alias
> > of the area.
> >
> > ``void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates``
> >
> > the cache for a given virtual address range in the vmap area
> > which prevents the processor from making the cache stale by
> > speculatively reading data while the I/O was occurring to the
> > physical pages. This is only necessary for data reads into the
> > vmap area.
> >
> > So invalidate_kernel_vmap_range() is not needed for data writes, that
> > is, for OUT transfers. And ironically, flush_kernel_vmap_range() _is_
> > needed (but only for OUT transfers).
>
> flush_kernel_vmap_range() for OUT transfers and
> invalidate_kernel_vmap_range() for IN transfers make sense to me.
Many thanks for your suggestions!
I'll do it in next version.
Thanks,
Xu Yang
>
> > On the other hand, Christoph may think these call should be included
> > regardless. Let's see what he recommends. Christoph?
> >
> > (Actually, I would expect dma_sync_sgtable_for_cpu() and
> > dma_sync_sgtable_for_device() to handle all of this for us
> > automatically, but never mind...)
>
> --
> Regards,
>
> Laurent Pinchart
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-07-01 9:36 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 10:19 [PATCH v2 0/3] add dma noncoherent API Xu Yang
2025-06-27 10:19 ` [PATCH v2 1/3] usb: core: add dma-noncoherent buffer alloc and free API Xu Yang
2025-06-27 11:22 ` Andy Shevchenko
2025-06-30 7:26 ` Xu Yang
2025-06-27 14:23 ` Alan Stern
2025-06-29 23:39 ` Laurent Pinchart
2025-06-30 6:48 ` Ricardo Ribalda
2025-06-30 8:23 ` Laurent Pinchart
2025-06-30 13:37 ` Christoph Hellwig
2025-06-30 8:45 ` Xu Yang
2025-06-30 8:18 ` Xu Yang
2025-06-30 14:16 ` Alan Stern
2025-06-30 17:38 ` Laurent Pinchart
2025-07-01 9:32 ` Xu Yang
2025-06-27 10:19 ` [PATCH v2 2/3] media: uvcvideo: use usb_alloc_noncoherent/usb_free_noncoherent() Xu Yang
2025-06-27 10:19 ` [PATCH v2 3/3] media: stk1160: " Xu Yang
2025-06-27 11:25 ` [PATCH v2 0/3] add dma noncoherent API Andy Shevchenko
2025-06-30 7:28 ` Xu Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox