* [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes
@ 2026-03-27 16:58 Nuno Sá via B4 Relay
2026-03-27 16:58 ` [PATCH v2 1/4] dmaengine: Fix possuible use after free Nuno Sá via B4 Relay
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Nuno Sá via B4 Relay @ 2026-03-27 16:58 UTC (permalink / raw)
To: dmaengine, linux-kernel
Cc: Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas
Ok, I rushed into v2 because I saw (based on AI review) that I already
had some fundamental issues. Some fairly straight (a bit embarrassing tbh)
but others not so much. Another thing to notice is that I changed the
order between "fix use-after-free on unbind" and "Defer freeing DMA
descriptors" because it just makes more sense given that using the
worker only works 100% if we don't have our DMA object bounded with the
platform driver.
Anyways, more details on the changelog.
Also note the addition of two new patches. The dmaengine one seems legit
but I want to note it was just by code inspection.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
---
Changes in v2:
- Patch 1:
* New patch.
- Patch 2:
* New patch.
- Patch 3:
* Use __free() to allocate the ojject so we don't leak in early
errors. Note that after dmaenginem_async_device_register(), the
object lifetime is handled by dmaengine;
* Move get_device() to after registering the device;
* Still allow to free DMA descriptors in axi_dmac_terminate_all();
* Use spin_lock_irqsave() to avoid possible deadlocks.
* Include spinlock.h
- Patch 4:
* Include workqueue.h;
* Save struct device directly in struct axi_dmac_desc and get a
reference when allocating. Give the reference when freeing the
descriptor.
- Link to v1: https://patch.msgid.link/20260326-dma-dmac-handle-vunmap-v1-0-be3e46ffaf69@analog.com
---
Eliza Balas (1):
dmaengine: dma-axi-dmac: Defer freeing DMA descriptors
Nuno Sá (3):
dmaengine: Fix possuible use after free
dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc
dmaengine: dma-axi-dmac: fix use-after-free on unbind
drivers/dma/dma-axi-dmac.c | 122 ++++++++++++++++++++++++++++++++++++---------
drivers/dma/dmaengine.c | 3 +-
2 files changed, 100 insertions(+), 25 deletions(-)
---
base-commit: b7560798466a07d9c3fb011698e92c335ab28baf
change-id: 20260325-dma-dmac-handle-vunmap-84a06df7d133
--
Thanks!
- Nuno Sá
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH v2 1/4] dmaengine: Fix possuible use after free 2026-03-27 16:58 [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes Nuno Sá via B4 Relay @ 2026-03-27 16:58 ` Nuno Sá via B4 Relay 2026-03-30 15:15 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc Nuno Sá via B4 Relay ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Nuno Sá via B4 Relay @ 2026-03-27 16:58 UTC (permalink / raw) To: dmaengine, linux-kernel; +Cc: Lars-Peter Clausen, Vinod Koul, Frank Li From: Nuno Sá <nuno.sa@analog.com> In dma_release_channel(), we first called dma_chan_put() and then checked chan->device->privatecnt for possibly clearing DMA_PRIVATE. However, dma_chan_put() will call dma_device_put() which could, potentially (if the DMA provider is already gone for example), release the last reference of the device and hence freeing the it. Fix it, by doing the check before calling dma_chan_put(). Fixes: 0f571515c332 ("dmaengine: Add privatecnt to revert DMA_PRIVATE property") Signed-off-by: Nuno Sá <nuno.sa@analog.com> --- drivers/dma/dmaengine.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index 405bd2fbb4a3..9049171df857 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -905,11 +905,12 @@ void dma_release_channel(struct dma_chan *chan) mutex_lock(&dma_list_mutex); WARN_ONCE(chan->client_count != 1, "chan reference count %d != 1\n", chan->client_count); - dma_chan_put(chan); /* drop PRIVATE cap enabled by __dma_request_channel() */ if (--chan->device->privatecnt == 0) dma_cap_clear(DMA_PRIVATE, chan->device->cap_mask); + dma_chan_put(chan); + if (chan->slave) { sysfs_remove_link(&chan->dev->device.kobj, DMA_SLAVE_NAME); sysfs_remove_link(&chan->slave->kobj, chan->name); -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 1/4] dmaengine: Fix possuible use after free 2026-03-27 16:58 ` [PATCH v2 1/4] dmaengine: Fix possuible use after free Nuno Sá via B4 Relay @ 2026-03-30 15:15 ` Frank Li 0 siblings, 0 replies; 17+ messages in thread From: Frank Li @ 2026-03-30 15:15 UTC (permalink / raw) To: Nuno Sá Cc: dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li On Fri, Mar 27, 2026 at 04:58:38PM +0000, Nuno Sá wrote: typo at subject possuible dmaengine: move dma_chan_put() after check dmaengine device's privatecnt > In dma_release_channel(), we first called dma_chan_put() and then > checked chan->device->privatecnt for possibly clearing DMA_PRIVATE. > However, dma_chan_put() will call dma_device_put() which could, > potentially (if the DMA provider is already gone for example), > release the last reference of the device and hence freeing > the it. Avoid use "we".. In dma_release_channel(), call dma_chan_put() before check chan->device->privatecnt, which cause DMA engine device potentially is gone when the last reference of the device is released. Fixed it by moving dma_chan_put() after check chan->device->privatecnt. Frank > > Fix it, by doing the check before calling dma_chan_put(). > > Fixes: 0f571515c332 ("dmaengine: Add privatecnt to revert DMA_PRIVATE property") > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > --- > drivers/dma/dmaengine.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > index 405bd2fbb4a3..9049171df857 100644 > --- a/drivers/dma/dmaengine.c > +++ b/drivers/dma/dmaengine.c > @@ -905,11 +905,12 @@ void dma_release_channel(struct dma_chan *chan) > mutex_lock(&dma_list_mutex); > WARN_ONCE(chan->client_count != 1, > "chan reference count %d != 1\n", chan->client_count); > - dma_chan_put(chan); > /* drop PRIVATE cap enabled by __dma_request_channel() */ > if (--chan->device->privatecnt == 0) > dma_cap_clear(DMA_PRIVATE, chan->device->cap_mask); > > + dma_chan_put(chan); > + > if (chan->slave) { > sysfs_remove_link(&chan->dev->device.kobj, DMA_SLAVE_NAME); > sysfs_remove_link(&chan->slave->kobj, chan->name); > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc 2026-03-27 16:58 [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes Nuno Sá via B4 Relay 2026-03-27 16:58 ` [PATCH v2 1/4] dmaengine: Fix possuible use after free Nuno Sá via B4 Relay @ 2026-03-27 16:58 ` Nuno Sá via B4 Relay 2026-03-30 15:17 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind Nuno Sá via B4 Relay 2026-03-27 16:58 ` [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors Nuno Sá via B4 Relay 3 siblings, 1 reply; 17+ messages in thread From: Nuno Sá via B4 Relay @ 2026-03-27 16:58 UTC (permalink / raw) To: dmaengine, linux-kernel; +Cc: Lars-Peter Clausen, Vinod Koul, Frank Li From: Nuno Sá <nuno.sa@analog.com> In axi_dmac_prep_peripheral_dma_vec() if we fail after calling axi_dmac_alloc_desc(), we need to use axi_dmac_free_desc() to fully free the descriptor. Fixes: 74609e568670 ("dmaengine: dma-axi-dmac: Implement device_prep_peripheral_dma_vec") Signed-off-by: Nuno Sá <nuno.sa@analog.com> --- drivers/dma/dma-axi-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c index 45c2c8e4bc45..127c3cf80a0e 100644 --- a/drivers/dma/dma-axi-dmac.c +++ b/drivers/dma/dma-axi-dmac.c @@ -769,7 +769,7 @@ axi_dmac_prep_peripheral_dma_vec(struct dma_chan *c, const struct dma_vec *vecs, for (i = 0; i < nb; i++) { if (!axi_dmac_check_addr(chan, vecs[i].addr) || !axi_dmac_check_len(chan, vecs[i].len)) { - kfree(desc); + axi_dmac_free_desc(desc); return NULL; } -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc 2026-03-27 16:58 ` [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc Nuno Sá via B4 Relay @ 2026-03-30 15:17 ` Frank Li 0 siblings, 0 replies; 17+ messages in thread From: Frank Li @ 2026-03-30 15:17 UTC (permalink / raw) To: Nuno Sá Cc: dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li On Fri, Mar 27, 2026 at 04:58:39PM +0000, Nuno Sá wrote: > In axi_dmac_prep_peripheral_dma_vec() if we fail after calling > axi_dmac_alloc_desc(), we need to use axi_dmac_free_desc() to fully free > the descriptor. Call axi_dmac_free_desc() instead of kfree() to do fully cleanup at err handle path. Frank > > Fixes: 74609e568670 ("dmaengine: dma-axi-dmac: Implement device_prep_peripheral_dma_vec") > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > --- > drivers/dma/dma-axi-dmac.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > index 45c2c8e4bc45..127c3cf80a0e 100644 > --- a/drivers/dma/dma-axi-dmac.c > +++ b/drivers/dma/dma-axi-dmac.c > @@ -769,7 +769,7 @@ axi_dmac_prep_peripheral_dma_vec(struct dma_chan *c, const struct dma_vec *vecs, > for (i = 0; i < nb; i++) { > if (!axi_dmac_check_addr(chan, vecs[i].addr) || > !axi_dmac_check_len(chan, vecs[i].len)) { > - kfree(desc); > + axi_dmac_free_desc(desc); > return NULL; > } > > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind 2026-03-27 16:58 [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes Nuno Sá via B4 Relay 2026-03-27 16:58 ` [PATCH v2 1/4] dmaengine: Fix possuible use after free Nuno Sá via B4 Relay 2026-03-27 16:58 ` [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc Nuno Sá via B4 Relay @ 2026-03-27 16:58 ` Nuno Sá via B4 Relay 2026-03-30 15:22 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors Nuno Sá via B4 Relay 3 siblings, 1 reply; 17+ messages in thread From: Nuno Sá via B4 Relay @ 2026-03-27 16:58 UTC (permalink / raw) To: dmaengine, linux-kernel; +Cc: Lars-Peter Clausen, Vinod Koul, Frank Li From: Nuno Sá <nuno.sa@analog.com> The DMA device lifetime can extend beyond the platform driver unbind if DMA channels are still referenced by client drivers. This leads to use-after-free when the devm-managed memory is freed on unbind but the DMA device callbacks still access it. Fix this by: - Allocating axi_dmac with kzalloc_obj() instead of devm_kzalloc() so its lifetime is not tied to the platform device. - Implementing the device_release callback that so that we can free the object when reference count gets to 0 (no users). - Adding an 'unbound' flag protected by the vchan lock that is set during driver removal, preventing MMIO accesses after the device has been unbound. While at it, explicitly include spinlock.h given it was missing. Signed-off-by: Nuno Sá <nuno.sa@analog.com> --- drivers/dma/dma-axi-dmac.c | 70 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 60 insertions(+), 10 deletions(-) diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c index 127c3cf80a0e..70d3ad7e7d37 100644 --- a/drivers/dma/dma-axi-dmac.c +++ b/drivers/dma/dma-axi-dmac.c @@ -24,6 +24,7 @@ #include <linux/platform_device.h> #include <linux/regmap.h> #include <linux/slab.h> +#include <linux/spinlock.h> #include <dt-bindings/dma/axi-dmac.h> @@ -174,6 +175,8 @@ struct axi_dmac { struct dma_device dma_dev; struct axi_dmac_chan chan; + + bool unbound; }; static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) @@ -182,6 +185,11 @@ static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) dma_dev); } +static struct axi_dmac *dev_to_axi_dmac(struct dma_device *dev) +{ + return container_of(dev, struct axi_dmac, dma_dev); +} + static struct axi_dmac_chan *to_axi_dmac_chan(struct dma_chan *c) { return container_of(c, struct axi_dmac_chan, vchan.chan); @@ -614,7 +622,12 @@ static int axi_dmac_terminate_all(struct dma_chan *c) LIST_HEAD(head); spin_lock_irqsave(&chan->vchan.lock, flags); - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); + /* + * Only allow the MMIO access if the device is live. Otherwise still + * go for freeing the descriptors. + */ + if (!dmac->unbound) + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); chan->next_desc = NULL; vchan_get_all_descriptors(&chan->vchan, &head); list_splice_tail_init(&chan->active_descs, &head); @@ -642,9 +655,12 @@ static void axi_dmac_issue_pending(struct dma_chan *c) if (chan->hw_sg) ctrl |= AXI_DMAC_CTRL_ENABLE_SG; - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); - spin_lock_irqsave(&chan->vchan.lock, flags); + if (dmac->unbound) { + spin_unlock_irqrestore(&chan->vchan.lock, flags); + return; + } + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); if (vchan_issue_pending(&chan->vchan)) axi_dmac_start_transfer(chan); spin_unlock_irqrestore(&chan->vchan.lock, flags); @@ -1184,6 +1200,14 @@ static int axi_dmac_detect_caps(struct axi_dmac *dmac, unsigned int version) return 0; } +static void axi_dmac_release(struct dma_device *dma_dev) +{ + struct axi_dmac *dmac = dev_to_axi_dmac(dma_dev); + + put_device(dma_dev->dev); + kfree(dmac); +} + static void axi_dmac_tasklet_kill(void *task) { tasklet_kill(task); @@ -1194,16 +1218,27 @@ static void axi_dmac_free_dma_controller(void *of_node) of_dma_controller_free(of_node); } +static void axi_dmac_disable(void *__dmac) +{ + struct axi_dmac *dmac = __dmac; + unsigned long flags; + + spin_lock_irqsave(&dmac->chan.vchan.lock, flags); + dmac->unbound = true; + spin_unlock_irqrestore(&dmac->chan.vchan.lock, flags); + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); +} + static int axi_dmac_probe(struct platform_device *pdev) { struct dma_device *dma_dev; - struct axi_dmac *dmac; + struct axi_dmac *__dmac; struct regmap *regmap; unsigned int version; u32 irq_mask = 0; int ret; - dmac = devm_kzalloc(&pdev->dev, sizeof(*dmac), GFP_KERNEL); + struct axi_dmac *dmac __free(kfree) = kzalloc_obj(struct axi_dmac); if (!dmac) return -ENOMEM; @@ -1251,6 +1286,7 @@ static int axi_dmac_probe(struct platform_device *pdev) dma_dev->dev = &pdev->dev; dma_dev->src_addr_widths = BIT(dmac->chan.src_width); dma_dev->dst_addr_widths = BIT(dmac->chan.dest_width); + dma_dev->device_release = axi_dmac_release; dma_dev->directions = BIT(dmac->chan.direction); dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR; dma_dev->max_sg_burst = 31; /* 31 SGs maximum in one burst */ @@ -1285,12 +1321,21 @@ static int axi_dmac_probe(struct platform_device *pdev) if (ret) return ret; + /* + * From this point on, our dmac object has it's lifetime bounded with + * dma_dev. Will be freed when dma_dev refcount goes to 0. That means, + * no more automatic kfree(). Also note that dmac is now NULL so we + * need __dmac. + */ + __dmac = no_free_ptr(dmac); + get_device(&pdev->dev); + /* * Put the action in here so it get's done before unregistering the DMA * device. */ ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_tasklet_kill, - &dmac->chan.vchan.task); + &__dmac->chan.vchan.task); if (ret) return ret; @@ -1304,13 +1349,18 @@ static int axi_dmac_probe(struct platform_device *pdev) if (ret) return ret; - ret = devm_request_irq(&pdev->dev, dmac->irq, axi_dmac_interrupt_handler, - IRQF_SHARED, dev_name(&pdev->dev), dmac); + /* So that we can mark the device as unbound and disable it */ + ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_disable, __dmac); if (ret) return ret; - regmap = devm_regmap_init_mmio(&pdev->dev, dmac->base, - &axi_dmac_regmap_config); + ret = devm_request_irq(&pdev->dev, __dmac->irq, axi_dmac_interrupt_handler, + IRQF_SHARED, dev_name(&pdev->dev), __dmac); + if (ret) + return ret; + + regmap = devm_regmap_init_mmio(&pdev->dev, __dmac->base, + &axi_dmac_regmap_config); return PTR_ERR_OR_ZERO(regmap); } -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind 2026-03-27 16:58 ` [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind Nuno Sá via B4 Relay @ 2026-03-30 15:22 ` Frank Li 2026-03-31 8:46 ` Nuno Sá 0 siblings, 1 reply; 17+ messages in thread From: Frank Li @ 2026-03-30 15:22 UTC (permalink / raw) To: Nuno Sá Cc: dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li On Fri, Mar 27, 2026 at 04:58:40PM +0000, Nuno Sá wrote: > The DMA device lifetime can extend beyond the platform driver unbind if > DMA channels are still referenced by client drivers. This leads to > use-after-free when the devm-managed memory is freed on unbind but the > DMA device callbacks still access it. > > Fix this by: > - Allocating axi_dmac with kzalloc_obj() instead of devm_kzalloc() so > its lifetime is not tied to the platform device. > - Implementing the device_release callback that so that we can free > the object when reference count gets to 0 (no users). > - Adding an 'unbound' flag protected by the vchan lock that is set > during driver removal, preventing MMIO accesses after the device has been > unbound. > > While at it, explicitly include spinlock.h given it was missing. > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > --- Not sure if it similar with https://lore.kernel.org/dmaengine/20250903-v6-16-topic-sdma-v1-9-ac7bab629e8b@pengutronix.de/ It looks like miss device link between comsumer and provider. Frank > drivers/dma/dma-axi-dmac.c | 70 +++++++++++++++++++++++++++++++++++++++------- > 1 file changed, 60 insertions(+), 10 deletions(-) > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > index 127c3cf80a0e..70d3ad7e7d37 100644 > --- a/drivers/dma/dma-axi-dmac.c > +++ b/drivers/dma/dma-axi-dmac.c > @@ -24,6 +24,7 @@ > #include <linux/platform_device.h> > #include <linux/regmap.h> > #include <linux/slab.h> > +#include <linux/spinlock.h> > > #include <dt-bindings/dma/axi-dmac.h> > > @@ -174,6 +175,8 @@ struct axi_dmac { > > struct dma_device dma_dev; > struct axi_dmac_chan chan; > + > + bool unbound; > }; > > static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > @@ -182,6 +185,11 @@ static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > dma_dev); > } > > +static struct axi_dmac *dev_to_axi_dmac(struct dma_device *dev) > +{ > + return container_of(dev, struct axi_dmac, dma_dev); > +} > + > static struct axi_dmac_chan *to_axi_dmac_chan(struct dma_chan *c) > { > return container_of(c, struct axi_dmac_chan, vchan.chan); > @@ -614,7 +622,12 @@ static int axi_dmac_terminate_all(struct dma_chan *c) > LIST_HEAD(head); > > spin_lock_irqsave(&chan->vchan.lock, flags); > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > + /* > + * Only allow the MMIO access if the device is live. Otherwise still > + * go for freeing the descriptors. > + */ > + if (!dmac->unbound) > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > chan->next_desc = NULL; > vchan_get_all_descriptors(&chan->vchan, &head); > list_splice_tail_init(&chan->active_descs, &head); > @@ -642,9 +655,12 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > if (chan->hw_sg) > ctrl |= AXI_DMAC_CTRL_ENABLE_SG; > > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > - > spin_lock_irqsave(&chan->vchan.lock, flags); > + if (dmac->unbound) { > + spin_unlock_irqrestore(&chan->vchan.lock, flags); > + return; > + } > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > if (vchan_issue_pending(&chan->vchan)) > axi_dmac_start_transfer(chan); > spin_unlock_irqrestore(&chan->vchan.lock, flags); > @@ -1184,6 +1200,14 @@ static int axi_dmac_detect_caps(struct axi_dmac *dmac, unsigned int version) > return 0; > } > > +static void axi_dmac_release(struct dma_device *dma_dev) > +{ > + struct axi_dmac *dmac = dev_to_axi_dmac(dma_dev); > + > + put_device(dma_dev->dev); > + kfree(dmac); > +} > + > static void axi_dmac_tasklet_kill(void *task) > { > tasklet_kill(task); > @@ -1194,16 +1218,27 @@ static void axi_dmac_free_dma_controller(void *of_node) > of_dma_controller_free(of_node); > } > > +static void axi_dmac_disable(void *__dmac) > +{ > + struct axi_dmac *dmac = __dmac; > + unsigned long flags; > + > + spin_lock_irqsave(&dmac->chan.vchan.lock, flags); > + dmac->unbound = true; > + spin_unlock_irqrestore(&dmac->chan.vchan.lock, flags); > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > +} > + > static int axi_dmac_probe(struct platform_device *pdev) > { > struct dma_device *dma_dev; > - struct axi_dmac *dmac; > + struct axi_dmac *__dmac; > struct regmap *regmap; > unsigned int version; > u32 irq_mask = 0; > int ret; > > - dmac = devm_kzalloc(&pdev->dev, sizeof(*dmac), GFP_KERNEL); > + struct axi_dmac *dmac __free(kfree) = kzalloc_obj(struct axi_dmac); > if (!dmac) > return -ENOMEM; > > @@ -1251,6 +1286,7 @@ static int axi_dmac_probe(struct platform_device *pdev) > dma_dev->dev = &pdev->dev; > dma_dev->src_addr_widths = BIT(dmac->chan.src_width); > dma_dev->dst_addr_widths = BIT(dmac->chan.dest_width); > + dma_dev->device_release = axi_dmac_release; > dma_dev->directions = BIT(dmac->chan.direction); > dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR; > dma_dev->max_sg_burst = 31; /* 31 SGs maximum in one burst */ > @@ -1285,12 +1321,21 @@ static int axi_dmac_probe(struct platform_device *pdev) > if (ret) > return ret; > > + /* > + * From this point on, our dmac object has it's lifetime bounded with > + * dma_dev. Will be freed when dma_dev refcount goes to 0. That means, > + * no more automatic kfree(). Also note that dmac is now NULL so we > + * need __dmac. > + */ > + __dmac = no_free_ptr(dmac); > + get_device(&pdev->dev); > + > /* > * Put the action in here so it get's done before unregistering the DMA > * device. > */ > ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_tasklet_kill, > - &dmac->chan.vchan.task); > + &__dmac->chan.vchan.task); > if (ret) > return ret; > > @@ -1304,13 +1349,18 @@ static int axi_dmac_probe(struct platform_device *pdev) > if (ret) > return ret; > > - ret = devm_request_irq(&pdev->dev, dmac->irq, axi_dmac_interrupt_handler, > - IRQF_SHARED, dev_name(&pdev->dev), dmac); > + /* So that we can mark the device as unbound and disable it */ > + ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_disable, __dmac); > if (ret) > return ret; > > - regmap = devm_regmap_init_mmio(&pdev->dev, dmac->base, > - &axi_dmac_regmap_config); > + ret = devm_request_irq(&pdev->dev, __dmac->irq, axi_dmac_interrupt_handler, > + IRQF_SHARED, dev_name(&pdev->dev), __dmac); > + if (ret) > + return ret; > + > + regmap = devm_regmap_init_mmio(&pdev->dev, __dmac->base, > + &axi_dmac_regmap_config); > > return PTR_ERR_OR_ZERO(regmap); > } > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind 2026-03-30 15:22 ` Frank Li @ 2026-03-31 8:46 ` Nuno Sá 2026-03-31 14:20 ` Frank Li 0 siblings, 1 reply; 17+ messages in thread From: Nuno Sá @ 2026-03-31 8:46 UTC (permalink / raw) To: Frank Li Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li On Mon, Mar 30, 2026 at 11:22:10AM -0400, Frank Li wrote: > On Fri, Mar 27, 2026 at 04:58:40PM +0000, Nuno Sá wrote: > > The DMA device lifetime can extend beyond the platform driver unbind if > > DMA channels are still referenced by client drivers. This leads to > > use-after-free when the devm-managed memory is freed on unbind but the > > DMA device callbacks still access it. > > > > Fix this by: > > - Allocating axi_dmac with kzalloc_obj() instead of devm_kzalloc() so > > its lifetime is not tied to the platform device. > > - Implementing the device_release callback that so that we can free > > the object when reference count gets to 0 (no users). > > - Adding an 'unbound' flag protected by the vchan lock that is set > > during driver removal, preventing MMIO accesses after the device has been > > unbound. > > > > While at it, explicitly include spinlock.h given it was missing. > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > --- > > Not sure if it similar with > https://lore.kernel.org/dmaengine/20250903-v6-16-topic-sdma-v1-9-ac7bab629e8b@pengutronix.de/ > > It looks like miss device link between comsumer and provider. Well, it surely it's related. I mean, if we ensure the consumers are gone through devlinks and nothing is left behind, then this patch is basically unneeded. But, FWIW, my 2cents would also go into questioning if AUTOREMOVE is really want we want in every situation? Might be to harsh to assume that a DMA channel consumer is useless even if DMA is gone. Anyways, is there a v2 already? I would be interested in following this one... - Nuno Sá > > Frank > > > drivers/dma/dma-axi-dmac.c | 70 +++++++++++++++++++++++++++++++++++++++------- > > 1 file changed, 60 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > index 127c3cf80a0e..70d3ad7e7d37 100644 > > --- a/drivers/dma/dma-axi-dmac.c > > +++ b/drivers/dma/dma-axi-dmac.c > > @@ -24,6 +24,7 @@ > > #include <linux/platform_device.h> > > #include <linux/regmap.h> > > #include <linux/slab.h> > > +#include <linux/spinlock.h> > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > @@ -174,6 +175,8 @@ struct axi_dmac { > > > > struct dma_device dma_dev; > > struct axi_dmac_chan chan; > > + > > + bool unbound; > > }; > > > > static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > > @@ -182,6 +185,11 @@ static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > > dma_dev); > > } > > > > +static struct axi_dmac *dev_to_axi_dmac(struct dma_device *dev) > > +{ > > + return container_of(dev, struct axi_dmac, dma_dev); > > +} > > + > > static struct axi_dmac_chan *to_axi_dmac_chan(struct dma_chan *c) > > { > > return container_of(c, struct axi_dmac_chan, vchan.chan); > > @@ -614,7 +622,12 @@ static int axi_dmac_terminate_all(struct dma_chan *c) > > LIST_HEAD(head); > > > > spin_lock_irqsave(&chan->vchan.lock, flags); > > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > + /* > > + * Only allow the MMIO access if the device is live. Otherwise still > > + * go for freeing the descriptors. > > + */ > > + if (!dmac->unbound) > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > chan->next_desc = NULL; > > vchan_get_all_descriptors(&chan->vchan, &head); > > list_splice_tail_init(&chan->active_descs, &head); > > @@ -642,9 +655,12 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > if (chan->hw_sg) > > ctrl |= AXI_DMAC_CTRL_ENABLE_SG; > > > > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > > - > > spin_lock_irqsave(&chan->vchan.lock, flags); > > + if (dmac->unbound) { > > + spin_unlock_irqrestore(&chan->vchan.lock, flags); > > + return; > > + } > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > > if (vchan_issue_pending(&chan->vchan)) > > axi_dmac_start_transfer(chan); > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > @@ -1184,6 +1200,14 @@ static int axi_dmac_detect_caps(struct axi_dmac *dmac, unsigned int version) > > return 0; > > } > > > > +static void axi_dmac_release(struct dma_device *dma_dev) > > +{ > > + struct axi_dmac *dmac = dev_to_axi_dmac(dma_dev); > > + > > + put_device(dma_dev->dev); > > + kfree(dmac); > > +} > > + > > static void axi_dmac_tasklet_kill(void *task) > > { > > tasklet_kill(task); > > @@ -1194,16 +1218,27 @@ static void axi_dmac_free_dma_controller(void *of_node) > > of_dma_controller_free(of_node); > > } > > > > +static void axi_dmac_disable(void *__dmac) > > +{ > > + struct axi_dmac *dmac = __dmac; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&dmac->chan.vchan.lock, flags); > > + dmac->unbound = true; > > + spin_unlock_irqrestore(&dmac->chan.vchan.lock, flags); > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > +} > > + > > static int axi_dmac_probe(struct platform_device *pdev) > > { > > struct dma_device *dma_dev; > > - struct axi_dmac *dmac; > > + struct axi_dmac *__dmac; > > struct regmap *regmap; > > unsigned int version; > > u32 irq_mask = 0; > > int ret; > > > > - dmac = devm_kzalloc(&pdev->dev, sizeof(*dmac), GFP_KERNEL); > > + struct axi_dmac *dmac __free(kfree) = kzalloc_obj(struct axi_dmac); > > if (!dmac) > > return -ENOMEM; > > > > @@ -1251,6 +1286,7 @@ static int axi_dmac_probe(struct platform_device *pdev) > > dma_dev->dev = &pdev->dev; > > dma_dev->src_addr_widths = BIT(dmac->chan.src_width); > > dma_dev->dst_addr_widths = BIT(dmac->chan.dest_width); > > + dma_dev->device_release = axi_dmac_release; > > dma_dev->directions = BIT(dmac->chan.direction); > > dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR; > > dma_dev->max_sg_burst = 31; /* 31 SGs maximum in one burst */ > > @@ -1285,12 +1321,21 @@ static int axi_dmac_probe(struct platform_device *pdev) > > if (ret) > > return ret; > > > > + /* > > + * From this point on, our dmac object has it's lifetime bounded with > > + * dma_dev. Will be freed when dma_dev refcount goes to 0. That means, > > + * no more automatic kfree(). Also note that dmac is now NULL so we > > + * need __dmac. > > + */ > > + __dmac = no_free_ptr(dmac); > > + get_device(&pdev->dev); > > + > > /* > > * Put the action in here so it get's done before unregistering the DMA > > * device. > > */ > > ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_tasklet_kill, > > - &dmac->chan.vchan.task); > > + &__dmac->chan.vchan.task); > > if (ret) > > return ret; > > > > @@ -1304,13 +1349,18 @@ static int axi_dmac_probe(struct platform_device *pdev) > > if (ret) > > return ret; > > > > - ret = devm_request_irq(&pdev->dev, dmac->irq, axi_dmac_interrupt_handler, > > - IRQF_SHARED, dev_name(&pdev->dev), dmac); > > + /* So that we can mark the device as unbound and disable it */ > > + ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_disable, __dmac); > > if (ret) > > return ret; > > > > - regmap = devm_regmap_init_mmio(&pdev->dev, dmac->base, > > - &axi_dmac_regmap_config); > > + ret = devm_request_irq(&pdev->dev, __dmac->irq, axi_dmac_interrupt_handler, > > + IRQF_SHARED, dev_name(&pdev->dev), __dmac); > > + if (ret) > > + return ret; > > + > > + regmap = devm_regmap_init_mmio(&pdev->dev, __dmac->base, > > + &axi_dmac_regmap_config); > > > > return PTR_ERR_OR_ZERO(regmap); > > } > > > > -- > > 2.53.0 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind 2026-03-31 8:46 ` Nuno Sá @ 2026-03-31 14:20 ` Frank Li 0 siblings, 0 replies; 17+ messages in thread From: Frank Li @ 2026-03-31 14:20 UTC (permalink / raw) To: Nuno Sá Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li On Tue, Mar 31, 2026 at 09:46:35AM +0100, Nuno Sá wrote: > On Mon, Mar 30, 2026 at 11:22:10AM -0400, Frank Li wrote: > > On Fri, Mar 27, 2026 at 04:58:40PM +0000, Nuno Sá wrote: > > > The DMA device lifetime can extend beyond the platform driver unbind if > > > DMA channels are still referenced by client drivers. This leads to > > > use-after-free when the devm-managed memory is freed on unbind but the > > > DMA device callbacks still access it. > > > > > > Fix this by: > > > - Allocating axi_dmac with kzalloc_obj() instead of devm_kzalloc() so > > > its lifetime is not tied to the platform device. > > > - Implementing the device_release callback that so that we can free > > > the object when reference count gets to 0 (no users). > > > - Adding an 'unbound' flag protected by the vchan lock that is set > > > during driver removal, preventing MMIO accesses after the device has been > > > unbound. > > > > > > While at it, explicitly include spinlock.h given it was missing. > > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > --- > > > > Not sure if it similar with > > https://lore.kernel.org/dmaengine/20250903-v6-16-topic-sdma-v1-9-ac7bab629e8b@pengutronix.de/ > > > > It looks like miss device link between comsumer and provider. > > Well, it surely it's related. I mean, if we ensure the consumers are > gone through devlinks and nothing is left behind, then this patch is basically unneeded. > > But, FWIW, my 2cents would also go into questioning if AUTOREMOVE is > really want we want in every situation? Might be to harsh to assume that > a DMA channel consumer is useless even if DMA is gone. Anyways, is there > a v2 already? I would be interested in following this one... This patch may be missed by vnod, I have asked vnod to check again. The open question link to channel device or dma enginee device. I prefer link to channel devices, so it support more advance's runtime pm management. Frank > > - Nuno Sá > > > > > Frank > > > > > drivers/dma/dma-axi-dmac.c | 70 +++++++++++++++++++++++++++++++++++++++------- > > > 1 file changed, 60 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > index 127c3cf80a0e..70d3ad7e7d37 100644 > > > --- a/drivers/dma/dma-axi-dmac.c > > > +++ b/drivers/dma/dma-axi-dmac.c > > > @@ -24,6 +24,7 @@ > > > #include <linux/platform_device.h> > > > #include <linux/regmap.h> > > > #include <linux/slab.h> > > > +#include <linux/spinlock.h> > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > @@ -174,6 +175,8 @@ struct axi_dmac { > > > > > > struct dma_device dma_dev; > > > struct axi_dmac_chan chan; > > > + > > > + bool unbound; > > > }; > > > > > > static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > > > @@ -182,6 +185,11 @@ static struct axi_dmac *chan_to_axi_dmac(struct axi_dmac_chan *chan) > > > dma_dev); > > > } > > > > > > +static struct axi_dmac *dev_to_axi_dmac(struct dma_device *dev) > > > +{ > > > + return container_of(dev, struct axi_dmac, dma_dev); > > > +} > > > + > > > static struct axi_dmac_chan *to_axi_dmac_chan(struct dma_chan *c) > > > { > > > return container_of(c, struct axi_dmac_chan, vchan.chan); > > > @@ -614,7 +622,12 @@ static int axi_dmac_terminate_all(struct dma_chan *c) > > > LIST_HEAD(head); > > > > > > spin_lock_irqsave(&chan->vchan.lock, flags); > > > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > > + /* > > > + * Only allow the MMIO access if the device is live. Otherwise still > > > + * go for freeing the descriptors. > > > + */ > > > + if (!dmac->unbound) > > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > > chan->next_desc = NULL; > > > vchan_get_all_descriptors(&chan->vchan, &head); > > > list_splice_tail_init(&chan->active_descs, &head); > > > @@ -642,9 +655,12 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > if (chan->hw_sg) > > > ctrl |= AXI_DMAC_CTRL_ENABLE_SG; > > > > > > - axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > > > - > > > spin_lock_irqsave(&chan->vchan.lock, flags); > > > + if (dmac->unbound) { > > > + spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > + return; > > > + } > > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, ctrl); > > > if (vchan_issue_pending(&chan->vchan)) > > > axi_dmac_start_transfer(chan); > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > @@ -1184,6 +1200,14 @@ static int axi_dmac_detect_caps(struct axi_dmac *dmac, unsigned int version) > > > return 0; > > > } > > > > > > +static void axi_dmac_release(struct dma_device *dma_dev) > > > +{ > > > + struct axi_dmac *dmac = dev_to_axi_dmac(dma_dev); > > > + > > > + put_device(dma_dev->dev); > > > + kfree(dmac); > > > +} > > > + > > > static void axi_dmac_tasklet_kill(void *task) > > > { > > > tasklet_kill(task); > > > @@ -1194,16 +1218,27 @@ static void axi_dmac_free_dma_controller(void *of_node) > > > of_dma_controller_free(of_node); > > > } > > > > > > +static void axi_dmac_disable(void *__dmac) > > > +{ > > > + struct axi_dmac *dmac = __dmac; > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&dmac->chan.vchan.lock, flags); > > > + dmac->unbound = true; > > > + spin_unlock_irqrestore(&dmac->chan.vchan.lock, flags); > > > + axi_dmac_write(dmac, AXI_DMAC_REG_CTRL, 0); > > > +} > > > + > > > static int axi_dmac_probe(struct platform_device *pdev) > > > { > > > struct dma_device *dma_dev; > > > - struct axi_dmac *dmac; > > > + struct axi_dmac *__dmac; > > > struct regmap *regmap; > > > unsigned int version; > > > u32 irq_mask = 0; > > > int ret; > > > > > > - dmac = devm_kzalloc(&pdev->dev, sizeof(*dmac), GFP_KERNEL); > > > + struct axi_dmac *dmac __free(kfree) = kzalloc_obj(struct axi_dmac); > > > if (!dmac) > > > return -ENOMEM; > > > > > > @@ -1251,6 +1286,7 @@ static int axi_dmac_probe(struct platform_device *pdev) > > > dma_dev->dev = &pdev->dev; > > > dma_dev->src_addr_widths = BIT(dmac->chan.src_width); > > > dma_dev->dst_addr_widths = BIT(dmac->chan.dest_width); > > > + dma_dev->device_release = axi_dmac_release; > > > dma_dev->directions = BIT(dmac->chan.direction); > > > dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR; > > > dma_dev->max_sg_burst = 31; /* 31 SGs maximum in one burst */ > > > @@ -1285,12 +1321,21 @@ static int axi_dmac_probe(struct platform_device *pdev) > > > if (ret) > > > return ret; > > > > > > + /* > > > + * From this point on, our dmac object has it's lifetime bounded with > > > + * dma_dev. Will be freed when dma_dev refcount goes to 0. That means, > > > + * no more automatic kfree(). Also note that dmac is now NULL so we > > > + * need __dmac. > > > + */ > > > + __dmac = no_free_ptr(dmac); > > > + get_device(&pdev->dev); > > > + > > > /* > > > * Put the action in here so it get's done before unregistering the DMA > > > * device. > > > */ > > > ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_tasklet_kill, > > > - &dmac->chan.vchan.task); > > > + &__dmac->chan.vchan.task); > > > if (ret) > > > return ret; > > > > > > @@ -1304,13 +1349,18 @@ static int axi_dmac_probe(struct platform_device *pdev) > > > if (ret) > > > return ret; > > > > > > - ret = devm_request_irq(&pdev->dev, dmac->irq, axi_dmac_interrupt_handler, > > > - IRQF_SHARED, dev_name(&pdev->dev), dmac); > > > + /* So that we can mark the device as unbound and disable it */ > > > + ret = devm_add_action_or_reset(&pdev->dev, axi_dmac_disable, __dmac); > > > if (ret) > > > return ret; > > > > > > - regmap = devm_regmap_init_mmio(&pdev->dev, dmac->base, > > > - &axi_dmac_regmap_config); > > > + ret = devm_request_irq(&pdev->dev, __dmac->irq, axi_dmac_interrupt_handler, > > > + IRQF_SHARED, dev_name(&pdev->dev), __dmac); > > > + if (ret) > > > + return ret; > > > + > > > + regmap = devm_regmap_init_mmio(&pdev->dev, __dmac->base, > > > + &axi_dmac_regmap_config); > > > > > > return PTR_ERR_OR_ZERO(regmap); > > > } > > > > > > -- > > > 2.53.0 > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-27 16:58 [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes Nuno Sá via B4 Relay ` (2 preceding siblings ...) 2026-03-27 16:58 ` [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind Nuno Sá via B4 Relay @ 2026-03-27 16:58 ` Nuno Sá via B4 Relay 2026-03-30 15:24 ` Frank Li 3 siblings, 1 reply; 17+ messages in thread From: Nuno Sá via B4 Relay @ 2026-03-27 16:58 UTC (permalink / raw) To: dmaengine, linux-kernel Cc: Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas From: Eliza Balas <eliza.balas@analog.com> This IP core can be used in architectures (like Microblaze) where DMA descriptors are allocated with vmalloc(). Hence, given that freeing the descriptors happen in softirq context, vunmpap() will BUG(). To solve the above, we setup a work item during allocation of the descriptors and schedule in softirq context. Hence, the actual freeing happens in threaded context. Also note that to account for the possible race where the struct axi_dmac object is gone between scheduling the work and actually running it, we now save and get a reference of struct device when allocating the descriptor (given that's all we need in axi_dmac_free_desc()) and release it in axi_dmac_free_desc(). Signed-off-by: Eliza Balas <eliza.balas@analog.com> Co-developed-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> --- drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c index 70d3ad7e7d37..46f1ead0c7d7 100644 --- a/drivers/dma/dma-axi-dmac.c +++ b/drivers/dma/dma-axi-dmac.c @@ -25,6 +25,7 @@ #include <linux/regmap.h> #include <linux/slab.h> #include <linux/spinlock.h> +#include <linux/workqueue.h> #include <dt-bindings/dma/axi-dmac.h> @@ -133,6 +134,9 @@ struct axi_dmac_sg { struct axi_dmac_desc { struct virt_dma_desc vdesc; struct axi_dmac_chan *chan; + struct device *dev; + + struct work_struct sched_work; bool cyclic; bool cyclic_eot; @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) spin_unlock_irqrestore(&chan->vchan.lock, flags); } +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) +{ + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; + dma_addr_t hw_phys = desc->sg[0].hw_phys; + + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), + hw, hw_phys); + put_device(desc->dev); + kfree(desc); +} + +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) +{ + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, + sched_work); + + axi_dmac_free_desc(desc); +} + static struct axi_dmac_desc * axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) { @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) return NULL; desc->num_sgs = num_sgs; desc->chan = chan; + desc->dev = get_device(dmac->dma_dev.dev); hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), &hw_phys, GFP_ATOMIC); @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) /* The last hardware descriptor will trigger an interrupt */ desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; + /* + * We need to setup a work item because this IP can be used on archs + * that rely on vmalloced memory for descriptors. And given that freeing + * the descriptors happens in softirq context, vunmpap() will BUG(). + * Hence, setup the worker so that we can queue it and free the + * descriptor in threaded context. + */ + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); + return desc; } -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) -{ - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); - struct device *dev = dmac->dma_dev.dev; - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; - dma_addr_t hw_phys = desc->sg[0].hw_phys; - - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), - hw, hw_phys); - kfree(desc); -} - static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, enum dma_transfer_direction direction, dma_addr_t addr, unsigned int num_periods, unsigned int period_len, @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) { - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); + + /* See the comment in axi_dmac_alloc_desc() for the why! */ + schedule_work(&desc->sched_work); } static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-27 16:58 ` [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors Nuno Sá via B4 Relay @ 2026-03-30 15:24 ` Frank Li 2026-03-31 8:53 ` Nuno Sá 0 siblings, 1 reply; 17+ messages in thread From: Frank Li @ 2026-03-30 15:24 UTC (permalink / raw) To: Nuno Sá Cc: dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > From: Eliza Balas <eliza.balas@analog.com> > > This IP core can be used in architectures (like Microblaze) where DMA > descriptors are allocated with vmalloc(). strage, why use vmalloc()? Frank > Hence, given that freeing the > descriptors happen in softirq context, vunmpap() will BUG(). > > To solve the above, we setup a work item during allocation of the > descriptors and schedule in softirq context. Hence, the actual freeing > happens in threaded context. > > Also note that to account for the possible race where the struct axi_dmac > object is gone between scheduling the work and actually running it, we > now save and get a reference of struct device when allocating the > descriptor (given that's all we need in axi_dmac_free_desc()) and > release it in axi_dmac_free_desc(). > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > --- > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > 1 file changed, 37 insertions(+), 13 deletions(-) > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > index 70d3ad7e7d37..46f1ead0c7d7 100644 > --- a/drivers/dma/dma-axi-dmac.c > +++ b/drivers/dma/dma-axi-dmac.c > @@ -25,6 +25,7 @@ > #include <linux/regmap.h> > #include <linux/slab.h> > #include <linux/spinlock.h> > +#include <linux/workqueue.h> > > #include <dt-bindings/dma/axi-dmac.h> > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > struct axi_dmac_desc { > struct virt_dma_desc vdesc; > struct axi_dmac_chan *chan; > + struct device *dev; > + > + struct work_struct sched_work; > > bool cyclic; > bool cyclic_eot; > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > spin_unlock_irqrestore(&chan->vchan.lock, flags); > } > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > +{ > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > + > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > + hw, hw_phys); > + put_device(desc->dev); > + kfree(desc); > +} > + > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > +{ > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > + sched_work); > + > + axi_dmac_free_desc(desc); > +} > + > static struct axi_dmac_desc * > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > { > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > return NULL; > desc->num_sgs = num_sgs; > desc->chan = chan; > + desc->dev = get_device(dmac->dma_dev.dev); > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > &hw_phys, GFP_ATOMIC); > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > /* The last hardware descriptor will trigger an interrupt */ > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > + /* > + * We need to setup a work item because this IP can be used on archs > + * that rely on vmalloced memory for descriptors. And given that freeing > + * the descriptors happens in softirq context, vunmpap() will BUG(). > + * Hence, setup the worker so that we can queue it and free the > + * descriptor in threaded context. > + */ > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > + > return desc; > } > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > -{ > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > - struct device *dev = dmac->dma_dev.dev; > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > - > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > - hw, hw_phys); > - kfree(desc); > -} > - > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > enum dma_transfer_direction direction, dma_addr_t addr, > unsigned int num_periods, unsigned int period_len, > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > { > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > + > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > + schedule_work(&desc->sched_work); > } > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-30 15:24 ` Frank Li @ 2026-03-31 8:53 ` Nuno Sá 2026-03-31 14:16 ` Frank Li 0 siblings, 1 reply; 17+ messages in thread From: Nuno Sá @ 2026-03-31 8:53 UTC (permalink / raw) To: Frank Li Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > From: Eliza Balas <eliza.balas@analog.com> > > > > This IP core can be used in architectures (like Microblaze) where DMA > > descriptors are allocated with vmalloc(). > > strage, why use vmalloc()? It's just one of the paths in dma_alloc_coherent(). It should be architecture dependant. - Nuno Sá > > Frank > > > Hence, given that freeing the > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > To solve the above, we setup a work item during allocation of the > > descriptors and schedule in softirq context. Hence, the actual freeing > > happens in threaded context. > > > > Also note that to account for the possible race where the struct axi_dmac > > object is gone between scheduling the work and actually running it, we > > now save and get a reference of struct device when allocating the > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > release it in axi_dmac_free_desc(). > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > --- > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > --- a/drivers/dma/dma-axi-dmac.c > > +++ b/drivers/dma/dma-axi-dmac.c > > @@ -25,6 +25,7 @@ > > #include <linux/regmap.h> > > #include <linux/slab.h> > > #include <linux/spinlock.h> > > +#include <linux/workqueue.h> > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > struct axi_dmac_desc { > > struct virt_dma_desc vdesc; > > struct axi_dmac_chan *chan; > > + struct device *dev; > > + > > + struct work_struct sched_work; > > > > bool cyclic; > > bool cyclic_eot; > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > } > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > +{ > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > + > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > + hw, hw_phys); > > + put_device(desc->dev); > > + kfree(desc); > > +} > > + > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > +{ > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > + sched_work); > > + > > + axi_dmac_free_desc(desc); > > +} > > + > > static struct axi_dmac_desc * > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > { > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > return NULL; > > desc->num_sgs = num_sgs; > > desc->chan = chan; > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > &hw_phys, GFP_ATOMIC); > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > /* The last hardware descriptor will trigger an interrupt */ > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > + /* > > + * We need to setup a work item because this IP can be used on archs > > + * that rely on vmalloced memory for descriptors. And given that freeing > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > + * Hence, setup the worker so that we can queue it and free the > > + * descriptor in threaded context. > > + */ > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > + > > return desc; > > } > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > -{ > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > - struct device *dev = dmac->dma_dev.dev; > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > - > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > - hw, hw_phys); > > - kfree(desc); > > -} > > - > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > enum dma_transfer_direction direction, dma_addr_t addr, > > unsigned int num_periods, unsigned int period_len, > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > { > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > + > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > + schedule_work(&desc->sched_work); > > } > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > -- > > 2.53.0 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-31 8:53 ` Nuno Sá @ 2026-03-31 14:16 ` Frank Li 2026-03-31 15:21 ` Nuno Sá 0 siblings, 1 reply; 17+ messages in thread From: Frank Li @ 2026-03-31 14:16 UTC (permalink / raw) To: Nuno Sá Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Tue, Mar 31, 2026 at 09:53:45AM +0100, Nuno Sá wrote: > On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > > From: Eliza Balas <eliza.balas@analog.com> > > > > > > This IP core can be used in architectures (like Microblaze) where DMA > > > descriptors are allocated with vmalloc(). > > > > strage, why use vmalloc()? > > It's just one of the paths in dma_alloc_coherent(). It should be > architecture dependant. Which architectures, this may common problem, dma_alloc/free_coherent() is quite common at other dma-engine driver. Frank > > - Nuno Sá > > > > > Frank > > > > > Hence, given that freeing the > > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > > > To solve the above, we setup a work item during allocation of the > > > descriptors and schedule in softirq context. Hence, the actual freeing > > > happens in threaded context. > > > > > > Also note that to account for the possible race where the struct axi_dmac > > > object is gone between scheduling the work and actually running it, we > > > now save and get a reference of struct device when allocating the > > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > > release it in axi_dmac_free_desc(). > > > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > --- > > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > > --- a/drivers/dma/dma-axi-dmac.c > > > +++ b/drivers/dma/dma-axi-dmac.c > > > @@ -25,6 +25,7 @@ > > > #include <linux/regmap.h> > > > #include <linux/slab.h> > > > #include <linux/spinlock.h> > > > +#include <linux/workqueue.h> > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > > struct axi_dmac_desc { > > > struct virt_dma_desc vdesc; > > > struct axi_dmac_chan *chan; > > > + struct device *dev; > > > + > > > + struct work_struct sched_work; > > > > > > bool cyclic; > > > bool cyclic_eot; > > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > } > > > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > +{ > > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > + > > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > + hw, hw_phys); > > > + put_device(desc->dev); > > > + kfree(desc); > > > +} > > > + > > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > > +{ > > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > > + sched_work); > > > + > > > + axi_dmac_free_desc(desc); > > > +} > > > + > > > static struct axi_dmac_desc * > > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > { > > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > return NULL; > > > desc->num_sgs = num_sgs; > > > desc->chan = chan; > > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > > &hw_phys, GFP_ATOMIC); > > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > /* The last hardware descriptor will trigger an interrupt */ > > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > > > + /* > > > + * We need to setup a work item because this IP can be used on archs > > > + * that rely on vmalloced memory for descriptors. And given that freeing > > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > > + * Hence, setup the worker so that we can queue it and free the > > > + * descriptor in threaded context. > > > + */ > > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > > + > > > return desc; > > > } > > > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > -{ > > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > > - struct device *dev = dmac->dma_dev.dev; > > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > - > > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > - hw, hw_phys); > > > - kfree(desc); > > > -} > > > - > > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > > enum dma_transfer_direction direction, dma_addr_t addr, > > > unsigned int num_periods, unsigned int period_len, > > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > > { > > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > > + > > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > > + schedule_work(&desc->sched_work); > > > } > > > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > > > -- > > > 2.53.0 > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-31 14:16 ` Frank Li @ 2026-03-31 15:21 ` Nuno Sá 2026-04-01 16:14 ` Nuno Sá 0 siblings, 1 reply; 17+ messages in thread From: Nuno Sá @ 2026-03-31 15:21 UTC (permalink / raw) To: Frank Li Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Tue, Mar 31, 2026 at 10:16:09AM -0400, Frank Li wrote: > On Tue, Mar 31, 2026 at 09:53:45AM +0100, Nuno Sá wrote: > > On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > > > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > > > From: Eliza Balas <eliza.balas@analog.com> > > > > > > > > This IP core can be used in architectures (like Microblaze) where DMA > > > > descriptors are allocated with vmalloc(). > > > > > > strage, why use vmalloc()? > > > > It's just one of the paths in dma_alloc_coherent(). It should be > > architecture dependant. > > Which architectures, this may common problem, dma_alloc/free_coherent() is > quite common at other dma-engine driver. I'll double check this but I believe this was triggered on microblaze where we also use this IP. Will come back with confirmation! - Nuno Sá > > Frank > > > > > - Nuno Sá > > > > > > > > Frank > > > > > > > Hence, given that freeing the > > > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > > > > > To solve the above, we setup a work item during allocation of the > > > > descriptors and schedule in softirq context. Hence, the actual freeing > > > > happens in threaded context. > > > > > > > > Also note that to account for the possible race where the struct axi_dmac > > > > object is gone between scheduling the work and actually running it, we > > > > now save and get a reference of struct device when allocating the > > > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > > > release it in axi_dmac_free_desc(). > > > > > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > > --- > > > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > > > --- a/drivers/dma/dma-axi-dmac.c > > > > +++ b/drivers/dma/dma-axi-dmac.c > > > > @@ -25,6 +25,7 @@ > > > > #include <linux/regmap.h> > > > > #include <linux/slab.h> > > > > #include <linux/spinlock.h> > > > > +#include <linux/workqueue.h> > > > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > > > struct axi_dmac_desc { > > > > struct virt_dma_desc vdesc; > > > > struct axi_dmac_chan *chan; > > > > + struct device *dev; > > > > + > > > > + struct work_struct sched_work; > > > > > > > > bool cyclic; > > > > bool cyclic_eot; > > > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > > } > > > > > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > +{ > > > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > + > > > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > + hw, hw_phys); > > > > + put_device(desc->dev); > > > > + kfree(desc); > > > > +} > > > > + > > > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > > > +{ > > > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > > > + sched_work); > > > > + > > > > + axi_dmac_free_desc(desc); > > > > +} > > > > + > > > > static struct axi_dmac_desc * > > > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > { > > > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > return NULL; > > > > desc->num_sgs = num_sgs; > > > > desc->chan = chan; > > > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > > > &hw_phys, GFP_ATOMIC); > > > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > /* The last hardware descriptor will trigger an interrupt */ > > > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > > > > > + /* > > > > + * We need to setup a work item because this IP can be used on archs > > > > + * that rely on vmalloced memory for descriptors. And given that freeing > > > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > > > + * Hence, setup the worker so that we can queue it and free the > > > > + * descriptor in threaded context. > > > > + */ > > > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > > > + > > > > return desc; > > > > } > > > > > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > -{ > > > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > > > - struct device *dev = dmac->dma_dev.dev; > > > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > - > > > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > - hw, hw_phys); > > > > - kfree(desc); > > > > -} > > > > - > > > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > > > enum dma_transfer_direction direction, dma_addr_t addr, > > > > unsigned int num_periods, unsigned int period_len, > > > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > > > { > > > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > > > + > > > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > > > + schedule_work(&desc->sched_work); > > > > } > > > > > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > > > > > -- > > > > 2.53.0 > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-03-31 15:21 ` Nuno Sá @ 2026-04-01 16:14 ` Nuno Sá 2026-04-01 22:27 ` Frank Li 0 siblings, 1 reply; 17+ messages in thread From: Nuno Sá @ 2026-04-01 16:14 UTC (permalink / raw) To: Frank Li Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Tue, Mar 31, 2026 at 04:21:06PM +0100, Nuno Sá wrote: > On Tue, Mar 31, 2026 at 10:16:09AM -0400, Frank Li wrote: > > On Tue, Mar 31, 2026 at 09:53:45AM +0100, Nuno Sá wrote: > > > On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > > > > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > > > > From: Eliza Balas <eliza.balas@analog.com> > > > > > > > > > > This IP core can be used in architectures (like Microblaze) where DMA > > > > > descriptors are allocated with vmalloc(). > > > > > > > > strage, why use vmalloc()? > > > > > > It's just one of the paths in dma_alloc_coherent(). It should be > > > architecture dependant. > > > > Which architectures, this may common problem, dma_alloc/free_coherent() is > > quite common at other dma-engine driver. > > I'll double check this but I believe this was triggered on microblaze > where we also use this IP. Will come back with confirmation! > Hi Frank, I now went to the bottom of the issue! The problem is that for archs like microblaze and arm64 we have DMA_DIRECT_REMAP which means that when calling dma_alloc_coherent() in [1] we will get into the code path in [2]. Now I did some research and we might have other solution for this that does not involve this refcount craziness plus async work. But I need to test it. FYI, what I have in mind is similar to the what loongson2-apb-dma.c does. This means, using the dma pool API. IIUC, with the pool we only actually free the memory (dma_free_coherent()) in the .terminate_all() callback (when destroying the pool) which should not happen in interrupt context right? [1]: https://elixir.bootlin.com/linux/v7.0-rc6/source/drivers/dma/dma-axi-dmac.c#L549 [2]: https://elixir.bootlin.com/linux/v7.0-rc6/source/kernel/dma/direct.c#L278 - Nuno Sá > - Nuno Sá > > > > Frank > > > > > > > > - Nuno Sá > > > > > > > > > > > Frank > > > > > > > > > Hence, given that freeing the > > > > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > > > > > > > To solve the above, we setup a work item during allocation of the > > > > > descriptors and schedule in softirq context. Hence, the actual freeing > > > > > happens in threaded context. > > > > > > > > > > Also note that to account for the possible race where the struct axi_dmac > > > > > object is gone between scheduling the work and actually running it, we > > > > > now save and get a reference of struct device when allocating the > > > > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > > > > release it in axi_dmac_free_desc(). > > > > > > > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > > > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > > > --- > > > > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > > > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > > > > --- a/drivers/dma/dma-axi-dmac.c > > > > > +++ b/drivers/dma/dma-axi-dmac.c > > > > > @@ -25,6 +25,7 @@ > > > > > #include <linux/regmap.h> > > > > > #include <linux/slab.h> > > > > > #include <linux/spinlock.h> > > > > > +#include <linux/workqueue.h> > > > > > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > > > > struct axi_dmac_desc { > > > > > struct virt_dma_desc vdesc; > > > > > struct axi_dmac_chan *chan; > > > > > + struct device *dev; > > > > > + > > > > > + struct work_struct sched_work; > > > > > > > > > > bool cyclic; > > > > > bool cyclic_eot; > > > > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > > > } > > > > > > > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > +{ > > > > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > + > > > > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > + hw, hw_phys); > > > > > + put_device(desc->dev); > > > > > + kfree(desc); > > > > > +} > > > > > + > > > > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > > > > +{ > > > > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > > > > + sched_work); > > > > > + > > > > > + axi_dmac_free_desc(desc); > > > > > +} > > > > > + > > > > > static struct axi_dmac_desc * > > > > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > { > > > > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > return NULL; > > > > > desc->num_sgs = num_sgs; > > > > > desc->chan = chan; > > > > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > > > > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > > > > &hw_phys, GFP_ATOMIC); > > > > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > /* The last hardware descriptor will trigger an interrupt */ > > > > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > > > > > > > + /* > > > > > + * We need to setup a work item because this IP can be used on archs > > > > > + * that rely on vmalloced memory for descriptors. And given that freeing > > > > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > > > > + * Hence, setup the worker so that we can queue it and free the > > > > > + * descriptor in threaded context. > > > > > + */ > > > > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > > > > + > > > > > return desc; > > > > > } > > > > > > > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > -{ > > > > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > > > > - struct device *dev = dmac->dma_dev.dev; > > > > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > - > > > > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > - hw, hw_phys); > > > > > - kfree(desc); > > > > > -} > > > > > - > > > > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > > > > enum dma_transfer_direction direction, dma_addr_t addr, > > > > > unsigned int num_periods, unsigned int period_len, > > > > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > > > > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > > > > { > > > > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > > > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > > > > + > > > > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > > > > + schedule_work(&desc->sched_work); > > > > > } > > > > > > > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > > > > > > > -- > > > > > 2.53.0 > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-04-01 16:14 ` Nuno Sá @ 2026-04-01 22:27 ` Frank Li 2026-04-02 17:06 ` Nuno Sá 0 siblings, 1 reply; 17+ messages in thread From: Frank Li @ 2026-04-01 22:27 UTC (permalink / raw) To: Nuno Sá Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Wed, Apr 01, 2026 at 05:14:16PM +0100, Nuno Sá wrote: > On Tue, Mar 31, 2026 at 04:21:06PM +0100, Nuno Sá wrote: > > On Tue, Mar 31, 2026 at 10:16:09AM -0400, Frank Li wrote: > > > On Tue, Mar 31, 2026 at 09:53:45AM +0100, Nuno Sá wrote: > > > > On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > > > > > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > > > > > From: Eliza Balas <eliza.balas@analog.com> > > > > > > > > > > > > This IP core can be used in architectures (like Microblaze) where DMA > > > > > > descriptors are allocated with vmalloc(). > > > > > > > > > > strage, why use vmalloc()? > > > > > > > > It's just one of the paths in dma_alloc_coherent(). It should be > > > > architecture dependant. > > > > > > Which architectures, this may common problem, dma_alloc/free_coherent() is > > > quite common at other dma-engine driver. > > > > I'll double check this but I believe this was triggered on microblaze > > where we also use this IP. Will come back with confirmation! > > > > Hi Frank, > > I now went to the bottom of the issue! The problem is that for archs > like microblaze and arm64 we have DMA_DIRECT_REMAP which means that when > calling dma_alloc_coherent() in [1] we will get into the code path in > [2]. Now I did some research and we might have other solution for this > that does not involve this refcount craziness plus async work. But I > need to test it. FYI, what I have in mind is similar to the what > loongson2-apb-dma.c does. This means, using the dma pool API. IIUC, with > the pool we only actually free the memory (dma_free_coherent()) in the > .terminate_all() callback (when destroying the pool) which should not > happen in interrupt context right? I think so. If your dma engineer descriptor is link-list, suggest use dma pool. If it is cicylic buffer, suggest pre-alloc enough descriptors when apply channel. Frank > > [1]: https://elixir.bootlin.com/linux/v7.0-rc6/source/drivers/dma/dma-axi-dmac.c#L549 > [2]: https://elixir.bootlin.com/linux/v7.0-rc6/source/kernel/dma/direct.c#L278 > > - Nuno Sá > > > - Nuno Sá > > > > > > Frank > > > > > > > > > > > - Nuno Sá > > > > > > > > > > > > > > Frank > > > > > > > > > > > Hence, given that freeing the > > > > > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > > > > > > > > > To solve the above, we setup a work item during allocation of the > > > > > > descriptors and schedule in softirq context. Hence, the actual freeing > > > > > > happens in threaded context. > > > > > > > > > > > > Also note that to account for the possible race where the struct axi_dmac > > > > > > object is gone between scheduling the work and actually running it, we > > > > > > now save and get a reference of struct device when allocating the > > > > > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > > > > > release it in axi_dmac_free_desc(). > > > > > > > > > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > > > > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > > > > --- > > > > > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > > > > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > > > > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > > > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > > > > > --- a/drivers/dma/dma-axi-dmac.c > > > > > > +++ b/drivers/dma/dma-axi-dmac.c > > > > > > @@ -25,6 +25,7 @@ > > > > > > #include <linux/regmap.h> > > > > > > #include <linux/slab.h> > > > > > > #include <linux/spinlock.h> > > > > > > +#include <linux/workqueue.h> > > > > > > > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > > > > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > > > > > struct axi_dmac_desc { > > > > > > struct virt_dma_desc vdesc; > > > > > > struct axi_dmac_chan *chan; > > > > > > + struct device *dev; > > > > > > + > > > > > > + struct work_struct sched_work; > > > > > > > > > > > > bool cyclic; > > > > > > bool cyclic_eot; > > > > > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > > > > } > > > > > > > > > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > > +{ > > > > > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > > + > > > > > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > > + hw, hw_phys); > > > > > > + put_device(desc->dev); > > > > > > + kfree(desc); > > > > > > +} > > > > > > + > > > > > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > > > > > +{ > > > > > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > > > > > + sched_work); > > > > > > + > > > > > > + axi_dmac_free_desc(desc); > > > > > > +} > > > > > > + > > > > > > static struct axi_dmac_desc * > > > > > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > { > > > > > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > return NULL; > > > > > > desc->num_sgs = num_sgs; > > > > > > desc->chan = chan; > > > > > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > > > > > > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > > > > > &hw_phys, GFP_ATOMIC); > > > > > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > /* The last hardware descriptor will trigger an interrupt */ > > > > > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > > > > > > > > > + /* > > > > > > + * We need to setup a work item because this IP can be used on archs > > > > > > + * that rely on vmalloced memory for descriptors. And given that freeing > > > > > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > > > > > + * Hence, setup the worker so that we can queue it and free the > > > > > > + * descriptor in threaded context. > > > > > > + */ > > > > > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > > > > > + > > > > > > return desc; > > > > > > } > > > > > > > > > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > > -{ > > > > > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > > > > > - struct device *dev = dmac->dma_dev.dev; > > > > > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > > - > > > > > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > > - hw, hw_phys); > > > > > > - kfree(desc); > > > > > > -} > > > > > > - > > > > > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > > > > > enum dma_transfer_direction direction, dma_addr_t addr, > > > > > > unsigned int num_periods, unsigned int period_len, > > > > > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > > > > > > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > > > > > { > > > > > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > > > > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > > > > > + > > > > > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > > > > > + schedule_work(&desc->sched_work); > > > > > > } > > > > > > > > > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > > > > > > > > > -- > > > > > > 2.53.0 > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors 2026-04-01 22:27 ` Frank Li @ 2026-04-02 17:06 ` Nuno Sá 0 siblings, 0 replies; 17+ messages in thread From: Nuno Sá @ 2026-04-02 17:06 UTC (permalink / raw) To: Frank Li Cc: Nuno Sá, dmaengine, linux-kernel, Lars-Peter Clausen, Vinod Koul, Frank Li, Eliza Balas On Wed, Apr 01, 2026 at 06:27:26PM -0400, Frank Li wrote: > On Wed, Apr 01, 2026 at 05:14:16PM +0100, Nuno Sá wrote: > > On Tue, Mar 31, 2026 at 04:21:06PM +0100, Nuno Sá wrote: > > > On Tue, Mar 31, 2026 at 10:16:09AM -0400, Frank Li wrote: > > > > On Tue, Mar 31, 2026 at 09:53:45AM +0100, Nuno Sá wrote: > > > > > On Mon, Mar 30, 2026 at 11:24:34AM -0400, Frank Li wrote: > > > > > > On Fri, Mar 27, 2026 at 04:58:41PM +0000, Nuno Sá wrote: > > > > > > > From: Eliza Balas <eliza.balas@analog.com> > > > > > > > > > > > > > > This IP core can be used in architectures (like Microblaze) where DMA > > > > > > > descriptors are allocated with vmalloc(). > > > > > > > > > > > > strage, why use vmalloc()? > > > > > > > > > > It's just one of the paths in dma_alloc_coherent(). It should be > > > > > architecture dependant. > > > > > > > > Which architectures, this may common problem, dma_alloc/free_coherent() is > > > > quite common at other dma-engine driver. > > > > > > I'll double check this but I believe this was triggered on microblaze > > > where we also use this IP. Will come back with confirmation! > > > > > > > Hi Frank, > > > > I now went to the bottom of the issue! The problem is that for archs > > like microblaze and arm64 we have DMA_DIRECT_REMAP which means that when > > calling dma_alloc_coherent() in [1] we will get into the code path in > > [2]. Now I did some research and we might have other solution for this > > that does not involve this refcount craziness plus async work. But I > > need to test it. FYI, what I have in mind is similar to the what > > loongson2-apb-dma.c does. This means, using the dma pool API. IIUC, with > > the pool we only actually free the memory (dma_free_coherent()) in the > > .terminate_all() callback (when destroying the pool) which should not > > happen in interrupt context right? > > I think so. If your dma engineer descriptor is link-list, suggest use dma > pool. If it is cicylic buffer, suggest pre-alloc enough descriptors when > apply channel. Yeah, I ran some tests and dma pool seems to work just fine. I'll still ask a colleague to test my patch in the platform that triggered the BUG(). - Nuno Sá > > Frank > > > > [1]: https://elixir.bootlin.com/linux/v7.0-rc6/source/drivers/dma/dma-axi-dmac.c#L549 > > [2]: https://elixir.bootlin.com/linux/v7.0-rc6/source/kernel/dma/direct.c#L278 > > > > - Nuno Sá > > > > > - Nuno Sá > > > > > > > > Frank > > > > > > > > > > > > > > - Nuno Sá > > > > > > > > > > > > > > > > > Frank > > > > > > > > > > > > > Hence, given that freeing the > > > > > > > descriptors happen in softirq context, vunmpap() will BUG(). > > > > > > > > > > > > > > To solve the above, we setup a work item during allocation of the > > > > > > > descriptors and schedule in softirq context. Hence, the actual freeing > > > > > > > happens in threaded context. > > > > > > > > > > > > > > Also note that to account for the possible race where the struct axi_dmac > > > > > > > object is gone between scheduling the work and actually running it, we > > > > > > > now save and get a reference of struct device when allocating the > > > > > > > descriptor (given that's all we need in axi_dmac_free_desc()) and > > > > > > > release it in axi_dmac_free_desc(). > > > > > > > > > > > > > > Signed-off-by: Eliza Balas <eliza.balas@analog.com> > > > > > > > Co-developed-by: Nuno Sá <nuno.sa@analog.com> > > > > > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com> > > > > > > > --- > > > > > > > drivers/dma/dma-axi-dmac.c | 50 ++++++++++++++++++++++++++++++++++------------ > > > > > > > 1 file changed, 37 insertions(+), 13 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c > > > > > > > index 70d3ad7e7d37..46f1ead0c7d7 100644 > > > > > > > --- a/drivers/dma/dma-axi-dmac.c > > > > > > > +++ b/drivers/dma/dma-axi-dmac.c > > > > > > > @@ -25,6 +25,7 @@ > > > > > > > #include <linux/regmap.h> > > > > > > > #include <linux/slab.h> > > > > > > > #include <linux/spinlock.h> > > > > > > > +#include <linux/workqueue.h> > > > > > > > > > > > > > > #include <dt-bindings/dma/axi-dmac.h> > > > > > > > > > > > > > > @@ -133,6 +134,9 @@ struct axi_dmac_sg { > > > > > > > struct axi_dmac_desc { > > > > > > > struct virt_dma_desc vdesc; > > > > > > > struct axi_dmac_chan *chan; > > > > > > > + struct device *dev; > > > > > > > + > > > > > > > + struct work_struct sched_work; > > > > > > > > > > > > > > bool cyclic; > > > > > > > bool cyclic_eot; > > > > > > > @@ -666,6 +670,25 @@ static void axi_dmac_issue_pending(struct dma_chan *c) > > > > > > > spin_unlock_irqrestore(&chan->vchan.lock, flags); > > > > > > > } > > > > > > > > > > > > > > +static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > > > +{ > > > > > > > + struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > > > + dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > > > + > > > > > > > + dma_free_coherent(desc->dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > > > + hw, hw_phys); > > > > > > > + put_device(desc->dev); > > > > > > > + kfree(desc); > > > > > > > +} > > > > > > > + > > > > > > > +static void axi_dmac_free_desc_schedule_work(struct work_struct *work) > > > > > > > +{ > > > > > > > + struct axi_dmac_desc *desc = container_of(work, struct axi_dmac_desc, > > > > > > > + sched_work); > > > > > > > + > > > > > > > + axi_dmac_free_desc(desc); > > > > > > > +} > > > > > > > + > > > > > > > static struct axi_dmac_desc * > > > > > > > axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > > { > > > > > > > @@ -681,6 +704,7 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > > return NULL; > > > > > > > desc->num_sgs = num_sgs; > > > > > > > desc->chan = chan; > > > > > > > + desc->dev = get_device(dmac->dma_dev.dev); > > > > > > > > > > > > > > hws = dma_alloc_coherent(dev, PAGE_ALIGN(num_sgs * sizeof(*hws)), > > > > > > > &hw_phys, GFP_ATOMIC); > > > > > > > @@ -703,21 +727,18 @@ axi_dmac_alloc_desc(struct axi_dmac_chan *chan, unsigned int num_sgs) > > > > > > > /* The last hardware descriptor will trigger an interrupt */ > > > > > > > desc->sg[num_sgs - 1].hw->flags = AXI_DMAC_HW_FLAG_LAST | AXI_DMAC_HW_FLAG_IRQ; > > > > > > > > > > > > > > + /* > > > > > > > + * We need to setup a work item because this IP can be used on archs > > > > > > > + * that rely on vmalloced memory for descriptors. And given that freeing > > > > > > > + * the descriptors happens in softirq context, vunmpap() will BUG(). > > > > > > > + * Hence, setup the worker so that we can queue it and free the > > > > > > > + * descriptor in threaded context. > > > > > > > + */ > > > > > > > + INIT_WORK(&desc->sched_work, axi_dmac_free_desc_schedule_work); > > > > > > > + > > > > > > > return desc; > > > > > > > } > > > > > > > > > > > > > > -static void axi_dmac_free_desc(struct axi_dmac_desc *desc) > > > > > > > -{ > > > > > > > - struct axi_dmac *dmac = chan_to_axi_dmac(desc->chan); > > > > > > > - struct device *dev = dmac->dma_dev.dev; > > > > > > > - struct axi_dmac_hw_desc *hw = desc->sg[0].hw; > > > > > > > - dma_addr_t hw_phys = desc->sg[0].hw_phys; > > > > > > > - > > > > > > > - dma_free_coherent(dev, PAGE_ALIGN(desc->num_sgs * sizeof(*hw)), > > > > > > > - hw, hw_phys); > > > > > > > - kfree(desc); > > > > > > > -} > > > > > > > - > > > > > > > static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, > > > > > > > enum dma_transfer_direction direction, dma_addr_t addr, > > > > > > > unsigned int num_periods, unsigned int period_len, > > > > > > > @@ -958,7 +979,10 @@ static void axi_dmac_free_chan_resources(struct dma_chan *c) > > > > > > > > > > > > > > static void axi_dmac_desc_free(struct virt_dma_desc *vdesc) > > > > > > > { > > > > > > > - axi_dmac_free_desc(to_axi_dmac_desc(vdesc)); > > > > > > > + struct axi_dmac_desc *desc = to_axi_dmac_desc(vdesc); > > > > > > > + > > > > > > > + /* See the comment in axi_dmac_alloc_desc() for the why! */ > > > > > > > + schedule_work(&desc->sched_work); > > > > > > > } > > > > > > > > > > > > > > static bool axi_dmac_regmap_rdwr(struct device *dev, unsigned int reg) > > > > > > > > > > > > > > -- > > > > > > > 2.53.0 > > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-04-02 17:05 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-27 16:58 [PATCH v2 0/4] dmaengine: dma-axi-dmac: Some memory related fixes Nuno Sá via B4 Relay 2026-03-27 16:58 ` [PATCH v2 1/4] dmaengine: Fix possuible use after free Nuno Sá via B4 Relay 2026-03-30 15:15 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 2/4] dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc Nuno Sá via B4 Relay 2026-03-30 15:17 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 3/4] dmaengine: dma-axi-dmac: fix use-after-free on unbind Nuno Sá via B4 Relay 2026-03-30 15:22 ` Frank Li 2026-03-31 8:46 ` Nuno Sá 2026-03-31 14:20 ` Frank Li 2026-03-27 16:58 ` [PATCH v2 4/4] dmaengine: dma-axi-dmac: Defer freeing DMA descriptors Nuno Sá via B4 Relay 2026-03-30 15:24 ` Frank Li 2026-03-31 8:53 ` Nuno Sá 2026-03-31 14:16 ` Frank Li 2026-03-31 15:21 ` Nuno Sá 2026-04-01 16:14 ` Nuno Sá 2026-04-01 22:27 ` Frank Li 2026-04-02 17:06 ` Nuno Sá
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox