linux-serial.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
@ 2024-09-11 18:46 Serge Semin
  2024-09-11 18:46 ` [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback Serge Semin
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Serge Semin @ 2024-09-11 18:46 UTC (permalink / raw)
  To: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul
  Cc: Serge Semin, Ilpo Järvinen, Greg Kroah-Hartman, Jiri Slaby,
	dmaengine, linux-serial, linux-kernel

The main goal of the series is to fix the DW DMAC driver to be working
better with the serial 8250 device driver implementation. In particular it
was discovered that there is a random system freeze (caused by a
deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
error printed to the log when the DW APB UART interface is used in
conjunction with the DW DMA controller. Although I guess the problem can
be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
execution. Anyway this short series contains two patches fixing these
bugs. Please see the respective patches log for details.

Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
Changelog RFC:
- Add a new patch:
  [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
  fixing the "XFER bit set, but channel not idle" error.
- Instead of just dropping the dwc_scan_descriptors() method invocation
  calculate the residue in the Tx-status getter.

base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Cc: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: dmaengine@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (2):
  dmaengine: dw: Prevent tx-status calling DMA-desc callback
  dmaengine: dw: Fix XFER bit set, but channel not idle error

 drivers/dma/dw/core.c | 144 ++++++++++++++++++++++--------------------
 1 file changed, 75 insertions(+), 69 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback
  2024-09-11 18:46 [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Serge Semin
@ 2024-09-11 18:46 ` Serge Semin
  2024-09-12  5:27   ` Greg Kroah-Hartman
  2024-09-11 18:46 ` [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error Serge Semin
  2024-09-16 13:01 ` [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Andy Shevchenko
  2 siblings, 1 reply; 11+ messages in thread
From: Serge Semin @ 2024-09-11 18:46 UTC (permalink / raw)
  To: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Maciej Sosnowski, Haavard Skinnemoen, Dan Williams
  Cc: Serge Semin, Ilpo Järvinen, Greg Kroah-Hartman, Jiri Slaby,
	dmaengine, linux-serial, linux-kernel

The dmaengine_tx_status() method implemented in the DW DMAC driver is
responsible for not just DMA-transfer status getting, but may also cause
the transfer finalization with the Tx-descriptors callback invocation.
This makes the simple DMA-transfer status getting being much more complex
than it seems with a wider room for possible bugs.

In particular a deadlock has been discovered in the DW 8250 UART device
driver interacting with the DW DMA controller channels. Here is the
call-trace causing the deadlock:

serial8250_handle_irq()
  uart_port_lock_irqsave(port); ----------------------+
  handle_rx_dma()                                     |
    serial8250_rx_dma_flush()                         |
      __dma_rx_complete()                             |
        dmaengine_tx_status()                         |
          dwc_scan_descriptors()                      |
            dwc_complete_all()                        |
              dwc_descriptor_complete()               |
                dmaengine_desc_callback_invoke()      |
                  cb->callback(cb->callback_param);   |
                  ||                                  |
                  dma_rx_complete();                  |
                    uart_port_lock_irqsave(port); ----+ <- Deadlock!

So in case if the DMA-engine finished working at some point before the
serial8250_rx_dma_flush() invocation and the respective tasklet hasn't
been executed yet to finalize the DMA transfer, then calling the
dmaengine_tx_status() will cause the DMA-descriptors status update and the
Tx-descriptor callback invocation.

Generalizing the case up: if the dmaengine_tx_status() method callee and
the Tx-descriptor callback refer to the related critical section, then
calling dmaengine_tx_status() from the Tx-descriptor callback will
inevitably cause a deadlock around the guarding lock as it happens in the
Serial 8250 DMA implementation above. (Note the deadlock doesn't happen
very often, but can be eventually discovered if the being received data
size is greater than the Rx DMA-buffer size defined in the 8250_dma.c
driver. In my case reducing the Rx DMA-buffer size increased the deadlock
probability.)

Alas there is no obvious way to prevent the deadlock by fixing the
8250-port drivers because the UART-port lock must be held for the entire
port IRQ handling procedure. Thus the best way to fix the discovered
problem (and prevent similar ones in the drivers using the DW DMAC device
channels) is to simplify the DMA-transfer status getter by removing the
Tx-descriptors state update from there and making the function to serve
just one purpose - calculate the DMA-transfer residue and return the
transfer status. The DMA-transfer status update will be performed in the
bottom-half procedure only.

Fixes: 3bfb1d20b547 ("dmaengine: Driver for the Synopsys DesignWare DMA controller")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

---

Changelog RFC:
- Instead of just dropping the dwc_scan_descriptors() method invocation
  calculate the residue in the Tx-status getter.
---
 drivers/dma/dw/core.c | 90 ++++++++++++++++++++++++-------------------
 1 file changed, 50 insertions(+), 40 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index dd75f97a33b3..af1871646eb9 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -39,6 +39,8 @@
 	BIT(DMA_SLAVE_BUSWIDTH_2_BYTES)		| \
 	BIT(DMA_SLAVE_BUSWIDTH_4_BYTES)
 
+static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc);
+
 /*----------------------------------------------------------------------*/
 
 static struct device *chan2dev(struct dma_chan *chan)
@@ -297,14 +299,12 @@ static inline u32 dwc_get_sent(struct dw_dma_chan *dwc)
 
 static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
 {
-	dma_addr_t llp;
 	struct dw_desc *desc, *_desc;
 	struct dw_desc *child;
 	u32 status_xfer;
 	unsigned long flags;
 
 	spin_lock_irqsave(&dwc->lock, flags);
-	llp = channel_readl(dwc, LLP);
 	status_xfer = dma_readl(dw, RAW.XFER);
 
 	if (status_xfer & dwc->mask) {
@@ -358,41 +358,16 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
 		return;
 	}
 
-	dev_vdbg(chan2dev(&dwc->chan), "%s: llp=%pad\n", __func__, &llp);
+	dev_vdbg(chan2dev(&dwc->chan), "%s: hard LLP mode\n", __func__);
 
 	list_for_each_entry_safe(desc, _desc, &dwc->active_list, desc_node) {
-		/* Initial residue value */
-		desc->residue = desc->total_len;
-
-		/* Check first descriptors addr */
-		if (desc->txd.phys == DWC_LLP_LOC(llp)) {
-			spin_unlock_irqrestore(&dwc->lock, flags);
-			return;
-		}
-
-		/* Check first descriptors llp */
-		if (lli_read(desc, llp) == llp) {
-			/* This one is currently in progress */
-			desc->residue -= dwc_get_sent(dwc);
+		desc->residue = dwc_get_hard_llp_desc_residue(dwc, desc);
+		if (desc->residue) {
 			spin_unlock_irqrestore(&dwc->lock, flags);
 			return;
 		}
 
-		desc->residue -= desc->len;
-		list_for_each_entry(child, &desc->tx_list, desc_node) {
-			if (lli_read(child, llp) == llp) {
-				/* Currently in progress */
-				desc->residue -= dwc_get_sent(dwc);
-				spin_unlock_irqrestore(&dwc->lock, flags);
-				return;
-			}
-			desc->residue -= child->len;
-		}
-
-		/*
-		 * No descriptors so far seem to be in progress, i.e.
-		 * this one must be done.
-		 */
+		/* No data left to be send. Finalize the transfer then */
 		spin_unlock_irqrestore(&dwc->lock, flags);
 		dwc_descriptor_complete(dwc, desc, true);
 		spin_lock_irqsave(&dwc->lock, flags);
@@ -976,6 +951,45 @@ static struct dw_desc *dwc_find_desc(struct dw_dma_chan *dwc, dma_cookie_t c)
 	return NULL;
 }
 
+static u32 dwc_get_soft_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
+{
+	u32 residue = desc->residue;
+
+	if (residue)
+		residue -= dwc_get_sent(dwc);
+
+	return residue;
+}
+
+static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
+{
+	u32 residue = desc->total_len;
+	struct dw_desc *child;
+	dma_addr_t llp;
+
+	llp = channel_readl(dwc, LLP);
+
+	/* Check first descriptor for been pending to be fetched by DMAC */
+	if (desc->txd.phys == DWC_LLP_LOC(llp))
+		return residue;
+
+	/* Check first descriptor LLP to see if it's currently in-progress */
+	if (lli_read(desc, llp) == llp)
+		return residue - dwc_get_sent(dwc);
+
+	/* Check subordinate LLPs to find the currently in-progress desc */
+	residue -= desc->len;
+	list_for_each_entry(child, &desc->tx_list, desc_node) {
+		if (lli_read(child, llp) == llp)
+			return residue - dwc_get_sent(dwc);
+
+		residue -= child->len;
+	}
+
+	/* Shall return zero if no in-progress desc found */
+	return residue;
+}
+
 static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cookie,
 				      enum dma_status *status)
 {
@@ -988,9 +1002,11 @@ static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cook
 	desc = dwc_find_desc(dwc, cookie);
 	if (desc) {
 		if (desc == dwc_first_active(dwc)) {
-			residue = desc->residue;
-			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags) && residue)
-				residue -= dwc_get_sent(dwc);
+			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags))
+				residue = dwc_get_soft_llp_desc_residue(dwc, desc);
+			else
+				residue = dwc_get_hard_llp_desc_residue(dwc, desc);
+
 			if (test_bit(DW_DMA_IS_PAUSED, &dwc->flags))
 				*status = DMA_PAUSED;
 		} else {
@@ -1012,12 +1028,6 @@ dwc_tx_status(struct dma_chan *chan,
 	struct dw_dma_chan	*dwc = to_dw_dma_chan(chan);
 	enum dma_status		ret;
 
-	ret = dma_cookie_status(chan, cookie, txstate);
-	if (ret == DMA_COMPLETE)
-		return ret;
-
-	dwc_scan_descriptors(to_dw_dma(chan->device), dwc);
-
 	ret = dma_cookie_status(chan, cookie, txstate);
 	if (ret == DMA_COMPLETE)
 		return ret;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
  2024-09-11 18:46 [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Serge Semin
  2024-09-11 18:46 ` [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback Serge Semin
@ 2024-09-11 18:46 ` Serge Semin
  2024-09-12  5:27   ` Greg Kroah-Hartman
  2024-09-16 13:01 ` [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Andy Shevchenko
  2 siblings, 1 reply; 11+ messages in thread
From: Serge Semin @ 2024-09-11 18:46 UTC (permalink / raw)
  To: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Maciej Sosnowski, Haavard Skinnemoen, Dan Williams
  Cc: Serge Semin, Ilpo Järvinen, Greg Kroah-Hartman, Jiri Slaby,
	dmaengine, linux-serial, linux-kernel

If a client driver gets to use the DW DMAC engine device tougher
than usual, with occasional DMA-transfers termination and restart, then
the next error can be randomly spotted in the system log:

> dma dma0chan0: BUG: XFER bit set, but channel not idle!

For instance that happens in case of the 8250 UART port driver handling
the looped back high-speed traffic (in my case > 1.5Mbaud) by means of the
DMA-engine interface.

The error happens due to the two-staged nature of the DW DMAC IRQs
handling procedure and due to the critical section break in the meantime.
In particular in case if the DMA-transfer is terminated and restarted:
1. after the IRQ-handler submitted the tasklet but before the tasklet
   started handling the DMA-descriptors in dwc_scan_descriptors();
2. after the XFER completion flag was detected in the
   dwc_scan_descriptors() method, but before the dwc_complete_all() method
   is called
the error denoted above is printed due to the overlap of the last transfer
completion and the new transfer execution stages.

There are two places need to be altered in order to fix the problem.
1. Clear the IRQs in the dwc_chan_disable() method. That will prevent the
   dwc_scan_descriptors() method call in case if the DMA-transfer is
   restarted in the middle of the two-staged IRQs-handling procedure.
2. Move the dwc_complete_all() code to being executed inseparably (in the
   same atomic section) from the DMA-descriptors scanning procedure. That
   will prevent the DMA-transfer restarts after the DMA-transfer completion
   was spotted but before the actual completion is executed.

Fixes: 69cea5a00d31 ("dmaengine/dw_dmac: Replace spin_lock* with irqsave variants and enable submission from callback")
Fixes: 3bfb1d20b547 ("dmaengine: Driver for the Synopsys DesignWare DMA controller")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 drivers/dma/dw/core.c | 54 ++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index af1871646eb9..fbc46cbfe259 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -143,6 +143,12 @@ static inline void dwc_chan_disable(struct dw_dma *dw, struct dw_dma_chan *dwc)
 	channel_clear_bit(dw, CH_EN, dwc->mask);
 	while (dma_readl(dw, CH_EN) & dwc->mask)
 		cpu_relax();
+
+	dma_writel(dw, CLEAR.XFER, dwc->mask);
+	dma_writel(dw, CLEAR.BLOCK, dwc->mask);
+	dma_writel(dw, CLEAR.SRC_TRAN, dwc->mask);
+	dma_writel(dw, CLEAR.DST_TRAN, dwc->mask);
+	dma_writel(dw, CLEAR.ERROR, dwc->mask);
 }
 
 /*----------------------------------------------------------------------*/
@@ -259,34 +265,6 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
 	dmaengine_desc_callback_invoke(&cb, NULL);
 }
 
-static void dwc_complete_all(struct dw_dma *dw, struct dw_dma_chan *dwc)
-{
-	struct dw_desc *desc, *_desc;
-	LIST_HEAD(list);
-	unsigned long flags;
-
-	spin_lock_irqsave(&dwc->lock, flags);
-	if (dma_readl(dw, CH_EN) & dwc->mask) {
-		dev_err(chan2dev(&dwc->chan),
-			"BUG: XFER bit set, but channel not idle!\n");
-
-		/* Try to continue after resetting the channel... */
-		dwc_chan_disable(dw, dwc);
-	}
-
-	/*
-	 * Submit queued descriptors ASAP, i.e. before we go through
-	 * the completed ones.
-	 */
-	list_splice_init(&dwc->active_list, &list);
-	dwc_dostart_first_queued(dwc);
-
-	spin_unlock_irqrestore(&dwc->lock, flags);
-
-	list_for_each_entry_safe(desc, _desc, &list, desc_node)
-		dwc_descriptor_complete(dwc, desc, true);
-}
-
 /* Returns how many bytes were already received from source */
 static inline u32 dwc_get_sent(struct dw_dma_chan *dwc)
 {
@@ -303,6 +281,7 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
 	struct dw_desc *child;
 	u32 status_xfer;
 	unsigned long flags;
+	LIST_HEAD(list);
 
 	spin_lock_irqsave(&dwc->lock, flags);
 	status_xfer = dma_readl(dw, RAW.XFER);
@@ -341,9 +320,26 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
 			clear_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags);
 		}
 
+		/*
+		 * No more active descriptors left to handle. So submit the
+		 * queued descriptors and finish up the already handled ones.
+		 */
+		if (dma_readl(dw, CH_EN) & dwc->mask) {
+			dev_err(chan2dev(&dwc->chan),
+				"BUG: XFER bit set, but channel not idle!\n");
+
+			/* Try to continue after resetting the channel... */
+			dwc_chan_disable(dw, dwc);
+		}
+
+		list_splice_init(&dwc->active_list, &list);
+		dwc_dostart_first_queued(dwc);
+
 		spin_unlock_irqrestore(&dwc->lock, flags);
 
-		dwc_complete_all(dw, dwc);
+		list_for_each_entry_safe(desc, _desc, &list, desc_node)
+			dwc_descriptor_complete(dwc, desc, true);
+
 		return;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
  2024-09-11 18:46 ` [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error Serge Semin
@ 2024-09-12  5:27   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 11+ messages in thread
From: Greg Kroah-Hartman @ 2024-09-12  5:27 UTC (permalink / raw)
  To: Serge Semin
  Cc: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Maciej Sosnowski, Haavard Skinnemoen, Dan Williams,
	Ilpo Järvinen, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

On Wed, Sep 11, 2024 at 09:46:10PM +0300, Serge Semin wrote:
> If a client driver gets to use the DW DMAC engine device tougher
> than usual, with occasional DMA-transfers termination and restart, then
> the next error can be randomly spotted in the system log:
> 
> > dma dma0chan0: BUG: XFER bit set, but channel not idle!
> 
> For instance that happens in case of the 8250 UART port driver handling
> the looped back high-speed traffic (in my case > 1.5Mbaud) by means of the
> DMA-engine interface.
> 
> The error happens due to the two-staged nature of the DW DMAC IRQs
> handling procedure and due to the critical section break in the meantime.
> In particular in case if the DMA-transfer is terminated and restarted:
> 1. after the IRQ-handler submitted the tasklet but before the tasklet
>    started handling the DMA-descriptors in dwc_scan_descriptors();
> 2. after the XFER completion flag was detected in the
>    dwc_scan_descriptors() method, but before the dwc_complete_all() method
>    is called
> the error denoted above is printed due to the overlap of the last transfer
> completion and the new transfer execution stages.
> 
> There are two places need to be altered in order to fix the problem.
> 1. Clear the IRQs in the dwc_chan_disable() method. That will prevent the
>    dwc_scan_descriptors() method call in case if the DMA-transfer is
>    restarted in the middle of the two-staged IRQs-handling procedure.
> 2. Move the dwc_complete_all() code to being executed inseparably (in the
>    same atomic section) from the DMA-descriptors scanning procedure. That
>    will prevent the DMA-transfer restarts after the DMA-transfer completion
>    was spotted but before the actual completion is executed.
> 
> Fixes: 69cea5a00d31 ("dmaengine/dw_dmac: Replace spin_lock* with irqsave variants and enable submission from callback")
> Fixes: 3bfb1d20b547 ("dmaengine: Driver for the Synopsys DesignWare DMA controller")
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> ---
>  drivers/dma/dw/core.c | 54 ++++++++++++++++++++-----------------------
>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> index af1871646eb9..fbc46cbfe259 100644
> --- a/drivers/dma/dw/core.c
> +++ b/drivers/dma/dw/core.c
> @@ -143,6 +143,12 @@ static inline void dwc_chan_disable(struct dw_dma *dw, struct dw_dma_chan *dwc)
>  	channel_clear_bit(dw, CH_EN, dwc->mask);
>  	while (dma_readl(dw, CH_EN) & dwc->mask)
>  		cpu_relax();
> +
> +	dma_writel(dw, CLEAR.XFER, dwc->mask);
> +	dma_writel(dw, CLEAR.BLOCK, dwc->mask);
> +	dma_writel(dw, CLEAR.SRC_TRAN, dwc->mask);
> +	dma_writel(dw, CLEAR.DST_TRAN, dwc->mask);
> +	dma_writel(dw, CLEAR.ERROR, dwc->mask);
>  }
>  
>  /*----------------------------------------------------------------------*/
> @@ -259,34 +265,6 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
>  	dmaengine_desc_callback_invoke(&cb, NULL);
>  }
>  
> -static void dwc_complete_all(struct dw_dma *dw, struct dw_dma_chan *dwc)
> -{
> -	struct dw_desc *desc, *_desc;
> -	LIST_HEAD(list);
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&dwc->lock, flags);
> -	if (dma_readl(dw, CH_EN) & dwc->mask) {
> -		dev_err(chan2dev(&dwc->chan),
> -			"BUG: XFER bit set, but channel not idle!\n");
> -
> -		/* Try to continue after resetting the channel... */
> -		dwc_chan_disable(dw, dwc);
> -	}
> -
> -	/*
> -	 * Submit queued descriptors ASAP, i.e. before we go through
> -	 * the completed ones.
> -	 */
> -	list_splice_init(&dwc->active_list, &list);
> -	dwc_dostart_first_queued(dwc);
> -
> -	spin_unlock_irqrestore(&dwc->lock, flags);
> -
> -	list_for_each_entry_safe(desc, _desc, &list, desc_node)
> -		dwc_descriptor_complete(dwc, desc, true);
> -}
> -
>  /* Returns how many bytes were already received from source */
>  static inline u32 dwc_get_sent(struct dw_dma_chan *dwc)
>  {
> @@ -303,6 +281,7 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
>  	struct dw_desc *child;
>  	u32 status_xfer;
>  	unsigned long flags;
> +	LIST_HEAD(list);
>  
>  	spin_lock_irqsave(&dwc->lock, flags);
>  	status_xfer = dma_readl(dw, RAW.XFER);
> @@ -341,9 +320,26 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
>  			clear_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags);
>  		}
>  
> +		/*
> +		 * No more active descriptors left to handle. So submit the
> +		 * queued descriptors and finish up the already handled ones.
> +		 */
> +		if (dma_readl(dw, CH_EN) & dwc->mask) {
> +			dev_err(chan2dev(&dwc->chan),
> +				"BUG: XFER bit set, but channel not idle!\n");
> +
> +			/* Try to continue after resetting the channel... */
> +			dwc_chan_disable(dw, dwc);
> +		}
> +
> +		list_splice_init(&dwc->active_list, &list);
> +		dwc_dostart_first_queued(dwc);
> +
>  		spin_unlock_irqrestore(&dwc->lock, flags);
>  
> -		dwc_complete_all(dw, dwc);
> +		list_for_each_entry_safe(desc, _desc, &list, desc_node)
> +			dwc_descriptor_complete(dwc, desc, true);
> +
>  		return;
>  	}
>  
> -- 
> 2.43.0
> 
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You have marked a patch with a "Fixes:" tag for a commit that is in an
  older released kernel, yet you do not have a cc: stable line in the
  signed-off-by area at all, which means that the patch will not be
  applied to any older kernel releases.  To properly fix this, please
  follow the documented rules in the
  Documentation/process/stable-kernel-rules.rst file for how to resolve
  this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback
  2024-09-11 18:46 ` [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback Serge Semin
@ 2024-09-12  5:27   ` Greg Kroah-Hartman
  2024-09-13  9:25     ` Serge Semin
  0 siblings, 1 reply; 11+ messages in thread
From: Greg Kroah-Hartman @ 2024-09-12  5:27 UTC (permalink / raw)
  To: Serge Semin
  Cc: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Maciej Sosnowski, Haavard Skinnemoen, Dan Williams,
	Ilpo Järvinen, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

On Wed, Sep 11, 2024 at 09:46:09PM +0300, Serge Semin wrote:
> The dmaengine_tx_status() method implemented in the DW DMAC driver is
> responsible for not just DMA-transfer status getting, but may also cause
> the transfer finalization with the Tx-descriptors callback invocation.
> This makes the simple DMA-transfer status getting being much more complex
> than it seems with a wider room for possible bugs.
> 
> In particular a deadlock has been discovered in the DW 8250 UART device
> driver interacting with the DW DMA controller channels. Here is the
> call-trace causing the deadlock:
> 
> serial8250_handle_irq()
>   uart_port_lock_irqsave(port); ----------------------+
>   handle_rx_dma()                                     |
>     serial8250_rx_dma_flush()                         |
>       __dma_rx_complete()                             |
>         dmaengine_tx_status()                         |
>           dwc_scan_descriptors()                      |
>             dwc_complete_all()                        |
>               dwc_descriptor_complete()               |
>                 dmaengine_desc_callback_invoke()      |
>                   cb->callback(cb->callback_param);   |
>                   ||                                  |
>                   dma_rx_complete();                  |
>                     uart_port_lock_irqsave(port); ----+ <- Deadlock!
> 
> So in case if the DMA-engine finished working at some point before the
> serial8250_rx_dma_flush() invocation and the respective tasklet hasn't
> been executed yet to finalize the DMA transfer, then calling the
> dmaengine_tx_status() will cause the DMA-descriptors status update and the
> Tx-descriptor callback invocation.
> 
> Generalizing the case up: if the dmaengine_tx_status() method callee and
> the Tx-descriptor callback refer to the related critical section, then
> calling dmaengine_tx_status() from the Tx-descriptor callback will
> inevitably cause a deadlock around the guarding lock as it happens in the
> Serial 8250 DMA implementation above. (Note the deadlock doesn't happen
> very often, but can be eventually discovered if the being received data
> size is greater than the Rx DMA-buffer size defined in the 8250_dma.c
> driver. In my case reducing the Rx DMA-buffer size increased the deadlock
> probability.)
> 
> Alas there is no obvious way to prevent the deadlock by fixing the
> 8250-port drivers because the UART-port lock must be held for the entire
> port IRQ handling procedure. Thus the best way to fix the discovered
> problem (and prevent similar ones in the drivers using the DW DMAC device
> channels) is to simplify the DMA-transfer status getter by removing the
> Tx-descriptors state update from there and making the function to serve
> just one purpose - calculate the DMA-transfer residue and return the
> transfer status. The DMA-transfer status update will be performed in the
> bottom-half procedure only.
> 
> Fixes: 3bfb1d20b547 ("dmaengine: Driver for the Synopsys DesignWare DMA controller")
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> 
> ---
> 
> Changelog RFC:
> - Instead of just dropping the dwc_scan_descriptors() method invocation
>   calculate the residue in the Tx-status getter.
> ---
>  drivers/dma/dw/core.c | 90 ++++++++++++++++++++++++-------------------
>  1 file changed, 50 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> index dd75f97a33b3..af1871646eb9 100644
> --- a/drivers/dma/dw/core.c
> +++ b/drivers/dma/dw/core.c
> @@ -39,6 +39,8 @@
>  	BIT(DMA_SLAVE_BUSWIDTH_2_BYTES)		| \
>  	BIT(DMA_SLAVE_BUSWIDTH_4_BYTES)
>  
> +static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc);
> +
>  /*----------------------------------------------------------------------*/
>  
>  static struct device *chan2dev(struct dma_chan *chan)
> @@ -297,14 +299,12 @@ static inline u32 dwc_get_sent(struct dw_dma_chan *dwc)
>  
>  static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
>  {
> -	dma_addr_t llp;
>  	struct dw_desc *desc, *_desc;
>  	struct dw_desc *child;
>  	u32 status_xfer;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&dwc->lock, flags);
> -	llp = channel_readl(dwc, LLP);
>  	status_xfer = dma_readl(dw, RAW.XFER);
>  
>  	if (status_xfer & dwc->mask) {
> @@ -358,41 +358,16 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
>  		return;
>  	}
>  
> -	dev_vdbg(chan2dev(&dwc->chan), "%s: llp=%pad\n", __func__, &llp);
> +	dev_vdbg(chan2dev(&dwc->chan), "%s: hard LLP mode\n", __func__);
>  
>  	list_for_each_entry_safe(desc, _desc, &dwc->active_list, desc_node) {
> -		/* Initial residue value */
> -		desc->residue = desc->total_len;
> -
> -		/* Check first descriptors addr */
> -		if (desc->txd.phys == DWC_LLP_LOC(llp)) {
> -			spin_unlock_irqrestore(&dwc->lock, flags);
> -			return;
> -		}
> -
> -		/* Check first descriptors llp */
> -		if (lli_read(desc, llp) == llp) {
> -			/* This one is currently in progress */
> -			desc->residue -= dwc_get_sent(dwc);
> +		desc->residue = dwc_get_hard_llp_desc_residue(dwc, desc);
> +		if (desc->residue) {
>  			spin_unlock_irqrestore(&dwc->lock, flags);
>  			return;
>  		}
>  
> -		desc->residue -= desc->len;
> -		list_for_each_entry(child, &desc->tx_list, desc_node) {
> -			if (lli_read(child, llp) == llp) {
> -				/* Currently in progress */
> -				desc->residue -= dwc_get_sent(dwc);
> -				spin_unlock_irqrestore(&dwc->lock, flags);
> -				return;
> -			}
> -			desc->residue -= child->len;
> -		}
> -
> -		/*
> -		 * No descriptors so far seem to be in progress, i.e.
> -		 * this one must be done.
> -		 */
> +		/* No data left to be send. Finalize the transfer then */
>  		spin_unlock_irqrestore(&dwc->lock, flags);
>  		dwc_descriptor_complete(dwc, desc, true);
>  		spin_lock_irqsave(&dwc->lock, flags);
> @@ -976,6 +951,45 @@ static struct dw_desc *dwc_find_desc(struct dw_dma_chan *dwc, dma_cookie_t c)
>  	return NULL;
>  }
>  
> +static u32 dwc_get_soft_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
> +{
> +	u32 residue = desc->residue;
> +
> +	if (residue)
> +		residue -= dwc_get_sent(dwc);
> +
> +	return residue;
> +}
> +
> +static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
> +{
> +	u32 residue = desc->total_len;
> +	struct dw_desc *child;
> +	dma_addr_t llp;
> +
> +	llp = channel_readl(dwc, LLP);
> +
> +	/* Check first descriptor for been pending to be fetched by DMAC */
> +	if (desc->txd.phys == DWC_LLP_LOC(llp))
> +		return residue;
> +
> +	/* Check first descriptor LLP to see if it's currently in-progress */
> +	if (lli_read(desc, llp) == llp)
> +		return residue - dwc_get_sent(dwc);
> +
> +	/* Check subordinate LLPs to find the currently in-progress desc */
> +	residue -= desc->len;
> +	list_for_each_entry(child, &desc->tx_list, desc_node) {
> +		if (lli_read(child, llp) == llp)
> +			return residue - dwc_get_sent(dwc);
> +
> +		residue -= child->len;
> +	}
> +
> +	/* Shall return zero if no in-progress desc found */
> +	return residue;
> +}
> +
>  static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cookie,
>  				      enum dma_status *status)
>  {
> @@ -988,9 +1002,11 @@ static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cook
>  	desc = dwc_find_desc(dwc, cookie);
>  	if (desc) {
>  		if (desc == dwc_first_active(dwc)) {
> -			residue = desc->residue;
> -			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags) && residue)
> -				residue -= dwc_get_sent(dwc);
> +			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags))
> +				residue = dwc_get_soft_llp_desc_residue(dwc, desc);
> +			else
> +				residue = dwc_get_hard_llp_desc_residue(dwc, desc);
> +
>  			if (test_bit(DW_DMA_IS_PAUSED, &dwc->flags))
>  				*status = DMA_PAUSED;
>  		} else {
> @@ -1012,12 +1028,6 @@ dwc_tx_status(struct dma_chan *chan,
>  	struct dw_dma_chan	*dwc = to_dw_dma_chan(chan);
>  	enum dma_status		ret;
>  
> -	ret = dma_cookie_status(chan, cookie, txstate);
> -	if (ret == DMA_COMPLETE)
> -		return ret;
> -
> -	dwc_scan_descriptors(to_dw_dma(chan->device), dwc);
> -
>  	ret = dma_cookie_status(chan, cookie, txstate);
>  	if (ret == DMA_COMPLETE)
>  		return ret;
> -- 
> 2.43.0
> 
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You have marked a patch with a "Fixes:" tag for a commit that is in an
  older released kernel, yet you do not have a cc: stable line in the
  signed-off-by area at all, which means that the patch will not be
  applied to any older kernel releases.  To properly fix this, please
  follow the documented rules in the
  Documentation/process/stable-kernel-rules.rst file for how to resolve
  this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback
  2024-09-12  5:27   ` Greg Kroah-Hartman
@ 2024-09-13  9:25     ` Serge Semin
  0 siblings, 0 replies; 11+ messages in thread
From: Serge Semin @ 2024-09-13  9:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Viresh Kumar, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Maciej Sosnowski, Haavard Skinnemoen, Dan Williams,
	Ilpo Järvinen, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

Hi Greg

On Thu, Sep 12, 2024 at 07:27:22AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Sep 11, 2024 at 09:46:09PM +0300, Serge Semin wrote:
> > The dmaengine_tx_status() method implemented in the DW DMAC driver is
> > responsible for not just DMA-transfer status getting, but may also cause
> > the transfer finalization with the Tx-descriptors callback invocation.
> > This makes the simple DMA-transfer status getting being much more complex
> > than it seems with a wider room for possible bugs.
> > 
> > In particular a deadlock has been discovered in the DW 8250 UART device
> > driver interacting with the DW DMA controller channels. Here is the
> > call-trace causing the deadlock:
> > 
> > serial8250_handle_irq()
> >   uart_port_lock_irqsave(port); ----------------------+
> >   handle_rx_dma()                                     |
> >     serial8250_rx_dma_flush()                         |
> >       __dma_rx_complete()                             |
> >         dmaengine_tx_status()                         |
> >           dwc_scan_descriptors()                      |
> >             dwc_complete_all()                        |
> >               dwc_descriptor_complete()               |
> >                 dmaengine_desc_callback_invoke()      |
> >                   cb->callback(cb->callback_param);   |
> >                   ||                                  |
> >                   dma_rx_complete();                  |
> >                     uart_port_lock_irqsave(port); ----+ <- Deadlock!
> > 
> > So in case if the DMA-engine finished working at some point before the
> > serial8250_rx_dma_flush() invocation and the respective tasklet hasn't
> > been executed yet to finalize the DMA transfer, then calling the
> > dmaengine_tx_status() will cause the DMA-descriptors status update and the
> > Tx-descriptor callback invocation.
> > 
> > Generalizing the case up: if the dmaengine_tx_status() method callee and
> > the Tx-descriptor callback refer to the related critical section, then
> > calling dmaengine_tx_status() from the Tx-descriptor callback will
> > inevitably cause a deadlock around the guarding lock as it happens in the
> > Serial 8250 DMA implementation above. (Note the deadlock doesn't happen
> > very often, but can be eventually discovered if the being received data
> > size is greater than the Rx DMA-buffer size defined in the 8250_dma.c
> > driver. In my case reducing the Rx DMA-buffer size increased the deadlock
> > probability.)
> > 
> > Alas there is no obvious way to prevent the deadlock by fixing the
> > 8250-port drivers because the UART-port lock must be held for the entire
> > port IRQ handling procedure. Thus the best way to fix the discovered
> > problem (and prevent similar ones in the drivers using the DW DMAC device
> > channels) is to simplify the DMA-transfer status getter by removing the
> > Tx-descriptors state update from there and making the function to serve
> > just one purpose - calculate the DMA-transfer residue and return the
> > transfer status. The DMA-transfer status update will be performed in the
> > bottom-half procedure only.
> > 
> > Fixes: 3bfb1d20b547 ("dmaengine: Driver for the Synopsys DesignWare DMA controller")
> > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > 
> > ---
> > 
> > Changelog RFC:
> > - Instead of just dropping the dwc_scan_descriptors() method invocation
> >   calculate the residue in the Tx-status getter.
> > ---
> >  drivers/dma/dw/core.c | 90 ++++++++++++++++++++++++-------------------
> >  1 file changed, 50 insertions(+), 40 deletions(-)
> > 
> > diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> > index dd75f97a33b3..af1871646eb9 100644
> > --- a/drivers/dma/dw/core.c
> > +++ b/drivers/dma/dw/core.c
> > @@ -39,6 +39,8 @@
> >  	BIT(DMA_SLAVE_BUSWIDTH_2_BYTES)		| \
> >  	BIT(DMA_SLAVE_BUSWIDTH_4_BYTES)
> >  
> > +static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc);
> > +
> >  /*----------------------------------------------------------------------*/
> >  
> >  static struct device *chan2dev(struct dma_chan *chan)
> > @@ -297,14 +299,12 @@ static inline u32 dwc_get_sent(struct dw_dma_chan *dwc)
> >  
> >  static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
> >  {
> > -	dma_addr_t llp;
> >  	struct dw_desc *desc, *_desc;
> >  	struct dw_desc *child;
> >  	u32 status_xfer;
> >  	unsigned long flags;
> >  
> >  	spin_lock_irqsave(&dwc->lock, flags);
> > -	llp = channel_readl(dwc, LLP);
> >  	status_xfer = dma_readl(dw, RAW.XFER);
> >  
> >  	if (status_xfer & dwc->mask) {
> > @@ -358,41 +358,16 @@ static void dwc_scan_descriptors(struct dw_dma *dw, struct dw_dma_chan *dwc)
> >  		return;
> >  	}
> >  
> > -	dev_vdbg(chan2dev(&dwc->chan), "%s: llp=%pad\n", __func__, &llp);
> > +	dev_vdbg(chan2dev(&dwc->chan), "%s: hard LLP mode\n", __func__);
> >  
> >  	list_for_each_entry_safe(desc, _desc, &dwc->active_list, desc_node) {
> > -		/* Initial residue value */
> > -		desc->residue = desc->total_len;
> > -
> > -		/* Check first descriptors addr */
> > -		if (desc->txd.phys == DWC_LLP_LOC(llp)) {
> > -			spin_unlock_irqrestore(&dwc->lock, flags);
> > -			return;
> > -		}
> > -
> > -		/* Check first descriptors llp */
> > -		if (lli_read(desc, llp) == llp) {
> > -			/* This one is currently in progress */
> > -			desc->residue -= dwc_get_sent(dwc);
> > +		desc->residue = dwc_get_hard_llp_desc_residue(dwc, desc);
> > +		if (desc->residue) {
> >  			spin_unlock_irqrestore(&dwc->lock, flags);
> >  			return;
> >  		}
> >  
> > -		desc->residue -= desc->len;
> > -		list_for_each_entry(child, &desc->tx_list, desc_node) {
> > -			if (lli_read(child, llp) == llp) {
> > -				/* Currently in progress */
> > -				desc->residue -= dwc_get_sent(dwc);
> > -				spin_unlock_irqrestore(&dwc->lock, flags);
> > -				return;
> > -			}
> > -			desc->residue -= child->len;
> > -		}
> > -
> > -		/*
> > -		 * No descriptors so far seem to be in progress, i.e.
> > -		 * this one must be done.
> > -		 */
> > +		/* No data left to be send. Finalize the transfer then */
> >  		spin_unlock_irqrestore(&dwc->lock, flags);
> >  		dwc_descriptor_complete(dwc, desc, true);
> >  		spin_lock_irqsave(&dwc->lock, flags);
> > @@ -976,6 +951,45 @@ static struct dw_desc *dwc_find_desc(struct dw_dma_chan *dwc, dma_cookie_t c)
> >  	return NULL;
> >  }
> >  
> > +static u32 dwc_get_soft_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
> > +{
> > +	u32 residue = desc->residue;
> > +
> > +	if (residue)
> > +		residue -= dwc_get_sent(dwc);
> > +
> > +	return residue;
> > +}
> > +
> > +static u32 dwc_get_hard_llp_desc_residue(struct dw_dma_chan *dwc, struct dw_desc *desc)
> > +{
> > +	u32 residue = desc->total_len;
> > +	struct dw_desc *child;
> > +	dma_addr_t llp;
> > +
> > +	llp = channel_readl(dwc, LLP);
> > +
> > +	/* Check first descriptor for been pending to be fetched by DMAC */
> > +	if (desc->txd.phys == DWC_LLP_LOC(llp))
> > +		return residue;
> > +
> > +	/* Check first descriptor LLP to see if it's currently in-progress */
> > +	if (lli_read(desc, llp) == llp)
> > +		return residue - dwc_get_sent(dwc);
> > +
> > +	/* Check subordinate LLPs to find the currently in-progress desc */
> > +	residue -= desc->len;
> > +	list_for_each_entry(child, &desc->tx_list, desc_node) {
> > +		if (lli_read(child, llp) == llp)
> > +			return residue - dwc_get_sent(dwc);
> > +
> > +		residue -= child->len;
> > +	}
> > +
> > +	/* Shall return zero if no in-progress desc found */
> > +	return residue;
> > +}
> > +
> >  static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cookie,
> >  				      enum dma_status *status)
> >  {
> > @@ -988,9 +1002,11 @@ static u32 dwc_get_residue_and_status(struct dw_dma_chan *dwc, dma_cookie_t cook
> >  	desc = dwc_find_desc(dwc, cookie);
> >  	if (desc) {
> >  		if (desc == dwc_first_active(dwc)) {
> > -			residue = desc->residue;
> > -			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags) && residue)
> > -				residue -= dwc_get_sent(dwc);
> > +			if (test_bit(DW_DMA_IS_SOFT_LLP, &dwc->flags))
> > +				residue = dwc_get_soft_llp_desc_residue(dwc, desc);
> > +			else
> > +				residue = dwc_get_hard_llp_desc_residue(dwc, desc);
> > +
> >  			if (test_bit(DW_DMA_IS_PAUSED, &dwc->flags))
> >  				*status = DMA_PAUSED;
> >  		} else {
> > @@ -1012,12 +1028,6 @@ dwc_tx_status(struct dma_chan *chan,
> >  	struct dw_dma_chan	*dwc = to_dw_dma_chan(chan);
> >  	enum dma_status		ret;
> >  
> > -	ret = dma_cookie_status(chan, cookie, txstate);
> > -	if (ret == DMA_COMPLETE)
> > -		return ret;
> > -
> > -	dwc_scan_descriptors(to_dw_dma(chan->device), dwc);
> > -
> >  	ret = dma_cookie_status(chan, cookie, txstate);
> >  	if (ret == DMA_COMPLETE)
> >  		return ret;
> > -- 
> > 2.43.0
> > 
> > 
> 
> Hi,
> 
> This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
> a patch that has triggered this response.  He used to manually respond
> to these common problems, but in order to save his sanity (he kept
> writing the same thing over and over, yet to different people), I was
> created.  Hopefully you will not take offence and will fix the problem
> in your patch and resubmit it so that it can be accepted into the Linux
> kernel tree.
> 
> You are receiving this message because of the following common error(s)
> as indicated below:
> 
> - You have marked a patch with a "Fixes:" tag for a commit that is in an
>   older released kernel, yet you do not have a cc: stable line in the
>   signed-off-by area at all, which means that the patch will not be
>   applied to any older kernel releases.  To properly fix this, please
>   follow the documented rules in the
>   Documentation/process/stable-kernel-rules.rst file for how to resolve
>   this.
> 
> If you wish to discuss this problem further, or you have questions about
> how to resolve this issue, please feel free to respond to this email and
> Greg will reply once he has dug out from the pending patches received
> from other developers.

Got it. I'll wait for the maintainers to react and discuss the
problems the series fixes. Then, if required, I'll re-submit the patch
set with the stable list Cc'ed.

-Serge(y)

> 
> thanks,
> 
> greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
  2024-09-11 18:46 [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Serge Semin
  2024-09-11 18:46 ` [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback Serge Semin
  2024-09-11 18:46 ` [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error Serge Semin
@ 2024-09-16 13:01 ` Andy Shevchenko
  2024-09-20  9:33   ` Serge Semin
  2 siblings, 1 reply; 11+ messages in thread
From: Andy Shevchenko @ 2024-09-16 13:01 UTC (permalink / raw)
  To: Serge Semin, Hans de Goede
  Cc: Viresh Kumar, Vinod Koul, Ilpo Järvinen, Greg Kroah-Hartman,
	Jiri Slaby, dmaengine, linux-serial, linux-kernel

On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> The main goal of the series is to fix the DW DMAC driver to be working
> better with the serial 8250 device driver implementation. In particular it
> was discovered that there is a random system freeze (caused by a
> deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> error printed to the log when the DW APB UART interface is used in
> conjunction with the DW DMA controller. Although I guess the problem can
> be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> execution. Anyway this short series contains two patches fixing these
> bugs. Please see the respective patches log for details.
> 
> Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> Changelog RFC:
> - Add a new patch:
>   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
>   fixing the "XFER bit set, but channel not idle" error.
> - Instead of just dropping the dwc_scan_descriptors() method invocation
>   calculate the residue in the Tx-status getter.

FWIW, this series does not regress on Intel Merrifield (SPI case),
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

P.S.
However it might need an additional tests for the DW UART based platforms.
Cc'ed to Hans just in case (it might that he can add this to his repo for
testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
operations).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
  2024-09-16 13:01 ` [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Andy Shevchenko
@ 2024-09-20  9:33   ` Serge Semin
  2024-09-20 14:24     ` Andy Shevchenko
  0 siblings, 1 reply; 11+ messages in thread
From: Serge Semin @ 2024-09-20  9:33 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Viresh Kumar, Vinod Koul, Ilpo Järvinen,
	Greg Kroah-Hartman, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

Hi Andy

On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> > The main goal of the series is to fix the DW DMAC driver to be working
> > better with the serial 8250 device driver implementation. In particular it
> > was discovered that there is a random system freeze (caused by a
> > deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> > error printed to the log when the DW APB UART interface is used in
> > conjunction with the DW DMA controller. Although I guess the problem can
> > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> > execution. Anyway this short series contains two patches fixing these
> > bugs. Please see the respective patches log for details.
> > 
> > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> > Changelog RFC:
> > - Add a new patch:
> >   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
> >   fixing the "XFER bit set, but channel not idle" error.
> > - Instead of just dropping the dwc_scan_descriptors() method invocation
> >   calculate the residue in the Tx-status getter.
> 

> FWIW, this series does not regress on Intel Merrifield (SPI case),
> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 

Great! Thanks.

> P.S.
> However it might need an additional tests for the DW UART based platforms.
> Cc'ed to Hans just in case (it might that he can add this to his repo for
> testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
> operations).

It's not enough though. The DW UART controller must be connected to
the DW DMAC handshaking interface on the platform. The kernel must be
properly setup for that too. In that case the test would be done on
a proper target. Do the Bay Trail and Cherry Trail chips support such
HW-setup? If so the additional test would be very welcome.

Sometime ago you said that you seemed to meet a similar issue on older
machines:
https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/
If it's still possible could you please perform at least some smoke
test on those devices?

In case of my device this series and a previous one
https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/
fixed all the critical issues for the DW UART + DW DMAC buddies:
1. Sudden data disappearing at the tail of the transfers (previous
patch set).
2. Random system freeze (this patch set).

There is another problem caused by the too slow coherent memory IO on
my device. Due to that the data gets to be copied too slow in the
__dma_rx_complete()->tty_insert_flip_string() call. As a result a fast
incoming traffic overflows the DW UART inbound FIFO. But that can be
worked around by decreasing the Rx DMA-buffer size. (There are some
more generic fixes possible, but they haven't shown to be as effective
as the buffer size reduction.)

-Serge(y)

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
  2024-09-20  9:33   ` Serge Semin
@ 2024-09-20 14:24     ` Andy Shevchenko
  2024-09-20 14:56       ` Serge Semin
  0 siblings, 1 reply; 11+ messages in thread
From: Andy Shevchenko @ 2024-09-20 14:24 UTC (permalink / raw)
  To: Serge Semin
  Cc: Hans de Goede, Viresh Kumar, Vinod Koul, Ilpo Järvinen,
	Greg Kroah-Hartman, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote:
> On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote:
> > On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> > > The main goal of the series is to fix the DW DMAC driver to be working
> > > better with the serial 8250 device driver implementation. In particular it
> > > was discovered that there is a random system freeze (caused by a
> > > deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> > > error printed to the log when the DW APB UART interface is used in
> > > conjunction with the DW DMA controller. Although I guess the problem can
> > > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> > > execution. Anyway this short series contains two patches fixing these
> > > bugs. Please see the respective patches log for details.
> > > 
> > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> > > Changelog RFC:
> > > - Add a new patch:
> > >   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
> > >   fixing the "XFER bit set, but channel not idle" error.
> > > - Instead of just dropping the dwc_scan_descriptors() method invocation
> > >   calculate the residue in the Tx-status getter.
> 
> > FWIW, this series does not regress on Intel Merrifield (SPI case),
> > Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 
> Great! Thanks.
> 
> > P.S.
> > However it might need an additional tests for the DW UART based platforms.
> > Cc'ed to Hans just in case (it might that he can add this to his repo for
> > testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
> > operations).
> 
> It's not enough though. The DW UART controller must be connected to
> the DW DMAC handshaking interface on the platform. The kernel must be
> properly setup for that too. In that case the test would be done on
> a proper target. Do the Bay Trail and Cherry Trail chips support such
> HW-setup? If so the additional test would be very welcome.

I'm not sure I understand what HW setup you mean.

Bay Trail and Cherry Trail uses a shared DW DMA controller with number of
peripheral devices, HS UART (also DW) is one of them.

> Sometime ago you said that you seemed to meet a similar issue on older
> machines:
> https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/
> If it's still possible could you please perform at least some smoke
> test on those devices?

That mainly was exactly about Bay Trail and Cherry Trail machines
(and may be Broadwell and Haswell, but the latter two is not so
 distributed nowadays).

> In case of my device this series and a previous one
> https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/
> fixed all the critical issues for the DW UART + DW DMAC buddies:
> 1. Sudden data disappearing at the tail of the transfers (previous
> patch set).
> 2. Random system freeze (this patch set).
> 
> There is another problem caused by the too slow coherent memory IO on
> my device. Due to that the data gets to be copied too slow in the
> __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast
> incoming traffic overflows the DW UART inbound FIFO. But that can be
> worked around by decreasing the Rx DMA-buffer size. (There are some
> more generic fixes possible, but they haven't shown to be as effective
> as the buffer size reduction.)

This sounds like a specific quirk for a specific platform. In case you
are going to address that make sure it does not come to be generic.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
  2024-09-20 14:24     ` Andy Shevchenko
@ 2024-09-20 14:56       ` Serge Semin
  2024-09-20 15:04         ` Andy Shevchenko
  0 siblings, 1 reply; 11+ messages in thread
From: Serge Semin @ 2024-09-20 14:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Viresh Kumar, Vinod Koul, Ilpo Järvinen,
	Greg Kroah-Hartman, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

On Fri, Sep 20, 2024 at 05:24:37PM +0300, Andy Shevchenko wrote:
> On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote:
> > On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote:
> > > On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> > > > The main goal of the series is to fix the DW DMAC driver to be working
> > > > better with the serial 8250 device driver implementation. In particular it
> > > > was discovered that there is a random system freeze (caused by a
> > > > deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> > > > error printed to the log when the DW APB UART interface is used in
> > > > conjunction with the DW DMA controller. Although I guess the problem can
> > > > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> > > > execution. Anyway this short series contains two patches fixing these
> > > > bugs. Please see the respective patches log for details.
> > > > 
> > > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> > > > Changelog RFC:
> > > > - Add a new patch:
> > > >   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
> > > >   fixing the "XFER bit set, but channel not idle" error.
> > > > - Instead of just dropping the dwc_scan_descriptors() method invocation
> > > >   calculate the residue in the Tx-status getter.
> > 
> > > FWIW, this series does not regress on Intel Merrifield (SPI case),
> > > Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > 
> > Great! Thanks.
> > 
> > > P.S.
> > > However it might need an additional tests for the DW UART based platforms.
> > > Cc'ed to Hans just in case (it might that he can add this to his repo for
> > > testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
> > > operations).
> > 
> > It's not enough though. The DW UART controller must be connected to
> > the DW DMAC handshaking interface on the platform. The kernel must be
> > properly setup for that too. In that case the test would be done on
> > a proper target. Do the Bay Trail and Cherry Trail chips support such
> > HW-setup? If so the additional test would be very welcome.
> 

> I'm not sure I understand what HW setup you mean.

I meant exactly what you explained in the next sentence - whether the
Bay Trail and Cherry Trail have the DW UART capable to work with the
DW DMAC.

> 
> Bay Trail and Cherry Trail uses a shared DW DMA controller with number of
> peripheral devices, HS UART (also DW) is one of them.

Ok. Thanks. Testing the patch set on these platforms make sense then,
but of course with the kernel configured to have the DW UART device
handling the in-/outbound traffic by the DW DMA controller.

> 
> > Sometime ago you said that you seemed to meet a similar issue on older
> > machines:
> > https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/
> > If it's still possible could you please perform at least some smoke
> > test on those devices?
> 
> That mainly was exactly about Bay Trail and Cherry Trail machines
> (and may be Broadwell and Haswell, but the latter two is not so
>  distributed nowadays).
> 
> > In case of my device this series and a previous one
> > https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/
> > fixed all the critical issues for the DW UART + DW DMAC buddies:
> > 1. Sudden data disappearing at the tail of the transfers (previous
> > patch set).
> > 2. Random system freeze (this patch set).
> > 
> > There is another problem caused by the too slow coherent memory IO on
> > my device. Due to that the data gets to be copied too slow in the
> > __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast
> > incoming traffic overflows the DW UART inbound FIFO. But that can be
> > worked around by decreasing the Rx DMA-buffer size. (There are some
> > more generic fixes possible, but they haven't shown to be as effective
> > as the buffer size reduction.)
> 

> This sounds like a specific quirk for a specific platform. In case you
> are going to address that make sure it does not come to be generic.

Of course reducing the buffer size is the platform-specific quirk.

A more generic fix could be to convert the DMA-buffer to being
allocated from the DMA-noncoherent memory _if_ the DMA performed by
the DW DMA-device is non-coherent anyway. In that case the
DMA-coherent memory buffer is normally allocated from the
non-cacheable memory pool, access to which is very-very slow even on
the Intel/AMD devices.  So using the cacheable buffer for DMA, then
manually invalidating the cache for it before DMA IOs and prefetching
the data afterwards seemed as a more universal solution. But my tests
showed that such approach doesn't fully solve the problem on my
device. That said that approach permitted to execute data-safe UART
transfers for up to 460Kbit/s, meanwhile just reducing the buffer from
16K to 512b - for up to 2.0Mbaud/s. It's still not enough since the
device is capable to work on the speed 3Mbit/s, but it's better than
460Kbaud/s.

-Serge(y)

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs
  2024-09-20 14:56       ` Serge Semin
@ 2024-09-20 15:04         ` Andy Shevchenko
  0 siblings, 0 replies; 11+ messages in thread
From: Andy Shevchenko @ 2024-09-20 15:04 UTC (permalink / raw)
  To: Serge Semin
  Cc: Hans de Goede, Viresh Kumar, Vinod Koul, Ilpo Järvinen,
	Greg Kroah-Hartman, Jiri Slaby, dmaengine, linux-serial,
	linux-kernel

On Fri, Sep 20, 2024 at 05:56:23PM +0300, Serge Semin wrote:
> On Fri, Sep 20, 2024 at 05:24:37PM +0300, Andy Shevchenko wrote:
> > On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote:
> > > On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote:

...

> > > There is another problem caused by the too slow coherent memory IO on
> > > my device. Due to that the data gets to be copied too slow in the
> > > __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast
> > > incoming traffic overflows the DW UART inbound FIFO. But that can be
> > > worked around by decreasing the Rx DMA-buffer size. (There are some
> > > more generic fixes possible, but they haven't shown to be as effective
> > > as the buffer size reduction.)
> 
> > This sounds like a specific quirk for a specific platform. In case you
> > are going to address that make sure it does not come to be generic.
> 
> Of course reducing the buffer size is the platform-specific quirk.
> 
> A more generic fix could be to convert the DMA-buffer to being
> allocated from the DMA-noncoherent memory _if_ the DMA performed by
> the DW DMA-device is non-coherent anyway. In that case the
> DMA-coherent memory buffer is normally allocated from the
> non-cacheable memory pool, access to which is very-very slow even on
> the Intel/AMD devices.  So using the cacheable buffer for DMA, then
> manually invalidating the cache for it before DMA IOs and prefetching
> the data afterwards seemed as a more universal solution. But my tests
> showed that such approach doesn't fully solve the problem on my
> device. That said that approach permitted to execute data-safe UART
> transfers for up to 460Kbit/s, meanwhile just reducing the buffer from
> 16K to 512b - for up to 2.0Mbaud/s. It's still not enough since the
> device is capable to work on the speed 3Mbit/s, but it's better than
> 460Kbaud/s.

Ah, interesting issue.  Good lick with solving it the best way you can.
Any yes, you're right that 2M support is better than 0.5M.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-09-20 15:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-11 18:46 [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Serge Semin
2024-09-11 18:46 ` [PATCH 1/2] dmaengine: dw: Prevent tx-status calling DMA-desc callback Serge Semin
2024-09-12  5:27   ` Greg Kroah-Hartman
2024-09-13  9:25     ` Serge Semin
2024-09-11 18:46 ` [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error Serge Semin
2024-09-12  5:27   ` Greg Kroah-Hartman
2024-09-16 13:01 ` [PATCH 0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs Andy Shevchenko
2024-09-20  9:33   ` Serge Semin
2024-09-20 14:24     ` Andy Shevchenko
2024-09-20 14:56       ` Serge Semin
2024-09-20 15:04         ` Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).