linux-serial.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 8250_dw DMA issue with BYT ?
@ 2014-04-14 15:54 Poulain, Loic
  2014-04-15  6:56 ` Heikki Krogerus
  0 siblings, 1 reply; 6+ messages in thread
From: Poulain, Loic @ 2014-04-14 15:54 UTC (permalink / raw)
  To: heikki.krogerus@linux.intel.com; +Cc: linux-serial@vger.kernel.org

Hello Heikki,

I contact you regarding the 8250_dw driver.
My setup is an Asus T100TA (baytrail) + kernel 3.14.
The UART controller is ACPI enumerated (ID is 80860F0A).
(UART_CAP_AFE capable)

This uart is used to communicate with a BCM4324 bluetooth chip
(HCI H4 over uart). However we have many bluetooth instabilities.
These instabilities are due to the 'hardware error' HCI code sent
by the BCM chip. This error means that the BCM received an invalid
H4 packet header or an invalid sized packet => serial sync issue.
Moreover, Increasing the baudrate increases the hardware error
frequency.

This problem seems due to the DMA usage. Indeed, when I disable
the DMA manually in 8250_dw.c (up->dma = NULL), problem does not
occur anymore, all baudrates are stable.

Do you think a specific 8250 DMA configuration could apply for the byt
case ?
How/Where can I debug to know if the 8250 DMA works as expected ?

Thanks & Regards,
Loic Poulain
---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 8250_dw DMA issue with BYT ?
  2014-04-14 15:54 8250_dw DMA issue with BYT ? Poulain, Loic
@ 2014-04-15  6:56 ` Heikki Krogerus
  2014-04-15  7:25   ` Andy Shevchenko
  2014-04-16  1:33   ` Jin, Yao
  0 siblings, 2 replies; 6+ messages in thread
From: Heikki Krogerus @ 2014-04-15  6:56 UTC (permalink / raw)
  To: Poulain, Loic; +Cc: linux-serial@vger.kernel.org, Andy Shevchenko, Jin, Yao

On Mon, Apr 14, 2014 at 03:54:31PM +0000, Poulain, Loic wrote:
> Hello Heikki,
> 
> I contact you regarding the 8250_dw driver.
> My setup is an Asus T100TA (baytrail) + kernel 3.14.
> The UART controller is ACPI enumerated (ID is 80860F0A).
> (UART_CAP_AFE capable)
> 
> This uart is used to communicate with a BCM4324 bluetooth chip
> (HCI H4 over uart). However we have many bluetooth instabilities.
> These instabilities are due to the 'hardware error' HCI code sent
> by the BCM chip. This error means that the BCM received an invalid
> H4 packet header or an invalid sized packet => serial sync issue.
> Moreover, Increasing the baudrate increases the hardware error
> frequency.
> 
> This problem seems due to the DMA usage. Indeed, when I disable
> the DMA manually in 8250_dw.c (up->dma = NULL), problem does not
> occur anymore, all baudrates are stable.

I think this is the same issue Jin Yao (CC'd) already spotted. He made
a bug for it:
https://bugzilla.kernel.org/show_bug.cgi?id=73071

> Do you think a specific 8250 DMA configuration could apply for the byt
> case ?
> How/Where can I debug to know if the 8250 DMA works as expected ?

Adding Andy. He maintains the Designware DMA Engine driver that is
used on BYT. He should be the best person to tell how to debug the
DMA.

We are debugging an other issue where the DMA mask for the device is
somehow set wrong. This does not exactly look like the same problem,
but you newer know. Try getting the TX buffer page from the first 16M
of memory..

--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -143,7 +143,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
         */
        if (!state->xmit.buf) {
                /* This is protected by the per port mutex */
-               page = get_zeroed_page(GFP_KERNEL);
+               page = get_zeroed_page(GFP_KERNEL | GFP_DMA);
                if (!page)
                        return -ENOMEM;


If the issue disappears, we know it's the same problem.

Thanks,

-- 
heikki

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 8250_dw DMA issue with BYT ?
  2014-04-15  6:56 ` Heikki Krogerus
@ 2014-04-15  7:25   ` Andy Shevchenko
  2014-04-16  1:33   ` Jin, Yao
  1 sibling, 0 replies; 6+ messages in thread
From: Andy Shevchenko @ 2014-04-15  7:25 UTC (permalink / raw)
  To: Heikki Krogerus; +Cc: Poulain, Loic, linux-serial@vger.kernel.org, Jin, Yao

On Tue, 2014-04-15 at 09:56 +0300, Heikki Krogerus wrote:
> On Mon, Apr 14, 2014 at 03:54:31PM +0000, Poulain, Loic wrote:

...

> > This problem seems due to the DMA usage. Indeed, when I disable
> > the DMA manually in 8250_dw.c (up->dma = NULL), problem does not
> > occur anymore, all baudrates are stable.
> 
> I think this is the same issue Jin Yao (CC'd) already spotted. He made
> a bug for it:
> https://bugzilla.kernel.org/show_bug.cgi?id=73071
> 
> > Do you think a specific 8250 DMA configuration could apply for the byt
> > case ?
> > How/Where can I debug to know if the 8250 DMA works as expected ?
> 
> Adding Andy. He maintains the Designware DMA Engine driver that is
> used on BYT. He should be the best person to tell how to debug the
> DMA.

In case you have DYNAMIC_DEBUG enabled in the kernel configuration you
may apply following addon to the kernel command line (considering that
you have DW_DMAC_CORE=y and DW_DMAC=y):
dw_dmac_core.dyndbg dw_dmac.dyndbg

It enables the debug messages from the dw_dmac driver.

P.S. in case you need deep debugging I have a custom patch which enables
it for all IO communication done with the hardware (this apparently
doesn't include dumps of data transfered).

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: 8250_dw DMA issue with BYT ?
  2014-04-15  6:56 ` Heikki Krogerus
  2014-04-15  7:25   ` Andy Shevchenko
@ 2014-04-16  1:33   ` Jin, Yao
  2014-04-16 16:34     ` Poulain, Loic
  1 sibling, 1 reply; 6+ messages in thread
From: Jin, Yao @ 2014-04-16  1:33 UTC (permalink / raw)
  To: Heikki Krogerus, Poulain, Loic
  Cc: linux-serial@vger.kernel.org, Andy Shevchenko

Does this fix work? I don't have a T100 on hand. 

--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -143,7 +143,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
         */
        if (!state->xmit.buf) {
                /* This is protected by the per port mutex */
-               page = get_zeroed_page(GFP_KERNEL);
+               page = get_zeroed_page(GFP_KERNEL | GFP_DMA);
                if (!page)
                        return -ENOMEM;

-----Original Message-----
From: Heikki Krogerus [mailto:heikki.krogerus@linux.intel.com] 
Sent: Tuesday, April 15, 2014 2:56 PM
To: Poulain, Loic
Cc: linux-serial@vger.kernel.org; Andy Shevchenko; Jin, Yao
Subject: Re: 8250_dw DMA issue with BYT ?

On Mon, Apr 14, 2014 at 03:54:31PM +0000, Poulain, Loic wrote:
> Hello Heikki,
> 
> I contact you regarding the 8250_dw driver.
> My setup is an Asus T100TA (baytrail) + kernel 3.14.
> The UART controller is ACPI enumerated (ID is 80860F0A).
> (UART_CAP_AFE capable)
> 
> This uart is used to communicate with a BCM4324 bluetooth chip (HCI H4 
> over uart). However we have many bluetooth instabilities.
> These instabilities are due to the 'hardware error' HCI code sent by 
> the BCM chip. This error means that the BCM received an invalid
> H4 packet header or an invalid sized packet => serial sync issue.
> Moreover, Increasing the baudrate increases the hardware error 
> frequency.
> 
> This problem seems due to the DMA usage. Indeed, when I disable the 
> DMA manually in 8250_dw.c (up->dma = NULL), problem does not occur 
> anymore, all baudrates are stable.

I think this is the same issue Jin Yao (CC'd) already spotted. He made a bug for it:
https://bugzilla.kernel.org/show_bug.cgi?id=73071

> Do you think a specific 8250 DMA configuration could apply for the byt 
> case ?
> How/Where can I debug to know if the 8250 DMA works as expected ?

Adding Andy. He maintains the Designware DMA Engine driver that is used on BYT. He should be the best person to tell how to debug the DMA.

We are debugging an other issue where the DMA mask for the device is somehow set wrong. This does not exactly look like the same problem, but you newer know. Try getting the TX buffer page from the first 16M of memory..

--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -143,7 +143,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
         */
        if (!state->xmit.buf) {
                /* This is protected by the per port mutex */
-               page = get_zeroed_page(GFP_KERNEL);
+               page = get_zeroed_page(GFP_KERNEL | GFP_DMA);
                if (!page)
                        return -ENOMEM;


If the issue disappears, we know it's the same problem.

Thanks,

--
heikki

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: 8250_dw DMA issue with BYT ?
  2014-04-16  1:33   ` Jin, Yao
@ 2014-04-16 16:34     ` Poulain, Loic
  2014-04-17 10:21       ` Heikki Krogerus
  0 siblings, 1 reply; 6+ messages in thread
From: Poulain, Loic @ 2014-04-16 16:34 UTC (permalink / raw)
  To: Jin, Yao, Heikki Krogerus; +Cc: linux-serial@vger.kernel.org, Andy Shevchenko

Hello Heikki, Jin

> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -143,7 +143,7 @@ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state,
>          */
>         if (!state->xmit.buf) {
>                 /* This is protected by the per port mutex */
> -               page = get_zeroed_page(GFP_KERNEL);
> +               page = get_zeroed_page(GFP_KERNEL | GFP_DMA);
>                 if (!page)
>                        return -ENOMEM;

This patch doesn't fix my problem.


After analysis, here is my status:

The Bluetooth chip generates hardware errors due to unexpected
data. These unexpected data are caused by 8250 concurrent TX.

- Concurrent DMA TX (8250_dma.c):

After each DMA transfer, __dma_tx_complete() is called.
This function updates the circular buffer tail and calls
serial8250_tx_dma() if data remains.

However these operations are not protected against concurrent
call of serial8250_tx_dma() (via uart start_tx). So, this concurrent 
call may use non-updated tail index and may be called in parallel 
of an other serial8250_tx_dma().


- Concurrent Chars TX (8250_corec.c):

In the serial8250_handle_irq(), serial8250_tx_chars is called if
UART_LSR_THRE. However we should not call this function
if we are using the DMA. Moreover, for the same reason as above,
serial8250_tx_chars could be called with bad tail index or in parallel
of a serial8250_tx_dma.


I think two fixes are necessary:

to avoid tx chars usage:

diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 69932b7..e124983 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -1520,7 +1520,7 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
                        status = serial8250_rx_chars(up, status);
        }
        serial8250_modem_status(up);
-       if (status & UART_LSR_THRE)
+       if (!up->dma && (status & UART_LSR_THRE))
                serial8250_tx_chars(up);
 
        spin_unlock_irqrestore(&port->lock, flags);


to fix concurrent operations:

diff --git a/drivers/tty/serial/8250/8250_dma.c b/drivers/tty/serial/8250/8250_dma.c
index 7046769..4469190 100644
--- a/drivers/tty/serial/8250/8250_dma.c
+++ b/drivers/tty/serial/8250/8250_dma.c
@@ -20,6 +20,9 @@ static void __dma_tx_complete(void *param)
        struct uart_8250_port   *p = param;
        struct uart_8250_dma    *dma = p->dma;
        struct circ_buf         *xmit = &p->port.state->xmit;
+       unsigned long flags;
+
+       spin_lock_irqsave(&p->port.lock, flags);
 
        dma->tx_running = 0;
 
@@ -35,6 +38,8 @@ static void __dma_tx_complete(void *param)
 
        if (!uart_circ_empty(xmit) && !uart_tx_stopped(&p->port))
                serial8250_tx_dma(p);
+
+       spin_unlock_irqrestore(&p->port.lock, flags);
 }


The last one fixes TX issues but causes random freeze of my
platform (when uploading file via bluetooth). It seems to be a deadlock.
I have no kernel trace in my console, system is stuck.

To temporally workaround this, I simply moved "tx_running=0" after updating
tail index (in __dma_tx_complete) so that no dma_tx can happen before 
this update.

	xmit->tail += dma->tx_size;
	xmit->tail &= UART_XMIT_SIZE - 1;
	p->port.icount.tx += dma->tx_size;

	dma->tx_running = 0;


Regards,
Loic
---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: 8250_dw DMA issue with BYT ?
  2014-04-16 16:34     ` Poulain, Loic
@ 2014-04-17 10:21       ` Heikki Krogerus
  0 siblings, 0 replies; 6+ messages in thread
From: Heikki Krogerus @ 2014-04-17 10:21 UTC (permalink / raw)
  To: Poulain, Loic; +Cc: Jin, Yao, linux-serial@vger.kernel.org, Andy Shevchenko

Hi Loic,

On Wed, Apr 16, 2014 at 04:34:58PM +0000, Poulain, Loic wrote:
> After analysis, here is my status:
> 
> The Bluetooth chip generates hardware errors due to unexpected
> data. These unexpected data are caused by 8250 concurrent TX.
> 
> - Concurrent DMA TX (8250_dma.c):
> 
> After each DMA transfer, __dma_tx_complete() is called.
> This function updates the circular buffer tail and calls
> serial8250_tx_dma() if data remains.
> 
> However these operations are not protected against concurrent
> call of serial8250_tx_dma() (via uart start_tx). So, this concurrent 
> call may use non-updated tail index and may be called in parallel 
> of an other serial8250_tx_dma().
> 
> 
> - Concurrent Chars TX (8250_corec.c):
> 
> In the serial8250_handle_irq(), serial8250_tx_chars is called if
> UART_LSR_THRE. However we should not call this function
> if we are using the DMA. Moreover, for the same reason as above,
> serial8250_tx_chars could be called with bad tail index or in parallel
> of a serial8250_tx_dma.

Nice job! Thanks a lot for the analysis!

> I think two fixes are necessary:
> 
> to avoid tx chars usage:

<snip>

> The last one fixes TX issues but causes random freeze of my
> platform (when uploading file via bluetooth). It seems to be a deadlock.
> I have no kernel trace in my console, system is stuck.
> 
> To temporally workaround this, I simply moved "tx_running=0" after updating
> tail index (in __dma_tx_complete) so that no dma_tx can happen before 
> this update.
> 
> 	xmit->tail += dma->tx_size;
> 	xmit->tail &= UART_XMIT_SIZE - 1;
> 	p->port.icount.tx += dma->tx_size;
> 
> 	dma->tx_running = 0;

As a side note. Since you are in any case playing with the tx_running
flag, could you check if it's possible to drop it and just rely on the
status from the dma engine (with dmaengine_tx_status())?


Thanks,

-- 
heikki

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-04-17 10:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-14 15:54 8250_dw DMA issue with BYT ? Poulain, Loic
2014-04-15  6:56 ` Heikki Krogerus
2014-04-15  7:25   ` Andy Shevchenko
2014-04-16  1:33   ` Jin, Yao
2014-04-16 16:34     ` Poulain, Loic
2014-04-17 10:21       ` Heikki Krogerus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).