[BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
@ 2017-03-08  6:57 Petr Cvek
  2017-03-08 16:43 ` Robert Jarzmik
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Cvek @ 2017-03-08  6:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

PXA27x DMA changes between:

	v4.7
	d52bd54db8be8999df6df5a776f38c4f8b5e9cea
and
	v4.10-rc5
	a4685d2f58e2230d4e27fb2ee581d7ea35e5d046

seems to expose a race condition while using PXA MMC driver on a PXA27x (magician.c machine).

The failure causes one line in the kernel log, after which the filesystem on SD card is inaccessible (and machine too).

	mmc0: DMA error on tx channel

I wasn't able to track the problem to a single patch as the problem occurs at random time (from the boot to like a half an hour) and it's maybe dependent on a level of a battery charge (maybe because of kernel log writes of charging messages).

It seems that most occurrency is during writes on an SD card. Using an SDHC card decreases the time to fail. After failure the OS is unavailable (rootfs in on the card).

>From my poking in the kernel source code it seems there is a probability that pxamci_irq() takes longer to call and its subsequent call pxamci_data_done() isn't fast enough to set [1]
	host->data = NULL;
>From the DMA side, the DMA done interrupt is generated:
	pxad_chan_handler() -> vchan_cookie_complete()
...where a tasklet for vchan_complete() is scheduled, where finally with interrupts enabled (can pxamci_irq() be called here?) the callback pxamci_dma_irq() is called.

>From my tests it seems at this point [2] the host->data is always NULL and rest of the callback is never called. It is called once with a nonempty host->data only just before the failure.

During the testing I put udelay(100) at the start of pxamci_dma_irq() and fail occurred after like 2 hours (when I for the first time tapped the touchscreen - higher CPU usage and interrupts).

[1] http://elixir.free-electrons.com/source/drivers/mmc/host/pxamci.c?v=4.10#L385
[2] http://elixir.free-electrons.com/source/drivers/mmc/host/pxamci.c?v=4.10#L561

Best regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
  2017-03-08  6:57 [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel Petr Cvek
@ 2017-03-08 16:43 ` Robert Jarzmik
  2017-03-10  0:49   ` Petr Cvek
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Jarzmik @ 2017-03-08 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

Petr Cvek <petr.cvek@tul.cz> writes:

Hi Petr,
> I wasn't able to track the problem to a single patch as the problem occurs at
> random time (from the boot to like a half an hour) and it's maybe dependent on a
> level of a battery charge (maybe because of kernel log writes of charging
> messages).
Mmmh, long reproduction time, that will be bad.

> It seems that most occurrency is during writes on an SD card. Using an SDHC
> card decreases the time to fail. After failure the OS is unavailable (rootfs
> in on the card).
Okay, let me try to make write loop on my SD card to see if I manage to
reproduce this.

> From my poking in the kernel source code it seems there is a probability that pxamci_irq() takes longer to call and its subsequent call pxamci_data_done() isn't fast enough to set [1]
> 	host->data = NULL;

> From the DMA side, the DMA done interrupt is generated:
> 	pxad_chan_handler() -> vchan_cookie_complete()
> ...where a tasklet for vchan_complete() is scheduled
At least that seems to hint the DMA part is sound so for.
The bothering part is the log error "mmc0: DMA error on tx channel". I would
need a bit of guidance here, with the same log with [1] applied.

> , where finally with interrupts enabled (can pxamci_irq() be called here?) the
> callback pxamci_dma_irq() is called.
When DMA completes, there is a tiny window, before pxamci_dma_irq() is called,
when pxamci_irq() can be called, yes. As soon as the spinlock is taken in
pxamci_dma_irq() is taken, no more races.

> From my tests it seems at this point [2] the host->data is always NULL and rest
> of the callback is never called. It is called once with a nonempty host->data
> only just before the failure.
>
> During the testing I put udelay(100) at the start of pxamci_dma_irq() and fail
> occurred after like 2 hours (when I for the first time tapped the touchscreen -
> higher CPU usage and interrupts).
Mmm I would need more data here.

The biggest help I could get would be the pxa dma traces here :
echo -n 'file pxa_dma.c +p' > /sys/kernel/debug/dynamic_debug/control
echo -n 'file virt-dma.c +p' > /sys/kernel/debug/dynamic_debug/control

And then capture the last traces and send them to me.

Cheers.

-- 
Robert

[1] Small debug patch
---8>---
diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c
index c763b404510f..ed3812b2a34d 100644
--- a/drivers/mmc/host/pxamci.c
+++ b/drivers/mmc/host/pxamci.c
@@ -571,8 +571,9 @@ static void pxamci_dma_irq(void *param)
 	if (likely(status == DMA_COMPLETE)) {
 		writel(BUF_PART_FULL, host->base + MMC_PRTBUF);
 	} else {
-		pr_err("%s: DMA error on %s channel\n", mmc_hostname(host->mmc),
-			host->data->flags & MMC_DATA_READ ? "rx" : "tx");
+		pr_err("%s: DMA error on %s channel: %d\n",
+		       mmc_hostname(host->mmc),
+		       host->data->flags & MMC_DATA_READ ? "rx" : "tx", status);
 		host->data->error = -EIO;
 		pxamci_data_done(host, 0);
 	}

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
  2017-03-08 16:43 ` Robert Jarzmik
@ 2017-03-10  0:49   ` Petr Cvek
  2017-03-14 21:11     ` Robert Jarzmik
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Cvek @ 2017-03-10  0:49 UTC (permalink / raw)
  To: linux-arm-kernel

Dne 8.3.2017 v 17:43 Robert Jarzmik napsal(a):
> Petr Cvek <petr.cvek@tul.cz> writes:
> 
> Hi Petr,
>> I wasn't able to track the problem to a single patch as the problem occurs at
>> random time (from the boot to like a half an hour) and it's maybe dependent on a
>> level of a battery charge (maybe because of kernel log writes of charging
>> messages).
> Mmmh, long reproduction time, that will be bad.

Sent log is for a machine which finally died when I tried to start an X server. These debug printks slow it down.

>> It seems that most occurrency is during writes on an SD card. Using an SDHC
>> card decreases the time to fail. After failure the OS is unavailable (rootfs
>> in on the card).
> Okay, let me try to make write loop on my SD card to see if I manage to
> reproduce this.
> 

I tried to enable and disable soundcard (other part which is using DMA), but it does not have any effect. Concurrent interrupts may have.

>> From my poking in the kernel source code it seems there is a probability that pxamci_irq() takes longer to call and its subsequent call pxamci_data_done() isn't fast enough to set [1]
>> 	host->data = NULL;
> 
>> From the DMA side, the DMA done interrupt is generated:
>> 	pxad_chan_handler() -> vchan_cookie_complete()
>> ...where a tasklet for vchan_complete() is scheduled
> At least that seems to hint the DMA part is sound so for.
> The bothering part is the log error "mmc0: DMA error on tx channel". I would
> need a bit of guidance here, with the same log with [1] applied.
> 

Probably some cooperation of the DMA engine and the PXA MMC (like that other mail I've sent to you with UDC and g_webcam). I made another version of debug kernel with my own asserts:

In pxamci_dma_irq() before dmaengine_tx_status() call I've put:

pr_info("!!!cookie=%x complete=%x used=%x\n", 
	host->dma_cookie,chan->completed_cookie,chan->cookie);

It gets called only ONCE, just before failure, which values like this:
	cookie=372 complete=371 used=372

vchan_cyclic_callback() is called only with the soundcard in my machine IMO. I first thought there is regression with vd_completed in pxad_chan_handler() being changed during vchan_cyclic_callback() but it seems not.

pxad_tx_status() returns DMA_IN_PROGRESS before the failure.

>> , where finally with interrupts enabled (can pxamci_irq() be called here?) the
>> callback pxamci_dma_irq() is called.
> When DMA completes, there is a tiny window, before pxamci_dma_irq() is called,
> when pxamci_irq() can be called, yes. As soon as the spinlock is taken in
> pxamci_dma_irq() is taken, no more races.
> 

Does pxamci_dma_irq() depend on pxamci_irq() -> pxamci_data_done() NULLing host->data?

>> From my tests it seems at this point [2] the host->data is always NULL and rest
>> of the callback is never called. It is called once with a nonempty host->data
>> only just before the failure.
>>
>> During the testing I put udelay(100) at the start of pxamci_dma_irq() and fail
>> occurred after like 2 hours (when I for the first time tapped the touchscreen -
>> higher CPU usage and interrupts).
> Mmm I would need more data here.
> 
> The biggest help I could get would be the pxa dma traces here :
> echo -n 'file pxa_dma.c +p' > /sys/kernel/debug/dynamic_debug/control
> echo -n 'file virt-dma.c +p' > /sys/kernel/debug/dynamic_debug/control
> 
> And then capture the last traces and send them to me.

Last traces before the failure are in the attachment. It was copied (by a continual dmesg dump) from a ssh terminal over infraport (/var/log/messages gets corrupted after MMC DMA error).

> 
> Cheers.
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg_mmc.log
Type: text/x-log
Size: 28818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170310/9ee74ab3/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
  2017-03-10  0:49   ` Petr Cvek
@ 2017-03-14 21:11     ` Robert Jarzmik
  2017-03-26  2:43       ` Petr Cvek
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Jarzmik @ 2017-03-14 21:11 UTC (permalink / raw)
  To: linux-arm-kernel

Petr Cvek <petr.cvek@tul.cz> writes:

Ok Petr, I've been trying for days to reproduce without any luck.

I had a look at your traces, and I'd like something else when it happens :
 1) The patch I provided earlier applied
 2) This done (the 'cat' after the bug) :
	mount -t debugfs none /sys/kernel/debug/
	cat /sys/kernel/debug/pxa-dma.0/channels/4/[sd]*

Cheers.

--
Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
  2017-03-14 21:11     ` Robert Jarzmik
@ 2017-03-26  2:43       ` Petr Cvek
  2017-04-06  6:42         ` Robert Jarzmik
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Cvek @ 2017-03-26  2:43 UTC (permalink / raw)
  To: linux-arm-kernel

Dne 14.3.2017 v 22:11 Robert Jarzmik napsal(a):
> Petr Cvek <petr.cvek@tul.cz> writes:
> 
> Ok Petr, I've been trying for days to reproduce without any luck.

Hi,

I think I was able to finally find the problem with the PXA MCI and DMA. It seems to be a problem with a race condition with vchan_complete() tasklet.

The pxa_dma driver handles IRQ with pxad_chan_handler(), which calls vchan_cookie_complete(), which schedules the vchan_complete() tasklet. 

Starting the tasklet may take a long time. The race condition appeared during the heavy IRQ load. During that time, the pxamci driver can start another data write transmit. This another transmit with again schedule the tasklet. But as tasklet schedules are not cumulative, it will (probably) add item to a list. 

After some time the tasklet is finally scheduled and for every item in the list the callback pxamci_dma_irq() is called. And there is the main problem, the pxamci_dma_irq() is using __actual__ transmit variables (e.g. host->data). My debug printk shows the code tests a cookie of the actual transmit, which may be in a DMA_IN_PROGRESS state (every failure is during this state).

Solution:

I commented the error handling parts of the callback function and the driver works, but it is only for the testing purposes, there can be a partially filled FIFO (BUF_PART_FULL) which will not be handled. Problem is the driver won't wait on the completion before starting a new transmission. But this waiting on completion will probably slow down the MMC communication :-/.

The best thing would be to handle the partial buffer somewhere else and get rid of the callback completely. If it is possible, probably not as I assume the partially filled buffer will not create pxamci interrupt? In the other case then maybe in pxamci_data_done()?

Log:

	[ 2669.917946] dma dma0chan1: pxad_chan_handler(): checking txd c18c2f20[135f]: completed=1 dcsr=0x2000000c
		^^^ schedules the tasklet
	[ 2669.924255] dma dma0chan1: pxad_chan_handler(): checking txd c1a6b740[1360]: completed=1 dcsr=0x2000000c
		^^^ reschedule the tasklet
	[ 2669.934441] dma dma0chan1: pxad_chan_handler(): checking txd c1a6b880[1361]: completed=1 dcsr=0x2000000c
		^^^ reschedule the tasklet
	[ 2669.944893] dma dma0chan1: pxad_chan_handler(): checking txd c1a78ba0[1362]: completed=1 dcsr=0x2000000c
		^^^ reschedule the tasklet
	[ 2670.081114] ###pre
		^^^ tasklet has finally started
	[ 2670.081187] ###post
		^^^ first item of the list, callback
	[ 2670.081369] !!!cookie=1364 complete=1363 used=1364 ... status=1
		^^^ There it would fail with "DMA error on tx channel"
	[ 2670.081608] ###post
		^^^ The second item of the list
	[ 2670.081678] !!!cookie=1364 complete=1363 used=1364 ... status=1
		^^^ Again called with the same host->data, notice same cookie, status=1 == DMA_IN_PROGRESS (always)

The full log and used debug patch are attached. The machine is PXA272 @ 416MHz, during logging (over irda or usb ssh console) I created multiple interrupt sources by touching a touchscreen and pressing GPIO buttons. A higher bug occurrence was observed with sync-written files of unusual lengths:

	while : ; do
	dd if=/dev/urandom bs=7777 count=11 of=/tmp/file conv=fsync
	done

> 
> I had a look at your traces, and I'd like something else when it happens :
>  1) The patch I provided earlier applied

Yeah it is DMA_IN_PROGRESS in the "bug" case and a few DMA_COMPLETE in the "normal" case, but I don't know if it is a valid value due to the race condition "aliasing". Anyway the status value is written in the "!!!cookie=" printk.

>  2) This done (the 'cat' after the bug) :
> 	mount -t debugfs none /sys/kernel/debug/
> 	cat /sys/kernel/debug/pxa-dma.0/channels/4/[sd]*
> 

This wasn't possible as everything freezes on the vanilla kernel and with my debug patch the DMA transactions just continue.

Cheers,
Petr

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-DMA-MMC-debugs.patch
Type: text/x-patch
Size: 3144 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/e1a9b1c9/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg_mmc4_final.log
Type: text/x-log
Size: 25014 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/e1a9b1c9/attachment-0003.bin>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
  2017-03-26  2:43       ` Petr Cvek
@ 2017-04-06  6:42         ` Robert Jarzmik
  0 siblings, 0 replies; 6+ messages in thread
From: Robert Jarzmik @ 2017-04-06  6:42 UTC (permalink / raw)
  To: linux-arm-kernel

Petr Cvek <petr.cvek@tul.cz> writes:

Hi Petr,

Sorry I was on holidays away from computers.

> I think I was able to finally find the problem with the PXA MCI and DMA. It
> seems to be a problem with a race condition with vchan_complete() tasklet.
I completely agree with this analysis.

... zip the analysis, good catch ...

> Solution:
>
> I commented the error handling parts of the callback function and the driver
> works, but it is only for the testing purposes, there can be a partially
> filled FIFO (BUF_PART_FULL) which will not be handled. Problem is the driver
> won't wait on the completion before starting a new transmission. But this
> waiting on completion will probably slow down the MMC communication :-/.
I has to, the PXA can handle only one request at a time.

> The best thing would be to handle the partial buffer somewhere else and get
> rid of the callback completely. If it is possible, probably not as I assume
> the partially filled buffer will not create pxamci interrupt? In the other
> case then maybe in pxamci_data_done()?
That looks a bit over complicated, but you can try to send a patch for this.
I would try a simpler road if I were you :
 - in pxamci_irq()
 - call pxamci_data_done() _only_ in the error case or if host->data = NULL,
 ie. if "stat" states that the transfer was in error or the transfer is without
 data.

Either way you choose, pxamci_data_done() should only be called once per
transfer, and this is the fix you're aiming at.

Actually the title might be amended, as the bug is not as much in the dmaengine
driver as in the pxamci driver. I was thinking that 6464b7140951 ("mmc: pxamci:
switch over to dmaengine use") had broken it, but it looks to me the same
pattern was there, but without a tasklet, and the bug didn't occurr because of
the fast dma interrupt handling.

Cheers.

-- 
Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-06  6:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-08  6:57 [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel Petr Cvek
2017-03-08 16:43 ` Robert Jarzmik
2017-03-10  0:49   ` Petr Cvek
2017-03-14 21:11     ` Robert Jarzmik
2017-03-26  2:43       ` Petr Cvek
2017-04-06  6:42         ` Robert Jarzmik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox