public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel
@ 2017-03-08  6:57 Petr Cvek
  2017-03-08 16:43 ` Robert Jarzmik
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Cvek @ 2017-03-08  6:57 UTC (permalink / raw)
  To: Robert Jarzmik, vinod.koul, Ulf Hansson, Daniel Mack,
	Haojian Zhuang
  Cc: dmaengine, linux-mmc, linux-arm-kernel

Hello,

PXA27x DMA changes between:

	v4.7
	d52bd54db8be8999df6df5a776f38c4f8b5e9cea
and
	v4.10-rc5
	a4685d2f58e2230d4e27fb2ee581d7ea35e5d046

seems to expose a race condition while using PXA MMC driver on a PXA27x (magician.c machine).

The failure causes one line in the kernel log, after which the filesystem on SD card is inaccessible (and machine too).

	mmc0: DMA error on tx channel

I wasn't able to track the problem to a single patch as the problem occurs at random time (from the boot to like a half an hour) and it's maybe dependent on a level of a battery charge (maybe because of kernel log writes of charging messages).

It seems that most occurrency is during writes on an SD card. Using an SDHC card decreases the time to fail. After failure the OS is unavailable (rootfs in on the card).

>From my poking in the kernel source code it seems there is a probability that pxamci_irq() takes longer to call and its subsequent call pxamci_data_done() isn't fast enough to set [1]
	host->data = NULL;
>From the DMA side, the DMA done interrupt is generated:
	pxad_chan_handler() -> vchan_cookie_complete()
...where a tasklet for vchan_complete() is scheduled, where finally with interrupts enabled (can pxamci_irq() be called here?) the callback pxamci_dma_irq() is called.

>From my tests it seems at this point [2] the host->data is always NULL and rest of the callback is never called. It is called once with a nonempty host->data only just before the failure.

During the testing I put udelay(100) at the start of pxamci_dma_irq() and fail occurred after like 2 hours (when I for the first time tapped the touchscreen - higher CPU usage and interrupts).

[1] http://elixir.free-electrons.com/source/drivers/mmc/host/pxamci.c?v=4.10#L385
[2] http://elixir.free-electrons.com/source/drivers/mmc/host/pxamci.c?v=4.10#L561

Best regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-06  6:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-08  6:57 [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel Petr Cvek
2017-03-08 16:43 ` Robert Jarzmik
2017-03-10  0:49   ` Petr Cvek
2017-03-14 21:11     ` Robert Jarzmik
2017-03-26  2:43       ` Petr Cvek
2017-04-06  6:42         ` Robert Jarzmik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox