* Re: [PATCH] sh-sci: fix a race of DMA submit_tx on transfer
[not found] ` <20120315060938.GB10046@linux-sh.org>
@ 2012-03-15 10:50 ` Guennadi Liakhovetski
2012-03-28 6:11 ` Paul Mundt
0 siblings, 1 reply; 2+ messages in thread
From: Guennadi Liakhovetski @ 2012-03-15 10:50 UTC (permalink / raw)
To: Paul Mundt; +Cc: Takashi Yoshii, SH-Linux, linux-serial
Hi Paul
On Thu, 15 Mar 2012, Paul Mundt wrote:
> On Wed, Mar 14, 2012 at 04:14:43PM +0900, Takashi Yoshii wrote:
> > serial: sh-sci: fix a race of DMA submit_tx on transfer
> >
> > When DMA is enabled, sh-sci transfer begins with
> > uart_start()
> > sci_start_tx()
> > if (cookie_tx < 0) schedule_work()
> > Then, starts DMA when wq scheduled, -- (A)
> > process_one_work()
> > work_fn_rx()
> > cookie_tx = desc->submit_tx()
> > And finishes when DMA transfer ends, -- (B)
> > sci_dma_tx_complete()
> > async_tx_ack()
> > cookie_tx = -EINVAL
> > (possible another schedule_work())
> >
> > This A to B sequence is not reentrant, since controlling variables
> > (for example, cookie_tx above) are not queues nor lists. So, they
> > must be invoked as A B A B..., otherwise results in kernel crash.
> >
> > To ensure the sequence, sci_start_tx() seems to test if cookie_tx < 0
> > (represents "not used") to call schedule_work().
> > But cookie_tx will not be set (to a cookie, also means "used") until
> > in the middle of work queue scheduled function work_fn_tx().
> >
> > This gap between the test and set allows the breakage of the sequence
> > under the very frequently call of uart_start().
> > Another gap between async_tx_ack() and another schedule_work() results
> > in the same issue, too.
> >
> > This patch introduces a new condition "cookie_tx == 0" just to mark
> > it is "busy" and assign it within spin-locked region to fill the gaps.
> >
> > Signed-off-by: Takashi Yoshii <takashi.yoshii.zj@renesas.com>
> > Reviewed-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>
> Looks good to me. I'll apply it unless Guennadi has any other concerns.
No, I don't - my ack holds:) However, I do have some concerns regarding a
couple of other possible issues with SCI DMA:
1. As I mentioned earlier, I think, sci_start_tx() should always be called
under the port spinlock to get consistent ->cookie_tx and ->chan_tx
values. This is the case, when called from serial core as
->start_tx() from most locations, but not from uart_handle_cts_change(),
which is an exported function. It is also called internally in the SCI
driver itself several times - with no locking. This might need fixing,
especially in sci_tx_dma_release().
2. The DMA Tx work might need to be cancelled from time to time... E.g.,
when DMA is disabled to switch to PIO, or when shutting down the port.
Both of these might require a bit more thinking and testing.
Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] sh-sci: fix a race of DMA submit_tx on transfer
2012-03-15 10:50 ` [PATCH] sh-sci: fix a race of DMA submit_tx on transfer Guennadi Liakhovetski
@ 2012-03-28 6:11 ` Paul Mundt
0 siblings, 0 replies; 2+ messages in thread
From: Paul Mundt @ 2012-03-28 6:11 UTC (permalink / raw)
To: Guennadi Liakhovetski; +Cc: Takashi Yoshii, SH-Linux, linux-serial
On Thu, Mar 15, 2012 at 11:50:28AM +0100, Guennadi Liakhovetski wrote:
> On Thu, 15 Mar 2012, Paul Mundt wrote:
> > Looks good to me. I'll apply it unless Guennadi has any other concerns.
>
> No, I don't - my ack holds:) However, I do have some concerns regarding a
> couple of other possible issues with SCI DMA:
>
Ok, I've queued it up now (it was too late for -final, so we'll have to
rely on the stable backport later).
> 1. As I mentioned earlier, I think, sci_start_tx() should always be called
> under the port spinlock to get consistent ->cookie_tx and ->chan_tx
> values. This is the case, when called from serial core as
> ->start_tx() from most locations, but not from uart_handle_cts_change(),
> which is an exported function. It is also called internally in the SCI
> driver itself several times - with no locking. This might need fixing,
> especially in sci_tx_dma_release().
>
The bulk of the uart_handle_cts_change() callers do so with the lock
held, the only exception seems to be a few drivers that call it directly
from their interrupt handlers.
At first glance, the sci_tx/rx_dma_release() cases will probably need a
bit of reordering given that dma_release_channel() is taking a list
mutex, but I don't see too many issues otherwise. Having said that,
the whole DMA enable/disable path could probably be split up a bit with
individual toggle logic pushed down in to ->start/stop_tx as well as
->stop_rx for some finer-grained control. This would at least help make
the locking a bit more obvious.
> 2. The DMA Tx work might need to be cancelled from time to time... E.g.,
> when DMA is disabled to switch to PIO, or when shutting down the port.
>
Yes, this is something that needs to be done. I've tried to use the
driver as a module before in the past, and it does blow up rather
spectacularly in the remove case, this is hardly limited to the DMA
case though (and is also not a configuration most people are going to
ever really use, which is why we've largely ignored it thus far). The
PIO<->DMA transition on the other hand we're far more likely to hit,
especially if we end up exposing something like a userspace knob for
enabling/disabling arbitrarily for testing.
Most of the DMA cancelling looks it should be pretty easy to do via
->flush_buffer, unless I'm missing something. The amba-pl011.c driver
does just this for the dmaengine case.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-03-28 6:11 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4F5DC485.8020607@renesas.com>
[not found] ` <Pine.LNX.4.64.1203121109360.3744@axis700.grange>
[not found] ` <4F5F2AC1.8060305@renesas.com>
[not found] ` <Pine.LNX.4.64.1203131227470.14461@axis700.grange>
[not found] ` <4F604563.90000@renesas.com>
[not found] ` <20120315060938.GB10046@linux-sh.org>
2012-03-15 10:50 ` [PATCH] sh-sci: fix a race of DMA submit_tx on transfer Guennadi Liakhovetski
2012-03-28 6:11 ` Paul Mundt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).