From: javi.merino@arm.com (Javi Merino)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] ARM: pl330: Fix a race condition
Date: Tue, 20 Sep 2011 14:36:24 +0100 [thread overview]
Message-ID: <4E7896D8.5090606@arm.com> (raw)
In-Reply-To: <CABb+yY0BGHSMA08uDMNkdSehFkQ_9afzzbkdpPkD0cz3owVWBQ@mail.gmail.com>
On 19/09/11 19:07, Jassi Brar wrote:
> On Mon, Sep 19, 2011 at 10:41 PM, Javi Merino <javi.merino@arm.com> wrote:
>> If two requests have been submitted and one of them is running, if you
>> call pl330_chan_ctrl(ch_id, PL330_OP_START), there's a window of time
>> between the spin_lock_irqsave() and the _state() check in which the
>> running transaction may finish. In that case, we don't receive the
>> interrupt (because they are disabled), but _start() sees that the DMA
>> is stopped, so it starts it. The problem is that it sends the
>> transaction that has just finished again, because pl330_update()
>> hasn't mark it as done yet.
>>
>> This patch moves the _state() check out of the critical section, which
>> removes the race condition. It also treats PL330_STATE_COMPLETING as
>> still executing, because that introduces another race condition now
>> that we call _state() with interrupts enabled. Namely, if we read the
>> state as "completing" and the DMA sends the interrupt before we
>> disable interrupts, pl330_update() starts the next transaction and
>> returns. Then the _start() in pl330_chan_ctrl() will patiently wait
>> until the just issued transaction finishes (because the state we read
>> was PL330_STATE_COMPLETING) and when it does, it _trigger()s the same
>> transaction again.
>>
>> Signed-off-by: Javi Merino <javi.merino@arm.com>
>> Cc: Jassi Brar <jassi.brar@samsung.com>
>> ---
>> arch/arm/common/pl330.c | 12 +++++++-----
>> 1 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/common/pl330.c b/arch/arm/common/pl330.c
>> index 97912fa..26b5615 100644
>> --- a/arch/arm/common/pl330.c
>> +++ b/arch/arm/common/pl330.c
>> @@ -936,9 +936,9 @@ static bool _trigger(struct pl330_thread *thrd)
>> return true;
>> }
>>
>> -static bool _start(struct pl330_thread *thrd)
>> +static bool _start(struct pl330_thread *thrd, u32 state)
>> {
>> - switch (_state(thrd)) {
>> + switch (state) {
>> case PL330_STATE_FAULT_COMPLETING:
>> UNTIL(thrd, PL330_STATE_FAULTING | PL330_STATE_KILLING);
>>
>> @@ -949,7 +949,6 @@ static bool _start(struct pl330_thread *thrd)
>> _stop(thrd);
>>
>> case PL330_STATE_KILLING:
>> - case PL330_STATE_COMPLETING:
>> UNTIL(thrd, PL330_STATE_STOPPED)
>>
>> case PL330_STATE_STOPPED:
>> @@ -961,6 +960,7 @@ static bool _start(struct pl330_thread *thrd)
>> case PL330_STATE_UPDTPC:
>> case PL330_STATE_CACHEMISS:
>> case PL330_STATE_EXECUTING:
>> + case PL330_STATE_COMPLETING:
>> return true;
>>
>> case PL330_STATE_WFE: /* For RESUME, nothing yet */
>> @@ -1471,7 +1471,7 @@ int pl330_update(const struct pl330_info *pi)
>> MARK_FREE(rqdone);
>>
>> /* Get going again ASAP */
>> - _start(thrd);
>> + _start(thrd, _state(thrd));
>>
>> /* For now, just make a list of callbacks to be done */
>> list_add_tail(&rqdone->rqd, &pl330->req_done);
>> @@ -1510,12 +1510,14 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
>> struct pl330_dmac *pl330;
>> unsigned long flags;
>> int ret = 0, active;
>> + u32 dma_state;
>>
>> if (!thrd || thrd->free || thrd->dmac->state == DYING)
>> return -EINVAL;
>>
>> pl330 = thrd->dmac;
>>
>> + dma_state = _state(thrd);
>> spin_lock_irqsave(&pl330->lock, flags);
>>
>> switch (op) {
>> @@ -1546,7 +1548,7 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
>>
>> /* Start the next */
>> case PL330_OP_START:
>> - if (!_start(thrd))
>> + if (!_start(thrd, dma_state))
>> ret = -EIO;
>> break;
>>
>> --
>
> IIUIC your race scenario should be taken care of simply by doing...
>
> diff --git a/arch/arm/common/pl330.c b/arch/arm/common/pl330.c
> index 97912fa..7129cfb 100644
> --- a/arch/arm/common/pl330.c
> +++ b/arch/arm/common/pl330.c
> @@ -1546,7 +1546,7 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
>
> /* Start the next */
> case PL330_OP_START:
> - if (!_start(thrd))
> + if (!_thrd_active(thrd) && !_start(thrd))
> ret = -EIO;
> break;
>
My first reaction was this was just moving the race condition, but
thinking about it carefully, it looks like a better (simpler) solution.
> Could you please test if it fixes the issue?
Yes, I'll kick off runs later today and see how it goes. I don't have a
way to reproduce the bug consistently, I just do lots of DMA
transactions and eventually, I end up hitting it. I'll run it overnight
and see if it fixes it or not.
Cheers,
Javi
next prev parent reply other threads:[~2011-09-20 13:36 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-19 17:11 [PATCH] ARM: pl330: Fix a race condition Javi Merino
2011-09-19 18:07 ` Jassi Brar
2011-09-20 13:36 ` Javi Merino [this message]
2011-10-05 12:57 ` Javi Merino
2011-10-06 9:10 ` [PATCH v2] " Javi Merino
2011-11-05 19:05 ` Thomas Abraham
2011-11-05 19:05 ` Thomas Abraham
2011-11-07 10:48 ` Javi Merino
2011-11-07 10:48 ` Javi Merino
2011-11-07 11:03 ` Thomas Abraham
2011-11-07 11:03 ` Thomas Abraham
2011-11-28 8:23 ` Boojin Kim
2011-11-28 8:23 ` Boojin Kim
2011-11-28 16:36 ` Javi Merino
2011-11-28 16:36 ` Javi Merino
2011-11-29 3:41 ` Boojin Kim
2011-11-29 3:41 ` Boojin Kim
2011-11-29 9:53 ` Javi Merino
2011-11-29 9:53 ` Javi Merino
2011-11-29 10:37 ` Jassi Brar
2011-11-29 10:37 ` Jassi Brar
2011-12-07 7:52 ` Kukjin Kim
2011-12-07 7:52 ` Kukjin Kim
2011-12-07 10:01 ` Javi Merino
2011-12-07 10:01 ` Javi Merino
2011-12-07 20:54 ` Javi Merino
2011-12-07 20:54 ` Javi Merino
2011-12-09 11:58 ` Javi Merino
2011-12-09 11:58 ` Javi Merino
2011-12-09 13:04 ` Jassi Brar
2011-12-09 13:04 ` Jassi Brar
2011-12-09 13:41 ` Javi Merino
2011-12-09 13:41 ` Javi Merino
2011-12-09 14:15 ` Jassi Brar
2011-12-09 14:15 ` Jassi Brar
2011-12-09 14:52 ` Javi Merino
2011-12-09 14:52 ` Javi Merino
2011-12-09 16:50 ` Jassi Brar
2011-12-09 16:50 ` Jassi Brar
2011-12-09 19:50 ` Javi Merino
2011-12-09 19:50 ` Javi Merino
2011-12-11 10:51 ` Jassi Brar
2011-12-11 10:51 ` Jassi Brar
2011-12-11 15:09 ` Javi Merino
2011-12-11 15:09 ` Javi Merino
2011-12-11 17:10 ` Jassi Brar
2011-12-11 17:10 ` Jassi Brar
2011-12-11 17:42 ` Javi Merino
2011-12-11 17:42 ` Javi Merino
2011-12-11 19:27 ` [PATCH] ARM: PL330: Fix driver freeze Javi Merino
2011-12-11 19:27 ` Javi Merino
2011-12-15 17:48 ` Javi Merino
2011-12-15 17:48 ` Javi Merino
2011-12-16 9:01 ` Tushar Behera
2011-12-16 9:01 ` Tushar Behera
2011-12-16 6:27 ` Jassi Brar
2011-12-16 6:27 ` Jassi Brar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E7896D8.5090606@arm.com \
--to=javi.merino@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.