Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>,
	yusuke.goda.sx@renesas.com,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Paul Mundt <lethal@linux-sh.org>, Magnus <magnus.damm@gmail.com>,
	linux-sh@vger.kernel.org,
	Kuninori Morimoto <kuninori.morimoto.gx@gmail.com>,
	linux-mmc@vger.kernel.org, koba@kmckk.co.jp
Subject: Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
Date: Tue, 18 Sep 2012 15:13:59 +0900	[thread overview]
Message-ID: <50581127.2000003@kmckk.co.jp> (raw)
In-Reply-To: <50402A0D.7070106@kmckk.co.jp>

Hello Guennadi

(2012/08/31 12:05), Tetsuyuki Kobayashi wrote:

> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>> interrupts without any active request. To prevent the Oops, that results
>> in such cases, don't dereference the mmc request pointer until we make
>> sure, that we are indeed processing such a request.
>>
>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>> ---
>>
>> Hello Kobayashi-san
>>
>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>
>> ...
>>
>>> After applying this patch on kzm9g board, I got this error regarding
>>> eMMC.
>>> I think this is another problem.
>>>
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 00000008
>>> pgd = c0004000
>>> [00000008] *pgd=00000000
>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>> Modules linked in:
>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>> LR is at irq_thread+0x94/0x16c
>>
>> [snip]
>>
>>> My quick fix is below.
>>>
>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>> index 5d81427..e587fbc 100644
>>> --- a/drivers/mmc/host/sh_mmcif.c
>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>> *dev_id)
>>>   {
>>>          struct sh_mmcif_host *host = dev_id;
>>>          struct mmc_request *mrq = host->mrq;
>>> -       struct mmc_data *data = mrq->data;
>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
>>> pointer access*/
>>> +       struct mmc_data *data;
>>> +
>>> +       /* quick fix by koba */
>>> +       if (mrq == NULL) {
>>> +               printk("sh_mmcif_irqt: mrq == NULL:
>>> host->wait_for=%d\n", host->wait_for);
>>> +       } else {
>>> +               data = mrq->data;
>>> +       }
>>>
>>>          cancel_delayed_work_sync(&host->timeout_work);
>>>
>>>
>>> With this patch, there is no null pointer accesses and got this log.
>>>
>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>    ...
>>>
>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>> There is code such like:
>>>
>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>         host->mrq = NULL;
>>>
>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
>>> MMCIF_WAIT_FOR_REQUEST,
>>> host->mrq = NULL.
>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>> cause null pointer access.
>>>
>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>
>> Thanks for your report and a fix. Could you please double-check, whether
>> the below patch also fixes your problem? Since such spurious interrupts
>> are possible I would commit a check like this one, but in the longer run
>> we want to identify and eliminate them, if possible. But since so far
>> these interrupts only happen on 1 board model and also not on all units
>> and not upon each boot, this could be a bit tricky.
>>
>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>
>> Thanks
>> Guennadi
>>
>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..82bf921 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>   {
>>       struct sh_mmcif_host *host = dev_id;
>>       struct mmc_request *mrq = host->mrq;
>> -    struct mmc_data *data = mrq->data;
>>
>>       cancel_delayed_work_sync(&host->timeout_work);
>>
>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>       case MMCIF_WAIT_FOR_READ_END:
>>       case MMCIF_WAIT_FOR_WRITE_END:
>>           if (host->sd_error)
>> -            data->error = sh_mmcif_error_manage(host);
>> +            mrq->data->error = sh_mmcif_error_manage(host);
>>           break;
>>       default:
>>           BUG();
>>       }
>>
>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>> +        struct mmc_data *data = mrq->data;
>>           if (!mrq->cmd->error && data && !data->error)
>>               data->bytes_xfered =
>>                   data->blocks * data->blksz;
>>
>
> I tried this patch. It seems better.
> But I think this still have potential race condition.
> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
> host->wait_for for new request at the same time.
> How about add this code at the top of sh_mmcif_irqt or before returning
> IRQ_WAKE_THREAD in sh_mmcif_intr ?
>
>      if (host->state == STATE_IDLE)
>          return IRQ_HANDLED;
>
> I will rebase my test environment to v3.6-rc3 or later. Then I will
> send you my .config.
>
How is this?
I hope this fixed in v3.6.

WARNING: multiple messages have this Message-ID (diff)

From: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>,
	yusuke.goda.sx@renesas.com,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Paul Mundt <lethal@linux-sh.org>, Magnus <magnus.damm@gmail.com>,
	linux-sh@vger.kernel.org,
	Kuninori Morimoto <kuninori.morimoto.gx@gmail.com>,
	linux-mmc@vger.kernel.org, koba@kmckk.co.jp
Subject: Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
Date: Tue, 18 Sep 2012 06:13:59 +0000	[thread overview]
Message-ID: <50581127.2000003@kmckk.co.jp> (raw)
In-Reply-To: <50402A0D.7070106@kmckk.co.jp>

Hello Guennadi

(2012/08/31 12:05), Tetsuyuki Kobayashi wrote:

> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>> interrupts without any active request. To prevent the Oops, that results
>> in such cases, don't dereference the mmc request pointer until we make
>> sure, that we are indeed processing such a request.
>>
>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>> ---
>>
>> Hello Kobayashi-san
>>
>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>
>> ...
>>
>>> After applying this patch on kzm9g board, I got this error regarding
>>> eMMC.
>>> I think this is another problem.
>>>
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 00000008
>>> pgd = c0004000
>>> [00000008] *pgd\0000000
>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>> Modules linked in:
>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>> LR is at irq_thread+0x94/0x16c
>>
>> [snip]
>>
>>> My quick fix is below.
>>>
>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>> index 5d81427..e587fbc 100644
>>> --- a/drivers/mmc/host/sh_mmcif.c
>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>> *dev_id)
>>>   {
>>>          struct sh_mmcif_host *host = dev_id;
>>>          struct mmc_request *mrq = host->mrq;
>>> -       struct mmc_data *data = mrq->data;
>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
>>> pointer access*/
>>> +       struct mmc_data *data;
>>> +
>>> +       /* quick fix by koba */
>>> +       if (mrq = NULL) {
>>> +               printk("sh_mmcif_irqt: mrq = NULL:
>>> host->wait_for=%d\n", host->wait_for);
>>> +       } else {
>>> +               data = mrq->data;
>>> +       }
>>>
>>>          cancel_delayed_work_sync(&host->timeout_work);
>>>
>>>
>>> With this patch, there is no null pointer accesses and got this log.
>>>
>>> sh_mmcif_irqt: mrq = NULL: host->wait_for=0
>>> sh_mmcif_irqt: mrq = NULL: host->wait_for=0
>>>    ...
>>>
>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>> There is code such like:
>>>
>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>         host->mrq = NULL;
>>>
>>> So, at the top of sh_mmcif_irqt, if host->wait_for =
>>> MMCIF_WAIT_FOR_REQUEST,
>>> host->mrq = NULL.
>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>> cause null pointer access.
>>>
>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>
>> Thanks for your report and a fix. Could you please double-check, whether
>> the below patch also fixes your problem? Since such spurious interrupts
>> are possible I would commit a check like this one, but in the longer run
>> we want to identify and eliminate them, if possible. But since so far
>> these interrupts only happen on 1 board model and also not on all units
>> and not upon each boot, this could be a bit tricky.
>>
>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>
>> Thanks
>> Guennadi
>>
>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..82bf921 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>   {
>>       struct sh_mmcif_host *host = dev_id;
>>       struct mmc_request *mrq = host->mrq;
>> -    struct mmc_data *data = mrq->data;
>>
>>       cancel_delayed_work_sync(&host->timeout_work);
>>
>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>       case MMCIF_WAIT_FOR_READ_END:
>>       case MMCIF_WAIT_FOR_WRITE_END:
>>           if (host->sd_error)
>> -            data->error = sh_mmcif_error_manage(host);
>> +            mrq->data->error = sh_mmcif_error_manage(host);
>>           break;
>>       default:
>>           BUG();
>>       }
>>
>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>> +        struct mmc_data *data = mrq->data;
>>           if (!mrq->cmd->error && data && !data->error)
>>               data->bytes_xfered >>                   data->blocks * data->blksz;
>>
>
> I tried this patch. It seems better.
> But I think this still have potential race condition.
> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
> host->wait_for for new request at the same time.
> How about add this code at the top of sh_mmcif_irqt or before returning
> IRQ_WAKE_THREAD in sh_mmcif_intr ?
>
>      if (host->state = STATE_IDLE)
>          return IRQ_HANDLED;
>
> I will rebase my test environment to v3.6-rc3 or later. Then I will
> send you my .config.
>
How is this?
I hope this fixed in v3.6.

next prev parent reply	other threads:[~2012-09-18  6:14 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-02  9:50 irqdomain breaks ap4 boot kuninori.morimoto.gx
2012-08-03  5:00 ` Paul Mundt
2012-08-09  4:28 ` Paul Mundt
2012-08-09  4:53 ` Kuninori Morimoto
2012-08-10  6:10 ` Kuninori Morimoto
2012-08-10 12:38 ` Paul Mundt
2012-08-17  5:54   ` kzm9g boot fail (was Re: irqdomain breaks ap4 boot) Tetsuyuki Kobayashi
2012-08-20  1:14     ` Kuninori Morimoto
2012-08-20  3:13     ` Paul Mundt
2012-08-20  4:19     ` Kuninori Morimoto
2012-08-20  4:19     ` Tetsuyuki Kobayashi
2012-08-20  4:38     ` Paul Mundt
2012-08-20  4:45     ` Kuninori Morimoto
2012-08-20  5:24     ` Paul Mundt
2012-08-20  5:33     ` Tetsuyuki Kobayashi
2012-08-20  6:13     ` Kuninori Morimoto
2012-08-20  6:24     ` Kuninori Morimoto
2012-08-20  6:30     ` Paul Mundt
2012-08-20  6:32     ` Tetsuyuki Kobayashi
2012-08-22  6:49       ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
2012-08-22  6:49         ` Guennadi Liakhovetski
2012-08-22 12:16         ` Tetsuyuki Kobayashi
2012-08-22 12:16           ` Tetsuyuki Kobayashi
2012-08-23  7:11           ` Guennadi Liakhovetski
2012-08-23  7:11             ` Guennadi Liakhovetski
2012-09-04  7:40             ` Tetsuyuki Kobayashi
2012-09-04  7:40               ` Tetsuyuki Kobayashi
2012-08-31  3:05         ` Tetsuyuki Kobayashi
2012-08-31  3:05           ` Tetsuyuki Kobayashi
2012-09-18  6:13           ` Tetsuyuki Kobayashi [this message]
2012-09-18  6:13             ` Tetsuyuki Kobayashi
2012-09-18  6:42             ` Guennadi Liakhovetski
2012-09-18  6:42               ` Guennadi Liakhovetski
2012-09-18  8:02               ` Tetsuyuki Kobayashi
2012-09-18  8:02                 ` Tetsuyuki Kobayashi
2012-09-18  8:44                 ` Tetsuyuki Kobayashi
2012-09-18  8:44                   ` Tetsuyuki Kobayashi
2012-09-18  8:56                   ` Guennadi Liakhovetski
2012-09-18  8:56                     ` Guennadi Liakhovetski
2012-09-19  2:50         ` Tetsuyuki Kobayashi
2012-09-19  2:50           ` Tetsuyuki Kobayashi
2012-09-26  1:47           ` Tetsuyuki Kobayashi
2012-09-26  1:47             ` Tetsuyuki Kobayashi
2012-09-26 10:04             ` Chris Ball
2012-09-26 10:04               ` Chris Ball
2012-09-19  6:24         ` Chris Ball
2012-09-19  6:24           ` Chris Ball
2012-09-21  2:35           ` Tetsuyuki Kobayashi
2012-09-21  2:35             ` Tetsuyuki Kobayashi
2012-08-20  7:18     ` kzm9g boot fail (was Re: irqdomain breaks ap4 boot) Magnus Damm
2012-08-20  7:40     ` Paul Mundt
2012-08-20  7:41     ` Kuninori Morimoto
2012-08-20  7:54     ` Paul Mundt
2012-08-20  8:12     ` Kuninori Morimoto
2012-08-20  8:35     ` Kuninori Morimoto
2012-08-21  2:31     ` Kuninori Morimoto
2012-08-21  4:22     ` Tetsuyuki Kobayashi
2012-08-31  6:55 ` irqdomain breaks ap4 boot Tetsuyuki Kobayashi
2012-08-31  7:17 ` Simon Horman
2012-08-31 10:36 ` Paul Mundt
2012-09-18  2:15 ` Tetsuyuki Kobayashi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50581127.2000003@kmckk.co.jp \
    --to=koba@kmckk.co.jp \
    --cc=g.liakhovetski@gmx.de \
    --cc=kuninori.morimoto.gx@gmail.com \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=lethal@linux-sh.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=magnus.damm@gmail.com \
    --cc=yusuke.goda.sx@renesas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.