[PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts

linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
       [not found]                       ` <5031D9FF.8060801@kmckk.co.jp>
@ 2012-08-22  6:49                         ` Guennadi Liakhovetski
  2012-08-22 12:16                           ` Tetsuyuki Kobayashi
                                             ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Guennadi Liakhovetski @ 2012-08-22  6:49 UTC (permalink / raw)
  To: Tetsuyuki Kobayashi
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious 
interrupts without any active request. To prevent the Oops, that results 
in such cases, don't dereference the mmc request pointer until we make 
sure, that we are indeed processing such a request.

Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
---

Hello Kobayashi-san

On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:

...

> After applying this patch on kzm9g board, I got this error regarding eMMC.
> I think this is another problem.
> 
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000008
> pgd = c0004000
> [00000008] *pgd=00000000
> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> Modules linked in:
> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
> PC is at sh_mmcif_irqt+0x20/0xb30
> LR is at irq_thread+0x94/0x16c

[snip]

> My quick fix is below.
> 
> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> index 5d81427..e587fbc 100644
> --- a/drivers/mmc/host/sh_mmcif.c
> +++ b/drivers/mmc/host/sh_mmcif.c
> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>  {
>         struct sh_mmcif_host *host = dev_id;
>         struct mmc_request *mrq = host->mrq;
> -       struct mmc_data *data = mrq->data;
> +       /*struct mmc_data *data = mrq->data; -- this cause null pointer access*/
> +       struct mmc_data *data;
> +
> +       /* quick fix by koba */
> +       if (mrq == NULL) {
> +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n", host->wait_for);
> +       } else {
> +               data = mrq->data;
> +       }
> 
>         cancel_delayed_work_sync(&host->timeout_work);
> 
> 
> With this patch, there is no null pointer accesses and got this log.
> 
> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>   ...
> 
> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
> There is code such like:
> 
>        host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>        host->mrq = NULL;
> 
> So, at the top of sh_mmcif_irqt, if host->wait_for == MMCIF_WAIT_FOR_REQUEST,
> host->mrq = NULL. 
> It is too earlier to access mrq->data before checking host->mrq. it may
> cause null pointer access.
> 
> Goda-san, could you check this and refine the code of sh_mmcif_irqt?

Thanks for your report and a fix. Could you please double-check, whether 
the below patch also fixes your problem? Since such spurious interrupts 
are possible I would commit a check like this one, but in the longer run 
we want to identify and eliminate them, if possible. But since so far 
these interrupts only happen on 1 board model and also not on all units 
and not upon each boot, this could be a bit tricky.

One more question - is this only needed for 3.7 or also for 3.6 / stable?

Thanks
Guennadi

 drivers/mmc/host/sh_mmcif.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
index 5d81427..82bf921 100644
--- a/drivers/mmc/host/sh_mmcif.c
+++ b/drivers/mmc/host/sh_mmcif.c
@@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
 {
 	struct sh_mmcif_host *host = dev_id;
 	struct mmc_request *mrq = host->mrq;
-	struct mmc_data *data = mrq->data;
 
 	cancel_delayed_work_sync(&host->timeout_work);
 
@@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
 	case MMCIF_WAIT_FOR_READ_END:
 	case MMCIF_WAIT_FOR_WRITE_END:
 		if (host->sd_error)
-			data->error = sh_mmcif_error_manage(host);
+			mrq->data->error = sh_mmcif_error_manage(host);
 		break;
 	default:
 		BUG();
 	}
 
 	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
+		struct mmc_data *data = mrq->data;
 		if (!mrq->cmd->error && data && !data->error)
 			data->bytes_xfered =
 				data->blocks * data->blksz;
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-22  6:49                         ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
@ 2012-08-22 12:16                           ` Tetsuyuki Kobayashi
  2012-08-23  7:11                             ` Guennadi Liakhovetski
  2012-08-31  3:05                           ` Tetsuyuki Kobayashi
                                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-08-22 12:16 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc, koba

Hello Guennadi,

Thank you for your patch. I will test this next week.

 > One more question - is this only needed for 3.7 or also for 3.6 / stable?

I hope this also for 3.6 / stable because it is more robust.
The other hand, we need investigate why this strange interrupt happens.

(2012/08/22 15:49), Guennadi Liakhovetski wrote:
> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> interrupts without any active request. To prevent the Oops, that results
> in such cases, don't dereference the mmc request pointer until we make
> sure, that we are indeed processing such a request.
>
> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> ---
>
> Hello Kobayashi-san
>
> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>
> ...
>
>> After applying this patch on kzm9g board, I got this error regarding eMMC.
>> I think this is another problem.
>>
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000008
>> pgd = c0004000
>> [00000008] *pgd=00000000
>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>> Modules linked in:
>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>> PC is at sh_mmcif_irqt+0x20/0xb30
>> LR is at irq_thread+0x94/0x16c
>
> [snip]
>
>> My quick fix is below.
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..e587fbc 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>>   {
>>          struct sh_mmcif_host *host = dev_id;
>>          struct mmc_request *mrq = host->mrq;
>> -       struct mmc_data *data = mrq->data;
>> +       /*struct mmc_data *data = mrq->data; -- this cause null pointer access*/
>> +       struct mmc_data *data;
>> +
>> +       /* quick fix by koba */
>> +       if (mrq == NULL) {
>> +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n", host->wait_for);
>> +       } else {
>> +               data = mrq->data;
>> +       }
>>
>>          cancel_delayed_work_sync(&host->timeout_work);
>>
>>
>> With this patch, there is no null pointer accesses and got this log.
>>
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>    ...
>>
>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>> There is code such like:
>>
>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>         host->mrq = NULL;
>>
>> So, at the top of sh_mmcif_irqt, if host->wait_for == MMCIF_WAIT_FOR_REQUEST,
>> host->mrq = NULL.
>> It is too earlier to access mrq->data before checking host->mrq. it may
>> cause null pointer access.
>>
>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>
> Thanks for your report and a fix. Could you please double-check, whether
> the below patch also fixes your problem? Since such spurious interrupts
> are possible I would commit a check like this one, but in the longer run
> we want to identify and eliminate them, if possible. But since so far
> these interrupts only happen on 1 board model and also not on all units
> and not upon each boot, this could be a bit tricky.
>
> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>
> Thanks
> Guennadi
>
>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> index 5d81427..82bf921 100644
> --- a/drivers/mmc/host/sh_mmcif.c
> +++ b/drivers/mmc/host/sh_mmcif.c
> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   {
>   	struct sh_mmcif_host *host = dev_id;
>   	struct mmc_request *mrq = host->mrq;
> -	struct mmc_data *data = mrq->data;
>
>   	cancel_delayed_work_sync(&host->timeout_work);
>
> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   	case MMCIF_WAIT_FOR_READ_END:
>   	case MMCIF_WAIT_FOR_WRITE_END:
>   		if (host->sd_error)
> -			data->error = sh_mmcif_error_manage(host);
> +			mrq->data->error = sh_mmcif_error_manage(host);
>   		break;
>   	default:
>   		BUG();
>   	}
>
>   	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> +		struct mmc_data *data = mrq->data;
>   		if (!mrq->cmd->error && data && !data->error)
>   			data->bytes_xfered =
>   				data->blocks * data->blksz;
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-22 12:16                           ` Tetsuyuki Kobayashi
@ 2012-08-23  7:11                             ` Guennadi Liakhovetski
  2012-09-04  7:40                               ` Tetsuyuki Kobayashi
  0 siblings, 1 reply; 15+ messages in thread
From: Guennadi Liakhovetski @ 2012-08-23  7:11 UTC (permalink / raw)
  To: Tetsuyuki Kobayashi
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

Hello Kobayashi-san

On Wed, 22 Aug 2012, Tetsuyuki Kobayashi wrote:

> Hello Guennadi,
> 
> Thank you for your patch. I will test this next week.

Great, thanks very much! I'l also try to fund some time early in September 
to test on my board. Could you please send me your .config kernel 
configuration (off-list)?

Thanks
Guennadi

> > One more question - is this only needed for 3.7 or also for 3.6 / stable?
> 
> I hope this also for 3.6 / stable because it is more robust.
> The other hand, we need investigate why this strange interrupt happens.
> 
> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
> > On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> > interrupts without any active request. To prevent the Oops, that results
> > in such cases, don't dereference the mmc request pointer until we make
> > sure, that we are indeed processing such a request.
> > 
> > Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> > Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> > ---
> > 
> > Hello Kobayashi-san
> > 
> > On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
> > 
> > ...
> > 
> > > After applying this patch on kzm9g board, I got this error regarding eMMC.
> > > I think this is another problem.
> > > 
> > > 
> > > Unable to handle kernel NULL pointer dereference at virtual address
> > > 00000008
> > > pgd = c0004000
> > > [00000008] *pgd=00000000
> > > Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> > > Modules linked in:
> > > CPU: 1    Not tainted  (3.6.0-rc2+ #103)
> > > PC is at sh_mmcif_irqt+0x20/0xb30
> > > LR is at irq_thread+0x94/0x16c
> > 
> > [snip]
> > 
> > > My quick fix is below.
> > > 
> > > diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> > > index 5d81427..e587fbc 100644
> > > --- a/drivers/mmc/host/sh_mmcif.c
> > > +++ b/drivers/mmc/host/sh_mmcif.c
> > > @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > > *dev_id)
> > >   {
> > >          struct sh_mmcif_host *host = dev_id;
> > >          struct mmc_request *mrq = host->mrq;
> > > -       struct mmc_data *data = mrq->data;
> > > +       /*struct mmc_data *data = mrq->data; -- this cause null pointer
> > > access*/
> > > +       struct mmc_data *data;
> > > +
> > > +       /* quick fix by koba */
> > > +       if (mrq == NULL) {
> > > +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n",
> > > host->wait_for);
> > > +       } else {
> > > +               data = mrq->data;
> > > +       }
> > > 
> > >          cancel_delayed_work_sync(&host->timeout_work);
> > > 
> > > 
> > > With this patch, there is no null pointer accesses and got this log.
> > > 
> > > sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> > > sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> > >    ...
> > > 
> > > host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
> > > There is code such like:
> > > 
> > >         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
> > >         host->mrq = NULL;
> > > 
> > > So, at the top of sh_mmcif_irqt, if host->wait_for ==
> > > MMCIF_WAIT_FOR_REQUEST,
> > > host->mrq = NULL.
> > > It is too earlier to access mrq->data before checking host->mrq. it may
> > > cause null pointer access.
> > > 
> > > Goda-san, could you check this and refine the code of sh_mmcif_irqt?
> > 
> > Thanks for your report and a fix. Could you please double-check, whether
> > the below patch also fixes your problem? Since such spurious interrupts
> > are possible I would commit a check like this one, but in the longer run
> > we want to identify and eliminate them, if possible. But since so far
> > these interrupts only happen on 1 board model and also not on all units
> > and not upon each boot, this could be a bit tricky.
> > 
> > One more question - is this only needed for 3.7 or also for 3.6 / stable?
> > 
> > Thanks
> > Guennadi
> > 
> >   drivers/mmc/host/sh_mmcif.c |    4 ++--
> >   1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> > index 5d81427..82bf921 100644
> > --- a/drivers/mmc/host/sh_mmcif.c
> > +++ b/drivers/mmc/host/sh_mmcif.c
> > @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > *dev_id)
> >   {
> >   	struct sh_mmcif_host *host = dev_id;
> >   	struct mmc_request *mrq = host->mrq;
> > -	struct mmc_data *data = mrq->data;
> > 
> >   	cancel_delayed_work_sync(&host->timeout_work);
> > 
> > @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > *dev_id)
> >   	case MMCIF_WAIT_FOR_READ_END:
> >   	case MMCIF_WAIT_FOR_WRITE_END:
> >   		if (host->sd_error)
> > -			data->error = sh_mmcif_error_manage(host);
> > +			mrq->data->error = sh_mmcif_error_manage(host);
> >   		break;
> >   	default:
> >   		BUG();
> >   	}
> > 
> >   	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> > +		struct mmc_data *data = mrq->data;
> >   		if (!mrq->cmd->error && data && !data->error)
> >   			data->bytes_xfered =
> >   				data->blocks * data->blksz;
> > 
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-22  6:49                         ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
  2012-08-22 12:16                           ` Tetsuyuki Kobayashi
@ 2012-08-31  3:05                           ` Tetsuyuki Kobayashi
  2012-09-18  6:13                             ` Tetsuyuki Kobayashi
  2012-09-19  2:50                           ` Tetsuyuki Kobayashi
  2012-09-19  6:24                           ` Chris Ball
  3 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-08-31  3:05 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

Hello Guennadi

(2012/08/22 15:49), Guennadi Liakhovetski wrote:
> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> interrupts without any active request. To prevent the Oops, that results
> in such cases, don't dereference the mmc request pointer until we make
> sure, that we are indeed processing such a request.
>
> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> ---
>
> Hello Kobayashi-san
>
> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>
> ...
>
>> After applying this patch on kzm9g board, I got this error regarding eMMC.
>> I think this is another problem.
>>
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000008
>> pgd = c0004000
>> [00000008] *pgd=00000000
>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>> Modules linked in:
>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>> PC is at sh_mmcif_irqt+0x20/0xb30
>> LR is at irq_thread+0x94/0x16c
>
> [snip]
>
>> My quick fix is below.
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..e587fbc 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>>   {
>>          struct sh_mmcif_host *host = dev_id;
>>          struct mmc_request *mrq = host->mrq;
>> -       struct mmc_data *data = mrq->data;
>> +       /*struct mmc_data *data = mrq->data; -- this cause null pointer access*/
>> +       struct mmc_data *data;
>> +
>> +       /* quick fix by koba */
>> +       if (mrq == NULL) {
>> +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n", host->wait_for);
>> +       } else {
>> +               data = mrq->data;
>> +       }
>>
>>          cancel_delayed_work_sync(&host->timeout_work);
>>
>>
>> With this patch, there is no null pointer accesses and got this log.
>>
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>    ...
>>
>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>> There is code such like:
>>
>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>         host->mrq = NULL;
>>
>> So, at the top of sh_mmcif_irqt, if host->wait_for == MMCIF_WAIT_FOR_REQUEST,
>> host->mrq = NULL.
>> It is too earlier to access mrq->data before checking host->mrq. it may
>> cause null pointer access.
>>
>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>
> Thanks for your report and a fix. Could you please double-check, whether
> the below patch also fixes your problem? Since such spurious interrupts
> are possible I would commit a check like this one, but in the longer run
> we want to identify and eliminate them, if possible. But since so far
> these interrupts only happen on 1 board model and also not on all units
> and not upon each boot, this could be a bit tricky.
>
> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>
> Thanks
> Guennadi
>
>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> index 5d81427..82bf921 100644
> --- a/drivers/mmc/host/sh_mmcif.c
> +++ b/drivers/mmc/host/sh_mmcif.c
> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   {
>   	struct sh_mmcif_host *host = dev_id;
>   	struct mmc_request *mrq = host->mrq;
> -	struct mmc_data *data = mrq->data;
>
>   	cancel_delayed_work_sync(&host->timeout_work);
>
> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   	case MMCIF_WAIT_FOR_READ_END:
>   	case MMCIF_WAIT_FOR_WRITE_END:
>   		if (host->sd_error)
> -			data->error = sh_mmcif_error_manage(host);
> +			mrq->data->error = sh_mmcif_error_manage(host);
>   		break;
>   	default:
>   		BUG();
>   	}
>
>   	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> +		struct mmc_data *data = mrq->data;
>   		if (!mrq->cmd->error && data && !data->error)
>   			data->bytes_xfered =
>   				data->blocks * data->blksz;
>

I tried this patch. It seems better.
But I think this still have potential race condition.
I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to 
host->wait_for for new request at the same time.
How about add this code at the top of sh_mmcif_irqt or before returning 
IRQ_WAKE_THREAD in sh_mmcif_intr ?

	if (host->state == STATE_IDLE)
		return IRQ_HANDLED;

I will rebase my test environment to v3.6-rc3 or later. Then I will
send you my .config.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-23  7:11                             ` Guennadi Liakhovetski
@ 2012-09-04  7:40                               ` Tetsuyuki Kobayashi
  0 siblings, 0 replies; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-04  7:40 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-sh, linux-mmc, koba

[-- Attachment #1: Type: text/plain, Size: 5636 bytes --]

Hello Guennadi,

(2012/08/23 16:11), Guennadi Liakhovetski wrote:> Hello Kobayashi-san
 >
 > On Wed, 22 Aug 2012, Tetsuyuki Kobayashi wrote:
 >
 >> Hello Guennadi,
 >>
 >> Thank you for your patch. I will test this next week.
 >
 > Great, thanks very much! I'l also try to fund some time early in 
September
 > to test on my board. Could you please send me your .config kernel
 > configuration (off-list)?

I attached my .config file for kzm9g board.

My working source tree is:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git
checkout kzm9g branch

apply these 3 patches:

[PATCH] ARM: mach-shmobile: Add compilation support for dtbs using 'make 
dtbs'
http://www.spinics.net/lists/linux-sh/msg12970.html

Paul's irqdomain patch
http://www.spinics.net/lists/linux-sh/msg12760.html

[PATCH] ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
http://www.spinics.net/lists/linux-sh/msg12876.html

and then, apply your patch for sh_mmcif.c

Simon's Booting DT kernel on a non-DT bootloader is helpful, too.
http://www.spinics.net/lists/linux-sh/msg13051.html

I made 1 partition of eMMC by fdisk command and made ext4 file system on it.
The spurious interrupts occurs at boot time and when accesses file sytem
on eMMC.


>> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>>> interrupts without any active request. To prevent the Oops, that results
>>> in such cases, don't dereference the mmc request pointer until we make
>>> sure, that we are indeed processing such a request.
>>>
>>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>>> ---
>>>
>>> Hello Kobayashi-san
>>>
>>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>>
>>> ...
>>>
>>>> After applying this patch on kzm9g board, I got this error regarding eMMC.
>>>> I think this is another problem.
>>>>
>>>>
>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>> 00000008
>>>> pgd = c0004000
>>>> [00000008] *pgd=00000000
>>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>>> Modules linked in:
>>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>>> LR is at irq_thread+0x94/0x16c
>>>
>>> [snip]
>>>
>>>> My quick fix is below.
>>>>
>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>>> index 5d81427..e587fbc 100644
>>>> --- a/drivers/mmc/host/sh_mmcif.c
>>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>> *dev_id)
>>>>    {
>>>>           struct sh_mmcif_host *host = dev_id;
>>>>           struct mmc_request *mrq = host->mrq;
>>>> -       struct mmc_data *data = mrq->data;
>>>> +       /*struct mmc_data *data = mrq->data; -- this cause null pointer
>>>> access*/
>>>> +       struct mmc_data *data;
>>>> +
>>>> +       /* quick fix by koba */
>>>> +       if (mrq == NULL) {
>>>> +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n",
>>>> host->wait_for);
>>>> +       } else {
>>>> +               data = mrq->data;
>>>> +       }
>>>>
>>>>           cancel_delayed_work_sync(&host->timeout_work);
>>>>
>>>>
>>>> With this patch, there is no null pointer accesses and got this log.
>>>>
>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>>     ...
>>>>
>>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>>> There is code such like:
>>>>
>>>>          host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>>          host->mrq = NULL;
>>>>
>>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
>>>> MMCIF_WAIT_FOR_REQUEST,
>>>> host->mrq = NULL.
>>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>>> cause null pointer access.
>>>>
>>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>>
>>> Thanks for your report and a fix. Could you please double-check, whether
>>> the below patch also fixes your problem? Since such spurious interrupts
>>> are possible I would commit a check like this one, but in the longer run
>>> we want to identify and eliminate them, if possible. But since so far
>>> these interrupts only happen on 1 board model and also not on all units
>>> and not upon each boot, this could be a bit tricky.
>>>
>>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>>
>>> Thanks
>>> Guennadi
>>>
>>>    drivers/mmc/host/sh_mmcif.c |    4 ++--
>>>    1 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>> index 5d81427..82bf921 100644
>>> --- a/drivers/mmc/host/sh_mmcif.c
>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>> *dev_id)
>>>    {
>>>    	struct sh_mmcif_host *host = dev_id;
>>>    	struct mmc_request *mrq = host->mrq;
>>> -	struct mmc_data *data = mrq->data;
>>>
>>>    	cancel_delayed_work_sync(&host->timeout_work);
>>>
>>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>> *dev_id)
>>>    	case MMCIF_WAIT_FOR_READ_END:
>>>    	case MMCIF_WAIT_FOR_WRITE_END:
>>>    		if (host->sd_error)
>>> -			data->error = sh_mmcif_error_manage(host);
>>> +			mrq->data->error = sh_mmcif_error_manage(host);
>>>    		break;
>>>    	default:
>>>    		BUG();
>>>    	}
>>>
>>>    	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>>> +		struct mmc_data *data = mrq->data;
>>>    		if (!mrq->cmd->error && data && !data->error)
>>>    			data->bytes_xfered =
>>>    				data->blocks * data->blksz;
>>>



[-- Attachment #2: kzm9g_koba_defconfig --]
[-- Type: text/plain, Size: 3783 bytes --]

# CONFIG_ARM_PATCH_PHYS_VIRT is not set
CONFIG_EXPERIMENTAL=y
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SYSVIPC=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=16
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_EMBEDDED=y
CONFIG_SLAB=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_SHMOBILE=y
CONFIG_KEYBOARD_GPIO_POLLED=y
CONFIG_ARCH_SH73A0=y
CONFIG_MACH_KZM9G=y
CONFIG_MEMORY_START=0x41000000
CONFIG_MEMORY_SIZE=0x1f000000
CONFIG_PL310_ERRATA_588369=y
CONFIG_PL310_ERRATA_727915=y
CONFIG_ARM_ERRATA_743622=y
CONFIG_ARM_ERRATA_751472=y
CONFIG_ARM_ERRATA_754322=y
CONFIG_ARM_ERRATA_754327=y
CONFIG_ARM_ERRATA_764369=y
CONFIG_PL310_ERRATA_769419=y
CONFIG_SMP=y
CONFIG_SCHED_MC=y
CONFIG_PREEMPT=y
CONFIG_AEABI=y
# CONFIG_OABI_COMPAT is not set
CONFIG_HIGHMEM=y
CONFIG_ZBOOT_ROM_TEXT=0x0
CONFIG_ZBOOT_ROM_BSS=0x0
CONFIG_ARM_APPENDED_DTB=y
CONFIG_CMDLINE="console=tty0 console=ttySC4,115200 root=/dev/mmcblk0p2 rootfstype=ext4 rw ip=dhcp ignore_loglevel earlyprintk=sh-sci.4,115200"
CONFIG_CMDLINE_FORCE=y
CONFIG_KEXEC=y
CONFIG_VFP=y
CONFIG_NEON=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_PM_RUNTIME=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
# CONFIG_INET_DIAG is not set
# CONFIG_IPV6 is not set
CONFIG_IRDA=y
CONFIG_SH_IRDA=y
# CONFIG_WIRELESS is not set
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_NETDEVICES=y
CONFIG_SMSC911X=y
# CONFIG_WLAN is not set
CONFIG_INPUT_SPARSEKMAP=y
# CONFIG_INPUT_MOUSEDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_KEYBOARD_ATKBD is not set
# CONFIG_INPUT_MOUSE is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_ST1232=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_SH_SCI=y
CONFIG_SERIAL_SH_SCI_NR_UARTS=9
CONFIG_SERIAL_SH_SCI_CONSOLE=y
# CONFIG_HW_RANDOM is not set
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_SH_MOBILE=y
CONFIG_GPIO_PCF857X=y
# CONFIG_HWMON is not set
CONFIG_FB=y
CONFIG_FB_SH_MOBILE_LCDC=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_LOGO=y
CONFIG_FB_SH_MOBILE_MERAM=y
CONFIG_SOUND=y
CONFIG_SND=y
# CONFIG_SND_SUPPORT_OLD_API is not set
# CONFIG_SND_VERBOSE_PROCFS is not set
# CONFIG_SND_DRIVERS is not set
# CONFIG_SND_ARM is not set
# CONFIG_SND_USB is not set
CONFIG_SND_SOC=y
CONFIG_SND_SOC_SH4_FSI=y
CONFIG_USB=y
CONFIG_USB_R8A66597_HCD=y
CONFIG_USB_RENESAS_USBHS=y
CONFIG_USB_STORAGE=y
CONFIG_USB_GADGET=y
CONFIG_USB_RENESAS_USBHS_UDC=y
CONFIG_USB_ETH=m
CONFIG_USB_MASS_STORAGE=m
CONFIG_MMC=y
# CONFIG_MMC_BLOCK_BOUNCE is not set
CONFIG_MMC_SDHI=y
CONFIG_MMC_SH_MMCIF=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_DRV_RS5C372=y
CONFIG_DMADEVICES=y
CONFIG_SH_DMAE=y
CONFIG_ASYNC_TX_DMA=y
CONFIG_STAGING=y
CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
CONFIG_EXT4_FS=y
CONFIG_VFAT_FS=y
CONFIG_TMPFS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_PRINTK_TIME=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
# CONFIG_SCHED_DEBUG is not set
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_DEBUG_INFO=y
# CONFIG_FTRACE is not set
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_DES=y

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-31  3:05                           ` Tetsuyuki Kobayashi
@ 2012-09-18  6:13                             ` Tetsuyuki Kobayashi
  2012-09-18  6:42                               ` Guennadi Liakhovetski
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-18  6:13 UTC (permalink / raw)
  Cc: Guennadi Liakhovetski, yusuke.goda.sx, Kuninori Morimoto,
	Paul Mundt, Magnus, linux-sh, Kuninori Morimoto, linux-mmc, koba

Hello Guennadi

(2012/08/31 12:05), Tetsuyuki Kobayashi wrote:

> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>> interrupts without any active request. To prevent the Oops, that results
>> in such cases, don't dereference the mmc request pointer until we make
>> sure, that we are indeed processing such a request.
>>
>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>> ---
>>
>> Hello Kobayashi-san
>>
>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>
>> ...
>>
>>> After applying this patch on kzm9g board, I got this error regarding
>>> eMMC.
>>> I think this is another problem.
>>>
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 00000008
>>> pgd = c0004000
>>> [00000008] *pgd=00000000
>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>> Modules linked in:
>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>> LR is at irq_thread+0x94/0x16c
>>
>> [snip]
>>
>>> My quick fix is below.
>>>
>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>> index 5d81427..e587fbc 100644
>>> --- a/drivers/mmc/host/sh_mmcif.c
>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>> *dev_id)
>>>   {
>>>          struct sh_mmcif_host *host = dev_id;
>>>          struct mmc_request *mrq = host->mrq;
>>> -       struct mmc_data *data = mrq->data;
>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
>>> pointer access*/
>>> +       struct mmc_data *data;
>>> +
>>> +       /* quick fix by koba */
>>> +       if (mrq == NULL) {
>>> +               printk("sh_mmcif_irqt: mrq == NULL:
>>> host->wait_for=%d\n", host->wait_for);
>>> +       } else {
>>> +               data = mrq->data;
>>> +       }
>>>
>>>          cancel_delayed_work_sync(&host->timeout_work);
>>>
>>>
>>> With this patch, there is no null pointer accesses and got this log.
>>>
>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>    ...
>>>
>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>> There is code such like:
>>>
>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>         host->mrq = NULL;
>>>
>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
>>> MMCIF_WAIT_FOR_REQUEST,
>>> host->mrq = NULL.
>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>> cause null pointer access.
>>>
>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>
>> Thanks for your report and a fix. Could you please double-check, whether
>> the below patch also fixes your problem? Since such spurious interrupts
>> are possible I would commit a check like this one, but in the longer run
>> we want to identify and eliminate them, if possible. But since so far
>> these interrupts only happen on 1 board model and also not on all units
>> and not upon each boot, this could be a bit tricky.
>>
>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>
>> Thanks
>> Guennadi
>>
>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..82bf921 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>   {
>>       struct sh_mmcif_host *host = dev_id;
>>       struct mmc_request *mrq = host->mrq;
>> -    struct mmc_data *data = mrq->data;
>>
>>       cancel_delayed_work_sync(&host->timeout_work);
>>
>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>> *dev_id)
>>       case MMCIF_WAIT_FOR_READ_END:
>>       case MMCIF_WAIT_FOR_WRITE_END:
>>           if (host->sd_error)
>> -            data->error = sh_mmcif_error_manage(host);
>> +            mrq->data->error = sh_mmcif_error_manage(host);
>>           break;
>>       default:
>>           BUG();
>>       }
>>
>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>> +        struct mmc_data *data = mrq->data;
>>           if (!mrq->cmd->error && data && !data->error)
>>               data->bytes_xfered =
>>                   data->blocks * data->blksz;
>>
>
> I tried this patch. It seems better.
> But I think this still have potential race condition.
> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
> host->wait_for for new request at the same time.
> How about add this code at the top of sh_mmcif_irqt or before returning
> IRQ_WAKE_THREAD in sh_mmcif_intr ?
>
>      if (host->state == STATE_IDLE)
>          return IRQ_HANDLED;
>
> I will rebase my test environment to v3.6-rc3 or later. Then I will
> send you my .config.
>
How is this?
I hope this fixed in v3.6.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-18  6:13                             ` Tetsuyuki Kobayashi
@ 2012-09-18  6:42                               ` Guennadi Liakhovetski
  2012-09-18  8:02                                 ` Tetsuyuki Kobayashi
  0 siblings, 1 reply; 15+ messages in thread
From: Guennadi Liakhovetski @ 2012-09-18  6:42 UTC (permalink / raw)
  To: Tetsuyuki Kobayashi
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

Hello Kobayashi-san

On Tue, 18 Sep 2012, Tetsuyuki Kobayashi wrote:

> Hello Guennadi
> 
> (2012/08/31 12:05), Tetsuyuki Kobayashi wrote:
> 
> > (2012/08/22 15:49), Guennadi Liakhovetski wrote:
> > > On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> > > interrupts without any active request. To prevent the Oops, that results
> > > in such cases, don't dereference the mmc request pointer until we make
> > > sure, that we are indeed processing such a request.
> > > 
> > > Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> > > Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> > > ---
> > > 
> > > Hello Kobayashi-san
> > > 
> > > On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
> > > 
> > > ...
> > > 
> > > > After applying this patch on kzm9g board, I got this error regarding
> > > > eMMC.
> > > > I think this is another problem.
> > > > 
> > > > 
> > > > Unable to handle kernel NULL pointer dereference at virtual address
> > > > 00000008
> > > > pgd = c0004000
> > > > [00000008] *pgd=00000000
> > > > Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> > > > Modules linked in:
> > > > CPU: 1    Not tainted  (3.6.0-rc2+ #103)
> > > > PC is at sh_mmcif_irqt+0x20/0xb30
> > > > LR is at irq_thread+0x94/0x16c
> > > 
> > > [snip]
> > > 
> > > > My quick fix is below.
> > > > 
> > > > diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> > > > index 5d81427..e587fbc 100644
> > > > --- a/drivers/mmc/host/sh_mmcif.c
> > > > +++ b/drivers/mmc/host/sh_mmcif.c
> > > > @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > > > *dev_id)
> > > >   {
> > > >          struct sh_mmcif_host *host = dev_id;
> > > >          struct mmc_request *mrq = host->mrq;
> > > > -       struct mmc_data *data = mrq->data;
> > > > +       /*struct mmc_data *data = mrq->data; -- this cause null
> > > > pointer access*/
> > > > +       struct mmc_data *data;
> > > > +
> > > > +       /* quick fix by koba */
> > > > +       if (mrq == NULL) {
> > > > +               printk("sh_mmcif_irqt: mrq == NULL:
> > > > host->wait_for=%d\n", host->wait_for);
> > > > +       } else {
> > > > +               data = mrq->data;
> > > > +       }
> > > > 
> > > >          cancel_delayed_work_sync(&host->timeout_work);
> > > > 
> > > > 
> > > > With this patch, there is no null pointer accesses and got this log.
> > > > 
> > > > sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> > > > sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> > > >    ...
> > > > 
> > > > host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
> > > > There is code such like:
> > > > 
> > > >         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
> > > >         host->mrq = NULL;
> > > > 
> > > > So, at the top of sh_mmcif_irqt, if host->wait_for ==
> > > > MMCIF_WAIT_FOR_REQUEST,
> > > > host->mrq = NULL.
> > > > It is too earlier to access mrq->data before checking host->mrq. it may
> > > > cause null pointer access.
> > > > 
> > > > Goda-san, could you check this and refine the code of sh_mmcif_irqt?
> > > 
> > > Thanks for your report and a fix. Could you please double-check, whether
> > > the below patch also fixes your problem? Since such spurious interrupts
> > > are possible I would commit a check like this one, but in the longer run
> > > we want to identify and eliminate them, if possible. But since so far
> > > these interrupts only happen on 1 board model and also not on all units
> > > and not upon each boot, this could be a bit tricky.
> > > 
> > > One more question - is this only needed for 3.7 or also for 3.6 / stable?
> > > 
> > > Thanks
> > > Guennadi
> > > 
> > >   drivers/mmc/host/sh_mmcif.c |    4 ++--
> > >   1 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> > > index 5d81427..82bf921 100644
> > > --- a/drivers/mmc/host/sh_mmcif.c
> > > +++ b/drivers/mmc/host/sh_mmcif.c
> > > @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > > *dev_id)
> > >   {
> > >       struct sh_mmcif_host *host = dev_id;
> > >       struct mmc_request *mrq = host->mrq;
> > > -    struct mmc_data *data = mrq->data;
> > > 
> > >       cancel_delayed_work_sync(&host->timeout_work);
> > > 
> > > @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> > > *dev_id)
> > >       case MMCIF_WAIT_FOR_READ_END:
> > >       case MMCIF_WAIT_FOR_WRITE_END:
> > >           if (host->sd_error)
> > > -            data->error = sh_mmcif_error_manage(host);
> > > +            mrq->data->error = sh_mmcif_error_manage(host);
> > >           break;
> > >       default:
> > >           BUG();
> > >       }
> > > 
> > >       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> > > +        struct mmc_data *data = mrq->data;
> > >           if (!mrq->cmd->error && data && !data->error)
> > >               data->bytes_xfered =
> > >                   data->blocks * data->blksz;
> > > 
> > 
> > I tried this patch. It seems better.
> > But I think this still have potential race condition.
> > I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
> > host->wait_for for new request at the same time.
> > How about add this code at the top of sh_mmcif_irqt or before returning
> > IRQ_WAKE_THREAD in sh_mmcif_intr ?
> > 
> >      if (host->state == STATE_IDLE)
> >          return IRQ_HANDLED;
> > 
> > I will rebase my test environment to v3.6-rc3 or later. Then I will
> > send you my .config.
> > 
> How is this?
> I hope this fixed in v3.6.

Sorry, I still haven't come round to looking at this. I think, one thing 
could halp, if you could try to find out what exactly those spurious 
interrupts look like, i.e., what's their interrupt status? You could try 
to print like

diff -u a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
--- a/drivers/mmc/host/sh_mmcif.c
+++ b/drivers/mmc/host/sh_mmcif.c
@@ -1229,6 +1229,10 @@
 		host->sd_error = true;
 		dev_dbg(&host->pd->dev, "int err state = %08x\n", state);
 	}
+	if (host->state == STATE_IDLE) {
+		dev_info(&host->pd->dev, "Spurious IRQ status 0x%x", state);
+		return IRQ_HANDLED;
+	}
 	if (state & ~(INT_CMD12RBE | INT_CMD12CRE)) {
 		if (!host->dma_active)
 			return IRQ_WAKE_THREAD;

Please, let me know, if it's not very easy for you ATM to perform the 
test, I'll try to do it myself then.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-18  6:42                               ` Guennadi Liakhovetski
@ 2012-09-18  8:02                                 ` Tetsuyuki Kobayashi
  2012-09-18  8:44                                   ` Tetsuyuki Kobayashi
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-18  8:02 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc, Tetsuyuki Kobayashi

Hello Guennadi

(09/18/2012 03:42 PM), Guennadi Liakhovetski wrote:
> Hello Kobayashi-san
> 
> On Tue, 18 Sep 2012, Tetsuyuki Kobayashi wrote:
> 
>> Hello Guennadi
>>
>> (2012/08/31 12:05), Tetsuyuki Kobayashi wrote:
>>
>>> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>>>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>>>> interrupts without any active request. To prevent the Oops, that results
>>>> in such cases, don't dereference the mmc request pointer until we make
>>>> sure, that we are indeed processing such a request.
>>>>
>>>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>>>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>>>> ---
>>>>
>>>> Hello Kobayashi-san
>>>>
>>>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>>>
>>>> ...
>>>>
>>>>> After applying this patch on kzm9g board, I got this error regarding
>>>>> eMMC.
>>>>> I think this is another problem.
>>>>>
>>>>>
>>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>>> 00000008
>>>>> pgd = c0004000
>>>>> [00000008] *pgd=00000000
>>>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>>>> Modules linked in:
>>>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>>>> LR is at irq_thread+0x94/0x16c
>>>>
>>>> [snip]
>>>>
>>>>> My quick fix is below.
>>>>>
>>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>>>> index 5d81427..e587fbc 100644
>>>>> --- a/drivers/mmc/host/sh_mmcif.c
>>>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>>> *dev_id)
>>>>>   {
>>>>>          struct sh_mmcif_host *host = dev_id;
>>>>>          struct mmc_request *mrq = host->mrq;
>>>>> -       struct mmc_data *data = mrq->data;
>>>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
>>>>> pointer access*/
>>>>> +       struct mmc_data *data;
>>>>> +
>>>>> +       /* quick fix by koba */
>>>>> +       if (mrq == NULL) {
>>>>> +               printk("sh_mmcif_irqt: mrq == NULL:
>>>>> host->wait_for=%d\n", host->wait_for);
>>>>> +       } else {
>>>>> +               data = mrq->data;
>>>>> +       }
>>>>>
>>>>>          cancel_delayed_work_sync(&host->timeout_work);
>>>>>
>>>>>
>>>>> With this patch, there is no null pointer accesses and got this log.
>>>>>
>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>>>    ...
>>>>>
>>>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>>>> There is code such like:
>>>>>
>>>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>>>         host->mrq = NULL;
>>>>>
>>>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
>>>>> MMCIF_WAIT_FOR_REQUEST,
>>>>> host->mrq = NULL.
>>>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>>>> cause null pointer access.
>>>>>
>>>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>>>
>>>> Thanks for your report and a fix. Could you please double-check, whether
>>>> the below patch also fixes your problem? Since such spurious interrupts
>>>> are possible I would commit a check like this one, but in the longer run
>>>> we want to identify and eliminate them, if possible. But since so far
>>>> these interrupts only happen on 1 board model and also not on all units
>>>> and not upon each boot, this could be a bit tricky.
>>>>
>>>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>>>
>>>> Thanks
>>>> Guennadi
>>>>
>>>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>>> index 5d81427..82bf921 100644
>>>> --- a/drivers/mmc/host/sh_mmcif.c
>>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>> *dev_id)
>>>>   {
>>>>       struct sh_mmcif_host *host = dev_id;
>>>>       struct mmc_request *mrq = host->mrq;
>>>> -    struct mmc_data *data = mrq->data;
>>>>
>>>>       cancel_delayed_work_sync(&host->timeout_work);
>>>>
>>>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>> *dev_id)
>>>>       case MMCIF_WAIT_FOR_READ_END:
>>>>       case MMCIF_WAIT_FOR_WRITE_END:
>>>>           if (host->sd_error)
>>>> -            data->error = sh_mmcif_error_manage(host);
>>>> +            mrq->data->error = sh_mmcif_error_manage(host);
>>>>           break;
>>>>       default:
>>>>           BUG();
>>>>       }
>>>>
>>>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>>>> +        struct mmc_data *data = mrq->data;
>>>>           if (!mrq->cmd->error && data && !data->error)
>>>>               data->bytes_xfered =
>>>>                   data->blocks * data->blksz;
>>>>
>>>
>>> I tried this patch. It seems better.
>>> But I think this still have potential race condition.
>>> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
>>> host->wait_for for new request at the same time.
>>> How about add this code at the top of sh_mmcif_irqt or before returning
>>> IRQ_WAKE_THREAD in sh_mmcif_intr ?
>>>
>>>      if (host->state == STATE_IDLE)
>>>          return IRQ_HANDLED;
>>>
>>> I will rebase my test environment to v3.6-rc3 or later. Then I will
>>> send you my .config.
>>>
>> How is this?
>> I hope this fixed in v3.6.
> 
> Sorry, I still haven't come round to looking at this. I think, one thing 
> could halp, if you could try to find out what exactly those spurious 
> interrupts look like, i.e., what's their interrupt status? You could try 
> to print like
> 
> diff -u a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> --- a/drivers/mmc/host/sh_mmcif.c
> +++ b/drivers/mmc/host/sh_mmcif.c
> @@ -1229,6 +1229,10 @@
>  		host->sd_error = true;
>  		dev_dbg(&host->pd->dev, "int err state = %08x\n", state);
>  	}
> +	if (host->state == STATE_IDLE) {
> +		dev_info(&host->pd->dev, "Spurious IRQ status 0x%x", state);
> +		return IRQ_HANDLED;
> +	}
>  	if (state & ~(INT_CMD12RBE | INT_CMD12CRE)) {
>  		if (!host->dma_active)
>  			return IRQ_WAKE_THREAD;
> 
> Please, let me know, if it's not very easy for you ATM to perform the 
> test, I'll try to do it myself then.

OK. It is easy for me.
I got this log after mounting /dev/mmcblk2p1 (on eMMC) and executing
tar command to make file accesses.
This is based on v3.6-rc6.

[  149.968750] EXT4-fs (mmcblk2p1): warning: maximal mount count reached, running e2fsck is recommended
[  150.296875] EXT4-fs (mmcblk2p1): mounted filesystem with ordered data mode. Opts: (null)
[  221.539062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  221.585937] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.039062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.226562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.382812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.109375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.406250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.734375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.750000] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  224.398437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  230.289062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-18  8:02                                 ` Tetsuyuki Kobayashi
@ 2012-09-18  8:44                                   ` Tetsuyuki Kobayashi
  2012-09-18  8:56                                     ` Guennadi Liakhovetski
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-18  8:44 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Tetsuyuki Kobayashi, yusuke.goda.sx, Kuninori Morimoto,
	Paul Mundt, Magnus, linux-sh, Kuninori Morimoto, linux-mmc

Hello Guennadi

(09/18/2012 05:02 PM), Tetsuyuki Kobayashi wrote:

>>>> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>>>>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>>>>> interrupts without any active request. To prevent the Oops, that results
>>>>> in such cases, don't dereference the mmc request pointer until we make
>>>>> sure, that we are indeed processing such a request.
>>>>>
>>>>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>>>>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>>>>> ---
>>>>>
>>>>> Hello Kobayashi-san
>>>>>
>>>>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>> After applying this patch on kzm9g board, I got this error regarding
>>>>>> eMMC.
>>>>>> I think this is another problem.
>>>>>>
>>>>>>
>>>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>>>> 00000008
>>>>>> pgd = c0004000
>>>>>> [00000008] *pgd=00000000
>>>>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>>>>>> Modules linked in:
>>>>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>>>>>> PC is at sh_mmcif_irqt+0x20/0xb30
>>>>>> LR is at irq_thread+0x94/0x16c
>>>>>
>>>>> [snip]
>>>>>
>>>>>> My quick fix is below.
>>>>>>
>>>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>>>>> index 5d81427..e587fbc 100644
>>>>>> --- a/drivers/mmc/host/sh_mmcif.c
>>>>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>>>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>>>> *dev_id)
>>>>>>   {
>>>>>>          struct sh_mmcif_host *host = dev_id;
>>>>>>          struct mmc_request *mrq = host->mrq;
>>>>>> -       struct mmc_data *data = mrq->data;
>>>>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
>>>>>> pointer access*/
>>>>>> +       struct mmc_data *data;
>>>>>> +
>>>>>> +       /* quick fix by koba */
>>>>>> +       if (mrq == NULL) {
>>>>>> +               printk("sh_mmcif_irqt: mrq == NULL:
>>>>>> host->wait_for=%d\n", host->wait_for);
>>>>>> +       } else {
>>>>>> +               data = mrq->data;
>>>>>> +       }
>>>>>>
>>>>>>          cancel_delayed_work_sync(&host->timeout_work);
>>>>>>
>>>>>>
>>>>>> With this patch, there is no null pointer accesses and got this log.
>>>>>>
>>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>>>>>    ...
>>>>>>
>>>>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>>>>>> There is code such like:
>>>>>>
>>>>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>>>>>         host->mrq = NULL;
>>>>>>
>>>>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
>>>>>> MMCIF_WAIT_FOR_REQUEST,
>>>>>> host->mrq = NULL.
>>>>>> It is too earlier to access mrq->data before checking host->mrq. it may
>>>>>> cause null pointer access.
>>>>>>
>>>>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>>>>>
>>>>> Thanks for your report and a fix. Could you please double-check, whether
>>>>> the below patch also fixes your problem? Since such spurious interrupts
>>>>> are possible I would commit a check like this one, but in the longer run
>>>>> we want to identify and eliminate them, if possible. But since so far
>>>>> these interrupts only happen on 1 board model and also not on all units
>>>>> and not upon each boot, this could be a bit tricky.
>>>>>
>>>>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>>>>>
>>>>> Thanks
>>>>> Guennadi
>>>>>
>>>>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>>>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>>>>> index 5d81427..82bf921 100644
>>>>> --- a/drivers/mmc/host/sh_mmcif.c
>>>>> +++ b/drivers/mmc/host/sh_mmcif.c
>>>>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>>> *dev_id)
>>>>>   {
>>>>>       struct sh_mmcif_host *host = dev_id;
>>>>>       struct mmc_request *mrq = host->mrq;
>>>>> -    struct mmc_data *data = mrq->data;
>>>>>
>>>>>       cancel_delayed_work_sync(&host->timeout_work);
>>>>>
>>>>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
>>>>> *dev_id)
>>>>>       case MMCIF_WAIT_FOR_READ_END:
>>>>>       case MMCIF_WAIT_FOR_WRITE_END:
>>>>>           if (host->sd_error)
>>>>> -            data->error = sh_mmcif_error_manage(host);
>>>>> +            mrq->data->error = sh_mmcif_error_manage(host);
>>>>>           break;
>>>>>       default:
>>>>>           BUG();
>>>>>       }
>>>>>
>>>>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>>>>> +        struct mmc_data *data = mrq->data;
>>>>>           if (!mrq->cmd->error && data && !data->error)
>>>>>               data->bytes_xfered =
>>>>>                   data->blocks * data->blksz;
>>>>>
>>>>
>>>> I tried this patch. It seems better.
>>>> But I think this still have potential race condition.
>>>> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
>>>> host->wait_for for new request at the same time.
>>>> How about add this code at the top of sh_mmcif_irqt or before returning
>>>> IRQ_WAKE_THREAD in sh_mmcif_intr ?
>>>>
>>>>      if (host->state == STATE_IDLE)
>>>>          return IRQ_HANDLED;
>>>>
>>>> I will rebase my test environment to v3.6-rc3 or later. Then I will
>>>> send you my .config.
>>>>
>>> How is this?
>>> I hope this fixed in v3.6.
>>
>> Sorry, I still haven't come round to looking at this. I think, one thing 
>> could halp, if you could try to find out what exactly those spurious 
>> interrupts look like, i.e., what's their interrupt status? You could try 
>> to print like
>>
>> diff -u a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1229,6 +1229,10 @@
>>  		host->sd_error = true;
>>  		dev_dbg(&host->pd->dev, "int err state = %08x\n", state);
>>  	}
>> +	if (host->state == STATE_IDLE) {
>> +		dev_info(&host->pd->dev, "Spurious IRQ status 0x%x", state);
>> +		return IRQ_HANDLED;
>> +	}
>>  	if (state & ~(INT_CMD12RBE | INT_CMD12CRE)) {
>>  		if (!host->dma_active)
>>  			return IRQ_WAKE_THREAD;
>>
>> Please, let me know, if it's not very easy for you ATM to perform the 
>> test, I'll try to do it myself then.
> 
> OK. It is easy for me.
> I got this log after mounting /dev/mmcblk2p1 (on eMMC) and executing
> tar command to make file accesses.
> This is based on v3.6-rc6.
> 
> [  149.968750] EXT4-fs (mmcblk2p1): warning: maximal mount count reached, running e2fsck is recommended
> [  150.296875] EXT4-fs (mmcblk2p1): mounted filesystem with ordered data mode. Opts: (null)
> [  221.539062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  221.585937] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  222.039062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  222.226562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  222.382812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  223.109375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  223.406250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  223.734375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  223.750000] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  224.398437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> [  230.289062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> 
It might too earlier to report. I got this log.

[  149.968750] EXT4-fs (mmcblk2p1): warning: maximal mount count reached, running e2fsck is recommended
[  150.296875] EXT4-fs (mmcblk2p1): mounted filesystem with ordered data mode. Opts: (null)
[  221.539062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  221.585937] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.039062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.226562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  222.382812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.109375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.406250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.734375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  223.750000] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  224.398437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  230.289062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[  998.554687] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1063.695312] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1663.203125] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1675.007812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1683.898437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1684.187500] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1684.343750] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1686.390625] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1687.843750] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1687.921875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1687.976562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1692.101562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1692.281250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1692.476562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1692.960937] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1693.062500] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1696.195312] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1702.140625] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3800000
[ 1702.148437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1719.453125] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1719.476562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.171875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3800000
[ 1721.171875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.484375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.507812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.781250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.804687] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.828125] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1721.851562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1722.171875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1723.437500] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1723.453125] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1723.484375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1723.507812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1723.531250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
[ 1934.296875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[ 1935.125000] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
[ 1968.226562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000


After that, I tried the same thing without DMA by comment out like this.
In this case spurious IRQ never occurred. 

diff --git a/arch/arm/mach-shmobile/board-kzm9g.c b/arch/arm/mach-shmobile/board
index 765f60a..d5e6609 100644
--- a/arch/arm/mach-shmobile/board-kzm9g.c
+++ b/arch/arm/mach-shmobile/board-kzm9g.c
@@ -389,8 +389,8 @@ static struct resource sh_mmcif_resources[] = {
 static struct sh_mmcif_plat_data sh_mmcif_platdata = {
        .ocr            = MMC_VDD_165_195,
        .caps           = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE,
-       .slave_id_tx    = SHDMA_SLAVE_MMCIF_TX,
-       .slave_id_rx    = SHDMA_SLAVE_MMCIF_RX,
+       /*.slave_id_tx  = SHDMA_SLAVE_MMCIF_TX,*/
+       /*.slave_id_rx  = SHDMA_SLAVE_MMCIF_RX,*/
 };




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-18  8:44                                   ` Tetsuyuki Kobayashi
@ 2012-09-18  8:56                                     ` Guennadi Liakhovetski
  0 siblings, 0 replies; 15+ messages in thread
From: Guennadi Liakhovetski @ 2012-09-18  8:56 UTC (permalink / raw)
  To: Tetsuyuki Kobayashi
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

On Tue, 18 Sep 2012, Tetsuyuki Kobayashi wrote:

> Hello Guennadi
> 
> (09/18/2012 05:02 PM), Tetsuyuki Kobayashi wrote:
> 
> >>>> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
> >>>>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> >>>>> interrupts without any active request. To prevent the Oops, that results
> >>>>> in such cases, don't dereference the mmc request pointer until we make
> >>>>> sure, that we are indeed processing such a request.
> >>>>>
> >>>>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> >>>>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> >>>>> ---
> >>>>>
> >>>>> Hello Kobayashi-san
> >>>>>
> >>>>> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
> >>>>>
> >>>>> ...
> >>>>>
> >>>>>> After applying this patch on kzm9g board, I got this error regarding
> >>>>>> eMMC.
> >>>>>> I think this is another problem.
> >>>>>>
> >>>>>>
> >>>>>> Unable to handle kernel NULL pointer dereference at virtual address
> >>>>>> 00000008
> >>>>>> pgd = c0004000
> >>>>>> [00000008] *pgd=00000000
> >>>>>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> >>>>>> Modules linked in:
> >>>>>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
> >>>>>> PC is at sh_mmcif_irqt+0x20/0xb30
> >>>>>> LR is at irq_thread+0x94/0x16c
> >>>>>
> >>>>> [snip]
> >>>>>
> >>>>>> My quick fix is below.
> >>>>>>
> >>>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> >>>>>> index 5d81427..e587fbc 100644
> >>>>>> --- a/drivers/mmc/host/sh_mmcif.c
> >>>>>> +++ b/drivers/mmc/host/sh_mmcif.c
> >>>>>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> >>>>>> *dev_id)
> >>>>>>   {
> >>>>>>          struct sh_mmcif_host *host = dev_id;
> >>>>>>          struct mmc_request *mrq = host->mrq;
> >>>>>> -       struct mmc_data *data = mrq->data;
> >>>>>> +       /*struct mmc_data *data = mrq->data; -- this cause null
> >>>>>> pointer access*/
> >>>>>> +       struct mmc_data *data;
> >>>>>> +
> >>>>>> +       /* quick fix by koba */
> >>>>>> +       if (mrq == NULL) {
> >>>>>> +               printk("sh_mmcif_irqt: mrq == NULL:
> >>>>>> host->wait_for=%d\n", host->wait_for);
> >>>>>> +       } else {
> >>>>>> +               data = mrq->data;
> >>>>>> +       }
> >>>>>>
> >>>>>>          cancel_delayed_work_sync(&host->timeout_work);
> >>>>>>
> >>>>>>
> >>>>>> With this patch, there is no null pointer accesses and got this log.
> >>>>>>
> >>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> >>>>>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
> >>>>>>    ...
> >>>>>>
> >>>>>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
> >>>>>> There is code such like:
> >>>>>>
> >>>>>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
> >>>>>>         host->mrq = NULL;
> >>>>>>
> >>>>>> So, at the top of sh_mmcif_irqt, if host->wait_for ==
> >>>>>> MMCIF_WAIT_FOR_REQUEST,
> >>>>>> host->mrq = NULL.
> >>>>>> It is too earlier to access mrq->data before checking host->mrq. it may
> >>>>>> cause null pointer access.
> >>>>>>
> >>>>>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
> >>>>>
> >>>>> Thanks for your report and a fix. Could you please double-check, whether
> >>>>> the below patch also fixes your problem? Since such spurious interrupts
> >>>>> are possible I would commit a check like this one, but in the longer run
> >>>>> we want to identify and eliminate them, if possible. But since so far
> >>>>> these interrupts only happen on 1 board model and also not on all units
> >>>>> and not upon each boot, this could be a bit tricky.
> >>>>>
> >>>>> One more question - is this only needed for 3.7 or also for 3.6 / stable?
> >>>>>
> >>>>> Thanks
> >>>>> Guennadi
> >>>>>
> >>>>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
> >>>>>   1 files changed, 2 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> >>>>> index 5d81427..82bf921 100644
> >>>>> --- a/drivers/mmc/host/sh_mmcif.c
> >>>>> +++ b/drivers/mmc/host/sh_mmcif.c
> >>>>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> >>>>> *dev_id)
> >>>>>   {
> >>>>>       struct sh_mmcif_host *host = dev_id;
> >>>>>       struct mmc_request *mrq = host->mrq;
> >>>>> -    struct mmc_data *data = mrq->data;
> >>>>>
> >>>>>       cancel_delayed_work_sync(&host->timeout_work);
> >>>>>
> >>>>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void
> >>>>> *dev_id)
> >>>>>       case MMCIF_WAIT_FOR_READ_END:
> >>>>>       case MMCIF_WAIT_FOR_WRITE_END:
> >>>>>           if (host->sd_error)
> >>>>> -            data->error = sh_mmcif_error_manage(host);
> >>>>> +            mrq->data->error = sh_mmcif_error_manage(host);
> >>>>>           break;
> >>>>>       default:
> >>>>>           BUG();
> >>>>>       }
> >>>>>
> >>>>>       if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> >>>>> +        struct mmc_data *data = mrq->data;
> >>>>>           if (!mrq->cmd->error && data && !data->error)
> >>>>>               data->bytes_xfered =
> >>>>>                   data->blocks * data->blksz;
> >>>>>
> >>>>
> >>>> I tried this patch. It seems better.
> >>>> But I think this still have potential race condition.
> >>>> I am afraid that one cpu enter sh_mmcif_irqt and other cpu write to
> >>>> host->wait_for for new request at the same time.
> >>>> How about add this code at the top of sh_mmcif_irqt or before returning
> >>>> IRQ_WAKE_THREAD in sh_mmcif_intr ?
> >>>>
> >>>>      if (host->state == STATE_IDLE)
> >>>>          return IRQ_HANDLED;
> >>>>
> >>>> I will rebase my test environment to v3.6-rc3 or later. Then I will
> >>>> send you my .config.
> >>>>
> >>> How is this?
> >>> I hope this fixed in v3.6.
> >>
> >> Sorry, I still haven't come round to looking at this. I think, one thing 
> >> could halp, if you could try to find out what exactly those spurious 
> >> interrupts look like, i.e., what's their interrupt status? You could try 
> >> to print like
> >>
> >> diff -u a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> >> --- a/drivers/mmc/host/sh_mmcif.c
> >> +++ b/drivers/mmc/host/sh_mmcif.c
> >> @@ -1229,6 +1229,10 @@
> >>  		host->sd_error = true;
> >>  		dev_dbg(&host->pd->dev, "int err state = %08x\n", state);
> >>  	}
> >> +	if (host->state == STATE_IDLE) {
> >> +		dev_info(&host->pd->dev, "Spurious IRQ status 0x%x", state);
> >> +		return IRQ_HANDLED;
> >> +	}
> >>  	if (state & ~(INT_CMD12RBE | INT_CMD12CRE)) {
> >>  		if (!host->dma_active)
> >>  			return IRQ_WAKE_THREAD;
> >>
> >> Please, let me know, if it's not very easy for you ATM to perform the 
> >> test, I'll try to do it myself then.
> > 
> > OK. It is easy for me.
> > I got this log after mounting /dev/mmcblk2p1 (on eMMC) and executing
> > tar command to make file accesses.
> > This is based on v3.6-rc6.
> > 
> > [  149.968750] EXT4-fs (mmcblk2p1): warning: maximal mount count reached, running e2fsck is recommended
> > [  150.296875] EXT4-fs (mmcblk2p1): mounted filesystem with ordered data mode. Opts: (null)
> > [  221.539062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  221.585937] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  222.039062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  222.226562] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  222.382812] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  223.109375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  223.406250] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  223.734375] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  223.750000] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  224.398437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > [  230.289062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
> > 
> It might too earlier to report. I got this log.
> 
> [  149.968750] EXT4-fs (mmcblk2p1): warning: maximal mount count reached, running e2fsck is recommended
> [  150.296875] EXT4-fs (mmcblk2p1): mounted filesystem with ordered data mode. Opts: (null)
> [  221.539062] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
...
> [  998.554687] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
...
> [ 1702.140625] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3800000
> [ 1702.148437] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
...
> [ 1721.171875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3800000
> [ 1721.171875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x3000000
...
> [ 1934.296875] sh_mmcif sh_mmcif.0: Spurious IRQ status 0x7400000
...

So, I see 3 IRQ types there: auto CMD12 only (0x3000000), auto CMD12 with 
read completed (0x7400000), auto CMD12 with "Data Transmission Complete."

So, all of them are related to automatic CMD12 execution. I think, we 
should try to finc out now, how and when this auto CMD12 gets enabled, 
when it should be disabled again, and whether we have a problem with the 
logic there. Possibly, eMMC processing of CMD12 is different from normal 
MMC cards.

> After that, I tried the same thing without DMA by comment out like this.
> In this case spurious IRQ never occurred. 
> 
> diff --git a/arch/arm/mach-shmobile/board-kzm9g.c b/arch/arm/mach-shmobile/board
> index 765f60a..d5e6609 100644
> --- a/arch/arm/mach-shmobile/board-kzm9g.c
> +++ b/arch/arm/mach-shmobile/board-kzm9g.c
> @@ -389,8 +389,8 @@ static struct resource sh_mmcif_resources[] = {
>  static struct sh_mmcif_plat_data sh_mmcif_platdata = {
>         .ocr            = MMC_VDD_165_195,
>         .caps           = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE,
> -       .slave_id_tx    = SHDMA_SLAVE_MMCIF_TX,
> -       .slave_id_rx    = SHDMA_SLAVE_MMCIF_RX,
> +       /*.slave_id_tx  = SHDMA_SLAVE_MMCIF_TX,*/
> +       /*.slave_id_rx  = SHDMA_SLAVE_MMCIF_RX,*/
>  };
> 
> 

Interesting. I don't so far see how this is related, but it's interesting 
to know.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-22  6:49                         ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
  2012-08-22 12:16                           ` Tetsuyuki Kobayashi
  2012-08-31  3:05                           ` Tetsuyuki Kobayashi
@ 2012-09-19  2:50                           ` Tetsuyuki Kobayashi
  2012-09-26  1:47                             ` Tetsuyuki Kobayashi
  2012-09-19  6:24                           ` Chris Ball
  3 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-19  2:50 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: yusuke.goda.sx, Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto, linux-mmc

(2012/08/22 15:49), Guennadi Liakhovetski wrote:
> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
> interrupts without any active request. To prevent the Oops, that results
> in such cases, don't dereference the mmc request pointer until we make
> sure, that we are indeed processing such a request.
>
> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>

I verified on kzm9g.
This works with
[PATCH] mmc: sh-mmcif: properly handle MMC_WRITE_MULTIPLE_BLOCK 
completion IRQ

Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>

> ---
>
> Hello Kobayashi-san
>
> On Mon, 20 Aug 2012, Tetsuyuki Kobayashi wrote:
>
> ...
>
>> After applying this patch on kzm9g board, I got this error regarding eMMC.
>> I think this is another problem.
>>
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000008
>> pgd = c0004000
>> [00000008] *pgd=00000000
>> Internal error: Oops: 17 [#1] PREEMPT SMP ARM
>> Modules linked in:
>> CPU: 1    Not tainted  (3.6.0-rc2+ #103)
>> PC is at sh_mmcif_irqt+0x20/0xb30
>> LR is at irq_thread+0x94/0x16c
>
> [snip]
>
>> My quick fix is below.
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..e587fbc 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,15 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>>   {
>>          struct sh_mmcif_host *host = dev_id;
>>          struct mmc_request *mrq = host->mrq;
>> -       struct mmc_data *data = mrq->data;
>> +       /*struct mmc_data *data = mrq->data; -- this cause null pointer access*/
>> +       struct mmc_data *data;
>> +
>> +       /* quick fix by koba */
>> +       if (mrq == NULL) {
>> +               printk("sh_mmcif_irqt: mrq == NULL: host->wait_for=%d\n", host->wait_for);
>> +       } else {
>> +               data = mrq->data;
>> +       }
>>
>>          cancel_delayed_work_sync(&host->timeout_work);
>>
>>
>> With this patch, there is no null pointer accesses and got this log.
>>
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>> sh_mmcif_irqt: mrq == NULL: host->wait_for=0
>>    ...
>>
>> host->wait_for is 0. it is MMCIF_WAIT_FOR_REQUEST.
>> There is code such like:
>>
>>         host->wait_for = MMCIF_WAIT_FOR_REQUEST;
>>         host->mrq = NULL;
>>
>> So, at the top of sh_mmcif_irqt, if host->wait_for == MMCIF_WAIT_FOR_REQUEST,
>> host->mrq = NULL.
>> It is too earlier to access mrq->data before checking host->mrq. it may
>> cause null pointer access.
>>
>> Goda-san, could you check this and refine the code of sh_mmcif_irqt?
>
> Thanks for your report and a fix. Could you please double-check, whether
> the below patch also fixes your problem? Since such spurious interrupts
> are possible I would commit a check like this one, but in the longer run
> we want to identify and eliminate them, if possible. But since so far
> these interrupts only happen on 1 board model and also not on all units
> and not upon each boot, this could be a bit tricky.
>
> One more question - is this only needed for 3.7 or also for 3.6 / stable?
>
> Thanks
> Guennadi
>
>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
> index 5d81427..82bf921 100644
> --- a/drivers/mmc/host/sh_mmcif.c
> +++ b/drivers/mmc/host/sh_mmcif.c
> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   {
>   	struct sh_mmcif_host *host = dev_id;
>   	struct mmc_request *mrq = host->mrq;
> -	struct mmc_data *data = mrq->data;
>
>   	cancel_delayed_work_sync(&host->timeout_work);
>
> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>   	case MMCIF_WAIT_FOR_READ_END:
>   	case MMCIF_WAIT_FOR_WRITE_END:
>   		if (host->sd_error)
> -			data->error = sh_mmcif_error_manage(host);
> +			mrq->data->error = sh_mmcif_error_manage(host);
>   		break;
>   	default:
>   		BUG();
>   	}
>
>   	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
> +		struct mmc_data *data = mrq->data;
>   		if (!mrq->cmd->error && data && !data->error)
>   			data->bytes_xfered =
>   				data->blocks * data->blksz;
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-08-22  6:49                         ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
                                             ` (2 preceding siblings ...)
  2012-09-19  2:50                           ` Tetsuyuki Kobayashi
@ 2012-09-19  6:24                           ` Chris Ball
  2012-09-21  2:35                             ` Tetsuyuki Kobayashi
  3 siblings, 1 reply; 15+ messages in thread
From: Chris Ball @ 2012-09-19  6:24 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Tetsuyuki Kobayashi, yusuke.goda.sx, Kuninori Morimoto,
	Paul Mundt, Magnus, linux-sh, Kuninori Morimoto, linux-mmc

Hi,

On Wed, Aug 22 2012, Guennadi Liakhovetski wrote:
> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious 
> interrupts without any active request. To prevent the Oops, that results 
> in such cases, don't dereference the mmc request pointer until we make 
> sure, that we are indeed processing such a request.
>
> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>

Thanks, pushed to mmc-next for 3.7.

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-19  6:24                           ` Chris Ball
@ 2012-09-21  2:35                             ` Tetsuyuki Kobayashi
  0 siblings, 0 replies; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-21  2:35 UTC (permalink / raw)
  To: Chris Ball
  Cc: Guennadi Liakhovetski, yusuke.goda.sx, Kuninori Morimoto,
	Paul Mundt, Magnus, linux-sh, Kuninori Morimoto, linux-mmc, koba

Hello, Chris

(2012/09/19 15:24), Chris Ball wrote:
> Hi,
> 
> On Wed, Aug 22 2012, Guennadi Liakhovetski wrote:
>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>> interrupts without any active request. To prevent the Oops, that results
>> in such cases, don't dereference the mmc request pointer until we make
>> sure, that we are indeed processing such a request.
>>
>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> 
> Thanks, pushed to mmc-next for 3.7.

It needs this patch for kzm9g board to boot kernel 3.6-rc6 successfully.
Can I ask you to queue this patch for 3.6-rc7 ?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-19  2:50                           ` Tetsuyuki Kobayashi
@ 2012-09-26  1:47                             ` Tetsuyuki Kobayashi
  2012-09-26 10:04                               ` Chris Ball
  0 siblings, 1 reply; 15+ messages in thread
From: Tetsuyuki Kobayashi @ 2012-09-26  1:47 UTC (permalink / raw)
  To: linux-mmc
  Cc: Guennadi Liakhovetski, yusuke.goda.sx, Kuninori Morimoto,
	Paul Mundt, Magnus, linux-sh, Kuninori Morimoto, cjb

Dear linux-mmc maintainer,

(09/19/2012 11:50 AM), Tetsuyuki Kobayashi wrote:
> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>> interrupts without any active request. To prevent the Oops, that results
>> in such cases, don't dereference the mmc request pointer until we make
>> sure, that we are indeed processing such a request.
>>
>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> 
> I verified on kzm9g.
> This works with
> [PATCH] mmc: sh-mmcif: properly handle MMC_WRITE_MULTIPLE_BLOCK 
> completion IRQ
> 
> Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
> 
>> ---
[snip]
>>
>>   drivers/mmc/host/sh_mmcif.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
>> index 5d81427..82bf921 100644
>> --- a/drivers/mmc/host/sh_mmcif.c
>> +++ b/drivers/mmc/host/sh_mmcif.c
>> @@ -1104,7 +1104,6 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>>   {
>>   	struct sh_mmcif_host *host = dev_id;
>>   	struct mmc_request *mrq = host->mrq;
>> -	struct mmc_data *data = mrq->data;
>>
>>   	cancel_delayed_work_sync(&host->timeout_work);
>>
>> @@ -1152,13 +1151,14 @@ static irqreturn_t sh_mmcif_irqt(int irq, void *dev_id)
>>   	case MMCIF_WAIT_FOR_READ_END:
>>   	case MMCIF_WAIT_FOR_WRITE_END:
>>   		if (host->sd_error)
>> -			data->error = sh_mmcif_error_manage(host);
>> +			mrq->data->error = sh_mmcif_error_manage(host);
>>   		break;
>>   	default:
>>   		BUG();
>>   	}
>>
>>   	if (host->wait_for != MMCIF_WAIT_FOR_STOP) {
>> +		struct mmc_data *data = mrq->data;
>>   		if (!mrq->cmd->error && data && !data->error)
>>   			data->bytes_xfered =
>>   				data->blocks * data->blksz;
>>
> 

Without this patch, the following Oops occurs. (kzm9g on v3.6-rc7)
Please push this to v3.6, not only 3.7-next.


[   20.273437] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[   20.281250] pgd = c0004000
[   20.281250] [00000008] *pgd=00000000
[   20.281250] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[   20.281250] Modules linked in:
[   20.281250] CPU: 1    Not tainted  (3.6.0-rc7 #28)
[   20.281250] PC is at sh_mmcif_irqt+0x18/0xb1c
[   20.281250] LR is at irq_thread+0x90/0x15c
[   20.281250] pc : [<c0250250>]    lr : [<c005f180>]    psr: 60000113
[   20.281250] sp : de23df58  ip : 00000000  fp : 00000000
[   20.281250] r10: 00000000  r9 : de1dcab4  r8 : dd9f6360
[   20.281250] r7 : de23c000  r6 : de23c000  r5 : 00000000  r4 : de1dca80
[   20.281250] r3 : c0250238  r2 : 00000000  r1 : de1dca80  r0 : de1dcab4
[   20.281250] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[   20.281250] Control: 10c5387d  Table: 5eb4804a  DAC: 00000015
[   20.281250] Process irq/173-sh_mmc: (pid: 406, stack limit = 0xde23c2f0)
[   20.281250] Stack: (0xde23df58 to 0xde23e000)
[   20.281250] df40:                                                       00000003 00000000
[   20.281250] df60: c0331f6c de1dca80 c046634c c0468190 dd9ca800 00000001 00000000 dd9f6340
[   20.281250] df80: de00bc40 de23c000 de23c000 dd9f6360 00000000 00000000 00000000 c005f180
[   20.281250] dfa0: 00000000 de23dfa4 c005f02c de043e3c dd9f6340 de043e3c dd9f6340 c005f0f0
[   20.281250] dfc0: 00000013 00000000 00000000 c00387ac 00000000 dd9f6340 00000000 00000000
[   20.281250] dfe0: de23dfe0 de23dfe0 de043e3c c0038728 c000f178 c000f178 00000000 00000000
[   20.281250] [<c0250250>] (sh_mmcif_irqt+0x18/0xb1c) from [<c005f180>] (irq_thread+0x90/0x15c)
[   20.281250] [<c005f180>] (irq_thread+0x90/0x15c) from [<c00387ac>] (kthread+0x84/0x90)
[   20.281250] [<c00387ac>] (kthread+0x84/0x90) from [<c000f178>] (kernel_thread_exit+0x0/0x8)
[   20.281250] Code: e5915004 e1a04001 e24dd024 e1a00009 (e595a008)
[   20.281250] ---[ end trace 6efe730b0884a251 ]---





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts
  2012-09-26  1:47                             ` Tetsuyuki Kobayashi
@ 2012-09-26 10:04                               ` Chris Ball
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Ball @ 2012-09-26 10:04 UTC (permalink / raw)
  To: Tetsuyuki Kobayashi
  Cc: linux-mmc, Guennadi Liakhovetski, yusuke.goda.sx,
	Kuninori Morimoto, Paul Mundt, Magnus, linux-sh,
	Kuninori Morimoto

Hi Kobayashi,

On Tue, Sep 25 2012, Tetsuyuki Kobayashi wrote:
> (09/19/2012 11:50 AM), Tetsuyuki Kobayashi wrote:
>> (2012/08/22 15:49), Guennadi Liakhovetski wrote:
>>> On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
>>> interrupts without any active request. To prevent the Oops, that results
>>> in such cases, don't dereference the mmc request pointer until we make
>>> sure, that we are indeed processing such a request.
>>>
>>> Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>>> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>> 
>> I verified on kzm9g.
>> This works with
>> [PATCH] mmc: sh-mmcif: properly handle MMC_WRITE_MULTIPLE_BLOCK 
>> completion IRQ
>> 
>> Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
>
> Without this patch, the following Oops occurs. (kzm9g on v3.6-rc7)
> Please push this to v3.6, not only 3.7-next.

I'm traveling from Shanghai to Boston (home) at the moment, so I can't
push this immediately.  I'll either get it into 3.6, or 3.7 with a tag
for 3.6-stable.

Thanks,

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-09-26 10:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <878vdxd3mq.wl%kuninori.morimoto.gx@renesas.com>
     [not found] ` <20120803050039.GA1614@linux-sh.org>
     [not found]   ` <20120809042844.GF1614@linux-sh.org>
     [not found]     ` <87hasc3bv5.wl%kuninori.morimoto.gx@renesas.com>
     [not found]       ` <874nobqntv.wl%kuninori.morimoto.gx@renesas.com>
     [not found]         ` <20120810123804.GK1614@linux-sh.org>
     [not found]           ` <502DDC97.5080501@kmckk.co.jp>
     [not found]             ` <87wr0us6tg.wl%kuninori.morimoto.gx@renesas.com>
     [not found]               ` <20120820031352.GC25767@linux-sh.org>
     [not found]                 ` <87obm6ry98.wl%kuninori.morimoto.gx@renesas.com>
     [not found]                   ` <20120820043853.GD25767@linux-sh.org>
     [not found]                     ` <87mx1qrx1x.wl%kuninori.morimoto.gx@renesas.com>
     [not found]                       ` <5031D9FF.8060801@kmckk.co.jp>
2012-08-22  6:49                         ` [PATCH] mmc: sh-mmcif: avoid Oops on spurious interrupts Guennadi Liakhovetski
2012-08-22 12:16                           ` Tetsuyuki Kobayashi
2012-08-23  7:11                             ` Guennadi Liakhovetski
2012-09-04  7:40                               ` Tetsuyuki Kobayashi
2012-08-31  3:05                           ` Tetsuyuki Kobayashi
2012-09-18  6:13                             ` Tetsuyuki Kobayashi
2012-09-18  6:42                               ` Guennadi Liakhovetski
2012-09-18  8:02                                 ` Tetsuyuki Kobayashi
2012-09-18  8:44                                   ` Tetsuyuki Kobayashi
2012-09-18  8:56                                     ` Guennadi Liakhovetski
2012-09-19  2:50                           ` Tetsuyuki Kobayashi
2012-09-26  1:47                             ` Tetsuyuki Kobayashi
2012-09-26 10:04                               ` Chris Ball
2012-09-19  6:24                           ` Chris Ball
2012-09-21  2:35                             ` Tetsuyuki Kobayashi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).