linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Richard Weinberger <richard@nod.at>
To: dedekind1@gmail.com
Cc: "Wiedemer, Thorsten \(Lawo AG\)" <Thorsten.Wiedemer@lawo.com>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: UBI leb_write_unlock NULL pointer Oops (continuation)
Date: Tue, 04 Feb 2014 08:46:17 +0100	[thread overview]
Message-ID: <52F09AC9.6090604@nod.at> (raw)
In-Reply-To: <1391498545.1795.29.camel@sauron.fi.intel.com>

Am 04.02.2014 08:22, schrieb Artem Bityutskiy:
> On Mon, 2014-02-03 at 14:56 +0100, Richard Weinberger wrote:
>> Am 03.02.2014 13:51, schrieb Wiedemer, Thorsten (Lawo AG):
>>> Hi,
>>>
>>> I can reproduce it fairly regularly, but not really "quickly". At the moment, I can use a setup of about identical 70 devices.
>>> A test over the last weekend resultet In 6 devices showing the bug.
>>> What we have are multiple processes which write in different intervals some data on the device and sync it, because this data should be available after a power cut.
>>> Perhaps I can force the error more often in writing test processes with shorter write/sync intervals.
>>>
>>> If I have further access to the "big" setup for some days, I will try to make a test without preemption.
>>
>> Hmm, ok.
>> Please also apply this patch, just in case...
>>
>> diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
>> index 0e11671d..48fd2aa 100644
>> --- a/drivers/mtd/ubi/eba.c
>> +++ b/drivers/mtd/ubi/eba.c
>> @@ -301,6 +301,7 @@ static void leb_write_unlock(struct ubi_device *ubi, int vol_id, int lnum)
>>
>>  	spin_lock(&ubi->ltree_lock);
>>  	le = ltree_lookup(ubi, vol_id, lnum);
>> +	ubi_assert(le);
>>  	le->users -= 1;
>>  	ubi_assert(le->users >= 0);
>>  	up_write(&le->mutex);
> 
> The UBI LEB locking is a bit over-designed, it could be simplified, may
> be this could help looking for the problem.
> 
> The this report does really sound like there is something specific to
> Thorsten's system which corrupts memory.

Thorsten sees:
Dec 25 03:59:22 kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000c
(leb_write_unlock+0x74/0xf0) from [<c02d0d10>] (ubi_eba_write_leb+0x94/0x820

In July 2013 we got this report from a user:
[  300.554525] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
(leb_write_unlock+0xa0/0xf4) from [<802850e0>] (ubi_eba_write_leb+0x568/0x80c)

In both cases we fault at address 0000000c and leb_write_unlock() was called by ubi_eba_write_leb().

Same user saw the issue also in the read path:

[   38.471134] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
(leb_read_unlock+0xa0/0xf4) from [<80285cdc>] (ubi_eba_read_leb+0x404/0x480)

In that case the fault happened at 00000000 directly.

A bit too deterministic for a memory corruption IMHO.

> And it is difficult to debug this via the mailing list. Thorsten should
> start adding various checks like this and try to come closer to the
> root-cause.

Yeah.
We also need more oopses, maybe we can spot a pattern.

Thanks,
//richard

  reply	other threads:[~2014-02-04  7:46 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-03  8:51 UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-03  9:38 ` Richard Weinberger
2014-02-03 10:31   ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 11:02     ` Richard Weinberger
2014-02-03 12:51       ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 13:56         ` Richard Weinberger
2014-02-04  7:22           ` Artem Bityutskiy
2014-02-04  7:46             ` Richard Weinberger [this message]
2014-02-04  7:54               ` Artem Bityutskiy
2014-02-04 15:45                 ` UBI leb_write_unlock NULL pointer Oops (continuation) on ARM926 Bill Pringlemeir
2014-02-04 17:05                   ` Bill Pringlemeir
2014-02-04 19:57                     ` Bill Pringlemeir
2014-02-04 20:07                       ` Richard Weinberger
2014-02-04 17:01           ` AW: UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-04 17:52             ` Wiedemer, Thorsten (Lawo AG)
2014-02-05  8:29             ` Richard Weinberger
2014-02-05 21:45               ` Bill Pringlemeir
2014-02-05 22:13                 ` Richard Weinberger
2014-02-05 22:23                   ` Bill Pringlemeir
2014-02-06 13:05                     ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-06 16:00                       ` Bill Pringlemeir
2014-02-11  8:01               ` Wiedemer, Thorsten (Lawo AG)
2014-02-11 15:25                 ` Bill Pringlemeir
2014-02-12 15:18                   ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-12 17:46                     ` Richard Weinberger
2014-02-12 18:11                     ` AW: AW: " Bill Pringlemeir
2014-02-12 18:21                       ` Bill Pringlemeir
2014-02-12 20:48                         ` Richard Weinberger
2014-02-14 17:11                           ` Bill Pringlemeir
2014-02-18  8:25                           ` Ziegler, Emanuel (Lawo AG)
2014-02-19 11:09                             ` Ziegler, Emanuel (Lawo AG)
2014-02-20 15:21                       ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-20 17:26                         ` Bill Pringlemeir
2014-02-20 17:38                           ` Bill Pringlemeir
2014-02-21  8:55                         ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-21  9:28                           ` Quiniou, Benoit (Lawo AG)
2014-02-21 17:53                           ` AW: " Bill Pringlemeir
2014-02-21 18:12                             ` Richard Weinberger
2014-02-21 19:45                               ` Bill Pringlemeir
2014-02-22  0:49                                 ` Bill Pringlemeir
2014-02-22  8:32                                   ` Richard Weinberger
2014-02-24 15:09                                     ` Bill Pringlemeir
2014-02-24 15:36                                       ` Richard Weinberger
2014-02-24 15:45                                         ` Bill Pringlemeir
2014-02-24 15:48                                           ` Bill Pringlemeir
2014-03-05 20:57                                             ` Richard Weinberger
2014-03-05 21:30                                               ` Bill Pringlemeir
2014-03-05 21:42                                                 ` Bill Pringlemeir
2014-03-05 23:11                                                   ` Richard Weinberger
2014-03-05 23:12                                                   ` Richard Weinberger
2014-02-04 19:49     ` Andrew Ruder
2014-02-05  8:39       ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-05 20:13         ` Andrew Ruder
2015-10-16 12:17 ` Wojciech Nizinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52F09AC9.6090604@nod.at \
    --to=richard@nod.at \
    --cc=Thorsten.Wiedemer@lawo.com \
    --cc=dedekind1@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).