From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga14.intel.com ([143.182.124.37]) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WAaLK-00063x-4r for linux-mtd@lists.infradead.org; Tue, 04 Feb 2014 07:22:50 +0000 Message-ID: <1391498545.1795.29.camel@sauron.fi.intel.com> Subject: Re: UBI leb_write_unlock NULL pointer Oops (continuation) From: Artem Bityutskiy To: Richard Weinberger Date: Tue, 04 Feb 2014 09:22:25 +0200 In-Reply-To: <52EF9FFE.4020405@nod.at> References: <52EF772D.8080207@nod.at> <52EF9FFE.4020405@nod.at> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: "Wiedemer, Thorsten \(Lawo AG\)" , "linux-mtd@lists.infradead.org" Reply-To: dedekind1@gmail.com List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Mon, 2014-02-03 at 14:56 +0100, Richard Weinberger wrote: > Am 03.02.2014 13:51, schrieb Wiedemer, Thorsten (Lawo AG): > > Hi, > > > > I can reproduce it fairly regularly, but not really "quickly". At the moment, I can use a setup of about identical 70 devices. > > A test over the last weekend resultet In 6 devices showing the bug. > > What we have are multiple processes which write in different intervals some data on the device and sync it, because this data should be available after a power cut. > > Perhaps I can force the error more often in writing test processes with shorter write/sync intervals. > > > > If I have further access to the "big" setup for some days, I will try to make a test without preemption. > > Hmm, ok. > Please also apply this patch, just in case... > > diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c > index 0e11671d..48fd2aa 100644 > --- a/drivers/mtd/ubi/eba.c > +++ b/drivers/mtd/ubi/eba.c > @@ -301,6 +301,7 @@ static void leb_write_unlock(struct ubi_device *ubi, int vol_id, int lnum) > > spin_lock(&ubi->ltree_lock); > le = ltree_lookup(ubi, vol_id, lnum); > + ubi_assert(le); > le->users -= 1; > ubi_assert(le->users >= 0); > up_write(&le->mutex); The UBI LEB locking is a bit over-designed, it could be simplified, may be this could help looking for the problem. The this report does really sound like there is something specific to Thorsten's system which corrupts memory. And it is difficult to debug this via the mailing list. Thorsten should start adding various checks like this and try to come closer to the root-cause. -- Best Regards, Artem Bityutskiy