Re: ubifs_decompress: cannot decompress ...

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

From: Artem Bityutskiy <dedekind1@gmail.com>
To: "Matthew L. Creech" <mlcreech@gmail.com>
Cc: linux-mtd@lists.infradead.org
Subject: Re: ubifs_decompress: cannot decompress ...
Date: Thu, 09 Jun 2011 15:10:34 +0300	[thread overview]
Message-ID: <1307621434.7374.78.camel@localhost> (raw)
In-Reply-To: <BANLkTimXGmANdV663BP2CLKZTJyzD+QLhQ@mail.gmail.com>

On Wed, 2011-06-08 at 13:50 -0400, Matthew L. Creech wrote:
> On Wed, Jun 8, 2011 at 10:11 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> >
> > Yes, it does look like this LEB might be garbage-collected. But it does
> > not have to be.
> >
> > Anyway, what I can suggest you is to do several things.
> >
> > 1. If you have many occasions of such error, try to gather some
> >   information about how the device was used, and if it was uncleanly
> >   power-cut. Remember, I often saw that embedded devices have incorrect
> >   reboot. Whe users reboot it "normally" - it does not try to unmount
> >   the FS-es cleanly and just jumps to som HW reset function.
> >
> >   You can verify this by rebooting normally and checking if UBIFS says
> >   "recovery needed" or not. If it does - the reboot was not normal.
> >
> 
> Yes, it currently reboots uncleanly (though it does do a "sync"
> first).  I noticed this a while back, and the next release firmware
> will have it fixed.  However, it doesn't make a huge difference to us,
> because these devices are probably more likely to experience power
> loss than a software reboot, in the field at least.
> 
> > 2. This error may be due to memory corruptions in some driver (e.g.,
> >   wireless or video), due to issues in the mtd driver, etc. Try to
> >   stress your system with slub/slab full checks enabled, and other
> >   debugging features which you can find in the "hacking" section of
> >   make menuconfig.
> >
> 
> Will do.
> 
> > 3. If my theory is true, then what may help is adding a check it
> >   ubifs recovery function. The recovery ends with an ubifs_leb_change()
> >   call. You need to check the last node there - is it full and correct?
> >   If not, you should print a loud warning and information like leb dump
> >   _before_ the change, and dump of the buffer which we are going to
> >   write with ubifs_leb_change().
> >
> >   You'd probably need to deploy this check to the field if this issue
> >   is not easy to reproduce. If you have then this info you may fix the
> >   bug.
> >
> 
> Great, I'll add this check and see if we get any hits.  Even if it
> takes a while to hit it in the field, this would at least give us a
> way to make some progress in finding the issue.

With my latest code-base, I am able to inject a hack into
ubifs_leb_change() - but this function does not exist in your code-base.
Anyway, I'm currently running power cut emulation testing with the
following hack:


>From df163f4dd8a1604fe3085c4d11281c530837bc53 Mon Sep 17 00:00:00 2001
From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Date: Thu, 9 Jun 2011 15:08:59 +0300
Subject: [PATCH] UBIFS: temporary: hack to check recovery

We suspect that recovery cuts nodes sometimes. This is the hack which should
catch such things. We hack ubifs_change_leb and scan the leb right after
changing it - if we wrote corrupted data there, scan should fail.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/io.c |   24 ++++++++++++++++++++++++
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c
index 9228950..9f7dbbf 100644
--- a/fs/ubifs/io.c
+++ b/fs/ubifs/io.c
@@ -153,6 +153,30 @@ int ubifs_leb_change(struct ubifs_info *c, int lnum, const void *buf, int len,
 		ubifs_ro_mode(c, err);
 		dbg_dump_stack();
 	}
+
+	/* Temporary hack to catch incorrect recovery, if we have such */
+	if (!err && (lnum < c->lpt_first || lnum > c->lpt_last)) {
+		void *buf = vmalloc(c->leb_size);
+		struct ubifs_scan_leb *sleb;
+
+		if (!buf)
+			return 0;
+
+		sleb = ubifs_scan(c, lnum, 0, buf, 0);
+		if (!IS_ERR(sleb)) {
+			/* Scan succeeded */
+			vfree(buf);
+			return 0;
+		}
+
+		ubifs_err("scanning after LEB %d change failed, error %d!", lnum, err);
+		print_hex_dump(KERN_ERR, "", DUMP_PREFIX_OFFSET, 32, 1,
+			       buf, c->leb_size, 1);
+		dump_stack();
+		vfree(buf);
+		return -EINVAL;
+	}
+
 	return err;
 }
 
-- 
1.7.2.3



-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

next prev parent reply	other threads:[~2011-06-09 12:14 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-27 21:12 ubifs_decompress: cannot decompress Matthew L. Creech
2011-05-30 12:29 ` Ben Gardiner
2011-05-31 15:47   ` Matthew L. Creech
2011-05-31 16:10     ` Ben Gardiner
2011-05-31 21:47       ` Matthew L. Creech
2011-06-01  7:51         ` Artem Bityutskiy
2011-06-02  4:30           ` Matthew L. Creech
2011-06-02 18:59             ` Matthew L. Creech
2011-06-06  9:58               ` Artem Bityutskiy
2011-06-06 16:04                 ` Matthew L. Creech
2011-06-06 16:18                   ` Artem Bityutskiy
2011-06-06 19:52                     ` Matthew L. Creech
2011-06-07  4:34                       ` Artem Bityutskiy
2011-06-07 20:41                         ` Matthew L. Creech
2011-06-08 14:11                           ` Artem Bityutskiy
2011-06-08 17:50                             ` Matthew L. Creech
2011-06-09 12:10                               ` Artem Bityutskiy [this message]
2011-06-20 15:35                                 ` Matthew L. Creech
2011-06-07 10:24                       ` Artem Bityutskiy
2011-06-03  4:32             ` Artem Bityutskiy
2011-06-01  8:02     ` Artem Bityutskiy
2011-06-01  8:07       ` Artem Bityutskiy
2011-06-01  8:39       ` Artem Bityutskiy
2011-06-02  4:34       ` Matthew L. Creech
2011-06-01  7:48 ` Artem Bityutskiy

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:9228950 dfblob:9f7dbbf )
 OR (
bs:"UBIFS: temporary: hack to check recovery" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1307621434.7374.78.camel@localhost \
    --to=dedekind1@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mlcreech@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox