linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Charles Bertsch <cbertsch@cox.net>
Cc: linux-raid@vger.kernel.org, "BertschC@acm.org" <BertschC@acm.org>
Subject: Re: PROBLEM: write to jbod with 3TB and 160GB drives hits BUG/oops
Date: Mon, 27 Apr 2015 11:11:59 +1000	[thread overview]
Message-ID: <20150427111159.46b9781a@notabene.brown> (raw)
In-Reply-To: <553AA35A.5050300@cox.net>

[-- Attachment #1: Type: text/plain, Size: 4202 bytes --]

On Fri, 24 Apr 2015 13:11:06 -0700 Charles Bertsch <cbertsch@cox.net> wrote:

> On 04/23/2015 06:55 PM, NeilBrown wrote:
> >
> > By "jbod" I assume you mean "linear array".
> >
> > You say this happens without any filesystem on the array, yet the stack
> > traces clearly show ext2 in use.
> > Maybe some weird interaction is happening between the the filesystem and the
> > linear array.
> > But please confirm that the stack trace happened when there was no filesystem
> > on the array you were testing, and report what filesystems you do have which
> > use ext2.
> >
> Neil --
> Yes, I do mean linear array.
> 
> At the point of the stack trace, there was no file-system on the linear 
> 2-drive array.  The test-jbod-2 script would create the array and then 
> write directly to /dev/md0.  Any evidence of previous existence of a 
> file-system would have been obliterated by earlier runs copying 
> /dev/zero everywhere.
> 
> The file-systems in use --
> -- The rootfs is an initrd file, squashfs, and mounted read-only.
> -- An ext3 for configuration and logs is mounted RW on /flash
> -- An ext2 using 8MB of RAM is mounted RW on /var
> -- The file-server is derived from a much earlier design that required 
> some RW directories within the root.  These entries appear in the mount 
> command as ext2, but are part of /var (and not separate file systems) --
> -- mount --bind /var/hd /hd
> -- mount --bind /var/home /home
> 
> -- A devtmpfs mounted on /dev, tmpfs on /dev/shm, proc on /proc, sysfs 
> on /sys, and another mount --bind from within /flash for nfs.
> 
> # mount
> /dev/root on / type squashfs (ro,relatime)
> devtmpfs on /dev type devtmpfs 
> (rw,relatime,size=1002600k,nr_inodes=250650,mode=755)
> proc on /proc type proc (rw,relatime)
> sysfs on /sys type sysfs (rw,relatime)
> /dev/ram1 on /var type ext2 (rw,relatime,errors=continue)
> /dev/ram1 on /hd type ext2 (rw,relatime,errors=continue)
> /dev/ram1 on /home type ext2 (rw,relatime,errors=continue)
> tmpfs on /dev/shm type tmpfs (rw,relatime)
> /dev/sdb1 on /flash type ext3 
> (rw,noatime,errors=continue,commit=60,barrier=1,data=ordered)
> /dev/sdb1 on /var/lib/nfs type ext3 
> (rw,noatime,errors=continue,commit=60,barrier=1,data=ordered)
> nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
> #

Thanks for the details.
On the whole, I don't think it is likely that your problem is directly
related to md - just a coincidence that it happened when you were using md
things.  But one never knows until that actual cause is found.

> 
>  > Is there any chance you could use "git bisect" to find out exactly which
>  > commit introduced the problem?  That is the mostly likely path to a 
> solution.
>  >
> 
> 
> I am not familiar with "git bisect".  Would this be similar to 
> downloading a series of kernel releases from linux-3.3.5 up to 3.18.5 
> using a binary search to find which release (rather than which commit) 
> has the problem ?

Similar, but (some of) the boring work is all done for you.

It would be best to stick to mainline kernels for testing.  i.e. just '3.x',
not '3.x.y'.

So presumably 3.3 works, and 3.18 fails.
In that case:

   git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
   cd linux
   git bisect start
   git bisect good v3.3
   git bisect bad v3.18

That should get you started, except that it seems to take an incredibly long
time.  So probably do the first few steps by hand.
e.g

   git checkout v3.10

and test that.  Then try v3.7 or v3.14.

Once you know which of those are good or bad, run e.g.
  git bisect start
  git bisect good v3.7
  git bisect bad v3.10

and that will checkout a kernel somewhere in the middle and tell you there
are 14 (or so) steps to go.

Then build and test the kernel. If it is good, run "git bisect good".
If bad, "git bisect bad".

If you can persist through testing over a dozen kernels (takes some
patience!!) it should lead you to the commit that introduced the problem.
It is always best to be caution before declaring a kernel 'good' - run the
test a few times.

> 
> Thanks
> 
> Charles Bertsch


NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2015-04-27  1:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <55398462.1000202@cox.net>
2015-04-24  1:55 ` PROBLEM: write to jbod with 3TB and 160GB drives hits BUG/oops NeilBrown
     [not found] ` <KdvS1q00X1hjKLY01dvTMG>
2015-04-24 20:11   ` Charles Bertsch
2015-04-27  1:11     ` NeilBrown [this message]
     [not found]     ` <LpCG1q00n1hjKLY01pCHNM>
2015-04-29  1:05       ` Charles Bertsch
2015-05-16  3:46       ` Charles Bertsch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150427111159.46b9781a@notabene.brown \
    --to=neilb@suse.de \
    --cc=BertschC@acm.org \
    --cc=cbertsch@cox.net \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).