public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Yann Dupont <Yann.Dupont@univ-nantes.fr>
Cc: xfs@oss.sgi.com
Subject: Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
Date: Fri, 26 Oct 2012 08:10:47 +1100	[thread overview]
Message-ID: <20121025211047.GD29378@dastard> (raw)
In-Reply-To: <508958FF.4000007@univ-nantes.fr>

On Thu, Oct 25, 2012 at 05:21:35PM +0200, Yann Dupont wrote:
> Le 23/10/2012 10:24, Yann Dupont a écrit :
> >Le 22/10/2012 16:14, Yann Dupont a écrit :
> >
> >Hello. This mail is a follow up of a message on XFS mailing list.
> >I had hang with 3.6.1, and then , damage on XFS filesystem.
> >
> >3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a
> >different trace this time , so not really sure the 2 problems are
> >related .
> >Anyway the problem is maybe not XFS, but is just a consequence of
> >what seems more like kernel problems.
> >
> >cc: to linux-kernel
> Hello.
> There is definitively something wrong in 3.6.xx with XFS, in
> particular after an abrupt stop of the machine :
> 
> I now have corruption on a 3rd machine (not involved with ceph).
> The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.
> 
> This machine isn't under heavy load, but it's a machine we use for
> tests & compilations. We often crash it. For 2 years, we didn't have
> problems. XFS always was reliable, even in hard conditions (hard
> reset, loss of power, etc)
> 
> This time, after 3.6.3 boot, one of my xfs volume refuse to mount :
> 
> mount: /dev/mapper/LocalDisk-debug--git: can't read superblock
> 
> 276596.189363] XFS (dm-1): Mounting Filesystem
> [276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
> [276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
> [276596.711329] XFS (dm-1): log mount/recovery failed: error 5
> [276596.711516] XFS (dm-1): log mount failed

That's an indication that zeros are being read from the journal
rather than valid transaction data. It may well be caused by an XFS
bug, but from experience it is equally likely to be a lower layer
storage problem. More information is needed.

Firstly:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Secondly, is the system still in this state? If so, dump the log to
a file using xfs_logprint, zip it up and send it to me so I can have
a look at where the log is intact (i.e. likely xfs bug) or contains
zero (likely storage bug).

If the system is not still in this state, then I'm afraid there's
nothing that can be done to understand the problem.

> I'm not even sure the reboot was after a crash or just a clean
> reboot. (I'm not the only one to use this machine). I have nothing
> suspect on my remote syslog.
> 
> Anyway, it's the 3rd XFS crashed volume in a row with 3.6 kernel.
> Different machines, different contexts. Looks suspicious.

You've had two machines crash with problems in the mm subsystem, and
one filesystem problem that might be hardware realted. Bit early to
be blaming XFS for all your problems, I think....

> xfs_repair -n seems to show volume is quite broken :

Sure, if the log hasn't been replayed then it will be - the
filesystem will only be consistent after log recovery has been run.

> I won't try to repair this volume right now.
> 
> This time, volume is small enough to make an image (it's a 100 GB
> lvm volume). I'll try to image it before making anything else.
> 
> 1st question : I saw there is ext4 corruption reported too with 3.6
> kernel, but as far as I can see, problem seems to be jbd related, so
> it shouldn't affect xfs ?

No relationship at all.

> 2nd question : Am I the only one to see this ?? I saw problems
> reported with 2.6.37, but here, the kernel is 3.6.xx

Yes, you're the only one to report such problems on 3.6. Anything
reported on 2.6.37 is likely to be completely unrelated.

> 3rd question : If you suspect the problem may be lying in XFS , what
> should I supply to help debugging the problem ?

See above.

> Not CC:ing linux kernel list right now, as I'm really not sure where
> the problem is right now.

You should report the mm problems to linux-mm@kvack.org to make sure
the right people see them and they don't get lost in the noise of
lkml....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-10-25 21:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-22 14:14 Is kernel 3.6.1 or filestreams option toxic ? Yann Dupont
2012-10-23  8:24 ` Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) Yann Dupont
2012-10-25 15:21   ` Yann Dupont
2012-10-25 20:55     ` Yann Dupont
2012-10-25 21:10     ` Dave Chinner [this message]
2012-10-26 10:03       ` Yann Dupont
2012-10-26 22:05         ` Yann Dupont
2012-10-28 23:48           ` Dave Chinner
2012-10-29  1:25             ` Dave Chinner
2012-10-29  8:11               ` Yann Dupont
2012-10-29 12:21                 ` Dave Chinner
2012-10-29 12:18               ` Dave Chinner
2012-10-29 12:43                 ` Yann Dupont
2012-10-30  1:33                   ` Dave Chinner
2012-10-31 11:45                     ` Gaudenz Steinlin
2012-11-05 13:57                     ` Yann Dupont
2012-10-29  8:07             ` Yann Dupont
2012-10-29  8:17               ` Yann Dupont
  -- strict thread matches above, loose matches on Subject: below --
2012-11-28  9:39 reste donewell
2012-11-28 20:37 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121025211047.GD29378@dastard \
    --to=david@fromorbit.com \
    --cc=Yann.Dupont@univ-nantes.fr \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox