From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: xfs@oss.sgi.com
Subject: Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
Date: Thu, 25 Oct 2012 17:21:35 +0200 [thread overview]
Message-ID: <508958FF.4000007@univ-nantes.fr> (raw)
In-Reply-To: <50865453.5080708@univ-nantes.fr>
Le 23/10/2012 10:24, Yann Dupont a écrit :
> Le 22/10/2012 16:14, Yann Dupont a écrit :
>
> Hello. This mail is a follow up of a message on XFS mailing list. I
> had hang with 3.6.1, and then , damage on XFS filesystem.
>
> 3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a
> different trace this time , so not really sure the 2 problems are
> related .
> Anyway the problem is maybe not XFS, but is just a consequence of what
> seems more like kernel problems.
>
> cc: to linux-kernel
Hello.
There is definitively something wrong in 3.6.xx with XFS, in particular
after an abrupt stop of the machine :
I now have corruption on a 3rd machine (not involved with ceph).
The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.
This machine isn't under heavy load, but it's a machine we use for tests
& compilations. We often crash it. For 2 years, we didn't have problems.
XFS always was reliable, even in hard conditions (hard reset, loss of
power, etc)
This time, after 3.6.3 boot, one of my xfs volume refuse to mount :
mount: /dev/mapper/LocalDisk-debug--git: can't read superblock
276596.189363] XFS (dm-1): Mounting Filesystem
[276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
[276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
[276596.711329] XFS (dm-1): log mount/recovery failed: error 5
[276596.711516] XFS (dm-1): log mount failed
I'm not even sure the reboot was after a crash or just a clean reboot.
(I'm not the only one to use this machine). I have nothing suspect on my
remote syslog.
Anyway, it's the 3rd XFS crashed volume in a row with 3.6 kernel.
Different machines, different contexts. Looks suspicious.
This time the crashed volume was handled by a PERC (mptsas) card. The 2
others volumes previously reported were handled by emulex lightpulse
fibre channel card (lpfc) and this time filestreams option wasn't used.
xfs_repair -n seems to show volume is quite broken :
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
block (1,6197-6197) multiply claimed by bno space tree, state - 2
bad magic # 0x7f454c46 in btbno block 3/2320
expected level 0 got 513 in btbno block 3/2320
bad btree nrecs (256, min=255, max=510) in btbno block 3/2320
invalid start block 16793088 in record 0 of bno btree block 3/2320
invalid start block 0 in record 1 of bno btree block 3/2320
invalid start block 0 in record 2 of bno btree block 3/2320
invalid start block 2282029056 in record 3 of bno btree block 3/2320
invalid start block 0 in record 4 of bno btree block 3/2320
invalid length 218106368 in record 5 of bno btree block 3/2320
invalid start block 1684369509 in record 6 of bno btree block 3/2320
invalid start block 6909556 in record 7 of bno btree block 3/2320
invalid start block 1493202533 in record 8 of bno btree block 3/2320
invalid start block 1768111411 in record 9 of bno btree block 3/2320
invalid start block 761557865 in record 10 of bno btree block 3/2320
invalid start block 842084400 in record 11 of bno btree block 3/2320
...
bad magic # 0x41425442 in btcnt block 2/14832
bad btree nrecs (436, min=255, max=510) in btcnt block 2/14832
out-of-order cnt btree record 2 (188545 1) block 2/14832
out-of-order cnt btree record 3 (188650 1) block 2/14832
out-of-order cnt btree record 4 (188658 1) block 2/14832
out-of-order cnt btree record 8 (189021 1) block 2/14832
out-of-order cnt btree record 9 (189104 1) block 2/14832
out-of-order cnt btree record 10 (189127 2) block 2/14832
out-of-order cnt btree record 11 (189193 2) block 2/14832
out-of-order cnt btree record 12 (189259 2) block 2/14832
out-of-order cnt btree record 13 (189268 1) block 2/14832
out-of-order cnt btree record 14 (189307 1) block 2/14832
out-of-order cnt btree record 15 (189330 1) block 2/14832
out-of-order cnt btree record 16 (189379 1) block 2/14832
out-of-order cnt btree record 18 (189477 1) block 2/14832
I won't try to repair this volume right now.
This time, volume is small enough to make an image (it's a 100 GB lvm
volume). I'll try to image it before making anything else.
1st question : I saw there is ext4 corruption reported too with 3.6
kernel, but as far as I can see, problem seems to be jbd related, so it
shouldn't affect xfs ?
2nd question : Am I the only one to see this ?? I saw problems reported
with 2.6.37, but here, the kernel is 3.6.xx
3rd question : If you suspect the problem may be lying in XFS , what
should I supply to help debugging the problem ?
Not CC:ing linux kernel list right now, as I'm really not sure where the
problem is right now.
Cheers,
--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-10-25 15:19 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-22 14:14 Is kernel 3.6.1 or filestreams option toxic ? Yann Dupont
2012-10-23 8:24 ` Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) Yann Dupont
2012-10-25 15:21 ` Yann Dupont [this message]
2012-10-25 20:55 ` Yann Dupont
2012-10-25 21:10 ` Dave Chinner
2012-10-26 10:03 ` Yann Dupont
2012-10-26 22:05 ` Yann Dupont
2012-10-28 23:48 ` Dave Chinner
2012-10-29 1:25 ` Dave Chinner
2012-10-29 8:11 ` Yann Dupont
2012-10-29 12:21 ` Dave Chinner
2012-10-29 12:18 ` Dave Chinner
2012-10-29 12:43 ` Yann Dupont
2012-10-30 1:33 ` Dave Chinner
2012-10-31 11:45 ` Gaudenz Steinlin
2012-11-05 13:57 ` Yann Dupont
2012-10-29 8:07 ` Yann Dupont
2012-10-29 8:17 ` Yann Dupont
-- strict thread matches above, loose matches on Subject: below --
2012-11-28 9:39 reste donewell
2012-11-28 20:37 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=508958FF.4000007@univ-nantes.fr \
--to=yann.dupont@univ-nantes.fr \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox