* XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
@ 2009-02-23 20:43 Harry Mangalam
2009-02-23 20:53 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Harry Mangalam @ 2009-02-23 20:43 UTC (permalink / raw)
To: xfs
Here's an unusual (long) tale of woe.
We had a USRobotics 8700 NAS appliance with 4 SATA disks in RAID5:
<http://www.usr.com/support/product-template.asp?prod=8700>
which was a fine (if crude) ARM-based Linux NAS until it stroked out
at some point, leaving us with a degraded RAID5 and comatose NAS
device.
We'd like to get the files back of course and I've moved the disks to
a Linux PC, hooked them up to a cheap Silicon Image 4x SATA
controller and brought up the whole frankenmess with mdadm. It
reported a clean but degraded array.
(much mdadm stuff deleted)
Shortening this up considerably, I was able to get the RAID5
reconstituted with a new disk, but was not so fortunate with the
filesystem.
The docs and files on the USR web site imply that the native
filesystem was originally XFS, but when I try to mount it as such, I
can't:
mount -vvv -t xfs /dev/md1 /mnt
mount: fstab path: "/etc/fstab"
mount: lock path: "/etc/mtab~"
mount: temp path: "/etc/mtab.tmp"
mount: no LABEL=, no UUID=, going to mount /dev/md1 by path
mount: spec: "/dev/md1"
mount: node: "/mnt"
mount: types: "xfs"
mount: opts: "(null)"
mount: mount(2) syscall: source: "/dev/md1", target: "/mnt",
filesystemtype: "xfs", mountflags: -1058209792, data: (null)
mount: wrong fs type, bad option, bad superblock on /dev/md1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
and when I check dmesg:
[ 245.008000] SGI XFS with ACLs, security attributes, realtime, large
block numbers, no debug enabled
[ 245.020000] SGI XFS Quota Management subsystem
[ 245.020000] XFS: SB read failed
[ 327.696000] md: md0 stopped.
[ 327.696000] md: unbind<sdc1>
[ 327.696000] md: export_rdev(sdc1)
[ 327.696000] md: unbind<sde1>
[ 327.696000] md: export_rdev(sde1)
[ 327.696000] md: unbind<sdd1>
[ 327.696000] md: export_rdev(sdd1)
[ 439.660000] XFS: bad magic number
[ 439.660000] XFS: SB validate failed
repeated attempts repeat the last 2 lines above. This implies that
the superblock is bad and xfs_repair also reports that:
xfs_repair /dev/md1
- creating 2 worker thread(s)
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
...... <lots of ...> ...
..found candidate secondary superblock...
unable to verify superblock, continuing...
<lots of ...> ...
...found candidate secondary superblock...
unable to verify superblock, continuing...
<lots of ...> ...
So my question is what should I do now? Neil Brown (mdadm author)
suggested that Dave Chinner had investigated this in some other cases
and had found that the Vendors of a number of such ARM-based NAS
appliances had mucked up the implementation of XFS such that this
situation is just a no-win.
I did find a few reports of people who had similar ARM-based XFS
filesystems with similar problems but could not find any successful
resolution. The vendor has not responded to emails about this.
Any suggestions (other than restore from (nonexistant) backups)?
Harry
--
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway,
UC Irvine 92697 949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience;
Experience comes from bad judgment. [F. Brooks.]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 20:43 XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5 Harry Mangalam
@ 2009-02-23 20:53 ` Eric Sandeen
2009-02-23 21:20 ` Harry Mangalam
0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-02-23 20:53 UTC (permalink / raw)
To: Harry Mangalam; +Cc: xfs
Harry Mangalam wrote:
> Here's an unusual (long) tale of woe.
>
> We had a USRobotics 8700 NAS appliance with 4 SATA disks in RAID5:
> <http://www.usr.com/support/product-template.asp?prod=8700>
> which was a fine (if crude) ARM-based Linux NAS until it stroked out
> at some point, leaving us with a degraded RAID5 and comatose NAS
> device.
>
> We'd like to get the files back of course and I've moved the disks to
> a Linux PC, hooked them up to a cheap Silicon Image 4x SATA
> controller and brought up the whole frankenmess with mdadm. It
> reported a clean but degraded array.
>
> (much mdadm stuff deleted)
>
> Shortening this up considerably, I was able to get the RAID5
> reconstituted with a new disk, but was not so fortunate with the
> filesystem.
>
> The docs and files on the USR web site imply that the native
> filesystem was originally XFS, but when I try to mount it as such, I
> can't:
...snip...
> and when I check dmesg:
> [ 245.008000] SGI XFS with ACLs, security attributes, realtime, large
> block numbers, no debug enabled
> [ 245.020000] SGI XFS Quota Management subsystem
> [ 245.020000] XFS: SB read failed
> [ 327.696000] md: md0 stopped.
> [ 327.696000] md: unbind<sdc1>
> [ 327.696000] md: export_rdev(sdc1)
> [ 327.696000] md: unbind<sde1>
> [ 327.696000] md: export_rdev(sde1)
> [ 327.696000] md: unbind<sdd1>
> [ 327.696000] md: export_rdev(sdd1)
> [ 439.660000] XFS: bad magic number
> [ 439.660000] XFS: SB validate failed
>
> repeated attempts repeat the last 2 lines above. This implies that
> the superblock is bad and xfs_repair also reports that:
> xfs_repair /dev/md1
> - creating 2 worker thread(s)
> Phase 1 - find and verify superblock...
> bad primary superblock - bad magic number !!!
The main badness that I know of on some of the ARM NAS implementations
is a change which was made due to differing alignment on the old arm
ABI; the change that went into some vendor trees actually modified the
on-disk format rather than fixing it up properly. (This should be fixed
upstream now). There are other odd problems w/ cache aliasing too.
However this wouldn't cause a superblock mis-read like this. If you get
it mounted, you *may* run into what looks like directory corruption on
the PC, though, due to the alignment issue.
Anyway, first, I'd look around for "XFSB" in the early few blocks of
your raid and see if the raid might possibly have been rebuilt out of order.
# dd if=/dev/md0 bs=4k count=32 | hexdump -C | grep XFSB
or so...
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 20:53 ` Eric Sandeen
@ 2009-02-23 21:20 ` Harry Mangalam
2009-02-23 21:31 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Harry Mangalam @ 2009-02-23 21:20 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
Thanks for the quick response Eric!
On Monday 23 February 2009, Eric Sandeen wrote:
> Harry Mangalam wrote:
> > Here's an unusual (long) tale of woe.
> >
> > We had a USRobotics 8700 NAS appliance with 4 SATA disks in
> > RAID5:
<snip>
> However this wouldn't cause a superblock mis-read like this. If
> you get it mounted, you *may* run into what looks like directory
> corruption on the PC, though, due to the alignment issue.
>
> Anyway, first, I'd look around for "XFSB" in the early few blocks
> of your raid and see if the raid might possibly have been rebuilt
> out of order.
>
> # dd if=/dev/md0 bs=4k count=32 | hexdump -C | grep XFSB
>
> or so...
>
> -Eric
No, I didn't find this - I did find some disk ID header stuff which
confirms that the filesystem is XFS and some other info that might be
useful, but no XFSB strings, even grepping 8MB into the device.
|<IPStorPartition|
| version="3.0" s|
|ize="595" owner=|
|"NACS-SW-DIST" c|
|hecksum="" signa|
|ture="IpStOrDyNa|
|MiCdIsK" dataSta|
|rtAtSectorNo="16|
|128" logvol="0" |
|category="Virtua|
|l Device"/>.<Phy|
|sicalDev guid="5|
|95e9fbb-1951-09c|
|3-a30c-000045d3a|
|a3b" Comment="" |
|WorldWideID="FAL|
|CON LVMDISK-M09|
|N01 v1.0-0-0-00|
|"/>.............|
|................|
|<DynamicDiskSegm|
|ent guid="126de9|
|2f-98a1-3c06-01e|
|5-000045d3aa4d" |
|firstSector="161|
|28" lastSector="|
|22271" owner="NA|
|CS-SW-DIST" data|
|set="1171499597"|
| seqNo="0" isLas|
|tSegment="true" |
|sectorSize="512"|
| type="Umap" lun|
|Type="0" timesta|
|mp="1171499597" |
|umapTimestamp="0|
|" deviceName="NA|
|SDisk-00002" fil|
|eSystem="XFS"/>.|
|<DynamicDiskSegm|
|ent guid="126de9|
|2f-98a1-3c06-01e|
|5-000045d3aa4d" |
|firstSector="222|
|72" lastSector="|
|2928740095" owne|
|r="NACS-SW-DIST"|
| dataset="117149|
|9597" seqNo="1" |
|isLastSegment="t|
|rue" sectorSize=|
|"512" type="NAS"|
| lunType="0" tim|
|estamp="11714995|
|97" umapTimestam|
|p="0" deviceName|
|="NASDisk-00002"|
|/>..............|
|................|
--
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway,
UC Irvine 92697 949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience;
Experience comes from bad judgment. [F. Brooks.]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 21:20 ` Harry Mangalam
@ 2009-02-23 21:31 ` Eric Sandeen
2009-02-23 21:45 ` Harry Mangalam
0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-02-23 21:31 UTC (permalink / raw)
To: Harry Mangalam; +Cc: xfs
Harry Mangalam wrote:
> Thanks for the quick response Eric!
>
> On Monday 23 February 2009, Eric Sandeen wrote:
>> Harry Mangalam wrote:
>>> Here's an unusual (long) tale of woe.
>>>
>>> We had a USRobotics 8700 NAS appliance with 4 SATA disks in
>>> RAID5:
>
> <snip>
>
>> However this wouldn't cause a superblock mis-read like this. If
>> you get it mounted, you *may* run into what looks like directory
>> corruption on the PC, though, due to the alignment issue.
>>
>> Anyway, first, I'd look around for "XFSB" in the early few blocks
>> of your raid and see if the raid might possibly have been rebuilt
>> out of order.
>>
>> # dd if=/dev/md0 bs=4k count=32 | hexdump -C | grep XFSB
>>
>> or so...
>>
>> -Eric
>
> No, I didn't find this - I did find some disk ID header stuff which
> confirms that the filesystem is XFS and some other info that might be
> useful, but no XFSB strings, even grepping 8MB into the device.
>
> |<IPStorPartition|
> | version="3.0" s|
> |ize="595" owner=|
> |"NACS-SW-DIST" c|
> |hecksum="" signa|
> |ture="IpStOrDyNa|
> |MiCdIsK" dataSta|
> |rtAtSectorNo="16|
.... knowing the offsets of these might be helpful. But perhaps you are
simply not trying to mount the thing which actually has xfs on it.
I don't know what IPStore is. Perhaps your volume is encrypted?
Dunno... at any rate, doesn't seem at first glance like it's an xfs
problem, I'm afraid.
> |firstSector="222|
> |72" lastSector="|
> |2928740095" owne|
I might look at sector 22272 (about 10MB in) and see if that looks like
xfs :)
Or maybe just put it back in the NAS box, now, assembled the same way.
-Eric
> |128" logvol="0" |
> |category="Virtua|
> |l Device"/>.<Phy|
> |sicalDev guid="5|
> |95e9fbb-1951-09c|
> |3-a30c-000045d3a|
> |a3b" Comment="" |
> |WorldWideID="FAL|
> |CON LVMDISK-M09|
> |N01 v1.0-0-0-00|
> |"/>.............|
> |................|
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 21:31 ` Eric Sandeen
@ 2009-02-23 21:45 ` Harry Mangalam
2009-02-23 21:54 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Harry Mangalam @ 2009-02-23 21:45 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
On Monday 23 February 2009, Eric Sandeen wrote:
> > |<IPStorPartition|
> > | version="3.0" s|
> > |ize="595" owner=|
> > |"NACS-SW-DIST" c|
> > |hecksum="" signa|
> > |ture="IpStOrDyNa|
> > |MiCdIsK" dataSta|
> > |rtAtSectorNo="16|
>
> .... knowing the offsets of these might be helpful. But perhaps
> you are simply not trying to mount the thing which actually has xfs
> on it.
>
> I don't know what IPStore is. Perhaps your volume is encrypted?
> Dunno... at any rate, doesn't seem at first glance like it's an xfs
> problem, I'm afraid.
>
> > |firstSector="222|
> > |72" lastSector="|
> > |2928740095" owne|
>
> I might look at sector 22272 (about 10MB in) and see if that looks
> like xfs :)
Good guess! :
At 00AE0000 (11.4MB in) there's "XFSB" altho I can't make out much of
the surrounding bytes...
Is there a way to do anything with this info? Can it be dd'ed out to
use to get the thing mounted?
>
> Or maybe just put it back in the NAS box, now, assembled the same
> way.
This is just about my last gasp - I'll rec that they try this if
nothing else comes to mind..
>
> -Eric
--
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway,
UC Irvine 92697 949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience;
Experience comes from bad judgment. [F. Brooks.]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 21:45 ` Harry Mangalam
@ 2009-02-23 21:54 ` Eric Sandeen
2009-02-23 22:24 ` Harry Mangalam
0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-02-23 21:54 UTC (permalink / raw)
To: Harry Mangalam; +Cc: xfs
Harry Mangalam wrote:
> On Monday 23 February 2009, Eric Sandeen wrote:
>>> |<IPStorPartition|
>>> | version="3.0" s|
>>> |ize="595" owner=|
>>> |"NACS-SW-DIST" c|
>>> |hecksum="" signa|
>>> |ture="IpStOrDyNa|
>>> |MiCdIsK" dataSta|
>>> |rtAtSectorNo="16|
>> .... knowing the offsets of these might be helpful. But perhaps
>> you are simply not trying to mount the thing which actually has xfs
>> on it.
>>
>> I don't know what IPStore is. Perhaps your volume is encrypted?
>> Dunno... at any rate, doesn't seem at first glance like it's an xfs
>> problem, I'm afraid.
>>
>>> |firstSector="222|
>>> |72" lastSector="|
>>> |2928740095" owne|
>> I might look at sector 22272 (about 10MB in) and see if that looks
>> like xfs :)
>
> Good guess! :
> At 00AE0000 (11.4MB in) there's "XFSB" altho I can't make out much of
> the surrounding bytes...
>
> Is there a way to do anything with this info? Can it be dd'ed out to
> use to get the thing mounted?
try mount -o loop,offset=11403264 /dev/whatever /mnt/whatever
I'd probably also add ro,norecovery to the options as well so you don't
actually write anything to it at this point.
-Eric
>> Or maybe just put it back in the NAS box, now, assembled the same
>> way.
>
> This is just about my last gasp - I'll rec that they try this if
> nothing else comes to mind..
>> -Eric
>
>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 21:54 ` Eric Sandeen
@ 2009-02-23 22:24 ` Harry Mangalam
2009-02-23 22:31 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Harry Mangalam @ 2009-02-23 22:24 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
mount -t xfs -o loop,offset=11403264,ro,norecovery /dev/md0 /lost
gives me a mount(!!):
df -> /lost/public, but it's not a standard entry:
?--------- ? ? ? ? ? /lost/public
and dmesg coughs up many lines of XFS errors:
loop: AES key scrubbing enabled
loop: loaded (max 8 devices)
Filesystem "loop0": Disabling barriers, not supported by the
underlying device
Mounting filesystem "loop0" in no-recovery mode. Filesystem will be
inconsistent.
XFS resetting qflags for filesystem loop0
xfs_force_shutdown(loop0,0x1) called from line 424 of file
fs/xfs/xfs_rw.c. Return address = 0xfb2fc728
Filesystem "loop0": I/O Error Detected. Shutting down filesystem:
loop0
Please umount the filesystem, and rectify the problem(s)
xfs_force_shutdown(loop0,0x1) called from line 424 of file
fs/xfs/xfs_rw.c. Return address = 0xfb2fc728
Filesystem "loop0": Disabling barriers, not supported by the
underlying device
Mounting filesystem "loop0" in no-recovery mode. Filesystem will be
inconsistent.
XFS resetting qflags for filesystem loop0
On Monday 23 February 2009, Eric Sandeen wrote:
> try mount -o loop,offset=11403264 /dev/whatever /mnt/whatever
>
> I'd probably also add ro,norecovery to the options as well so you
> don't actually write anything to it at this point.
>
> -Eric
--
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway,
UC Irvine 92697 949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience;
Experience comes from bad judgment. [F. Brooks.]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
2009-02-23 22:24 ` Harry Mangalam
@ 2009-02-23 22:31 ` Eric Sandeen
0 siblings, 0 replies; 8+ messages in thread
From: Eric Sandeen @ 2009-02-23 22:31 UTC (permalink / raw)
To: Harry Mangalam; +Cc: xfs
Harry Mangalam wrote:
> mount -t xfs -o loop,offset=11403264,ro,norecovery /dev/md0 /lost
>
> gives me a mount(!!):
>
> df -> /lost/public, but it's not a standard entry:
> ?--------- ? ? ? ? ? /lost/public
>
> and dmesg coughs up many lines of XFS errors:
not sure at this point. i'd put it back in the nas box w/ the recreated
raid mount -o ro,norecovery again for safety if you like, and hope for
the best.
Otherwise maybe run repair; on a dd image if you prefer, again for
safety....
-Eric
> loop: AES key scrubbing enabled
> loop: loaded (max 8 devices)
> Filesystem "loop0": Disabling barriers, not supported by the
> underlying device
> Mounting filesystem "loop0" in no-recovery mode. Filesystem will be
> inconsistent.
> XFS resetting qflags for filesystem loop0
> xfs_force_shutdown(loop0,0x1) called from line 424 of file
> fs/xfs/xfs_rw.c. Return address = 0xfb2fc728
> Filesystem "loop0": I/O Error Detected. Shutting down filesystem:
> loop0
> Please umount the filesystem, and rectify the problem(s)
> xfs_force_shutdown(loop0,0x1) called from line 424 of file
> fs/xfs/xfs_rw.c. Return address = 0xfb2fc728
> Filesystem "loop0": Disabling barriers, not supported by the
> underlying device
> Mounting filesystem "loop0" in no-recovery mode. Filesystem will be
> inconsistent.
> XFS resetting qflags for filesystem loop0
>
>
>
>
> On Monday 23 February 2009, Eric Sandeen wrote:
>> try mount -o loop,offset=11403264 /dev/whatever /mnt/whatever
>>
>> I'd probably also add ro,norecovery to the options as well so you
>> don't actually write anything to it at this point.
>>
>> -Eric
>
>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-02-23 22:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-23 20:43 XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5 Harry Mangalam
2009-02-23 20:53 ` Eric Sandeen
2009-02-23 21:20 ` Harry Mangalam
2009-02-23 21:31 ` Eric Sandeen
2009-02-23 21:45 ` Harry Mangalam
2009-02-23 21:54 ` Eric Sandeen
2009-02-23 22:24 ` Harry Mangalam
2009-02-23 22:31 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox