* XFS filesystem recovery from secondary superblocks
@ 2012-10-31 5:02 Aaron Goulding
2012-11-01 9:18 ` Emmanuel Florac
2012-11-01 22:59 ` Dave Chinner
0 siblings, 2 replies; 12+ messages in thread
From: Aaron Goulding @ 2012-10-31 5:02 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 8881 bytes --]
Hello! So I have an XFS filesystem that isn't mounting, and quite a long
story as to why and what I've tried.
And before you start, yes backups are the preferred method of restoration
at this point. Never trust your files to a single FS, etc.
So I have a 9 disk MD array (0.9 superblock format, total usable space
14TB) configured as an LVM PV, one VG, then one LV with not quite all the
space allocated. That LV is formatted XFS and mounted as /mnt/storage. This
was started on Ubuntu 10.04 which has been release-upgraded to 12.04. The
LV has been grown 3 times over the last two years. The system's boot, root
and swap partitions are on a separate drive.
So what happened? Well one drive died spectacularly. It had a full bearing
failure which caused a power drain and the system kicked out two more
drives instantly. This put the array into an offline state as expected. I
replaced the failed drive with a new one, and checked carefully the disk
order, before attempting to re-assemble the array. At the time, I didn't
know about mdadm --re-add. (Likely my first mistake)
mdadm --create --assume-clean --level=6 --raid-devices=9 /dev/md0 /dev/sdg1
missing /dev/sdh1 /dev/sdj1 /dev/sdd1 /dev/sdb1 /dev/sde1 /dev/sdf1
/dev/sdc1
The first problem with this is that the update to Ubuntu meant it created
the superblocks as 1.2 instead of 0.9. Not catching this, I then added in
the replacement /dev/sdi1. This started the array rebuilding incorrectly. I
quickly realized my mistake and stopped the array, then recreated again,
this time using superblock 0.9 format, but the damage had already been done
to roughly the first 100GB of the array, possibly more.
I attempted to restore the lvm superblock from the backup stored in
/etc/lvm/backup/
pvcreate -f -vv --uuid "hJrAn2-wTd8-vY11-steD-23Jh-AwKK-4VvnkH"
--restorefile /etc/lvm/backup/vg1 /dev/md0
When that failed, I decided to attach a second array so I could more safely
examine the problem. I built a second MD array with 7 3T disks in RAID6,
giving me a 15TB /mnt/restore volume to work with. I made a dd copy of
/dev/md0 to a test file I could manipulate safely.
Once I had the file created, I tried xfs_clean -f /mnt/restore/md0.dat to
no luck. I used a hex editor to add XFSB to be beginning, hoping the
recovery would just clean around the LVM data with similar results. The
result looks like the following:
Phase 1 - find and verify superblock...
bad primary superblock - bad or unsupported version !!!
attempting to find secondary superblock...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
Exiting now.
running xfs_db /mnt/restore/md0.dat would appear to run out of memory.
So I realized I needed to pull the data out of LVM and re-assemble it
properly if I was going to make any progress. So I checked the backup
config again:
# Generated by LVM2 version 2.02.66(2) (2010-05-20): Sun Jul 29 13:40:58
2012
contents = "Text Format Volume Group"
version = 1
description = "Created *after* executing 'vgcfgbackup'"
creation_host = "jarvis" # Linux jarvis 3.0.0-23-server #39-Ubuntu
SMP Thu Jul 19 19:37:41 UTC 2012 x86_64
creation_time = 1343594458 # Sun Jul 29 13:40:58 2012
vg1 {
id = "hJrAn2-wTd8-vY11-steD-23Jh-AwKK-4VvnkH"
seqno = 19
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0
physical_volumes {
pv0 {
id = "VRHqH4-oIje-iQWV-iLUL-dLXX-eEf9-mLd9Z7"
device = "/dev/md0" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 27349166336 # 12.7354 Terabytes
pe_start = 768
pe_count = 3338521 # 12.7354 Terabytes
}
}
logical_volumes {
storage {
id = "H47IMn-ohEG-3W6l-NfCu-ePjJ-U255-FcIjdp"
status = ["READ", "WRITE", "VISIBLE"]
flags = []
segment_count = 4
segment1 {
start_extent = 0
extent_count = 2145769 # 8.18546 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 25794
]
}
segment2 {
start_extent = 2145769
extent_count = 626688 # 2.39062 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 2174063
]
}
segment3 {
start_extent = 2772457
extent_count = 384170 # 1.46549 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 2954351
]
}
segment4 {
start_extent = 3156627
extent_count = 140118 # 547.336 Gigabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 2800751
]
}
}
}
}
So I noticed segment4 comes before segment3 (based off stripes = [
"pv0",2800751 ]) and standard extent size was 4MB, so I wrote the following:
echo "writing seg 1 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=0 skip=25794 count=2145769
echo "writing seg 2 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=2145769 skip=2174063 count=626688
echo "writing seg 3 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=2772457 skip=2954351 count=384170
echo "writing seg 4 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=3156627 skip=2800751 count=140118
then just to make sure things were clean, I zeroed out the remainder of
/dev/md1
I used the hex editor (shed) again to make sure the first four bytes on the
drive are XFSB.
Once done, I tried xfs_repair again, this time on /dev/md1 with the same
results as above.
Next I tried xfs_db /dev/md1 to see if anything would load. I get the
following:
root@jarvis:/mnt# xfs_db /dev/md1
Floating point exception
With the following in dmesg:
[1568395.691767] xfs_db[30966] trap divide error ip:41e4b5 sp:7fff5db8ab90
error:0 in xfs_db[400000+6a000]
So at this point I'm stumped. I'm hoping one of you clever folks out there
might have some next steps I can take. I'm okay with a partial recovery,
and I'm okay if the directory tree gets horked and I have to dig through
lost+found, but I'd really like to at least be able to recover something
from this. I'm happy to post any info needed on this.
Thanks!
-Aaron
[-- Attachment #1.2: Type: text/html, Size: 9805 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-10-31 5:02 XFS filesystem recovery from secondary superblocks Aaron Goulding
@ 2012-11-01 9:18 ` Emmanuel Florac
2012-11-01 22:59 ` Dave Chinner
1 sibling, 0 replies; 12+ messages in thread
From: Emmanuel Florac @ 2012-11-01 9:18 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
Le Tue, 30 Oct 2012 22:02:28 -0700 vous écriviez:
> So at this point I'm stumped. I'm hoping one of you clever folks out
> there might have some next steps I can take. I'm okay with a partial
> recovery, and I'm okay if the directory tree gets horked and I have
> to dig through lost+found, but I'd really like to at least be able to
> recover something from this. I'm happy to post any info needed on
> this.
You could give a try to UFS explorer with the RAID plugin. Just in case
it could make some sense of your disks...
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <eflorac@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-10-31 5:02 XFS filesystem recovery from secondary superblocks Aaron Goulding
2012-11-01 9:18 ` Emmanuel Florac
@ 2012-11-01 22:59 ` Dave Chinner
2012-11-07 15:24 ` Aaron Goulding
1 sibling, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-11-01 22:59 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
On Tue, Oct 30, 2012 at 10:02:28PM -0700, Aaron Goulding wrote:
> Hello! So I have an XFS filesystem that isn't mounting, and quite a long
> story as to why and what I've tried.
>
> And before you start, yes backups are the preferred method of restoration
> at this point. Never trust your files to a single FS, etc.
It's kind of assumed knowledge round here or that there are good
reasons for not backing up the filesystem (e.g. it's hard to back up
a 500TB filesystem).
[snip raid horror story]
> Once I had the file created, I tried xfs_clean -f /mnt/restore/md0.dat to
> no luck. I used a hex editor to add XFSB to be beginning, hoping the
That won't work - the magic number is just one of many checks on the
superblock before it can be considered valid.
> recovery would just clean around the LVM data with similar results. The
> result looks like the following:
>
> Phase 1 - find and verify superblock...
> bad primary superblock - bad or unsupported version !!!
And that's the second check :/
> attempting to find secondary superblock...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> unable to verify superblock, continuing...
> ....................................................................................................
> Exiting now.
So, that means if found 10 potential secondary superblocks, but
couldn't validate any of them. Can you find those superblocks and
hexdump them? something like:
hexdump <dev or file> | grep -A 30 XFSB
Also of interest would be to hexdump the subsequent sector as well,
it should have a magic number of XAGF, and will also have a sequence
number in it that should help tell us if you've got everything in
order.
> running xfs_db /mnt/restore/md0.dat would appear to run out of memory.
No surprise there if the superblocks are toast.
> Next I tried xfs_db /dev/md1 to see if anything would load. I get the
> following:
>
> root@jarvis:/mnt# xfs_db /dev/md1
> Floating point exception
>
> With the following in dmesg:
>
> [1568395.691767] xfs_db[30966] trap divide error ip:41e4b5 sp:7fff5db8ab90
> error:0 in xfs_db[400000+6a000]
what version are you running?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-01 22:59 ` Dave Chinner
@ 2012-11-07 15:24 ` Aaron Goulding
[not found] ` <CABJyUz+r7yQSswqyBng_W=fAXxTz9heb88NeiKSeFU7j4ZD=Hw@mail.gmail.com>
0 siblings, 1 reply; 12+ messages in thread
From: Aaron Goulding @ 2012-11-07 15:24 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 4323 bytes --]
root@jarvis:~# xfs_db -V
xfs_db version 3.1.7
I'm sorry for the delay in response. I've had to work on several other
projects as well, and thank you for the assistance. :)
So I ran a check using bgrep to find all instances of XFSB on /dev/md1 and
came up with just over 3000 results. It's easy enough to write up something
to go grab 100-200 bytes at each point, but I wonder if there's something I
can improve on the bgrep search to narrow that result list down a bit?
Thanks Again!
-Aaron
On Thu, Nov 1, 2012 at 3:59 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Oct 30, 2012 at 10:02:28PM -0700, Aaron Goulding wrote:
> > Hello! So I have an XFS filesystem that isn't mounting, and quite a long
> > story as to why and what I've tried.
> >
> > And before you start, yes backups are the preferred method of restoration
> > at this point. Never trust your files to a single FS, etc.
>
> It's kind of assumed knowledge round here or that there are good
> reasons for not backing up the filesystem (e.g. it's hard to back up
> a 500TB filesystem).
>
> [snip raid horror story]
>
> > Once I had the file created, I tried xfs_clean -f /mnt/restore/md0.dat to
> > no luck. I used a hex editor to add XFSB to be beginning, hoping the
>
> That won't work - the magic number is just one of many checks on the
> superblock before it can be considered valid.
>
> > recovery would just clean around the LVM data with similar results. The
> > result looks like the following:
> >
> > Phase 1 - find and verify superblock...
> > bad primary superblock - bad or unsupported version !!!
>
> And that's the second check :/
>
> > attempting to find secondary superblock...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > unable to verify superblock, continuing...
> >
> ....................................................................................................
> > Exiting now.
>
> So, that means if found 10 potential secondary superblocks, but
> couldn't validate any of them. Can you find those superblocks and
> hexdump them? something like:
>
> hexdump <dev or file> | grep -A 30 XFSB
>
> Also of interest would be to hexdump the subsequent sector as well,
> it should have a magic number of XAGF, and will also have a sequence
> number in it that should help tell us if you've got everything in
> order.
>
> > running xfs_db /mnt/restore/md0.dat would appear to run out of memory.
>
> No surprise there if the superblocks are toast.
>
> > Next I tried xfs_db /dev/md1 to see if anything would load. I get the
> > following:
> >
> > root@jarvis:/mnt# xfs_db /dev/md1
> > Floating point exception
> >
> > With the following in dmesg:
> >
> > [1568395.691767] xfs_db[30966] trap divide error ip:41e4b5
> sp:7fff5db8ab90
> > error:0 in xfs_db[400000+6a000]
>
> what version are you running?
>
> Cheers,
>
> Dave.
>
> --
> Dave Chinner
> david@fromorbit.com
>
[-- Attachment #1.2: Type: text/html, Size: 5314 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
[not found] ` <CABJyUz+r7yQSswqyBng_W=fAXxTz9heb88NeiKSeFU7j4ZD=Hw@mail.gmail.com>
@ 2012-11-08 21:02 ` Dave Chinner
[not found] ` <CABJyUzKy4YXVZAj=awVNfPD69eWo2mKhM7a-xWF1Vy-PD989sg@mail.gmail.com>
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-11-08 21:02 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
On Thu, Nov 08, 2012 at 07:16:06AM -0800, Aaron Goulding wrote:
> Well I found an error in my bgrep. I was searching for XFSB but in the
> wrong order. Once fixed, I get almost 19,000 results, but I note most of
That seems like way too many to be filesystem metadata hits. I woul
dhave expected the same number of hits as there are AGs in the
filesytem. What you want is the XFSB string in the first 4
bytes of a sector, with the next sector having XAGF as the first
four bytes....
> those results happen within one section on the drive. I've attached the
> first 1MB from that point (4398047346688 bytes in) and I will grab the
> other results as 1K blocks.
I'm not going to look at 19000 potential hits. Narrow it down to
likely candidates first whithe the sector location filter...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
[not found] ` <CABJyUzKy4YXVZAj=awVNfPD69eWo2mKhM7a-xWF1Vy-PD989sg@mail.gmail.com>
@ 2012-11-09 10:48 ` Dave Chinner
2012-11-09 15:22 ` Aaron Goulding
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-11-09 10:48 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
On Thu, Nov 08, 2012 at 11:28:20PM -0800, Aaron Goulding wrote:
> On Thu, Nov 8, 2012 at 1:02 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> > That seems like way too many to be filesystem metadata hits. I woul
> > dhave expected the same number of hits as there are AGs in the
> > filesytem. What you want is the XFSB string in the first 4
> > bytes of a sector, with the next sector having XAGF as the first
> > four bytes....
> >
>
> Okay this proved helpful. I ran a check of the first 1024 bytes at each of
> those 19000 results for XAGF, then limited that to results that occurred at
> the 512 byte relative mark. This dropped the results down to 11. I then
> went and grabbed 8192 bytes starting at each of those 11 points, and
> attached them.
bad:
4398367673432.dmp2 - has a headr followed by zeros, then parts of a
transaction headers, etc. just looks like random bits of filesystem
metadata record. It's probably the log.
12094629761024.dmp2 - has the first part of a superblco, but the
empty part of the sb is not zeroed - looks like a utf8 string?
Perhaps directory structure? components.ini.new is the repeated
name... AGI, AGF AGF look intact. Has this fs been grown in the
past?
13194139877376.dmp2 - Looks like an empty, pristine AG just freshly
allocated by mkfs. (AGFL contains only blocks 4,5,6 and 7, and the
AGF and ABTB block show a single large freespace extent covering the
entire AG (0x937a4 blocks in size).
So, judging by what I've got here, the file names are the block
offset of the data and that the offsets start at 0. That gives me
the headers for AGs 1, 3, 4, (the log in AG 4), 5, 6, 7, 9, 10, 11
and 12. Missing are 0, 2 and 8.
/me hugs xfs_db
$ xfs_db -c "sb 0" -c p -f 1099514834944.dmp2
<snip warnings>
magicnum = 0x58465342
blocksize = 4096
dblocks = 3375866880
rblocks = 0
rextents = 0
uuid = e36d4151-2bf0-4f0e-87e2-c21963022640
logstart = 1073741828
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 268435455
agcount = 13
rbmblocks = 0
logblocks = 521728
versionnum = 0xb4b4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 0
imax_pct = 5
icount = 3970688
ifree = 11
fdblocks = 260163907
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa
So I now know that there are only 13 AGs (0-12), and you are only
missing 0, 2 and 8, and that the log starts at 4398046527490 which
matches with the chunk that I though was a log record. The AGFs
indicate the sequence numbers are in the correct order and match the
offsets, so AFAICT you've assembled the array in the right order.
I think the best thing you could do copy the first 512 bytes from
the 1099514834944.dmp2 to the first sector of the device you dumped
this from (i.e. offset zero), and then running xfs_repair -n on it
to see whether it validates it as correct. (save the sector contents
first, jsut in case).
I'd say that there's probably only a small chance of recovering
much, there's a very good chance that you'll have to resort to tools
like xfs_irecover to find lost inodes on the disk to be able to
recover data. Still, one step at a time...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-09 10:48 ` Dave Chinner
@ 2012-11-09 15:22 ` Aaron Goulding
2012-11-11 7:08 ` Aaron Goulding
2012-11-11 22:36 ` Dave Chinner
0 siblings, 2 replies; 12+ messages in thread
From: Aaron Goulding @ 2012-11-09 15:22 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 4086 bytes --]
On Fri, Nov 9, 2012 at 2:48 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Nov 08, 2012 at 11:28:20PM -0800, Aaron Goulding wrote:
> > On Thu, Nov 8, 2012 at 1:02 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> >
> > > That seems like way too many to be filesystem metadata hits. I woul
> > > dhave expected the same number of hits as there are AGs in the
> > > filesytem. What you want is the XFSB string in the first 4
> > > bytes of a sector, with the next sector having XAGF as the first
> > > four bytes....
> > >
> >
> > Okay this proved helpful. I ran a check of the first 1024 bytes at each
> of
> > those 19000 results for XAGF, then limited that to results that occurred
> at
> > the 512 byte relative mark. This dropped the results down to 11. I then
> > went and grabbed 8192 bytes starting at each of those 11 points, and
> > attached them.
>
> bad:
>
> 4398367673432.dmp2 - has a headr followed by zeros, then parts of a
> transaction headers, etc. just looks like random bits of filesystem
> metadata record. It's probably the log.
>
> 12094629761024.dmp2 - has the first part of a superblco, but the
> empty part of the sb is not zeroed - looks like a utf8 string?
> Perhaps directory structure? components.ini.new is the repeated
> name... AGI, AGF AGF look intact. Has this fs been grown in the
> past?
>
> 13194139877376.dmp2 - Looks like an empty, pristine AG just freshly
> allocated by mkfs. (AGFL contains only blocks 4,5,6 and 7, and the
> AGF and ABTB block show a single large freespace extent covering the
> entire AG (0x937a4 blocks in size).
>
> So, judging by what I've got here, the file names are the block
> offset of the data and that the offsets start at 0. That gives me
> the headers for AGs 1, 3, 4, (the log in AG 4), 5, 6, 7, 9, 10, 11
> and 12. Missing are 0, 2 and 8.
>
> /me hugs xfs_db
>
> $ xfs_db -c "sb 0" -c p -f 1099514834944.dmp2
> <snip warnings>
> magicnum = 0x58465342
> blocksize = 4096
> dblocks = 3375866880
> rblocks = 0
> rextents = 0
> uuid = e36d4151-2bf0-4f0e-87e2-c21963022640
> logstart = 1073741828
> rootino = 128
> rbmino = 129
> rsumino = 130
> rextsize = 1
> agblocks = 268435455
> agcount = 13
> rbmblocks = 0
> logblocks = 521728
> versionnum = 0xb4b4
> sectsize = 512
> inodesize = 256
> inopblock = 16
> fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
> blocklog = 12
> sectlog = 9
> inodelog = 8
> inopblog = 4
> agblklog = 28
> rextslog = 0
> inprogress = 0
> imax_pct = 5
> icount = 3970688
> ifree = 11
> fdblocks = 260163907
> frextents = 0
> uquotino = 0
> gquotino = 0
> qflags = 0
> flags = 0
> shared_vn = 0
> inoalignmt = 2
> unit = 0
> width = 0
> dirblklog = 0
> logsectlog = 0
> logsectsize = 0
> logsunit = 1
> features2 = 0xa
> bad_features2 = 0xa
>
> So I now know that there are only 13 AGs (0-12), and you are only
> missing 0, 2 and 8, and that the log starts at 4398046527490 which
> matches with the chunk that I though was a log record. The AGFs
> indicate the sequence numbers are in the correct order and match the
> offsets, so AFAICT you've assembled the array in the right order.
>
> I think the best thing you could do copy the first 512 bytes from
> the 1099514834944.dmp2 to the first sector of the device you dumped
> this from (i.e. offset zero), and then running xfs_repair -n on it
> to see whether it validates it as correct. (save the sector contents
> first, jsut in case).
>
> root@jarvis:~/dump1# dd if=1099514834944.dmp2 of=/dev/md1 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0649365 s, 7.9 kB/s
root@jarvis:~/dump1# xfs_i
xfs_info xfs_io
root@jarvis:~/dump1# xfs_repair -n /dev/md1
Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with
matching geometry !!!
attempting to find secondary superblock...
......................................................................................................................
Hrm I'm guessing that's not a good response, but I'll let it run. If that
doesn't return good results, I'll try xfs_irecover.
-Aaron
[-- Attachment #1.2: Type: text/html, Size: 4965 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-09 15:22 ` Aaron Goulding
@ 2012-11-11 7:08 ` Aaron Goulding
2012-11-11 22:39 ` Dave Chinner
2012-11-11 22:36 ` Dave Chinner
1 sibling, 1 reply; 12+ messages in thread
From: Aaron Goulding @ 2012-11-11 7:08 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 412 bytes --]
As I guessed, xfs_repair didn't work. xfs_db does now load with warnings,
but I fear I don't know enough about that to properly use that tool. I've
done a search for xfs_irepair but I'm finding very little from that. Where
is that tool located? I'm understanding a full restore is very unlikely at
this point, but if I can get anything, I'll consider this project a success
and a learning experience. :)
-Aaron
[-- Attachment #1.2: Type: text/html, Size: 449 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-09 15:22 ` Aaron Goulding
2012-11-11 7:08 ` Aaron Goulding
@ 2012-11-11 22:36 ` Dave Chinner
1 sibling, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-11 22:36 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
On Fri, Nov 09, 2012 at 07:22:01AM -0800, Aaron Goulding wrote:
> On Fri, Nov 9, 2012 at 2:48 AM, Dave Chinner <david@fromorbit.com> wrote:
>
> > On Thu, Nov 08, 2012 at 11:28:20PM -0800, Aaron Goulding wrote:
> > > On Thu, Nov 8, 2012 at 1:02 PM, Dave Chinner <david@fromorbit.com>
> > wrote:
> > >
> > > > That seems like way too many to be filesystem metadata hits. I woul
> > > > dhave expected the same number of hits as there are AGs in the
> > > > filesytem. What you want is the XFSB string in the first 4
> > > > bytes of a sector, with the next sector having XAGF as the first
> > > > four bytes....
> > > >
> > >
> > > Okay this proved helpful. I ran a check of the first 1024 bytes at each
> > of
> > > those 19000 results for XAGF, then limited that to results that occurred
> > at
> > > the 512 byte relative mark. This dropped the results down to 11. I then
> > > went and grabbed 8192 bytes starting at each of those 11 points, and
> > > attached them.
......
> > $ xfs_db -c "sb 0" -c p -f 1099514834944.dmp2
> > <snip warnings>
> > magicnum = 0x58465342
> > blocksize = 4096
> > dblocks = 3375866880
> > rblocks = 0
> > rextents = 0
> > uuid = e36d4151-2bf0-4f0e-87e2-c21963022640
> > logstart = 1073741828
> > rootino = 128
> > rbmino = 129
> > rsumino = 130
> > rextsize = 1
> > agblocks = 268435455
....
> > blocklog = 12
....
> > So I now know that there are only 13 AGs (0-12), and you are only
> > missing 0, 2 and 8, and that the log starts at 4398046527490 which
> > matches with the chunk that I though was a log record. The AGFs
> > indicate the sequence numbers are in the correct order and match the
> > offsets, so AFAICT you've assembled the array in the right order.
> >
> > I think the best thing you could do copy the first 512 bytes from
> > the 1099514834944.dmp2 to the first sector of the device you dumped
> > this from (i.e. offset zero), and then running xfs_repair -n on it
> > to see whether it validates it as correct. (save the sector contents
> > first, jsut in case).
> >
> > root@jarvis:~/dump1# dd if=1099514834944.dmp2 of=/dev/md1 bs=512 count=1
> 1+0 records in
> 1+0 records out
> 512 bytes (512 B) copied, 0.0649365 s, 7.9 kB/s
> root@jarvis:~/dump1# xfs_i
> xfs_info xfs_io
> root@jarvis:~/dump1# xfs_repair -n /dev/md1
> Phase 1 - find and verify superblock...
> couldn't verify primary superblock - not enough secondary superblocks with
> matching geometry !!!
Which means it found less than 13/2 = 6 valid secondary superblocks.
I should have looked at this previously: the addresses of the
secondary superblocks should be:
$ for i in `seq 0 1 12`; do echo $((i * 268435455 << 12)) ; done
0
1099511623680
2199023247360
3298534871040
4398046494720
5497558118400
6597069742080
7696581365760
8796092989440
9895604613120
10995116236800
12094627860480
13194139484160
and the offsets with superblocks in them are:
1099514834944.dmp2
3298535133184.dmp2
4398047346688.dmp2
5497559560192.dmp2
6597069676544.dmp2
7696579268608.dmp2
9895606972416.dmp2
10995117088768.dmp2
12094629761024.dmp2
13194139877376.dmp2
All wrong. The difference is between AG 1 is 3,211,264 bytes, Ag 3
is 262144 bytes, AG 4 is 851,968 bytes, and so on. There is no
consistency in the difference between where you found the headers
and where they should. There's no way I can put this back together
into a repairable filesystem...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-11 7:08 ` Aaron Goulding
@ 2012-11-11 22:39 ` Dave Chinner
2012-11-14 3:26 ` Aaron Goulding
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-11-11 22:39 UTC (permalink / raw)
To: Aaron Goulding; +Cc: xfs
On Sat, Nov 10, 2012 at 11:08:23PM -0800, Aaron Goulding wrote:
> As I guessed, xfs_repair didn't work. xfs_db does now load with warnings,
> but I fear I don't know enough about that to properly use that tool. I've
> done a search for xfs_irepair but I'm finding very little from that. Where
> is that tool located? I'm understanding a full restore is very unlikely at
> this point, but if I can get anything, I'll consider this project a success
> and a learning experience. :)
xfs_irecover:
http://oss.sgi.com/archives/xfs/2008-12/msg01782.html
current location:
http://inai.de/projects/hxtools/
You might need to hack it to recovery full files (ISTR is ignore
files larger than a certain size), but tools lke this are your best
bet now.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-11 22:39 ` Dave Chinner
@ 2012-11-14 3:26 ` Aaron Goulding
2012-11-25 0:20 ` Aaron Goulding
0 siblings, 1 reply; 12+ messages in thread
From: Aaron Goulding @ 2012-11-14 3:26 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 1306 bytes --]
Hmm.. Okay new plan. Running XFSB and XAGF scans on /dev/md0 instead of
/dev/md1, and I'll use that to find the alignment the superblocks are
expecting, and see if I have an offset wrong somewhere when I ran the
multi-part DD onto /dev/md1. Barring this, I'll resort to xfs_irecover. I
think you've given me a lot of information to go on though, so thank you
greatly.
-Aaron
On Sun, Nov 11, 2012 at 2:39 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Sat, Nov 10, 2012 at 11:08:23PM -0800, Aaron Goulding wrote:
> > As I guessed, xfs_repair didn't work. xfs_db does now load with warnings,
> > but I fear I don't know enough about that to properly use that tool. I've
> > done a search for xfs_irepair but I'm finding very little from that.
> Where
> > is that tool located? I'm understanding a full restore is very unlikely
> at
> > this point, but if I can get anything, I'll consider this project a
> success
> > and a learning experience. :)
>
> xfs_irecover:
>
> http://oss.sgi.com/archives/xfs/2008-12/msg01782.html
>
> current location:
>
> http://inai.de/projects/hxtools/
>
> You might need to hack it to recovery full files (ISTR is ignore
> files larger than a certain size), but tools lke this are your best
> bet now.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
[-- Attachment #1.2: Type: text/html, Size: 2011 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS filesystem recovery from secondary superblocks
2012-11-14 3:26 ` Aaron Goulding
@ 2012-11-25 0:20 ` Aaron Goulding
0 siblings, 0 replies; 12+ messages in thread
From: Aaron Goulding @ 2012-11-25 0:20 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 2601 bytes --]
So an update! I did another search of /dev/md0 for superblock locations
(find all instances of XFSB, find all instances of XAGF that occur 512
bytes after XFSB) and used those as start points to copy the data to
/dev/md1. I took the block count from the superblock, (right at 1TB per AG)
and copied that block of data to the array, properly aligned.
This seems to have worked rather well! xfs_db now has much fewer errors,
and can read SBs and AGFs, and xfs_repair can do something with it.
>From here I did two things. I ran xfs_repair -L to purge the log, as I
wasn't able to replay it (and some data loss is acceptable), and then I
mounted the filesystem. This worked! Though it was now showing 22GB used,
all in lost and found. Oops.
I noticed ag 0 reports a lot more free space than any of the others, so I
tried copying sb 1 over sb 0 to get things to use the more intact blocks.
No such luck yet, but I'm definitely getting some data. When I tried using
sb 12, df shows 11T used (about right) but the file list is still very
sparse.
So moving onward! Any recommendations from this point? I take it this is
where I'd run xfs_irecover?
Aaron
On Tue, Nov 13, 2012 at 7:26 PM, Aaron Goulding <aarongldng@gmail.com>wrote:
> Hmm.. Okay new plan. Running XFSB and XAGF scans on /dev/md0 instead of
> /dev/md1, and I'll use that to find the alignment the superblocks are
> expecting, and see if I have an offset wrong somewhere when I ran the
> multi-part DD onto /dev/md1. Barring this, I'll resort to xfs_irecover. I
> think you've given me a lot of information to go on though, so thank you
> greatly.
>
> -Aaron
>
>
>
> On Sun, Nov 11, 2012 at 2:39 PM, Dave Chinner <david@fromorbit.com> wrote:
>
>> On Sat, Nov 10, 2012 at 11:08:23PM -0800, Aaron Goulding wrote:
>> > As I guessed, xfs_repair didn't work. xfs_db does now load with
>> warnings,
>> > but I fear I don't know enough about that to properly use that tool.
>> I've
>> > done a search for xfs_irepair but I'm finding very little from that.
>> Where
>> > is that tool located? I'm understanding a full restore is very unlikely
>> at
>> > this point, but if I can get anything, I'll consider this project a
>> success
>> > and a learning experience. :)
>>
>> xfs_irecover:
>>
>> http://oss.sgi.com/archives/xfs/2008-12/msg01782.html
>>
>> current location:
>>
>> http://inai.de/projects/hxtools/
>>
>> You might need to hack it to recovery full files (ISTR is ignore
>> files larger than a certain size), but tools lke this are your best
>> bet now.
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david@fromorbit.com
>>
>
>
[-- Attachment #1.2: Type: text/html, Size: 3652 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-11-25 0:18 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-31 5:02 XFS filesystem recovery from secondary superblocks Aaron Goulding
2012-11-01 9:18 ` Emmanuel Florac
2012-11-01 22:59 ` Dave Chinner
2012-11-07 15:24 ` Aaron Goulding
[not found] ` <CABJyUz+r7yQSswqyBng_W=fAXxTz9heb88NeiKSeFU7j4ZD=Hw@mail.gmail.com>
2012-11-08 21:02 ` Dave Chinner
[not found] ` <CABJyUzKy4YXVZAj=awVNfPD69eWo2mKhM7a-xWF1Vy-PD989sg@mail.gmail.com>
2012-11-09 10:48 ` Dave Chinner
2012-11-09 15:22 ` Aaron Goulding
2012-11-11 7:08 ` Aaron Goulding
2012-11-11 22:39 ` Dave Chinner
2012-11-14 3:26 ` Aaron Goulding
2012-11-25 0:20 ` Aaron Goulding
2012-11-11 22:36 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox