* xfs_repair deletes files after power cut
@ 2013-08-14 13:06 Semion Zak (sezak)
2013-08-15 0:02 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Semion Zak (sezak) @ 2013-08-14 13:06 UTC (permalink / raw)
To: xfs@oss.sgi.com; +Cc: xtv-fs-group-nds-dg(mailer list)
[-- Attachment #1.1: Type: text/plain, Size: 9016 bytes --]
Hello,
There is a problem in XFS: xfs_repair deletes files after power cut because of "data fork in rt inode x claims used rt block y"
Scenario:
Empty XFS partition and real-time partition with extent size 3008 sectors.
1. In a loop simultaneously:
a. 2 threads simultaneously write 1 stream file in real time partition
b. 1 thread writes 3 files into data partition.
c. 1 thread makes holes in the stream files
d. In the middle of the loop switch off the disk power.
2. Drop caches ("echo 3>/proc/sys/vm/drop_caches")
3. Unmount XFS
4. Switch the disk power on
5. Mount XFS (to replay log)
6. Unmount XFS
7. Repair XFS
8. Mount XFS
After the first mount (step 5) stream file exist in real time partition.
After the second mount (step 8) stream file is absent because is deleted by repair (see attached repair_log):
data fork in rt inode 197 claims used rt block 202731
bad data fork in inode 197
cleared inode 197
...
entry "0.STR" in shortform directory 195 references free inode 197
junking entry "0.STR" in directory inode 195
Seems to be the xfs_repair bug because the streams files bitmap after the first mount seems to be OK (see attached bitmap1), without overlapped extents.
The only file in RT partition 0.STR:
/rt/000000R0.DIR/0.STR:
0: [0..144383]: hole
1: [144384..147391]: 607625024..607628031
2: [147392..291775]: hole
3: [291776..294783]: 607772416..607775423
4: [294784..436159]: hole
5: [436160..439167]: 607916800..607919807
6: [439168..583551]: hole
7: [583552..586559]: 608064192..608067199
8: [586560..727935]: hole
9: [727936..730943]: 608208576..608211583
10: [730944..875327]: hole
11: [875328..878335]: 608355968..608358975
12: [878336..1019711]: hole
13: [1019712..1022719]: 608500352..608503359
14: [1022720..1167103]: hole
15: [1167104..1170111]: 608647744..608650751
16: [1170112..1311487]: hole
17: [1311488..1314495]: 608792128..608795135
18: [1314496..1458879]: hole
19: [1458880..1461887]: 608939520..608942527
20: [1461888..1603263]: hole
21: [1603264..1606271]: 609083904..609086911
22: [1606272..1750655]: hole
23: [1750656..1753663]: 609231296..609234303
24: [1753664..1895039]: hole
25: [1895040..1898047]: 609375680..609378687
26: [1898048..2042431]: hole
27: [2042432..2045439]: 609523072..609526079
28: [2045440..2186815]: hole
29: [2186816..2189823]: 609667456..609670463
30: [2189824..2334207]: hole
31: [2334208..2334719]: 609814848..609815359
32: [2334720..3853247]: 609815360..611333887
The only strange thing is that 2 the last extents are contiguous and could be united into 1 extent.
It is strange but is not a good reason to erase the file:
data fork in rt inode 197 claims used rt block 202731
bad data fork in inode 197
cleared inode 197
...
entry "0.STR" in shortform directory 195 references free inode 197
junking entry "0.STR" in directory inode 195
Block 202731corresponds to sector 202731* 3008 = 609814848, beginning of 31th extent.
This block is not used in other extents, so it is not an error.
* kernel version (uname -a):
Linux SZUbuntu 3.10.4-031004-generic #201307282043 SMP Mon Jul 29 00:52:09 UTC 2013 i686 i686 i686 GNU/Linux
* xfsprogs version (xfs_repair -V)
xfs_repair version 3.1.9
* number of CPUs: 1 Core 2 Duo
* contents of /proc/meminfo
MemTotal: 1989360 kB
MemFree: 134036 kB
Buffers: 36512 kB
Cached: 1208408 kB
SwapCached: 0 kB
Active: 790032 kB
Inactive: 990384 kB
Active(anon): 621884 kB
Inactive(anon): 582020 kB
Active(file): 168148 kB
Inactive(file): 408364 kB
Unevictable: 32 kB
Mlocked: 32 kB
HighTotal: 1174216 kB
HighFree: 55364 kB
LowTotal: 815144 kB
LowFree: 78672 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 12 kB
Writeback: 0 kB
AnonPages: 535536 kB
Mapped: 171040 kB
Shmem: 668400 kB
Slab: 42020 kB
SReclaimable: 23712 kB
SUnreclaim: 18308 kB
KernelStack: 4000 kB
PageTables: 11120 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 994680 kB
Committed_AS: 4103576 kB
VmallocTotal: 122880 kB
VmallocUsed: 20204 kB
VmallocChunk: 61320 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 14328 kB
DirectMap2M: 899072 kB
* contents of /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=986132k,nr_inodes=199513,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=198936k,mode=755 0 0
/dev/disk/by-uuid/2bd39f09-5ca0-41a5-a4af-1f5985fb1f69 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
gvfsd-fuse /run/user/szak/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0
10.63.7.58:/ms/stb_storage/7401_fs/build/szak /mnt/nfs nfs rw,relatime,vers=3,rsize=131072,wsize=131072,namle55,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.63.7.58,mountvers=3,mountport=1234,mountproudp,local_lock=all,addr=10.63.7.58 0 0
/dev/sdc2 /rt xfs rw,relatime,attr2,inode64,rtdev=/dev/sdc3,noquota 0 0
* contents of /proc/partitions
major minor #blocks name
8 0 976762584 sda
8 1 477894656 sda1
8 2 488381144 sda2
8 3 10485760 sda3
11 0 1048575 sr0
8 16 78156288 sdb
8 17 248832 sdb1
8 18 1 sdb2
8 21 77904896 sdb5
252 0 75812864 dm-0
252 1 2088960 dm-1
8 32 2930266584 sdc
8 33 976561152 sdc1
8 34 9765888 sdc2
8 35 1943938503 sdc3
* RAID layout (hardware and/or software):
No raid, /dev/sdc:
Number Start (sector) End (sector) Size Code Name
1 2048 1953124351 931.3 GiB 8300 Linux filesystem
2 1953124352 1972656127 9.3 GiB 8300 Linux filesystem
3 1972656128 5860533134 1.8 TiB 8300 Linux filesystem
* LVM configuration: no LVM
* type of disks you are using: WD30EURS
* write cache status of drives: write-caching = 1 (on)
* size of BBWC and mode it is running in: No
* xfs_info output on the filesystem in question:
meta-data=/dev/sdc2 isize=256 agcount=16, agsize=152592 blks
= sectsz=4096 attr=2
data = bsize=4096 blocks=2441472, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =external extsz=1540096 blocks=485984625, rtextents=1292512
* dmesg output showing all error messages and stack traces:
[166770.764560] XFS (sdc2): Mounting Filesystem
[166770.865189] XFS (sdc2): Starting recovery (logdev: internal)
[166771.042071] XFS (sdc2): Ending recovery (logdev: internal)
[166918.066626] XFS (sdc2): Mounting Filesystem
[166918.159045] XFS (sdc2): Ending clean mount
Thank you.
Semion
[-- Attachment #1.2: Type: text/html, Size: 33368 bytes --]
[-- Attachment #2: bitmap1 --]
[-- Type: application/octet-stream, Size: 1246 bytes --]
/rt/000000R0.DIR/0.STR:
0: [0..144383]: hole
1: [144384..147391]: 607625024..607628031
2: [147392..291775]: hole
3: [291776..294783]: 607772416..607775423
4: [294784..436159]: hole
5: [436160..439167]: 607916800..607919807
6: [439168..583551]: hole
7: [583552..586559]: 608064192..608067199
8: [586560..727935]: hole
9: [727936..730943]: 608208576..608211583
10: [730944..875327]: hole
11: [875328..878335]: 608355968..608358975
12: [878336..1019711]: hole
13: [1019712..1022719]: 608500352..608503359
14: [1022720..1167103]: hole
15: [1167104..1170111]: 608647744..608650751
16: [1170112..1311487]: hole
17: [1311488..1314495]: 608792128..608795135
18: [1314496..1458879]: hole
19: [1458880..1461887]: 608939520..608942527
20: [1461888..1603263]: hole
21: [1603264..1606271]: 609083904..609086911
22: [1606272..1750655]: hole
23: [1750656..1753663]: 609231296..609234303
24: [1753664..1895039]: hole
25: [1895040..1898047]: 609375680..609378687
26: [1898048..2042431]: hole
27: [2042432..2045439]: 609523072..609526079
28: [2045440..2186815]: hole
29: [2186816..2189823]: 609667456..609670463
30: [2189824..2334207]: hole
31: [2334208..2334719]: 609814848..609815359
32: [2334720..3853247]: 609815360..611333887
[-- Attachment #3: repair_log1 --]
[-- Type: application/octet-stream, Size: 1735 bytes --]
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
data fork in rt inode 197 claims used rt block 202731
bad data fork in inode 197
cleared inode 197
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
entry "0.STR" in shortform directory 195 references free inode 197
junking entry "0.STR" in directory inode 195
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
Phase 5 - rebuild AG headers and trees...
- generate realtime summary info and bitmap...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
[-- Attachment #4: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_repair deletes files after power cut
2013-08-14 13:06 xfs_repair deletes files after power cut Semion Zak (sezak)
@ 2013-08-15 0:02 ` Dave Chinner
2013-08-19 11:00 ` Semion Zak (sezak)
0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2013-08-15 0:02 UTC (permalink / raw)
To: Semion Zak (sezak); +Cc: xtv-fs-group-nds-dg(mailer list), xfs@oss.sgi.com
On Wed, Aug 14, 2013 at 01:06:08PM +0000, Semion Zak (sezak) wrote:
> Hello,
>
>
>
> There is a problem in XFS: xfs_repair deletes files after power
> cut because of "data fork in rt inode x claims used rt block y"
What's it supposed to do with it if it is corrupt?
> Scenario:
>
> Empty XFS partition and real-time partition with extent size 3008
> sectors.
Umm, 3008 sectors for the rt extent size? that's extremely weird
even for a RT device....
>
> 1. In a loop simultaneously:
>
> a. 2 threads simultaneously write 1 stream file in real time
> partition
>
> b. 1 thread writes 3 files into data partition.
>
> c. 1 thread makes holes in the stream files
>
> d. In the middle of the loop switch off the disk power.
So you're power failing a drive which has write caches turned on,
>
> 2. Drop caches ("echo 3>/proc/sys/vm/drop_caches")
>
> 3. Unmount XFS
>
> 4. Switch the disk power on
>
> 5. Mount XFS (to replay log)
>
> 6. Unmount XFS
>
> 7. Repair XFS
>
> 8. Mount XFS
>
>
>
> After the first mount (step 5) stream file exist in real time
> partition.
No, the inode and it's metadata exist in the data partition. Only
the file data is in the realtime partition. The corruption is in the
metadata, not the realtime device.
> The only file in RT partition 0.STR:
>
> /rt/000000R0.DIR/0.STR:
>
> 0: [0..144383]: hole
> 1: [144384..147391]: 607625024..607628031
> 2: [147392..291775]: hole
> 3: [291776..294783]: 607772416..607775423
> 4: [294784..436159]: hole
> 5: [436160..439167]: 607916800..607919807
> 6: [439168..583551]: hole
> 7: [583552..586559]: 608064192..608067199
> 8: [586560..727935]: hole
> 9: [727936..730943]: 608208576..608211583
> 10: [730944..875327]: hole
> 11: [875328..878335]: 608355968..608358975
> 12: [878336..1019711]: hole
> 13: [1019712..1022719]: 608500352..608503359
> 14: [1022720..1167103]: hole
> 15: [1167104..1170111]: 608647744..608650751
> 16: [1170112..1311487]: hole
> 17: [1311488..1314495]: 608792128..608795135
> 18: [1314496..1458879]: hole
> 19: [1458880..1461887]: 608939520..608942527
> 20: [1461888..1603263]: hole
> 21: [1603264..1606271]: 609083904..609086911
> 22: [1606272..1750655]: hole
> 23: [1750656..1753663]: 609231296..609234303
> 24: [1753664..1895039]: hole
> 25: [1895040..1898047]: 609375680..609378687
> 26: [1898048..2042431]: hole
> 27: [2042432..2045439]: 609523072..609526079
> 28: [2045440..2186815]: hole
> 29: [2186816..2189823]: 609667456..609670463
> 30: [2189824..2334207]: hole
> 31: [2334208..2334719]: 609814848..609815359
> 32: [2334720..3853247]: 609815360..611333887
>
> The only strange thing is that 2 the last extents are contiguous
> and could be united into 1 extent.
And that will, most likely, be what xfs_repair is barfing on. The
end of extent 31 is not aligned to the rt extent size, and so the
block starting extent 32 overlaps a rt extent already claimed by
extent 31.
So, there is an inconsistency in the extent map, and so xfs_repair
is correct in saying it's broken and trashing the file.
This all sounds very familiar. I'm pretty sure this has been hit
before, and I thought we fixed it. Oh:
http://oss.sgi.com/archives/xfs/2012-09/msg00287.html
Can you see if this patch:
http://oss.sgi.com/archives/xfs/2012-09/msg00481.html
stops repair from removing the file?
It would appear that followup patches that fixed the kernel code
were never posted, and so the problem still exists in the kernel
code.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: xfs_repair deletes files after power cut
2013-08-15 0:02 ` Dave Chinner
@ 2013-08-19 11:00 ` Semion Zak (sezak)
2013-10-09 9:55 ` Semion Zak (sezak)
0 siblings, 1 reply; 7+ messages in thread
From: Semion Zak (sezak) @ 2013-08-19 11:00 UTC (permalink / raw)
To: Dave Chinner; +Cc: xtv-fs-group-nds-dg(mailer list), xfs@oss.sgi.com
Hello Dave,
Thank you for fast and helpful answer.
I applied the patch and it really helped.
The only problem was that read and append to the file of 512 bytes, properly aligned failed.
4K append succeeded, which for my purposes is OK.
Once more, thank you very much.
Semion.
-----Original Message-----
From: Dave Chinner [mailto:david@fromorbit.com]
Sent: Thursday, August 15, 2013 3:02 AM
To: Semion Zak (sezak)
Cc: xfs@oss.sgi.com; xtv-fs-group-nds-dg(mailer list)
Subject: Re: xfs_repair deletes files after power cut
On Wed, Aug 14, 2013 at 01:06:08PM +0000, Semion Zak (sezak) wrote:
> Hello,
>
>
>
> There is a problem in XFS: xfs_repair deletes files after power cut
> because of "data fork in rt inode x claims used rt block y"
What's it supposed to do with it if it is corrupt?
> Scenario:
>
> Empty XFS partition and real-time partition with extent size 3008
> sectors.
Umm, 3008 sectors for the rt extent size? that's extremely weird even for a RT device....
>
> 1. In a loop simultaneously:
>
> a. 2 threads simultaneously write 1 stream file in real time partition
>
> b. 1 thread writes 3 files into data partition.
>
> c. 1 thread makes holes in the stream files
>
> d. In the middle of the loop switch off the disk power.
So you're power failing a drive which has write caches turned on,
>
> 2. Drop caches ("echo 3>/proc/sys/vm/drop_caches")
>
> 3. Unmount XFS
>
> 4. Switch the disk power on
>
> 5. Mount XFS (to replay log)
>
> 6. Unmount XFS
>
> 7. Repair XFS
>
> 8. Mount XFS
>
>
>
> After the first mount (step 5) stream file exist in real time
> partition.
No, the inode and it's metadata exist in the data partition. Only the file data is in the realtime partition. The corruption is in the metadata, not the realtime device.
> The only file in RT partition 0.STR:
>
> /rt/000000R0.DIR/0.STR:
>
> 0: [0..144383]: hole
> 1: [144384..147391]: 607625024..607628031
> 2: [147392..291775]: hole
> 3: [291776..294783]: 607772416..607775423
> 4: [294784..436159]: hole
> 5: [436160..439167]: 607916800..607919807
> 6: [439168..583551]: hole
> 7: [583552..586559]: 608064192..608067199
> 8: [586560..727935]: hole
> 9: [727936..730943]: 608208576..608211583
> 10: [730944..875327]: hole
> 11: [875328..878335]: 608355968..608358975
> 12: [878336..1019711]: hole
> 13: [1019712..1022719]: 608500352..608503359
> 14: [1022720..1167103]: hole
> 15: [1167104..1170111]: 608647744..608650751
> 16: [1170112..1311487]: hole
> 17: [1311488..1314495]: 608792128..608795135
> 18: [1314496..1458879]: hole
> 19: [1458880..1461887]: 608939520..608942527
> 20: [1461888..1603263]: hole
> 21: [1603264..1606271]: 609083904..609086911
> 22: [1606272..1750655]: hole
> 23: [1750656..1753663]: 609231296..609234303
> 24: [1753664..1895039]: hole
> 25: [1895040..1898047]: 609375680..609378687
> 26: [1898048..2042431]: hole
> 27: [2042432..2045439]: 609523072..609526079
> 28: [2045440..2186815]: hole
> 29: [2186816..2189823]: 609667456..609670463
> 30: [2189824..2334207]: hole
> 31: [2334208..2334719]: 609814848..609815359
> 32: [2334720..3853247]: 609815360..611333887
>
> The only strange thing is that 2 the last extents are contiguous and
> could be united into 1 extent.
And that will, most likely, be what xfs_repair is barfing on. The end of extent 31 is not aligned to the rt extent size, and so the block starting extent 32 overlaps a rt extent already claimed by extent 31.
So, there is an inconsistency in the extent map, and so xfs_repair is correct in saying it's broken and trashing the file.
This all sounds very familiar. I'm pretty sure this has been hit before, and I thought we fixed it. Oh:
http://oss.sgi.com/archives/xfs/2012-09/msg00287.html
Can you see if this patch:
http://oss.sgi.com/archives/xfs/2012-09/msg00481.html
stops repair from removing the file?
It would appear that followup patches that fixed the kernel code were never posted, and so the problem still exists in the kernel code.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: xfs_repair deletes files after power cut
2013-08-19 11:00 ` Semion Zak (sezak)
@ 2013-10-09 9:55 ` Semion Zak (sezak)
2013-10-09 20:06 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Semion Zak (sezak) @ 2013-10-09 9:55 UTC (permalink / raw)
To: Dave Chinner; +Cc: xtv-fs-group-nds-dg(mailer list), xfs@oss.sgi.com
Hello Dave,
Is the patch going to be implemented in the formal Linux code?
Thanks,
Semion
-----Original Message-----
From: Semion Zak (sezak)
Sent: Monday, August 19, 2013 2:01 PM
To: Dave Chinner
Cc: xfs@oss.sgi.com; xtv-fs-group-nds-dg(mailer list)
Subject: RE: xfs_repair deletes files after power cut
Hello Dave,
Thank you for fast and helpful answer.
I applied the patch and it really helped.
The only problem was that read and append to the file of 512 bytes, properly aligned failed.
4K append succeeded, which for my purposes is OK.
Once more, thank you very much.
Semion.
-----Original Message-----
From: Dave Chinner [mailto:david@fromorbit.com]
Sent: Thursday, August 15, 2013 3:02 AM
To: Semion Zak (sezak)
Cc: xfs@oss.sgi.com; xtv-fs-group-nds-dg(mailer list)
Subject: Re: xfs_repair deletes files after power cut
On Wed, Aug 14, 2013 at 01:06:08PM +0000, Semion Zak (sezak) wrote:
> Hello,
>
>
>
> There is a problem in XFS: xfs_repair deletes files after power cut
> because of "data fork in rt inode x claims used rt block y"
What's it supposed to do with it if it is corrupt?
> Scenario:
>
> Empty XFS partition and real-time partition with extent size 3008
> sectors.
Umm, 3008 sectors for the rt extent size? that's extremely weird even for a RT device....
>
> 1. In a loop simultaneously:
>
> a. 2 threads simultaneously write 1 stream file in real time partition
>
> b. 1 thread writes 3 files into data partition.
>
> c. 1 thread makes holes in the stream files
>
> d. In the middle of the loop switch off the disk power.
So you're power failing a drive which has write caches turned on,
>
> 2. Drop caches ("echo 3>/proc/sys/vm/drop_caches")
>
> 3. Unmount XFS
>
> 4. Switch the disk power on
>
> 5. Mount XFS (to replay log)
>
> 6. Unmount XFS
>
> 7. Repair XFS
>
> 8. Mount XFS
>
>
>
> After the first mount (step 5) stream file exist in real time
> partition.
No, the inode and it's metadata exist in the data partition. Only the file data is in the realtime partition. The corruption is in the metadata, not the realtime device.
> The only file in RT partition 0.STR:
>
> /rt/000000R0.DIR/0.STR:
>
> 0: [0..144383]: hole
> 1: [144384..147391]: 607625024..607628031
> 2: [147392..291775]: hole
> 3: [291776..294783]: 607772416..607775423
> 4: [294784..436159]: hole
> 5: [436160..439167]: 607916800..607919807
> 6: [439168..583551]: hole
> 7: [583552..586559]: 608064192..608067199
> 8: [586560..727935]: hole
> 9: [727936..730943]: 608208576..608211583
> 10: [730944..875327]: hole
> 11: [875328..878335]: 608355968..608358975
> 12: [878336..1019711]: hole
> 13: [1019712..1022719]: 608500352..608503359
> 14: [1022720..1167103]: hole
> 15: [1167104..1170111]: 608647744..608650751
> 16: [1170112..1311487]: hole
> 17: [1311488..1314495]: 608792128..608795135
> 18: [1314496..1458879]: hole
> 19: [1458880..1461887]: 608939520..608942527
> 20: [1461888..1603263]: hole
> 21: [1603264..1606271]: 609083904..609086911
> 22: [1606272..1750655]: hole
> 23: [1750656..1753663]: 609231296..609234303
> 24: [1753664..1895039]: hole
> 25: [1895040..1898047]: 609375680..609378687
> 26: [1898048..2042431]: hole
> 27: [2042432..2045439]: 609523072..609526079
> 28: [2045440..2186815]: hole
> 29: [2186816..2189823]: 609667456..609670463
> 30: [2189824..2334207]: hole
> 31: [2334208..2334719]: 609814848..609815359
> 32: [2334720..3853247]: 609815360..611333887
>
> The only strange thing is that 2 the last extents are contiguous and
> could be united into 1 extent.
And that will, most likely, be what xfs_repair is barfing on. The end of extent 31 is not aligned to the rt extent size, and so the block starting extent 32 overlaps a rt extent already claimed by extent 31.
So, there is an inconsistency in the extent map, and so xfs_repair is correct in saying it's broken and trashing the file.
This all sounds very familiar. I'm pretty sure this has been hit before, and I thought we fixed it. Oh:
http://oss.sgi.com/archives/xfs/2012-09/msg00287.html
Can you see if this patch:
http://oss.sgi.com/archives/xfs/2012-09/msg00481.html
stops repair from removing the file?
It would appear that followup patches that fixed the kernel code were never posted, and so the problem still exists in the kernel code.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_repair deletes files after power cut
2013-10-09 9:55 ` Semion Zak (sezak)
@ 2013-10-09 20:06 ` Dave Chinner
2013-10-14 13:10 ` Semion Zak (sezak)
0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2013-10-09 20:06 UTC (permalink / raw)
To: Semion Zak (sezak); +Cc: xtv-fs-group-nds-dg(mailer list), xfs@oss.sgi.com
On Wed, Oct 09, 2013 at 09:55:39AM +0000, Semion Zak (sezak) wrote:
> Hello Dave,
>
> Is the patch going to be implemented in the formal Linux code?
It needs to be. I've been busy with other stuff, so haven't done it
myself. I'll try to get to it soon, but if someone else wants to
pick it up sooner, then by all means....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: xfs_repair deletes files after power cut
2013-10-09 20:06 ` Dave Chinner
@ 2013-10-14 13:10 ` Semion Zak (sezak)
2013-10-14 20:08 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Semion Zak (sezak) @ 2013-10-14 13:10 UTC (permalink / raw)
To: Dave Chinner
Cc: Roee Friedman (rfriedma), Danny Shavit (dashavit),
xfs@oss.sgi.com
Hello Dave,
What should be done to deliver the patch?
Thanks,
Semion
-----Original Message-----
From: Dave Chinner [mailto:david@fromorbit.com]
Sent: Wednesday, October 09, 2013 11:06 PM
To: Semion Zak (sezak)
Cc: xfs@oss.sgi.com; xtv-fs-group-nds-dg(mailer list)
Subject: Re: xfs_repair deletes files after power cut
On Wed, Oct 09, 2013 at 09:55:39AM +0000, Semion Zak (sezak) wrote:
> Hello Dave,
>
> Is the patch going to be implemented in the formal Linux code?
It needs to be. I've been busy with other stuff, so haven't done it myself. I'll try to get to it soon, but if someone else wants to pick it up sooner, then by all means....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfs_repair deletes files after power cut
2013-10-14 13:10 ` Semion Zak (sezak)
@ 2013-10-14 20:08 ` Dave Chinner
0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2013-10-14 20:08 UTC (permalink / raw)
To: Semion Zak (sezak)
Cc: Roee Friedman (rfriedma), Danny Shavit (dashavit),
xfs@oss.sgi.com
On Mon, Oct 14, 2013 at 01:10:39PM +0000, Semion Zak (sezak) wrote:
> Hello Dave,
>
> What should be done to deliver the patch?
The repair patch really just needs review and testing, but fixing
the kernel side of things is more complex and I'm not sure what is
needed there yet...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-10-14 20:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-14 13:06 xfs_repair deletes files after power cut Semion Zak (sezak)
2013-08-15 0:02 ` Dave Chinner
2013-08-19 11:00 ` Semion Zak (sezak)
2013-10-09 9:55 ` Semion Zak (sezak)
2013-10-09 20:06 ` Dave Chinner
2013-10-14 13:10 ` Semion Zak (sezak)
2013-10-14 20:08 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox