xfs_repair deletes files after power cut

* xfs_repair deletes files after power cut
@ 2013-08-14 13:06 Semion Zak (sezak)
  2013-08-15  0:02 ` Dave Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Semion Zak (sezak) @ 2013-08-14 13:06 UTC (permalink / raw)
  To: xfs@oss.sgi.com; +Cc: xtv-fs-group-nds-dg(mailer list)

[-- Attachment #1.1: Type: text/plain, Size: 9016 bytes --]

Hello,

There is a problem in XFS: xfs_repair deletes files after power cut because of "data fork in rt inode x claims used rt block y"

Scenario:

Empty XFS partition and real-time partition with extent size 3008 sectors.

1. In a loop simultaneously:

a. 2 threads simultaneously write 1 stream file in real time partition

b. 1 thread writes 3 files into data partition.

c. 1 thread makes holes in the stream files

d. In the middle of the loop switch off the disk power.

2. Drop caches ("echo 3>/proc/sys/vm/drop_caches")

3. Unmount XFS

4. Switch the disk power on

5. Mount XFS (to replay log)

6. Unmount XFS

7. Repair XFS

8. Mount XFS

After the first mount (step 5) stream file exist in real time partition.

After the second mount (step 8) stream file is absent because is deleted by repair (see attached repair_log):

data fork in rt inode 197 claims used rt block 202731

bad data fork in inode 197

cleared inode 197

...

entry "0.STR" in shortform directory 195 references free inode 197

junking entry "0.STR" in directory inode 195

Seems to be the xfs_repair bug because the streams files bitmap after the first mount seems to be OK (see attached bitmap1), without overlapped extents.

The only file in RT partition 0.STR:

/rt/000000R0.DIR/0.STR:

               0: [0..144383]: hole

               1: [144384..147391]: 607625024..607628031

               2: [147392..291775]: hole

               3: [291776..294783]: 607772416..607775423

               4: [294784..436159]: hole

               5: [436160..439167]: 607916800..607919807

               6: [439168..583551]: hole

               7: [583552..586559]: 608064192..608067199

               8: [586560..727935]: hole

               9: [727936..730943]: 608208576..608211583

               10: [730944..875327]: hole

               11: [875328..878335]: 608355968..608358975

               12: [878336..1019711]: hole

               13: [1019712..1022719]: 608500352..608503359

               14: [1022720..1167103]: hole

               15: [1167104..1170111]: 608647744..608650751

               16: [1170112..1311487]: hole

               17: [1311488..1314495]: 608792128..608795135

               18: [1314496..1458879]: hole

               19: [1458880..1461887]: 608939520..608942527

               20: [1461888..1603263]: hole

               21: [1603264..1606271]: 609083904..609086911

               22: [1606272..1750655]: hole

               23: [1750656..1753663]: 609231296..609234303

               24: [1753664..1895039]: hole

               25: [1895040..1898047]: 609375680..609378687

               26: [1898048..2042431]: hole

               27: [2042432..2045439]: 609523072..609526079

               28: [2045440..2186815]: hole

               29: [2186816..2189823]: 609667456..609670463

               30: [2189824..2334207]: hole

               31: [2334208..2334719]: 609814848..609815359

               32: [2334720..3853247]: 609815360..611333887

The only strange thing is that 2 the last extents are contiguous and could be united into 1 extent.

It is strange but is not a good reason to erase the file:

data fork in rt inode 197 claims used rt block 202731

bad data fork in inode 197

cleared inode 197

...

entry "0.STR" in shortform directory 195 references free inode 197

junking entry "0.STR" in directory inode 195

Block 202731corresponds to sector 202731* 3008 = 609814848, beginning of 31th extent.

This block is not used in other extents, so it is not an error.

* kernel version (uname -a):

Linux SZUbuntu 3.10.4-031004-generic #201307282043 SMP Mon Jul 29 00:52:09 UTC 2013 i686 i686 i686 GNU/Linux

* xfsprogs version (xfs_repair -V)

xfs_repair version 3.1.9

* number of CPUs: 1 Core 2 Duo

* contents of /proc/meminfo

MemTotal:        1989360 kB

MemFree:          134036 kB

Buffers:           36512 kB

Cached:          1208408 kB

SwapCached:            0 kB

Active:           790032 kB

Inactive:         990384 kB

Active(anon):     621884 kB

Inactive(anon):   582020 kB

Active(file):     168148 kB

Inactive(file):   408364 kB

Unevictable:          32 kB

Mlocked:              32 kB

HighTotal:       1174216 kB

HighFree:          55364 kB

LowTotal:         815144 kB

LowFree:           78672 kB

SwapTotal:             0 kB

SwapFree:              0 kB

Dirty:                12 kB

Writeback:             0 kB

AnonPages:        535536 kB

Mapped:           171040 kB

Shmem:            668400 kB

Slab:              42020 kB

SReclaimable:      23712 kB

SUnreclaim:        18308 kB

KernelStack:        4000 kB

PageTables:        11120 kB

NFS_Unstable:          0 kB

Bounce:                0 kB

WritebackTmp:          0 kB

CommitLimit:      994680 kB

Committed_AS:    4103576 kB

VmallocTotal:     122880 kB

VmallocUsed:       20204 kB

VmallocChunk:      61320 kB

HardwareCorrupted:     0 kB

AnonHugePages:         0 kB

HugePages_Total:       0

HugePages_Free:        0

HugePages_Rsvd:        0

HugePages_Surp:        0

Hugepagesize:       2048 kB

DirectMap4k:       14328 kB

DirectMap2M:      899072 kB

* contents of /proc/mounts

rootfs / rootfs rw 0 0

sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0

proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0

udev /dev devtmpfs rw,relatime,size=986132k,nr_inodes=199513,mode=755 0 0

devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0

tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=198936k,mode=755 0 0

/dev/disk/by-uuid/2bd39f09-5ca0-41a5-a4af-1f5985fb1f69 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0

none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0

none /sys/fs/fuse/connections fusectl rw,relatime 0 0

none /sys/kernel/debug debugfs rw,relatime 0 0

none /sys/kernel/security securityfs rw,relatime 0 0

none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0

none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0

none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0

binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0

rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0

gvfsd-fuse /run/user/szak/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0

10.63.7.58:/ms/stb_storage/7401_fs/build/szak /mnt/nfs nfs rw,relatime,vers=3,rsize=131072,wsize=131072,namle55,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.63.7.58,mountvers=3,mountport=1234,mountproudp,local_lock=all,addr=10.63.7.58 0 0

/dev/sdc2 /rt xfs rw,relatime,attr2,inode64,rtdev=/dev/sdc3,noquota 0 0

* contents of /proc/partitions

major minor  #blocks  name

   8        0  976762584 sda

   8        1  477894656 sda1

   8        2  488381144 sda2

   8        3   10485760 sda3

  11        0    1048575 sr0

   8       16   78156288 sdb

   8       17     248832 sdb1

   8       18          1 sdb2

   8       21   77904896 sdb5

252        0   75812864 dm-0

252        1    2088960 dm-1

   8       32 2930266584 sdc

   8       33  976561152 sdc1

   8       34    9765888 sdc2

   8       35 1943938503 sdc3

* RAID layout (hardware and/or software):

No raid, /dev/sdc:

Number  Start (sector)    End (sector)  Size       Code  Name

   1            2048      1953124351   931.3 GiB   8300  Linux filesystem

   2      1953124352      1972656127   9.3 GiB     8300  Linux filesystem

   3      1972656128      5860533134   1.8 TiB     8300  Linux filesystem

* LVM configuration: no LVM

* type of disks you are using: WD30EURS

* write cache status of drives: write-caching =  1 (on)

* size of BBWC and mode it is running in: No

* xfs_info output on the filesystem in question:

meta-data=/dev/sdc2              isize=256    agcount=16, agsize=152592 blks

         =                       sectsz=4096  attr=2

data     =                       bsize=4096   blocks=2441472, imaxpct=25

         =                       sunit=0      swidth=0 blks

naming   =version 2              bsize=4096   ascii-ci=0

log      =internal               bsize=4096   blocks=2560, version=2

         =                       sectsz=4096  sunit=1 blks, lazy-count=1

realtime =external               extsz=1540096 blocks=485984625, rtextents=1292512

* dmesg output showing all error messages and stack traces:

[166770.764560] XFS (sdc2): Mounting Filesystem

[166770.865189] XFS (sdc2): Starting recovery (logdev: internal)

[166771.042071] XFS (sdc2): Ending recovery (logdev: internal)

[166918.066626] XFS (sdc2): Mounting Filesystem

[166918.159045] XFS (sdc2): Ending clean mount

Thank you.

Semion

[-- Attachment #1.2: Type: text/html, Size: 33368 bytes --]

[-- Attachment #2: bitmap1 --]
[-- Type: application/octet-stream, Size: 1246 bytes --]

/rt/000000R0.DIR/0.STR:
	0: [0..144383]: hole
	1: [144384..147391]: 607625024..607628031
	2: [147392..291775]: hole
	3: [291776..294783]: 607772416..607775423
	4: [294784..436159]: hole
	5: [436160..439167]: 607916800..607919807
	6: [439168..583551]: hole
	7: [583552..586559]: 608064192..608067199
	8: [586560..727935]: hole
	9: [727936..730943]: 608208576..608211583
	10: [730944..875327]: hole
	11: [875328..878335]: 608355968..608358975
	12: [878336..1019711]: hole
	13: [1019712..1022719]: 608500352..608503359
	14: [1022720..1167103]: hole
	15: [1167104..1170111]: 608647744..608650751
	16: [1170112..1311487]: hole
	17: [1311488..1314495]: 608792128..608795135
	18: [1314496..1458879]: hole
	19: [1458880..1461887]: 608939520..608942527
	20: [1461888..1603263]: hole
	21: [1603264..1606271]: 609083904..609086911
	22: [1606272..1750655]: hole
	23: [1750656..1753663]: 609231296..609234303
	24: [1753664..1895039]: hole
	25: [1895040..1898047]: 609375680..609378687
	26: [1898048..2042431]: hole
	27: [2042432..2045439]: 609523072..609526079
	28: [2045440..2186815]: hole
	29: [2186816..2189823]: 609667456..609670463
	30: [2189824..2334207]: hole
	31: [2334208..2334719]: 609814848..609815359
	32: [2334720..3853247]: 609815360..611333887

[-- Attachment #3: repair_log1 --]
[-- Type: application/octet-stream, Size: 1735 bytes --]

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in rt inode 197 claims used rt block 202731
bad data fork in inode 197
cleared inode 197
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
entry "0.STR" in shortform directory 195 references free inode 197
junking entry "0.STR" in directory inode 195
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
Phase 5 - rebuild AG headers and trees...
        - generate realtime summary info and bitmap...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

[-- Attachment #4: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread