* xfsdump-3.0.4 problems @ 2010-08-16 16:22 Mario Bachmann 2010-08-16 22:30 ` Dave Chinner 0 siblings, 1 reply; 14+ messages in thread From: Mario Bachmann @ 2010-08-16 16:22 UTC (permalink / raw) To: xfs Hello, my kernel is Linux x2 2.6.35.2 #1 SMP Sun Aug 15 00:32:14 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux I get a lot of Warnings with xfsdump-3.0.4 (booth, gentoo package 3.0.4-r1 and git-version): x2 ~/source/d/xfsdump/dump # ./xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump_text.gz ./xfsdump: using file dump (drive_simple) strategy ./xfsdump: version 3.0.4 (dump format 3.0) - Running single-threaded ./xfsdump: level 0 dump of x2:/home ./xfsdump: dump date: Mon Aug 16 17:34:30 2010 ./xfsdump: session id: 288ed27c-2b26-4c8b-a5d4-bfd1a32f4b6f ./xfsdump: session label: "Test" ./xfsdump: ino map phase 1: constructing initial dump list ./xfsdump: ino map phase 2: skipping (no pruning necessary) ./xfsdump: ino map phase 3: skipping (only one dump stream) ./xfsdump: ino map construction complete ./xfsdump: estimated dump size: 3236335808 bytes ./xfsdump: creating dump session media file 0 (media 0, file 0) ./xfsdump: dumping ino map ./xfsdump: dumping directories ./xfsdump: WARNING: could not stat dirent .crack-attack ino 100663674: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent .kvirc4.rc ino 239: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent .recently-used ino 240: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent .DownloadManager ino 100663836: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent .distcc ino 33554725: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent .javafx_eula_accepted ino 1327: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: WARNING: could not stat dirent solar_kassel_thomas.txt ino 1671: Das Argument ist ungültig: using null generation count in directory entry [thousands of these messages...] ./xfsdump: WARNING: could not stat dirent -07a9540007aa7500-0000000000 ino 38797391: Das Argument ist ungültig: using null generation count in directory entry ./xfsdump: dumping non-directory files ./xfsdump: ending media file ./xfsdump: media file size 3319179552 bytes ./xfsdump: dump size (non-dir files) : 3313078008 bytes ./xfsdump: dump complete: 174 seconds elapsed ./xfsdump: Dump Status: SUCCESS At the end there is a file with 2,8 GB. When restored, I have 3,2 GB. A lot of files simply are not there! With the "old" version xfsdump-3.0.1, I get no warnings! xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump301_test.gz xfsdump: using file dump (drive_simple) strategy xfsdump: version 3.0.1 (dump format 3.0) - Running single-threaded xfsdump: level 0 dump of x2:/home xfsdump: dump date: Mon Aug 16 18:14:47 2010 xfsdump: session id: b75a088f-6481-4583-8020-7c67fbe92bca xfsdump: session label: "Test" xfsdump: ino map phase 1: constructing initial dump list xfsdump: ino map phase 2: skipping (no pruning necessary) xfsdump: ino map phase 3: skipping (only one dump stream) xfsdump: ino map construction complete xfsdump: estimated dump size: 6000020480 bytes xfsdump: creating dump session media file 0 (media 0, file 0) xfsdump: dumping ino map xfsdump: dumping directories xfsdump: dumping non-directory files xfsdump: ending media file xfsdump: media file size 5701306656 bytes xfsdump: dump size (non-dir files) : 5691937016 bytes xfsdump: dump complete: 308 seconds elapsed xfsdump: Dump Status: SUCCESS And the file is 4,8 GB. All seems to be correct! Mario _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-16 16:22 xfsdump-3.0.4 problems Mario Bachmann @ 2010-08-16 22:30 ` Dave Chinner 2010-08-17 6:32 ` Mario Bachmann 0 siblings, 1 reply; 14+ messages in thread From: Dave Chinner @ 2010-08-16 22:30 UTC (permalink / raw) To: Mario Bachmann; +Cc: xfs On Mon, Aug 16, 2010 at 06:22:36PM +0200, Mario Bachmann wrote: > Hello, > > my kernel is > Linux x2 2.6.35.2 #1 SMP Sun Aug 15 00:32:14 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux > > I get a lot of Warnings with xfsdump-3.0.4 (booth, gentoo package 3.0.4-r1 and git-version): > > x2 ~/source/d/xfsdump/dump # ./xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump_text.gz ..... > ./xfsdump: WARNING: could not stat dirent .crack-attack ino 100663674: Das Argument ist ungültig: using null generation count in directory entry bulkstat is failing, indicating an invalid option was passed. > With the "old" version xfsdump-3.0.1, I get no warnings! > xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump301_test.gz > xfsdump: using file dump (drive_simple) strategy > xfsdump: version 3.0.1 (dump format 3.0) - Running single-threaded .... > xfsdump: Dump Status: SUCCESS > > And the file is 4,8 GB. All seems to be correct! I can't see any changes between 3.0.1 and 3.0.4 that would explain this. Did you run them on the same machine and kernel? Did you build them with the same compiler? If everything is the same, then perhaps you could bisect (only a few changes so should be quick) to point out the offending change. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-16 22:30 ` Dave Chinner @ 2010-08-17 6:32 ` Mario Bachmann 2010-08-17 7:13 ` Dave Chinner 0 siblings, 1 reply; 14+ messages in thread From: Mario Bachmann @ 2010-08-17 6:32 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Am Tue, 17 Aug 2010 08:30:21 +1000 schrieb Dave Chinner <david@fromorbit.com>: > On Mon, Aug 16, 2010 at 06:22:36PM +0200, Mario Bachmann wrote: > > Hello, > > > > my kernel is > > Linux x2 2.6.35.2 #1 SMP Sun Aug 15 00:32:14 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux > > > > I get a lot of Warnings with xfsdump-3.0.4 (booth, gentoo package 3.0.4-r1 and git-version): > > > > x2 ~/source/d/xfsdump/dump # ./xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump_text.gz > ..... > > ./xfsdump: WARNING: could not stat dirent .crack-attack ino 100663674: Das Argument ist ungültig: using null generation count in directory entry > > bulkstat is failing, indicating an invalid option was passed. > > > With the "old" version xfsdump-3.0.1, I get no warnings! > > xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump301_test.gz > > xfsdump: using file dump (drive_simple) strategy > > xfsdump: version 3.0.1 (dump format 3.0) - Running single-threaded > .... > > xfsdump: Dump Status: SUCCESS > > > > And the file is 4,8 GB. All seems to be correct! > > I can't see any changes between 3.0.1 and 3.0.4 that would explain > this. Did you run them on the same machine and kernel? Did you build > them with the same compiler? If everything is the same, then perhaps > you could bisect (only a few changes so should be quick) to point > out the offending change. > > Cheers, > > Dave. After some testing, I think it is NOT a problem with xfsdump, but with the new kernel 2.6.35.2. First I must correct my last posting: xfsdump-3.0.1 DO have the same problem as xfsdump-3.0.4 on kernel 2.6.35.2. It was just a coincidence that is worked one time without problems... Machines: I have two x86_64 and one x86. All machines have the same problems after I upgraded all three Kernels from 2.6.34.x to 2.6.35.2. So I believe, it is a problem with 2.6.35.2 or the combination of [2.6.35.2 & xfsdump] Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". Testing List (on one machine only): works: x86_64, 2.6.34.4, xfsdump-3.0.1 works: x86_64, 2.6.34.4, xfsdump-3.0.4 failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) failure: x86_64, 2.6.35.2, xfsdump-3.0.4 Mario _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-17 6:32 ` Mario Bachmann @ 2010-08-17 7:13 ` Dave Chinner 2010-08-17 7:53 ` Mario Bachmann 2010-08-17 9:03 ` Christoph Hellwig 0 siblings, 2 replies; 14+ messages in thread From: Dave Chinner @ 2010-08-17 7:13 UTC (permalink / raw) To: Mario Bachmann; +Cc: xfs On Tue, Aug 17, 2010 at 08:32:27AM +0200, Mario Bachmann wrote: > Am Tue, 17 Aug 2010 08:30:21 +1000 > schrieb Dave Chinner <david@fromorbit.com>: > > > On Mon, Aug 16, 2010 at 06:22:36PM +0200, Mario Bachmann wrote: > > > Hello, > > > > > > my kernel is > > > Linux x2 2.6.35.2 #1 SMP Sun Aug 15 00:32:14 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux > > > > > > I get a lot of Warnings with xfsdump-3.0.4 (booth, gentoo package 3.0.4-r1 and git-version): > > > > > > x2 ~/source/d/xfsdump/dump # ./xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump_text.gz > > ..... > > > ./xfsdump: WARNING: could not stat dirent .crack-attack ino 100663674: Das Argument ist ungültig: using null generation count in directory entry > > > > bulkstat is failing, indicating an invalid option was passed. > > > > > With the "old" version xfsdump-3.0.1, I get no warnings! > > > xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump301_test.gz > > > xfsdump: using file dump (drive_simple) strategy > > > xfsdump: version 3.0.1 (dump format 3.0) - Running single-threaded > > .... > > > xfsdump: Dump Status: SUCCESS > > > > > > And the file is 4,8 GB. All seems to be correct! > > > > I can't see any changes between 3.0.1 and 3.0.4 that would explain > > this. Did you run them on the same machine and kernel? Did you build > > them with the same compiler? If everything is the same, then perhaps > > you could bisect (only a few changes so should be quick) to point > > out the offending change. > > After some testing, I think it is NOT a problem with xfsdump, but > with the new kernel 2.6.35.2. First I must correct my last > posting: xfsdump-3.0.1 DO have the same problem as xfsdump-3.0.4 > on kernel 2.6.35.2. It was just a coincidence that is worked one > time without problems... > > Machines: I have two x86_64 and one x86. All machines have the > same problems after I upgraded all three Kernels from 2.6.34.x to > 2.6.35.2. So I believe, it is a problem with 2.6.35.2 or the > combination of [2.6.35.2 & xfsdump]. > > Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". > > Testing List (on one machine only): > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 Ok, that makes more sense - we changed the way bulkstat works in from 2.6.34 to 2.6.35 to correctly validate inode numbers being passed in via bulkstat, and hence files unlinked during the dump run could return EINVAL when validating the directory structure (as they no longer exist). Is you system completely idle while the dump is running, or are files being removed while the dump is running? Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-17 7:13 ` Dave Chinner @ 2010-08-17 7:53 ` Mario Bachmann 2010-08-17 9:05 ` Dave Chinner 2010-08-17 9:03 ` Christoph Hellwig 1 sibling, 1 reply; 14+ messages in thread From: Mario Bachmann @ 2010-08-17 7:53 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Am Tue, 17 Aug 2010 17:13:37 +1000 schrieb Dave Chinner <david@fromorbit.com>: > On Tue, Aug 17, 2010 at 08:32:27AM +0200, Mario Bachmann wrote: > > Am Tue, 17 Aug 2010 08:30:21 +1000 > > schrieb Dave Chinner <david@fromorbit.com>: > > > > > On Mon, Aug 16, 2010 at 06:22:36PM +0200, Mario Bachmann wrote: > > > > Hello, > > > > > > > > my kernel is > > > > Linux x2 2.6.35.2 #1 SMP Sun Aug 15 00:32:14 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ AuthenticAMD GNU/Linux > > > > > > > > I get a lot of Warnings with xfsdump-3.0.4 (booth, gentoo package 3.0.4-r1 and git-version): > > > > > > > > x2 ~/source/d/xfsdump/dump # ./xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump_text.gz > > > ..... > > > > ./xfsdump: WARNING: could not stat dirent .crack-attack ino 100663674: Das Argument ist ungültig: using null generation count in directory entry > > > > > > bulkstat is failing, indicating an invalid option was passed. > > > > > > > With the "old" version xfsdump-3.0.1, I get no warnings! > > > > xfsdump -l0 -L "Test" - /dev/sda2 |gzip - > /mnt/data2/dump301_test.gz > > > > xfsdump: using file dump (drive_simple) strategy > > > > xfsdump: version 3.0.1 (dump format 3.0) - Running single-threaded > > > .... > > > > xfsdump: Dump Status: SUCCESS > > > > > > > > And the file is 4,8 GB. All seems to be correct! > > > > > > I can't see any changes between 3.0.1 and 3.0.4 that would explain > > > this. Did you run them on the same machine and kernel? Did you build > > > them with the same compiler? If everything is the same, then perhaps > > > you could bisect (only a few changes so should be quick) to point > > > out the offending change. > > > > After some testing, I think it is NOT a problem with xfsdump, but > > with the new kernel 2.6.35.2. First I must correct my last > > posting: xfsdump-3.0.1 DO have the same problem as xfsdump-3.0.4 > > on kernel 2.6.35.2. It was just a coincidence that is worked one > > time without problems... > > > > Machines: I have two x86_64 and one x86. All machines have the > > same problems after I upgraded all three Kernels from 2.6.34.x to > > 2.6.35.2. So I believe, it is a problem with 2.6.35.2 or the > > combination of [2.6.35.2 & xfsdump]. > > > > Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". > > > > Testing List (on one machine only): > > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 > > Ok, that makes more sense - we changed the way bulkstat works in > from 2.6.34 to 2.6.35 to correctly validate inode numbers being > passed in via bulkstat, and hence files unlinked during the dump run > could return EINVAL when validating the directory structure (as they > no longer exist). Is you system completely idle while the dump > is running, or are files being removed while the dump is running? > > Cheers, > > Dave. I would call my system idle, when I use xfsdump. No rm or mv operations are running while the dump. The first machine has a dual core 2.9 GHz and 8 GB of RAM and the filesystems are not really big (~10GB used). The second machine has a dual core 2 GHz and 2 GB of RAM. It does not matter if I dump the running root partition or the extra home partition (even not logged in with a user, so there should be absolutely no changes to the files, also I did a sync before the dump). What I tested now: After downgrading to 2.6.34.4 on both x86_64, xfsdump worked again, but that is no solution to use an old kernel! To describe the result on 2.6.35.2 again clearly: xfsdump produces a dump where some gigabyte of data are simply missing. I think about 30% of all files are missing. Mario _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-17 7:53 ` Mario Bachmann @ 2010-08-17 9:05 ` Dave Chinner 2010-08-17 11:45 ` [PATCH] " Dave Chinner 0 siblings, 1 reply; 14+ messages in thread From: Dave Chinner @ 2010-08-17 9:05 UTC (permalink / raw) To: Mario Bachmann; +Cc: xfs On Tue, Aug 17, 2010 at 09:53:40AM +0200, Mario Bachmann wrote: > Am Tue, 17 Aug 2010 17:13:37 +1000 > schrieb Dave Chinner <david@fromorbit.com>: > > > Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". > > > > > > Testing List (on one machine only): > > > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > > > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 > > > > Ok, that makes more sense - we changed the way bulkstat works in > > from 2.6.34 to 2.6.35 to correctly validate inode numbers being > > passed in via bulkstat, and hence files unlinked during the dump run > > could return EINVAL when validating the directory structure (as they > > no longer exist). Is you system completely idle while the dump > > is running, or are files being removed while the dump is running? > > I would call my system idle, when I use xfsdump. No rm or mv operations > are running while the dump. The first machine has a dual core 2.9 GHz and > 8 GB of RAM and the filesystems are not really big (~10GB used). The second > machine has a dual core 2 GHz and 2 GB of RAM. Yup, I have reproduced it here. What is strange is that xfs_fsr uses XFS_IOC_BULKSTAT_SINGLE, and that works fine on 2.6.35.2. The same ioctl calls from xfsdump are failing, though, so something funny is going on there. I'll look into it further. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] Re: xfsdump-3.0.4 problems 2010-08-17 9:05 ` Dave Chinner @ 2010-08-17 11:45 ` Dave Chinner 2010-08-17 15:47 ` Mario Bachmann ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Dave Chinner @ 2010-08-17 11:45 UTC (permalink / raw) To: Mario Bachmann; +Cc: xfs On Tue, Aug 17, 2010 at 07:05:34PM +1000, Dave Chinner wrote: > On Tue, Aug 17, 2010 at 09:53:40AM +0200, Mario Bachmann wrote: > > Am Tue, 17 Aug 2010 17:13:37 +1000 > schrieb Dave Chinner <david@fromorbit.com>: > > > > Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". > > > > > > > > Testing List (on one machine only): > > > > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > > > > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 > > > > > > Ok, that makes more sense - we changed the way bulkstat works in > > > from 2.6.34 to 2.6.35 to correctly validate inode numbers being > > > passed in via bulkstat, and hence files unlinked during the dump run > > > could return EINVAL when validating the directory structure (as they > > > no longer exist). Is you system completely idle while the dump > > > is running, or are files being removed while the dump is running? > > > > I would call my system idle, when I use xfsdump. No rm or mv operations > > are running while the dump. The first machine has a dual core 2.9 GHz and > > 8 GB of RAM and the filesystems are not really big (~10GB used). The second > > machine has a dual core 2 GHz and 2 GB of RAM. > > Yup, I have reproduced it here. What is strange is that xfs_fsr uses > XFS_IOC_BULKSTAT_SINGLE, and that works fine on 2.6.35.2. The same > ioctl calls from xfsdump are failing, though, so something funny is > going on there. > > I'll look into it further. Ok, there is nothing wrong with the changes to the bulkstat code; when all the inodes in the filesystem are hot in the inode cache xfsdump succeeds. When I run xfs_fsr per file to exercise the XFS_IOC_BULKSTAT_SINGLE path like so: $ sudo find /mnt/test -type f -exec xfs_fsr -d -v {} \; It succeeds without any bulkstat failures. A subsequent xfsdump invocation then succeeds without failure, either. Clearly the find is populating the inode cache for the subsequent bulkstat calls, Ok, so the reason this wasn't picked up is that xfs_fsr silently ignores inodes that it gets an error from bulkstat on. and it looks like Dropping caches then running xfsdump: $ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches" $ sudo xfsdump -l0 -L "Test" - /dev/vda 2> t.t |gzip - > ~/dump_test.gz Results in failures. /me sighs My fault. I screwed up the btree lookup for the inode validation. Can you test the patch below? Cheers, Dave -- Dave Chinner david@fromorbit.com xfs: fix untrusted inode number lookup From: Dave Chinner <dchinner@redhat.com> Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode numbers during lookup") changes the inode lookup code to do btree lookups for untrusted inode numbers. This change made an invalid assumption about the alignment of inodes and hence incorrectly calculated the first inode in the cluster. As a result, some inode numbers were being incorrectly considered invalid when they were actually valid. The issue was not picked up by the xfstests suite because it always runs fsr and dump (the two utilities that utilise the bulkstat interface) on cache hot inodes and hence the lookup code in the cold cache path was not sufficiently exercised to uncover this intermittent problem. Fix the issue by relaxing the btree lookup criteria and then checking if the record returned contains the inode number we are lookup for. If it we get an incorrect record, then the inode number is invalid. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_ialloc.c | 16 ++++++++++------ 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c index abf80ae..5371d2d 100644 --- a/fs/xfs/xfs_ialloc.c +++ b/fs/xfs/xfs_ialloc.c @@ -1213,7 +1213,6 @@ xfs_imap_lookup( struct xfs_inobt_rec_incore rec; struct xfs_btree_cur *cur; struct xfs_buf *agbp; - xfs_agino_t startino; int error; int i; @@ -1227,13 +1226,13 @@ xfs_imap_lookup( } /* - * derive and lookup the exact inode record for the given agino. If the - * record cannot be found, then it's an invalid inode number and we - * should abort. + * Lookup the inode record for the given agino. If the record cannot be + * found, then it's an invalid inode number and we should abort. Once + * we have a record, we need to ensure it contains the inode number + * we are looking up. */ cur = xfs_inobt_init_cursor(mp, tp, agbp, agno); - startino = agino & ~(XFS_IALLOC_INODES(mp) - 1); - error = xfs_inobt_lookup(cur, startino, XFS_LOOKUP_EQ, &i); + error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &i); if (!error) { if (i) error = xfs_inobt_get_rec(cur, &rec, &i); @@ -1246,6 +1245,11 @@ xfs_imap_lookup( if (error) return error; + /* check that the returned record contains the required inode */ + if (rec.ir_startino > agino || + rec.ir_startino + XFS_IALLOC_INODES(mp) <= agino) + return EINVAL; + /* for untrusted inodes check it is allocated first */ if ((flags & XFS_IGET_UNTRUSTED) && (rec.ir_free & XFS_INOBT_MASK(agino - rec.ir_startino))) _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-08-17 11:45 ` [PATCH] " Dave Chinner @ 2010-08-17 15:47 ` Mario Bachmann 2010-08-18 10:10 ` Christoph Hellwig 2010-08-27 11:18 ` Iustin Pop 2 siblings, 0 replies; 14+ messages in thread From: Mario Bachmann @ 2010-08-17 15:47 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Am Tue, 17 Aug 2010 21:45:50 +1000 schrieb Dave Chinner <david@fromorbit.com>: > On Tue, Aug 17, 2010 at 07:05:34PM +1000, Dave Chinner wrote: > > On Tue, Aug 17, 2010 at 09:53:40AM +0200, Mario Bachmann wrote: > > > Am Tue, 17 Aug 2010 17:13:37 +1000 > schrieb Dave Chinner <david@fromorbit.com>: > > > > > Compiler: I use "gcc (Gentoo 4.4.4-r1 p1.0, pie-0.4.5) 4.4.4". > > > > > > > > > > Testing List (on one machine only): > > > > > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > > > > > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > > > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > > > > > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 > > > > > > > > Ok, that makes more sense - we changed the way bulkstat works in > > > > from 2.6.34 to 2.6.35 to correctly validate inode numbers being > > > > passed in via bulkstat, and hence files unlinked during the dump run > > > > could return EINVAL when validating the directory structure (as they > > > > no longer exist). Is you system completely idle while the dump > > > > is running, or are files being removed while the dump is running? > > > > > > I would call my system idle, when I use xfsdump. No rm or mv operations > > > are running while the dump. The first machine has a dual core 2.9 GHz and > > > 8 GB of RAM and the filesystems are not really big (~10GB used). The second > > > machine has a dual core 2 GHz and 2 GB of RAM. > > > > Yup, I have reproduced it here. What is strange is that xfs_fsr uses > > XFS_IOC_BULKSTAT_SINGLE, and that works fine on 2.6.35.2. The same > > ioctl calls from xfsdump are failing, though, so something funny is > > going on there. > > > > I'll look into it further. > > Ok, there is nothing wrong with the changes to the bulkstat code; > when all the inodes in the filesystem are hot in the inode cache > xfsdump succeeds. > > When I run xfs_fsr per file to exercise the XFS_IOC_BULKSTAT_SINGLE > path like so: > > $ sudo find /mnt/test -type f -exec xfs_fsr -d -v {} \; > > It succeeds without any bulkstat failures. A subsequent xfsdump > invocation then succeeds without failure, either. Clearly the find > is populating the inode cache for the subsequent bulkstat calls, > > Ok, so the reason this wasn't picked up is that xfs_fsr silently > ignores inodes that it gets an error from bulkstat on. > > and it looks like > Dropping caches then running xfsdump: > > $ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches" > $ sudo xfsdump -l0 -L "Test" - /dev/vda 2> t.t |gzip - > ~/dump_test.gz > > Results in failures. > > /me sighs > > My fault. I screwed up the btree lookup for the inode validation. > Can you test the patch below? > > Cheers, > > Dave Your patch works here with 2.6.35.2. :-) Mario _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-08-17 11:45 ` [PATCH] " Dave Chinner 2010-08-17 15:47 ` Mario Bachmann @ 2010-08-18 10:10 ` Christoph Hellwig 2010-08-27 11:18 ` Iustin Pop 2 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2010-08-18 10:10 UTC (permalink / raw) To: Dave Chinner; +Cc: Mario Bachmann, xfs > xfs: fix untrusted inode number lookup > > From: Dave Chinner <dchinner@redhat.com> > > Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode > numbers during lookup") changes the inode lookup code to do btree lookups for > untrusted inode numbers. This change made an invalid assumption about the > alignment of inodes and hence incorrectly calculated the first inode in the > cluster. As a result, some inode numbers were being incorrectly considered > invalid when they were actually valid. > > The issue was not picked up by the xfstests suite because it always runs fsr > and dump (the two utilities that utilise the bulkstat interface) on cache hot > inodes and hence the lookup code in the cold cache path was not sufficiently > exercised to uncover this intermittent problem. > > Fix the issue by relaxing the btree lookup criteria and then checking if the > record returned contains the inode number we are lookup for. If it we get an > incorrect record, then the inode number is invalid. Looks good and fixes the dump issues I've seen in xfstests. Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-08-17 11:45 ` [PATCH] " Dave Chinner 2010-08-17 15:47 ` Mario Bachmann 2010-08-18 10:10 ` Christoph Hellwig @ 2010-08-27 11:18 ` Iustin Pop 2010-08-27 11:40 ` Dave Chinner 2 siblings, 1 reply; 14+ messages in thread From: Iustin Pop @ 2010-08-27 11:18 UTC (permalink / raw) To: Dave Chinner; +Cc: Mario Bachmann, xfs On Tue, Aug 17, 2010 at 09:45:50PM +1000, Dave Chinner wrote: > My fault. I screwed up the btree lookup for the inode validation. > Can you test the patch below? I just see that 2.6.35.4 has been released, but it doesn't include this fix (as far as I can see). Could it be send for inclusion into the next stable please (yes, it fixes the issue for me too). thanks, iustin _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-08-27 11:18 ` Iustin Pop @ 2010-08-27 11:40 ` Dave Chinner 2010-09-24 9:53 ` Mario Bachmann 0 siblings, 1 reply; 14+ messages in thread From: Dave Chinner @ 2010-08-27 11:40 UTC (permalink / raw) To: Mario Bachmann, xfs On Fri, Aug 27, 2010 at 01:18:20PM +0200, Iustin Pop wrote: > On Tue, Aug 17, 2010 at 09:45:50PM +1000, Dave Chinner wrote: > > My fault. I screwed up the btree lookup for the inode validation. > > Can you test the patch below? > > I just see that 2.6.35.4 has been released, but it doesn't include this > fix (as far as I can see). Could it be send for inclusion into the next > stable please (yes, it fixes the issue for me too). The commit is now upstream, it had a "cc: stable@kernel.org" in it, so it should get automatically queued for inclusion in the next stable kernel release. If I don't see it appear in Greg's stable queue once he starts processing the commits for the next stable release, I'll chase it up.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-08-27 11:40 ` Dave Chinner @ 2010-09-24 9:53 ` Mario Bachmann 2010-10-03 6:20 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Mario Bachmann @ 2010-09-24 9:53 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Hi there, is the patch now in 2.6.35.5? The more exact question is: Does XFS (and xfsprogs) work perfectly now again? My backup works with xfsdump, so it would be a disaster when this does not work. Thanks for your information! Greetings Mario Am Fri, 27 Aug 2010 21:40:55 +1000 schrieb Dave Chinner <david@fromorbit.com>: > On Fri, Aug 27, 2010 at 01:18:20PM +0200, Iustin Pop wrote: > > On Tue, Aug 17, 2010 at 09:45:50PM +1000, Dave Chinner wrote: > > > My fault. I screwed up the btree lookup for the inode validation. > > > Can you test the patch below? > > > > I just see that 2.6.35.4 has been released, but it doesn't include this > > fix (as far as I can see). Could it be send for inclusion into the next > > stable please (yes, it fixes the issue for me too). > > The commit is now upstream, it had a "cc: stable@kernel.org" in it, > so it should get automatically queued for inclusion in the next > stable kernel release. If I don't see it appear in Greg's stable > queue once he starts processing the commits for the next stable > release, I'll chase it up.... > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Re: xfsdump-3.0.4 problems 2010-09-24 9:53 ` Mario Bachmann @ 2010-10-03 6:20 ` Christoph Hellwig 0 siblings, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2010-10-03 6:20 UTC (permalink / raw) To: Mario Bachmann; +Cc: xfs On Fri, Sep 24, 2010 at 11:53:17AM +0200, Mario Bachmann wrote: > Hi there, > > is the patch now in 2.6.35.5? > > The more exact question is: Does XFS (and xfsprogs) work perfectly now again? > My backup works with xfsdump, so it would be a disaster when this does not work. > > Thanks for your information! I've not seen the mail that gets sent for -stable inclusion normally, so I suspect it's not been included yet. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: xfsdump-3.0.4 problems 2010-08-17 7:13 ` Dave Chinner 2010-08-17 7:53 ` Mario Bachmann @ 2010-08-17 9:03 ` Christoph Hellwig 1 sibling, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2010-08-17 9:03 UTC (permalink / raw) To: Dave Chinner; +Cc: Mario Bachmann, xfs On Tue, Aug 17, 2010 at 05:13:37PM +1000, Dave Chinner wrote: > > Testing List (on one machine only): > > works: x86_64, 2.6.34.4, xfsdump-3.0.1 > > works: x86_64, 2.6.34.4, xfsdump-3.0.4 > > failure: x86_64, 2.6.35.2, xfsdump-3.0.1 (worked only one time) > > failure: x86_64, 2.6.35.2, xfsdump-3.0.4 > > Ok, that makes more sense - we changed the way bulkstat works in > from 2.6.34 to 2.6.35 to correctly validate inode numbers being > passed in via bulkstat, and hence files unlinked during the dump run > could return EINVAL when validating the directory structure (as they > no longer exist). Is you system completely idle while the dump > is running, or are files being removed while the dump is running? I've also seen related issues in xfstests recently, but even going all the way back to kernel 2.6.34 does not solve all of them for me. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-10-03 6:19 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-16 16:22 xfsdump-3.0.4 problems Mario Bachmann 2010-08-16 22:30 ` Dave Chinner 2010-08-17 6:32 ` Mario Bachmann 2010-08-17 7:13 ` Dave Chinner 2010-08-17 7:53 ` Mario Bachmann 2010-08-17 9:05 ` Dave Chinner 2010-08-17 11:45 ` [PATCH] " Dave Chinner 2010-08-17 15:47 ` Mario Bachmann 2010-08-18 10:10 ` Christoph Hellwig 2010-08-27 11:18 ` Iustin Pop 2010-08-27 11:40 ` Dave Chinner 2010-09-24 9:53 ` Mario Bachmann 2010-10-03 6:20 ` Christoph Hellwig 2010-08-17 9:03 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox