* status of userspace release @ 2012-10-25 15:15 Ben Myers 2012-10-26 21:57 ` Ben Myers 2012-11-02 5:51 ` Dave Chinner 0 siblings, 2 replies; 12+ messages in thread From: Ben Myers @ 2012-10-25 15:15 UTC (permalink / raw) To: xfs Hi Folks, We're working toward a userspace release this month. There are several patches that need to go in first, including backing out the xfsdump format version bump from Eric, fixes for the makefiles from Mike, and the Polish language update for xfsdump from Jakub. If anyone knows of something else we need, now is the time to flame about it. I will take a look around for other important patches too. This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll give everyone a few working days to do a final test and/or pipe up if we have missed something important. Then if all goes well we'll cut the release next Tuesday. Regards, Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-10-25 15:15 status of userspace release Ben Myers @ 2012-10-26 21:57 ` Ben Myers 2012-10-28 21:27 ` Dave Chinner 2012-11-02 5:51 ` Dave Chinner 1 sibling, 1 reply; 12+ messages in thread From: Ben Myers @ 2012-10-26 21:57 UTC (permalink / raw) To: xfs Hi, On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > We're working toward a userspace release this month. There are several patches > that need to go in first, including backing out the xfsdump format version bump > from Eric, fixes for the makefiles from Mike, and the Polish language update > for xfsdump from Jakub. If anyone knows of something else we need, now is the > time to flame about it. I will take a look around for other important patches > too. > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > give everyone a few working days to do a final test and/or pipe up if we have > missed something important. Then if all goes well we'll cut the release next > Tuesday. I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump. If we missed something important now is the time to speak up. Currently there are two items of which I'm aware: 1) the Polish (and German) translations of xfsdump are not working. I'll start a separate thread for that issue. 2) Christoph has pointed out this patch series: http://oss.sgi.com/archives/xfs/2012-05/msg00323.html http://oss.sgi.com/archives/xfs/2012-10/msg00541.html Please take a look. Thanks, Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-10-26 21:57 ` Ben Myers @ 2012-10-28 21:27 ` Dave Chinner 2012-10-29 16:17 ` Ben Myers 0 siblings, 1 reply; 12+ messages in thread From: Dave Chinner @ 2012-10-28 21:27 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Fri, Oct 26, 2012 at 04:57:41PM -0500, Ben Myers wrote: > Hi, > > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > > We're working toward a userspace release this month. There are several patches > > that need to go in first, including backing out the xfsdump format version bump > > from Eric, fixes for the makefiles from Mike, and the Polish language update > > for xfsdump from Jakub. If anyone knows of something else we need, now is the > > time to flame about it. I will take a look around for other important patches > > too. > > > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > > give everyone a few working days to do a final test and/or pipe up if we have > > missed something important. Then if all goes well we'll cut the release next > > Tuesday. > > I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump. If > we missed something important now is the time to speak up. > > Currently there are two items of which I'm aware: > > 1) the Polish (and German) translations of xfsdump are not working. I'll start > a separate thread for that issue. > > 2) Christoph has pointed out this patch series: > > http://oss.sgi.com/archives/xfs/2012-05/msg00323.html > http://oss.sgi.com/archives/xfs/2012-10/msg00541.html I agree with christoph - the kernel side intptr/uintptr stuff that Jan did needs to go into the kernel first, and then brought to libxfs via a kernel code sync. A kernel code resync needs to be done immediately after the xfsprogs release so that usrspace is using the same base code for the CRC work, so splitting them to before/after the relases makes sense IMO.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-10-28 21:27 ` Dave Chinner @ 2012-10-29 16:17 ` Ben Myers 0 siblings, 0 replies; 12+ messages in thread From: Ben Myers @ 2012-10-29 16:17 UTC (permalink / raw) To: Dave Chinner; +Cc: Jan Engelhardt, Christoph Hellwig, xfs Hi Dave, On Mon, Oct 29, 2012 at 08:27:08AM +1100, Dave Chinner wrote: > On Fri, Oct 26, 2012 at 04:57:41PM -0500, Ben Myers wrote: > > Hi, > > > > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > > > We're working toward a userspace release this month. There are several patches > > > that need to go in first, including backing out the xfsdump format version bump > > > from Eric, fixes for the makefiles from Mike, and the Polish language update > > > for xfsdump from Jakub. If anyone knows of something else we need, now is the > > > time to flame about it. I will take a look around for other important patches > > > too. > > > > > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > > > give everyone a few working days to do a final test and/or pipe up if we have > > > missed something important. Then if all goes well we'll cut the release next > > > Tuesday. > > > > I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump. If > > we missed something important now is the time to speak up. > > > > Currently there are two items of which I'm aware: > > > > 1) the Polish (and German) translations of xfsdump are not working. I'll start > > a separate thread for that issue. > > > > 2) Christoph has pointed out this patch series: > > > > http://oss.sgi.com/archives/xfs/2012-05/msg00323.html > > http://oss.sgi.com/archives/xfs/2012-10/msg00541.html > > I agree with christoph - the kernel side intptr/uintptr stuff that > Jan did needs to go into the kernel first, and then brought to > libxfs via a kernel code sync. A kernel code resync needs to be > done immediately after the xfsprogs release so that usrspace is > using the same base code for the CRC work, so splitting them to > before/after the relases makes sense IMO.... Sounds good. I'll pull in 2, 3, and 6 for now. -Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-10-25 15:15 status of userspace release Ben Myers 2012-10-26 21:57 ` Ben Myers @ 2012-11-02 5:51 ` Dave Chinner 2012-11-02 18:59 ` Ben Myers 1 sibling, 1 reply; 12+ messages in thread From: Dave Chinner @ 2012-11-02 5:51 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > Hi Folks, > > We're working toward a userspace release this month. There are several patches > that need to go in first, including backing out the xfsdump format version bump > from Eric, fixes for the makefiles from Mike, and the Polish language update > for xfsdump from Jakub. If anyone knows of something else we need, now is the > time to flame about it. I will take a look around for other important patches > too. > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > give everyone a few working days to do a final test and/or pipe up if we have > missed something important. Then if all goes well we'll cut the release next > Tuesday. I think that dump/restore need more work/testing. I've just been running with whatever xfsdump I have had installed on my test machines for some time. I think I was the 3.0.6 - whatever is in the current debian unstable repository - or some version of 3.1.0 that I built a while back. I've already pointed Eric to the header checksum failures (forkoff patch being needed), and that fixes the failures I've been seeing on normal xfstests runs. Running some large filesystem testing, however, I see more problems. I'm using a 17TB filesytsem and the --largefs patch series. This results in a futex hang in 059 like so: [ 4770.007858] xfsrestore S ffff88021fc52d40 5504 3926 3487 0x00000000 [ 4770.007858] ffff880212ea9c68 0000000000000082 ffff880207830140 ffff880212ea9fd8 [ 4770.007858] ffff880212ea9fd8 ffff880212ea9fd8 ffff880216cec2c0 ffff880207830140 [ 4770.007858] ffff880212ea9d08 ffff880212ea9d58 ffff880207830140 0000000000000000 [ 4770.007858] Call Trace: [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 [ 4770.007858] [<ffffffff8113acf7>] ? __free_pages+0x47/0x70 [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b [ 4770.007858] xfsrestore S ffff88021fc52d40 5656 3927 3487 0x00000000 [ 4770.007858] ffff880208f29c68 0000000000000082 ffff880208f84180 ffff880208f29fd8 [ 4770.007858] ffff880208f29fd8 ffff880208f29fd8 ffff880216cec2c0 ffff880208f84180 [ 4770.007858] ffff880208f29d08 ffff880208f29d58 ffff880208f84180 0000000000000000 [ 4770.007858] Call Trace: [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b [ 4770.007858] xfsrestore S ffff88021fc92d40 5848 3928 3487 0x00000000 [ 4770.007858] ffff880212d0dc68 0000000000000082 ffff880208e76240 ffff880212d0dfd8 [ 4770.007858] ffff880212d0dfd8 ffff880212d0dfd8 ffff880216cf2300 ffff880208e76240 [ 4770.007858] ffff880212d0dd08 ffff880212d0dd58 ffff880208e76240 0000000000000000 [ 4770.007858] Call Trace: [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b I can't reliably reproduce it at this point, but there does appear to be some kind of locking problem in the multistream support. Speaking of which, most large filesystems dump/restore tests are failing because of this output: 026 20s ... - output mismatch (see 026.out.bad) --- 026.out 2012-10-05 11:37:51.000000000 +1000 +++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100 @@ -20,6 +20,7 @@ xfsdump: media file size NUM bytes xfsdump: dump size (non-dir files) : NUM bytes xfsdump: dump complete: SECS seconds elapsed +xfsdump: stream 0 DUMP_FILE OK (success) xfsdump: Dump Status: SUCCESS Restoring from file... xfsrestore -f DUMP_FILE -L stress_026 RESTORE_DIR @@ -32,6 +33,7 @@ xfsrestore: directory post-processing xfsrestore: restoring non-directory files xfsrestore: restore complete: SECS seconds elapsed +xfsrestore: stream 0 DUMP_FILE OK (success) xfsrestore: Restore Status: SUCCESS Comparing dump directory with restore directory Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical Which looks like output from the multistream code. Why it is emitting this for large filesystem testing and not for small filesystems, I'm not sure yet. In fact, with --largefs, I see this for the dump group: Failures: 026 028 046 047 056 059 060 061 063 064 065 066 266 281 282 283 Failed 16 of 19 tests And this for the normal sized (10GB) scratch device: Passed all 18 tests So there's something funky going on here.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-02 5:51 ` Dave Chinner @ 2012-11-02 18:59 ` Ben Myers 2012-11-02 23:03 ` Dave Chinner 0 siblings, 1 reply; 12+ messages in thread From: Ben Myers @ 2012-11-02 18:59 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Hi Dave, On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote: > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > > Hi Folks, > > > > We're working toward a userspace release this month. There are several patches > > that need to go in first, including backing out the xfsdump format version bump > > from Eric, fixes for the makefiles from Mike, and the Polish language update > > for xfsdump from Jakub. If anyone knows of something else we need, now is the > > time to flame about it. I will take a look around for other important patches > > too. > > > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > > give everyone a few working days to do a final test and/or pipe up if we have > > missed something important. Then if all goes well we'll cut the release next > > Tuesday. > > I think that dump/restore need more work/testing. Sounds good. AFAIK there is no blazing hurry to release immediately. > I've already pointed Eric to the header checksum failures (forkoff > patch being needed), and that fixes the failures I've been seeing on > normal xfstests runs. I've pulled that patch in. Interesting that it doesn't reproduce on i586 but is so reliable on x86_64. It's a good excuse to do some testing on a wider set of arches before the release. > Running some large filesystem testing, however, I see more problems. > I'm using a 17TB filesytsem and the --largefs patch series. This > results in a futex hang in 059 like so: > > [ 4770.007858] xfsrestore S ffff88021fc52d40 5504 3926 3487 0x00000000 > [ 4770.007858] ffff880212ea9c68 0000000000000082 ffff880207830140 ffff880212ea9fd8 > [ 4770.007858] ffff880212ea9fd8 ffff880212ea9fd8 ffff880216cec2c0 ffff880207830140 > [ 4770.007858] ffff880212ea9d08 ffff880212ea9d58 ffff880207830140 0000000000000000 > [ 4770.007858] Call Trace: > [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 > [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 > [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 > [ 4770.007858] [<ffffffff8113acf7>] ? __free_pages+0x47/0x70 > [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 > [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 > [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 > [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 > [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 > [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 > [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b > [ 4770.007858] xfsrestore S ffff88021fc52d40 5656 3927 3487 0x00000000 > [ 4770.007858] ffff880208f29c68 0000000000000082 ffff880208f84180 ffff880208f29fd8 > [ 4770.007858] ffff880208f29fd8 ffff880208f29fd8 ffff880216cec2c0 ffff880208f84180 > [ 4770.007858] ffff880208f29d08 ffff880208f29d58 ffff880208f84180 0000000000000000 > [ 4770.007858] Call Trace: > [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 > [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 > [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 > [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 > [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 > [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 > [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 > [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 > [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 > [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b > [ 4770.007858] xfsrestore S ffff88021fc92d40 5848 3928 3487 0x00000000 > [ 4770.007858] ffff880212d0dc68 0000000000000082 ffff880208e76240 ffff880212d0dfd8 > [ 4770.007858] ffff880212d0dfd8 ffff880212d0dfd8 ffff880216cf2300 ffff880208e76240 > [ 4770.007858] ffff880212d0dd08 ffff880212d0dd58 ffff880208e76240 0000000000000000 > [ 4770.007858] Call Trace: > [ 4770.007858] [<ffffffff81b8a009>] schedule+0x29/0x70 > [ 4770.007858] [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100 > [ 4770.007858] [<ffffffff810db809>] futex_wait+0x189/0x290 > [ 4770.007858] [<ffffffff810dd41c>] do_futex+0x11c/0xa80 > [ 4770.007858] [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110 > [ 4770.007858] [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30 > [ 4770.007858] [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0 > [ 4770.007858] [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0 > [ 4770.007858] [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80 > [ 4770.007858] [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b > > I can't reliably reproduce it at this point, but there does appear > to be some kind of locking problem in the multistream support. One of my machines hit this overnight without --largefs. I wasn't able to get a dump though. Just another data point. > Speaking of which, most large filesystems dump/restore tests are > failing because of this output: > > 026 20s ... - output mismatch (see 026.out.bad) > --- 026.out 2012-10-05 11:37:51.000000000 +1000 > +++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100 > @@ -20,6 +20,7 @@ > xfsdump: media file size NUM bytes > xfsdump: dump size (non-dir files) : NUM bytes > xfsdump: dump complete: SECS seconds elapsed > +xfsdump: stream 0 DUMP_FILE OK (success) > xfsdump: Dump Status: SUCCESS > Restoring from file... > xfsrestore -f DUMP_FILE -L stress_026 RESTORE_DIR > @@ -32,6 +33,7 @@ > xfsrestore: directory post-processing > xfsrestore: restoring non-directory files > xfsrestore: restore complete: SECS seconds elapsed > +xfsrestore: stream 0 DUMP_FILE OK (success) > xfsrestore: Restore Status: SUCCESS > Comparing dump directory with restore directory > Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical > > Which looks like output from the multistream code. Why it is > emitting this for large filesystem testing and not for small > filesystems, I'm not sure yet. > > In fact, with --largefs, I see this for the dump group: > > Failures: 026 028 046 047 056 059 060 061 063 064 065 066 266 281 > 282 283 > Failed 16 of 19 tests > > And this for the normal sized (10GB) scratch device: > > Passed all 18 tests > > So there's something funky going on here.... Rich also reported some golden output related changes with --largefs awhile back. I don't think he saw this one though. The TODO list for userspace release currently stands at: 1) fix the header checksum failures... which is resolved 2) fix a futex hang in 059 3) fix the golden output changes related to multistream support in xfsdump and --largefs 4) test on more platforms Regards, Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-02 18:59 ` Ben Myers @ 2012-11-02 23:03 ` Dave Chinner 2012-11-03 0:16 ` Dave Chinner 2012-11-03 1:53 ` Dave Chinner 0 siblings, 2 replies; 12+ messages in thread From: Dave Chinner @ 2012-11-02 23:03 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote: > Hi Dave, > > On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote: > > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > > > Hi Folks, > > > > > > We're working toward a userspace release this month. There are several patches > > > that need to go in first, including backing out the xfsdump format version bump > > > from Eric, fixes for the makefiles from Mike, and the Polish language update > > > for xfsdump from Jakub. If anyone knows of something else we need, now is the > > > time to flame about it. I will take a look around for other important patches > > > too. > > > > > > This time I'm going to tag an -rc1 (probably later today or tomorrow). We'll > > > give everyone a few working days to do a final test and/or pipe up if we have > > > missed something important. Then if all goes well we'll cut the release next > > > Tuesday. > > > > I think that dump/restore need more work/testing. > > Sounds good. AFAIK there is no blazing hurry to release immediately. Agreed. better to get it right ;) > > Running some large filesystem testing, however, I see more problems. > > I'm using a 17TB filesytsem and the --largefs patch series. This > > results in a futex hang in 059 like so: .... > > I can't reliably reproduce it at this point, but there does appear > > to be some kind of locking problem in the multistream support. > > One of my machines hit this overnight without --largefs. I wasn't able to get > a dump though. Just another data point. Ok, that's good to know it is directly related to the largefs testing I'm doing. > > Speaking of which, most large filesystems dump/restore tests are > > failing because of this output: > > > > 026 20s ... - output mismatch (see 026.out.bad) > > --- 026.out 2012-10-05 11:37:51.000000000 +1000 > > +++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100 > > @@ -20,6 +20,7 @@ > > xfsdump: media file size NUM bytes > > xfsdump: dump size (non-dir files) : NUM bytes > > xfsdump: dump complete: SECS seconds elapsed > > +xfsdump: stream 0 DUMP_FILE OK (success) > > xfsdump: Dump Status: SUCCESS > > Restoring from file... > > xfsrestore -f DUMP_FILE -L stress_026 RESTORE_DIR > > @@ -32,6 +33,7 @@ > > xfsrestore: directory post-processing > > xfsrestore: restoring non-directory files > > xfsrestore: restore complete: SECS seconds elapsed > > +xfsrestore: stream 0 DUMP_FILE OK (success) > > xfsrestore: Restore Status: SUCCESS > > Comparing dump directory with restore directory > > Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical > > > > Which looks like output from the multistream code. Why it is > > emitting this for large filesystem testing and not for small > > filesystems, I'm not sure yet. .... > Rich also reported some golden output related changes with --largefs awhile > back. I don't think he saw this one though. No, this one is new, caused by upgrading xfsdump. As it turns out, the previous version of xfsdump on this particular VM was from before the multistream dump was implemented - it was a distro package rather than one I'd custom built. And, as it is, I just removed the --large-fs config (so my scratch device is just an empty 17TB device) and I still get this extra output. So it's not related to the --large-fs behaviour at all. > The TODO list for userspace release currently stands at: > > 1) fix the header checksum failures... which is resolved > 2) fix a futex hang in 059 > 3) fix the golden output changes related to multistream support in xfsdump > and --largefs Well, understand them first, then fix ;) > 4) test on more platforms I suspect that the futex hang is only going to be solvable if it can be reliably reproduced. I haven't seen it again since the hang I reported. Otherwise, sounds good. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-02 23:03 ` Dave Chinner @ 2012-11-03 0:16 ` Dave Chinner 2012-11-03 1:35 ` Eric Sandeen 2012-11-03 3:16 ` Dave Chinner 2012-11-03 1:53 ` Dave Chinner 1 sibling, 2 replies; 12+ messages in thread From: Dave Chinner @ 2012-11-03 0:16 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote: > On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote: > > Hi Dave, > > > > On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote: > > > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > > > > Hi Folks, > > > > > > > > We're working toward a userspace release this month. There are several patches > > > > that need to go in first, including backing out the xfsdump format version bump > > > > from Eric, fixes for the makefiles from Mike, and the Polish language update > > > > for xfsdump from Jakub. If anyone knows of something else we need, now is the > > > > time to flame about it. I will take a look around for other important patches > > > > too. .... > > The TODO list for userspace release currently stands at: > > > > 1) fix the header checksum failures... which is resolved > > 2) fix a futex hang in 059 > > 3) fix the golden output changes related to multistream support in xfsdump > > and --largefs > > Well, understand them first, then fix ;) > > > 4) test on more platforms Another: $ sudo xfs_info /mnt/scratch/ meta-data=/dev/vdc isize=256 agcount=4, agsize=12800 blks = sectsz=512 attr=2 data = bsize=4096 blocks=51200, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=1200, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 $ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT $ xfs_info is not reporting the 32 bit project ID status. Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd like it this release to at least say "off or unknown" here. I say this, because this is the first thing I noticed when having a look at a test 287 failure: 7 10s ... - output mismatch (see 287.out.bad) --- 287.out 2012-10-05 11:38:08.000000000 +1000 +++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100 @@ -2,22 +2,24 @@ No 32bit project quotas: projid = 1234 projid = 0 +xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument With 32bit project quota support: projid = 1234 -projid = 2123456789 +projid = 0 +xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument The restored file system + one additional file: projid = 1234 -projid = 2123456789 -projid = 2123456789 +projid = 0 +projid = 0 These two values of 16bit project quota ids shall be the same -core.projid_lo = 1234 +core.projid_lo = 0 core.projid_hi = 0 core.projid_lo = 1234 core.projid_hi = 0 These three values of 32bit project quota ids shall be the same -core.projid_lo = 24853 -core.projid_hi = 32401 -core.projid_lo = 24853 -core.projid_hi = 32401 -core.projid_lo = 24853 -core.projid_hi = 32401 +core.projid_lo = 0 +core.projid_hi = 0 +core.projid_lo = 0 +core.projid_hi = 0 +core.projid_lo = 0 +core.projid_hi = 0 Here's what's curious - this is failing on the 17TB filesystem, but is not failing on 10-20GB filesystems. There seems to be a pattern here.... Note that I only recently updated xfstests on the VM with the 17TB filesystem (i.e. on wednesday), so this is probably the first time I have run test 287 on a large filesystem like this. Same goes for much of the other problems I'm reporting - xfstests on this machine has been running out of dev branch I hadn't updated for a while, so these problems might have been around for a while on large filesystems... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-03 0:16 ` Dave Chinner @ 2012-11-03 1:35 ` Eric Sandeen 2012-11-03 1:55 ` Dave Chinner 2012-11-03 3:16 ` Dave Chinner 1 sibling, 1 reply; 12+ messages in thread From: Eric Sandeen @ 2012-11-03 1:35 UTC (permalink / raw) To: Dave Chinner; +Cc: Ben Myers, xfs On 11/2/12 7:16 PM, Dave Chinner wrote: > On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote: >> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote: >>> Hi Dave, >>> >>> On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote: >>>> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: >>>>> Hi Folks, >>>>> >>>>> We're working toward a userspace release this month. There are several patches >>>>> that need to go in first, including backing out the xfsdump format version bump >>>>> from Eric, fixes for the makefiles from Mike, and the Polish language update >>>>> for xfsdump from Jakub. If anyone knows of something else we need, now is the >>>>> time to flame about it. I will take a look around for other important patches >>>>> too. > .... >>> The TODO list for userspace release currently stands at: >>> >>> 1) fix the header checksum failures... which is resolved >>> 2) fix a futex hang in 059 >>> 3) fix the golden output changes related to multistream support in xfsdump >>> and --largefs >> >> Well, understand them first, then fix ;) >> >>> 4) test on more platforms > > Another: > > $ sudo xfs_info /mnt/scratch/ > meta-data=/dev/vdc isize=256 agcount=4, agsize=12800 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=51200, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > $ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc > versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT > $ > > xfs_info is not reporting the 32 bit project ID status. Weird, I didn't realize that [PATCH 2/2] xfsprogs: report projid32 status in growfs output hadn't been pulled in. > Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd > like it this release to at least say "off or unknown" here. Heh, ok, when you reviewed you said it was no big deal ;) but I guess we can add the "or unknown" if you like. > I say this, because this is the first thing I noticed when having a > look at a test 287 failure: Hm that's pretty odd. -Eric > 7 10s ... - output mismatch (see 287.out.bad) > --- 287.out 2012-10-05 11:38:08.000000000 +1000 > +++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100 > @@ -2,22 +2,24 @@ > No 32bit project quotas: > projid = 1234 > projid = 0 > +xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument > With 32bit project quota support: > projid = 1234 > -projid = 2123456789 > +projid = 0 > +xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument > The restored file system + one additional file: > projid = 1234 > -projid = 2123456789 > -projid = 2123456789 > +projid = 0 > +projid = 0 > These two values of 16bit project quota ids shall be the same > -core.projid_lo = 1234 > +core.projid_lo = 0 > core.projid_hi = 0 > core.projid_lo = 1234 > core.projid_hi = 0 > These three values of 32bit project quota ids shall be the same > -core.projid_lo = 24853 > -core.projid_hi = 32401 > -core.projid_lo = 24853 > -core.projid_hi = 32401 > -core.projid_lo = 24853 > -core.projid_hi = 32401 > +core.projid_lo = 0 > +core.projid_hi = 0 > +core.projid_lo = 0 > +core.projid_hi = 0 > +core.projid_lo = 0 > +core.projid_hi = 0 > > Here's what's curious - this is failing on the 17TB filesystem, but > is not failing on 10-20GB filesystems. There seems to be a pattern > here.... > > Note that I only recently updated xfstests on the VM with the 17TB > filesystem (i.e. on wednesday), so this is probably the first time I > have run test 287 on a large filesystem like this. Same goes for > much of the other problems I'm reporting - xfstests on this machine > has been running out of dev branch I hadn't updated for a while, so > these problems might have been around for a while on large > filesystems... > > Cheers, > > Dave. > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-03 1:35 ` Eric Sandeen @ 2012-11-03 1:55 ` Dave Chinner 0 siblings, 0 replies; 12+ messages in thread From: Dave Chinner @ 2012-11-03 1:55 UTC (permalink / raw) To: Eric Sandeen; +Cc: Ben Myers, xfs On Fri, Nov 02, 2012 at 08:35:32PM -0500, Eric Sandeen wrote: > On 11/2/12 7:16 PM, Dave Chinner wrote: > > On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote: > >> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote: > >>> Hi Dave, > >>> > >>> On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote: > >>>> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote: > >>>>> Hi Folks, > >>>>> > >>>>> We're working toward a userspace release this month. There are several patches > >>>>> that need to go in first, including backing out the xfsdump format version bump > >>>>> from Eric, fixes for the makefiles from Mike, and the Polish language update > >>>>> for xfsdump from Jakub. If anyone knows of something else we need, now is the > >>>>> time to flame about it. I will take a look around for other important patches > >>>>> too. > > .... > >>> The TODO list for userspace release currently stands at: > >>> > >>> 1) fix the header checksum failures... which is resolved > >>> 2) fix a futex hang in 059 > >>> 3) fix the golden output changes related to multistream support in xfsdump > >>> and --largefs > >> > >> Well, understand them first, then fix ;) > >> > >>> 4) test on more platforms > > > > Another: > > > > $ sudo xfs_info /mnt/scratch/ > > meta-data=/dev/vdc isize=256 agcount=4, agsize=12800 blks > > = sectsz=512 attr=2 > > data = bsize=4096 blocks=51200, imaxpct=25 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0 > > log =internal bsize=4096 blocks=1200, version=2 > > = sectsz=512 sunit=0 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > $ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc > > versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT > > $ > > > > xfs_info is not reporting the 32 bit project ID status. > > Weird, I didn't realize that > [PATCH 2/2] xfsprogs: report projid32 status in growfs output > hadn't been pulled in. > > > Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd > > like it this release to at least say "off or unknown" here. > > Heh, ok, when you reviewed you said it was no big deal ;) but I guess > we can add the "or unknown" if you like. It probably doesn't matter that much, because we'll know if it is supporte dbyt the kernel the user is running. having it there is the most important thing. > > I say this, because this is the first thing I noticed when having a > > look at a test 287 failure: > > Hm that's pretty odd. Yeah, Still need to get to the bottom of it. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-03 0:16 ` Dave Chinner 2012-11-03 1:35 ` Eric Sandeen @ 2012-11-03 3:16 ` Dave Chinner 1 sibling, 0 replies; 12+ messages in thread From: Dave Chinner @ 2012-11-03 3:16 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Sat, Nov 03, 2012 at 11:16:39AM +1100, Dave Chinner wrote: > I say this, because this is the first thing I noticed when having a > look at a test 287 failure: > > 7 10s ... - output mismatch (see 287.out.bad) > --- 287.out 2012-10-05 11:38:08.000000000 +1000 > +++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100 > @@ -2,22 +2,24 @@ > No 32bit project quotas: > projid = 1234 > projid = 0 > +xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument > With 32bit project quota support: > projid = 1234 > -projid = 2123456789 > +projid = 0 > +xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument > The restored file system + one additional file: > projid = 1234 > -projid = 2123456789 > -projid = 2123456789 > +projid = 0 > +projid = 0 > These two values of 16bit project quota ids shall be the same > -core.projid_lo = 1234 > +core.projid_lo = 0 > core.projid_hi = 0 > core.projid_lo = 1234 > core.projid_hi = 0 > These three values of 32bit project quota ids shall be the same > -core.projid_lo = 24853 > -core.projid_hi = 32401 > -core.projid_lo = 24853 > -core.projid_hi = 32401 > -core.projid_lo = 24853 > -core.projid_hi = 32401 > +core.projid_lo = 0 > +core.projid_hi = 0 > +core.projid_lo = 0 > +core.projid_hi = 0 > +core.projid_lo = 0 > +core.projid_hi = 0 > > Here's what's curious - this is failing on the 17TB filesystem, but > is not failing on 10-20GB filesystems. There seems to be a pattern > here.... This is caused by a longstanding bug in xfs_db. The fix below should be included in the release, I think... Cheers, Dave -- Dave Chinner david@fromorbit.com xfs_db: flush devices before exiting From: Dave Chinner <dchinner@redhat.com> Test 287 uses xfs_db to change 32-bit project ID support while the filesystem is unmounted. On a large filesystem the test was failing due to the mount not seeing the feature bit in the superblock. xfs_db uses a different address space to the filesystem when it is mounted by the kernel, so the only way to keep them coherent is to ensure that all buffered data is written to disk before the other entity tries to read it. xfs_db uses buffered IO, but does not close the devices when it exits, thereby leaving changes it has written in the block device cache rather than on disk. Hence when the kernel tries to mount the filesystem, it reads what is on disk and does not see xfs_db's changes. Fix this by ensuring that xfs_db flushes it's changes to disk before it exits by caling libxfs_device_close(). This fsyncs the data and flushes the caches to ensure that it is present on disk before xfs_db exits. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- db/init.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/db/init.c b/db/init.c index 2a5ef2b..2a31cb8 100644 --- a/db/init.c +++ b/db/init.c @@ -170,7 +170,7 @@ main( } if (cmdline) { xfree(cmdline); - return exitcode; + goto close_devices; } while (!done) { @@ -181,5 +181,13 @@ main( done = command(c, v); doneline(input, v); } + +close_devices: + if (x.ddev) + libxfs_device_close(x.ddev); + if (x.logdev && x.logdev != x.ddev) + libxfs_device_close(x.logdev); + if (x.rtdev) + libxfs_device_close(x.rtdev); return exitcode; } _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: status of userspace release 2012-11-02 23:03 ` Dave Chinner 2012-11-03 0:16 ` Dave Chinner @ 2012-11-03 1:53 ` Dave Chinner 1 sibling, 0 replies; 12+ messages in thread From: Dave Chinner @ 2012-11-03 1:53 UTC (permalink / raw) To: Ben Myers; +Cc: xfs On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote: > On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote: > > The TODO list for userspace release currently stands at: > > > > 1) fix the header checksum failures... which is resolved > > 2) fix a futex hang in 059 > > 3) fix the golden output changes related to multistream support in xfsdump > > and --largefs > > Well, understand them first, then fix ;) Because it is a bug in the dump filter that the largefs patchset introduces. I've fixed this, and the problem goes away. You can take this off your list ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-11-03 3:14 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-25 15:15 status of userspace release Ben Myers 2012-10-26 21:57 ` Ben Myers 2012-10-28 21:27 ` Dave Chinner 2012-10-29 16:17 ` Ben Myers 2012-11-02 5:51 ` Dave Chinner 2012-11-02 18:59 ` Ben Myers 2012-11-02 23:03 ` Dave Chinner 2012-11-03 0:16 ` Dave Chinner 2012-11-03 1:35 ` Eric Sandeen 2012-11-03 1:55 ` Dave Chinner 2012-11-03 3:16 ` Dave Chinner 2012-11-03 1:53 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox