public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* status of userspace release
@ 2012-10-25 15:15 Ben Myers
  2012-10-26 21:57 ` Ben Myers
  2012-11-02  5:51 ` Dave Chinner
  0 siblings, 2 replies; 12+ messages in thread
From: Ben Myers @ 2012-10-25 15:15 UTC (permalink / raw)
  To: xfs

Hi Folks,

We're working toward a userspace release this month.  There are several patches
that need to go in first, including backing out the xfsdump format version bump
from Eric, fixes for the makefiles from Mike, and the Polish language update
for xfsdump from Jakub.  If anyone knows of something else we need, now is the
time to flame about it.  I will take a look around for other important patches
too.

This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
give everyone a few working days to do a final test and/or pipe up if we have
missed something important.  Then if all goes well we'll cut the release next
Tuesday.

Regards,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-10-25 15:15 status of userspace release Ben Myers
@ 2012-10-26 21:57 ` Ben Myers
  2012-10-28 21:27   ` Dave Chinner
  2012-11-02  5:51 ` Dave Chinner
  1 sibling, 1 reply; 12+ messages in thread
From: Ben Myers @ 2012-10-26 21:57 UTC (permalink / raw)
  To: xfs

Hi,

On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> We're working toward a userspace release this month.  There are several patches
> that need to go in first, including backing out the xfsdump format version bump
> from Eric, fixes for the makefiles from Mike, and the Polish language update
> for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> time to flame about it.  I will take a look around for other important patches
> too.
> 
> This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> give everyone a few working days to do a final test and/or pipe up if we have
> missed something important.  Then if all goes well we'll cut the release next
> Tuesday.

I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump.  If
we missed something important now is the time to speak up.

Currently there are two items of which I'm aware:

1) the Polish (and German) translations of xfsdump are not working.  I'll start
a separate thread for that issue.

2) Christoph has pointed out this patch series:

http://oss.sgi.com/archives/xfs/2012-05/msg00323.html
http://oss.sgi.com/archives/xfs/2012-10/msg00541.html

Please take a look.

Thanks,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-10-26 21:57 ` Ben Myers
@ 2012-10-28 21:27   ` Dave Chinner
  2012-10-29 16:17     ` Ben Myers
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-10-28 21:27 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Fri, Oct 26, 2012 at 04:57:41PM -0500, Ben Myers wrote:
> Hi,
> 
> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> > We're working toward a userspace release this month.  There are several patches
> > that need to go in first, including backing out the xfsdump format version bump
> > from Eric, fixes for the makefiles from Mike, and the Polish language update
> > for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> > time to flame about it.  I will take a look around for other important patches
> > too.
> > 
> > This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> > give everyone a few working days to do a final test and/or pipe up if we have
> > missed something important.  Then if all goes well we'll cut the release next
> > Tuesday.
> 
> I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump.  If
> we missed something important now is the time to speak up.
> 
> Currently there are two items of which I'm aware:
> 
> 1) the Polish (and German) translations of xfsdump are not working.  I'll start
> a separate thread for that issue.
> 
> 2) Christoph has pointed out this patch series:
> 
> http://oss.sgi.com/archives/xfs/2012-05/msg00323.html
> http://oss.sgi.com/archives/xfs/2012-10/msg00541.html

I agree with christoph - the kernel side intptr/uintptr stuff that
Jan did needs to go into the kernel first, and then brought to
libxfs via a kernel code sync.  A kernel code resync needs to be
done immediately after the xfsprogs release so that usrspace is
using the same base code for the CRC work, so splitting them to
before/after the relases makes sense IMO....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-10-28 21:27   ` Dave Chinner
@ 2012-10-29 16:17     ` Ben Myers
  0 siblings, 0 replies; 12+ messages in thread
From: Ben Myers @ 2012-10-29 16:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Jan Engelhardt, Christoph Hellwig, xfs

Hi Dave, 

On Mon, Oct 29, 2012 at 08:27:08AM +1100, Dave Chinner wrote:
> On Fri, Oct 26, 2012 at 04:57:41PM -0500, Ben Myers wrote:
> > Hi,
> > 
> > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> > > We're working toward a userspace release this month.  There are several patches
> > > that need to go in first, including backing out the xfsdump format version bump
> > > from Eric, fixes for the makefiles from Mike, and the Polish language update
> > > for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> > > time to flame about it.  I will take a look around for other important patches
> > > too.
> > > 
> > > This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> > > give everyone a few working days to do a final test and/or pipe up if we have
> > > missed something important.  Then if all goes well we'll cut the release next
> > > Tuesday.
> > 
> > I've tagged -rc1 for the upcoming releases of dmapi, xfsprogs, and xfsdump.  If
> > we missed something important now is the time to speak up.
> > 
> > Currently there are two items of which I'm aware:
> > 
> > 1) the Polish (and German) translations of xfsdump are not working.  I'll start
> > a separate thread for that issue.
> > 
> > 2) Christoph has pointed out this patch series:
> > 
> > http://oss.sgi.com/archives/xfs/2012-05/msg00323.html
> > http://oss.sgi.com/archives/xfs/2012-10/msg00541.html
> 
> I agree with christoph - the kernel side intptr/uintptr stuff that
> Jan did needs to go into the kernel first, and then brought to
> libxfs via a kernel code sync.  A kernel code resync needs to be
> done immediately after the xfsprogs release so that usrspace is
> using the same base code for the CRC work, so splitting them to
> before/after the relases makes sense IMO....

Sounds good.  I'll pull in 2, 3, and 6 for now.

-Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-10-25 15:15 status of userspace release Ben Myers
  2012-10-26 21:57 ` Ben Myers
@ 2012-11-02  5:51 ` Dave Chinner
  2012-11-02 18:59   ` Ben Myers
  1 sibling, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-11-02  5:51 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> Hi Folks,
> 
> We're working toward a userspace release this month.  There are several patches
> that need to go in first, including backing out the xfsdump format version bump
> from Eric, fixes for the makefiles from Mike, and the Polish language update
> for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> time to flame about it.  I will take a look around for other important patches
> too.
> 
> This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> give everyone a few working days to do a final test and/or pipe up if we have
> missed something important.  Then if all goes well we'll cut the release next
> Tuesday.

I think that dump/restore need more work/testing. I've just been
running with whatever xfsdump I have had installed on my test
machines for some time. I think I was the 3.0.6 - whatever is in the
current debian unstable repository - or some version of 3.1.0 that I
built a while back.

I've already pointed Eric to the header checksum failures (forkoff
patch being needed), and that fixes the failures I've been seeing on
normal xfstests runs.

Running some large filesystem testing, however, I see more problems.
I'm using a 17TB filesytsem and the --largefs patch series. This
results in a futex hang in 059 like so:

[ 4770.007858] xfsrestore      S ffff88021fc52d40  5504  3926   3487 0x00000000
[ 4770.007858]  ffff880212ea9c68 0000000000000082 ffff880207830140 ffff880212ea9fd8
[ 4770.007858]  ffff880212ea9fd8 ffff880212ea9fd8 ffff880216cec2c0 ffff880207830140
[ 4770.007858]  ffff880212ea9d08 ffff880212ea9d58 ffff880207830140 0000000000000000
[ 4770.007858] Call Trace:
[ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
[ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
[ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
[ 4770.007858]  [<ffffffff8113acf7>] ? __free_pages+0x47/0x70
[ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
[ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
[ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
[ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
[ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
[ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
[ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b
[ 4770.007858] xfsrestore      S ffff88021fc52d40  5656  3927   3487 0x00000000
[ 4770.007858]  ffff880208f29c68 0000000000000082 ffff880208f84180 ffff880208f29fd8
[ 4770.007858]  ffff880208f29fd8 ffff880208f29fd8 ffff880216cec2c0 ffff880208f84180
[ 4770.007858]  ffff880208f29d08 ffff880208f29d58 ffff880208f84180 0000000000000000
[ 4770.007858] Call Trace:
[ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
[ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
[ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
[ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
[ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
[ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
[ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
[ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
[ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
[ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b
[ 4770.007858] xfsrestore      S ffff88021fc92d40  5848  3928   3487 0x00000000
[ 4770.007858]  ffff880212d0dc68 0000000000000082 ffff880208e76240 ffff880212d0dfd8
[ 4770.007858]  ffff880212d0dfd8 ffff880212d0dfd8 ffff880216cf2300 ffff880208e76240
[ 4770.007858]  ffff880212d0dd08 ffff880212d0dd58 ffff880208e76240 0000000000000000
[ 4770.007858] Call Trace:
[ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
[ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
[ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
[ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
[ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
[ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
[ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
[ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
[ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
[ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b

I can't reliably reproduce it at this point, but there does appear
to be some kind of locking problem in the multistream support.

Speaking of which, most large filesystems dump/restore tests are
failing because of this output:

026 20s ... - output mismatch (see 026.out.bad)
--- 026.out     2012-10-05 11:37:51.000000000 +1000
+++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100
@@ -20,6 +20,7 @@
 xfsdump: media file size NUM bytes
 xfsdump: dump size (non-dir files) : NUM bytes
 xfsdump: dump complete: SECS seconds elapsed
+xfsdump:   stream 0 DUMP_FILE OK (success)
 xfsdump: Dump Status: SUCCESS
 Restoring from file...
 xfsrestore  -f DUMP_FILE  -L stress_026 RESTORE_DIR
@@ -32,6 +33,7 @@
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed
+xfsrestore:   stream 0 DUMP_FILE OK (success)
 xfsrestore: Restore Status: SUCCESS
 Comparing dump directory with restore directory
 Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical

Which looks like output from the multistream code. Why it is
emitting this for large filesystem testing and not for small
filesystems, I'm not sure yet. 

In fact, with --largefs, I see this for the dump group:

Failures: 026 028 046 047 056 059 060 061 063 064 065 066 266 281
282 283
Failed 16 of 19 tests

And this for the normal sized (10GB) scratch device:

Passed all 18 tests

So there's something funky going on here....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-02  5:51 ` Dave Chinner
@ 2012-11-02 18:59   ` Ben Myers
  2012-11-02 23:03     ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Myers @ 2012-11-02 18:59 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Hi Dave,

On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote:
> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> > Hi Folks,
> > 
> > We're working toward a userspace release this month.  There are several patches
> > that need to go in first, including backing out the xfsdump format version bump
> > from Eric, fixes for the makefiles from Mike, and the Polish language update
> > for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> > time to flame about it.  I will take a look around for other important patches
> > too.
> > 
> > This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> > give everyone a few working days to do a final test and/or pipe up if we have
> > missed something important.  Then if all goes well we'll cut the release next
> > Tuesday.
> 
> I think that dump/restore need more work/testing.

Sounds good.  AFAIK there is no blazing hurry to release immediately.

> I've already pointed Eric to the header checksum failures (forkoff
> patch being needed), and that fixes the failures I've been seeing on
> normal xfstests runs.

I've pulled that patch in.  Interesting that it doesn't reproduce on i586 but
is so reliable on x86_64.  It's a good excuse to do some testing on a wider set
of arches before the release.

> Running some large filesystem testing, however, I see more problems.
> I'm using a 17TB filesytsem and the --largefs patch series. This
> results in a futex hang in 059 like so:
> 
> [ 4770.007858] xfsrestore      S ffff88021fc52d40  5504  3926   3487 0x00000000
> [ 4770.007858]  ffff880212ea9c68 0000000000000082 ffff880207830140 ffff880212ea9fd8
> [ 4770.007858]  ffff880212ea9fd8 ffff880212ea9fd8 ffff880216cec2c0 ffff880207830140
> [ 4770.007858]  ffff880212ea9d08 ffff880212ea9d58 ffff880207830140 0000000000000000
> [ 4770.007858] Call Trace:
> [ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
> [ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
> [ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
> [ 4770.007858]  [<ffffffff8113acf7>] ? __free_pages+0x47/0x70
> [ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
> [ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
> [ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
> [ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
> [ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
> [ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
> [ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b
> [ 4770.007858] xfsrestore      S ffff88021fc52d40  5656  3927   3487 0x00000000
> [ 4770.007858]  ffff880208f29c68 0000000000000082 ffff880208f84180 ffff880208f29fd8
> [ 4770.007858]  ffff880208f29fd8 ffff880208f29fd8 ffff880216cec2c0 ffff880208f84180
> [ 4770.007858]  ffff880208f29d08 ffff880208f29d58 ffff880208f84180 0000000000000000
> [ 4770.007858] Call Trace:
> [ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
> [ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
> [ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
> [ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
> [ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
> [ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
> [ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
> [ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
> [ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
> [ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b
> [ 4770.007858] xfsrestore      S ffff88021fc92d40  5848  3928   3487 0x00000000
> [ 4770.007858]  ffff880212d0dc68 0000000000000082 ffff880208e76240 ffff880212d0dfd8
> [ 4770.007858]  ffff880212d0dfd8 ffff880212d0dfd8 ffff880216cf2300 ffff880208e76240
> [ 4770.007858]  ffff880212d0dd08 ffff880212d0dd58 ffff880208e76240 0000000000000000
> [ 4770.007858] Call Trace:
> [ 4770.007858]  [<ffffffff81b8a009>] schedule+0x29/0x70
> [ 4770.007858]  [<ffffffff810db089>] futex_wait_queue_me+0xc9/0x100
> [ 4770.007858]  [<ffffffff810db809>] futex_wait+0x189/0x290
> [ 4770.007858]  [<ffffffff810dd41c>] do_futex+0x11c/0xa80
> [ 4770.007858]  [<ffffffff810abbd5>] ? hrtimer_try_to_cancel+0x55/0x110
> [ 4770.007858]  [<ffffffff810abcb2>] ? hrtimer_cancel+0x22/0x30
> [ 4770.007858]  [<ffffffff81b88f44>] ? do_nanosleep+0xa4/0xd0
> [ 4770.007858]  [<ffffffff810dde0d>] sys_futex+0x8d/0x1b0
> [ 4770.007858]  [<ffffffff810ab6e0>] ? update_rmtp+0x80/0x80
> [ 4770.007858]  [<ffffffff81b93a99>] system_call_fastpath+0x16/0x1b
> 
> I can't reliably reproduce it at this point, but there does appear
> to be some kind of locking problem in the multistream support.

One of my machines hit this overnight without --largefs.  I wasn't able to get
a dump though.  Just another data point.

> Speaking of which, most large filesystems dump/restore tests are
> failing because of this output:
> 
> 026 20s ... - output mismatch (see 026.out.bad)
> --- 026.out     2012-10-05 11:37:51.000000000 +1000
> +++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100
> @@ -20,6 +20,7 @@
>  xfsdump: media file size NUM bytes
>  xfsdump: dump size (non-dir files) : NUM bytes
>  xfsdump: dump complete: SECS seconds elapsed
> +xfsdump:   stream 0 DUMP_FILE OK (success)
>  xfsdump: Dump Status: SUCCESS
>  Restoring from file...
>  xfsrestore  -f DUMP_FILE  -L stress_026 RESTORE_DIR
> @@ -32,6 +33,7 @@
>  xfsrestore: directory post-processing
>  xfsrestore: restoring non-directory files
>  xfsrestore: restore complete: SECS seconds elapsed
> +xfsrestore:   stream 0 DUMP_FILE OK (success)
>  xfsrestore: Restore Status: SUCCESS
>  Comparing dump directory with restore directory
>  Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical
> 
> Which looks like output from the multistream code. Why it is
> emitting this for large filesystem testing and not for small
> filesystems, I'm not sure yet. 
> 
> In fact, with --largefs, I see this for the dump group:
> 
> Failures: 026 028 046 047 056 059 060 061 063 064 065 066 266 281
> 282 283
> Failed 16 of 19 tests
> 
> And this for the normal sized (10GB) scratch device:
> 
> Passed all 18 tests
> 
> So there's something funky going on here....

Rich also reported some golden output related changes with --largefs awhile
back.  I don't think he saw this one though.

The TODO list for userspace release currently stands at:

1) fix the header checksum failures... which is resolved
2) fix a futex hang in 059
3) fix the golden output changes related to multistream support in xfsdump
   and --largefs
4) test on more platforms

Regards,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-02 18:59   ` Ben Myers
@ 2012-11-02 23:03     ` Dave Chinner
  2012-11-03  0:16       ` Dave Chinner
  2012-11-03  1:53       ` Dave Chinner
  0 siblings, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-02 23:03 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote:
> Hi Dave,
> 
> On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote:
> > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> > > Hi Folks,
> > > 
> > > We're working toward a userspace release this month.  There are several patches
> > > that need to go in first, including backing out the xfsdump format version bump
> > > from Eric, fixes for the makefiles from Mike, and the Polish language update
> > > for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> > > time to flame about it.  I will take a look around for other important patches
> > > too.
> > > 
> > > This time I'm going to tag an -rc1 (probably later today or tomorrow).  We'll
> > > give everyone a few working days to do a final test and/or pipe up if we have
> > > missed something important.  Then if all goes well we'll cut the release next
> > > Tuesday.
> > 
> > I think that dump/restore need more work/testing.
> 
> Sounds good.  AFAIK there is no blazing hurry to release immediately.

Agreed. better to get it right ;)

> > Running some large filesystem testing, however, I see more problems.
> > I'm using a 17TB filesytsem and the --largefs patch series. This
> > results in a futex hang in 059 like so:
....
> > I can't reliably reproduce it at this point, but there does appear
> > to be some kind of locking problem in the multistream support.
> 
> One of my machines hit this overnight without --largefs.  I wasn't able to get
> a dump though.  Just another data point.

Ok, that's good to know it is directly related to the largefs
testing I'm doing.

> > Speaking of which, most large filesystems dump/restore tests are
> > failing because of this output:
> > 
> > 026 20s ... - output mismatch (see 026.out.bad)
> > --- 026.out     2012-10-05 11:37:51.000000000 +1000
> > +++ 026.out.bad 2012-11-02 16:20:17.000000000 +1100
> > @@ -20,6 +20,7 @@
> >  xfsdump: media file size NUM bytes
> >  xfsdump: dump size (non-dir files) : NUM bytes
> >  xfsdump: dump complete: SECS seconds elapsed
> > +xfsdump:   stream 0 DUMP_FILE OK (success)
> >  xfsdump: Dump Status: SUCCESS
> >  Restoring from file...
> >  xfsrestore  -f DUMP_FILE  -L stress_026 RESTORE_DIR
> > @@ -32,6 +33,7 @@
> >  xfsrestore: directory post-processing
> >  xfsrestore: restoring non-directory files
> >  xfsrestore: restore complete: SECS seconds elapsed
> > +xfsrestore:   stream 0 DUMP_FILE OK (success)
> >  xfsrestore: Restore Status: SUCCESS
> >  Comparing dump directory with restore directory
> >  Files DUMP_DIR/big and RESTORE_DIR/DUMP_SUBDIR/big are identical
> > 
> > Which looks like output from the multistream code. Why it is
> > emitting this for large filesystem testing and not for small
> > filesystems, I'm not sure yet. 
....
> Rich also reported some golden output related changes with --largefs awhile
> back.  I don't think he saw this one though.

No, this one is new, caused by upgrading xfsdump. As it turns out,
the previous version of xfsdump on this particular VM was from
before the multistream dump was implemented - it was a distro
package rather than one I'd custom built.

And, as it is, I just removed the --large-fs config (so my scratch
device is just an empty 17TB device) and I still get this extra
output. So it's not related to the --large-fs behaviour at all.

> The TODO list for userspace release currently stands at:
> 
> 1) fix the header checksum failures... which is resolved
> 2) fix a futex hang in 059
> 3) fix the golden output changes related to multistream support in xfsdump
>    and --largefs

Well, understand them first, then fix ;)

> 4) test on more platforms

I suspect that the futex hang is only going to be solvable if it can
be reliably reproduced. I haven't seen it again since the hang I
reported. Otherwise, sounds good.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-02 23:03     ` Dave Chinner
@ 2012-11-03  0:16       ` Dave Chinner
  2012-11-03  1:35         ` Eric Sandeen
  2012-11-03  3:16         ` Dave Chinner
  2012-11-03  1:53       ` Dave Chinner
  1 sibling, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-03  0:16 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote:
> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote:
> > Hi Dave,
> > 
> > On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote:
> > > On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> > > > Hi Folks,
> > > > 
> > > > We're working toward a userspace release this month.  There are several patches
> > > > that need to go in first, including backing out the xfsdump format version bump
> > > > from Eric, fixes for the makefiles from Mike, and the Polish language update
> > > > for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> > > > time to flame about it.  I will take a look around for other important patches
> > > > too.
....
> > The TODO list for userspace release currently stands at:
> > 
> > 1) fix the header checksum failures... which is resolved
> > 2) fix a futex hang in 059
> > 3) fix the golden output changes related to multistream support in xfsdump
> >    and --largefs
> 
> Well, understand them first, then fix ;)
> 
> > 4) test on more platforms

Another:

$ sudo xfs_info /mnt/scratch/
meta-data=/dev/vdc               isize=256    agcount=4, agsize=12800 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=51200, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=1200, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
$ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc
versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT
$

xfs_info is not reporting the 32 bit project ID status.

Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd
like it this release to at least say "off or unknown" here.

I say this, because this is the first thing I noticed when having a
look at a test 287 failure:

7 10s ... - output mismatch (see 287.out.bad)
--- 287.out     2012-10-05 11:38:08.000000000 +1000
+++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100
@@ -2,22 +2,24 @@
 No 32bit project quotas:
 projid = 1234
 projid = 0
+xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument
 With 32bit project quota support:
 projid = 1234
-projid = 2123456789
+projid = 0
+xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument
 The restored file system + one additional file:
 projid = 1234
-projid = 2123456789
-projid = 2123456789
+projid = 0
+projid = 0
 These two values of 16bit project quota ids shall be the same
-core.projid_lo = 1234
+core.projid_lo = 0
 core.projid_hi = 0
 core.projid_lo = 1234
 core.projid_hi = 0
 These three values of 32bit project quota ids shall be the same
-core.projid_lo = 24853
-core.projid_hi = 32401
-core.projid_lo = 24853
-core.projid_hi = 32401
-core.projid_lo = 24853
-core.projid_hi = 32401
+core.projid_lo = 0
+core.projid_hi = 0
+core.projid_lo = 0
+core.projid_hi = 0
+core.projid_lo = 0
+core.projid_hi = 0

Here's what's curious - this is failing on the 17TB filesystem, but
is not failing on 10-20GB filesystems. There seems to be a pattern
here....

Note that I only recently updated xfstests on the VM with the 17TB
filesystem (i.e. on wednesday), so this is probably the first time I
have run test 287 on a large filesystem like this. Same goes for
much of the other problems I'm reporting - xfstests on this machine
has been running out of dev branch I hadn't updated for a while, so
these problems might have been around for a while on large
filesystems...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-03  0:16       ` Dave Chinner
@ 2012-11-03  1:35         ` Eric Sandeen
  2012-11-03  1:55           ` Dave Chinner
  2012-11-03  3:16         ` Dave Chinner
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2012-11-03  1:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Ben Myers, xfs

On 11/2/12 7:16 PM, Dave Chinner wrote:
> On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote:
>> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote:
>>> Hi Dave,
>>>
>>> On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote:
>>>> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
>>>>> Hi Folks,
>>>>>
>>>>> We're working toward a userspace release this month.  There are several patches
>>>>> that need to go in first, including backing out the xfsdump format version bump
>>>>> from Eric, fixes for the makefiles from Mike, and the Polish language update
>>>>> for xfsdump from Jakub.  If anyone knows of something else we need, now is the
>>>>> time to flame about it.  I will take a look around for other important patches
>>>>> too.
> ....
>>> The TODO list for userspace release currently stands at:
>>>
>>> 1) fix the header checksum failures... which is resolved
>>> 2) fix a futex hang in 059
>>> 3) fix the golden output changes related to multistream support in xfsdump
>>>    and --largefs
>>
>> Well, understand them first, then fix ;)
>>
>>> 4) test on more platforms
> 
> Another:
> 
> $ sudo xfs_info /mnt/scratch/
> meta-data=/dev/vdc               isize=256    agcount=4, agsize=12800 blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=51200, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=1200, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> $ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc
> versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT
> $
> 
> xfs_info is not reporting the 32 bit project ID status.

Weird, I didn't realize that
[PATCH 2/2] xfsprogs: report projid32 status in growfs output
hadn't been pulled in.

> Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd
> like it this release to at least say "off or unknown" here.

Heh, ok, when you reviewed you said it was no big deal ;)  but I guess
we can add the "or unknown" if you like.

> I say this, because this is the first thing I noticed when having a
> look at a test 287 failure:

Hm that's pretty odd.

-Eric

> 7 10s ... - output mismatch (see 287.out.bad)
> --- 287.out     2012-10-05 11:38:08.000000000 +1000
> +++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100
> @@ -2,22 +2,24 @@
>  No 32bit project quotas:
>  projid = 1234
>  projid = 0
> +xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument
>  With 32bit project quota support:
>  projid = 1234
> -projid = 2123456789
> +projid = 0
> +xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument
>  The restored file system + one additional file:
>  projid = 1234
> -projid = 2123456789
> -projid = 2123456789
> +projid = 0
> +projid = 0
>  These two values of 16bit project quota ids shall be the same
> -core.projid_lo = 1234
> +core.projid_lo = 0
>  core.projid_hi = 0
>  core.projid_lo = 1234
>  core.projid_hi = 0
>  These three values of 32bit project quota ids shall be the same
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> +core.projid_lo = 0
> +core.projid_hi = 0
> +core.projid_lo = 0
> +core.projid_hi = 0
> +core.projid_lo = 0
> +core.projid_hi = 0
> 
> Here's what's curious - this is failing on the 17TB filesystem, but
> is not failing on 10-20GB filesystems. There seems to be a pattern
> here....
> 
> Note that I only recently updated xfstests on the VM with the 17TB
> filesystem (i.e. on wednesday), so this is probably the first time I
> have run test 287 on a large filesystem like this. Same goes for
> much of the other problems I'm reporting - xfstests on this machine
> has been running out of dev branch I hadn't updated for a while, so
> these problems might have been around for a while on large
> filesystems...
> 
> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-02 23:03     ` Dave Chinner
  2012-11-03  0:16       ` Dave Chinner
@ 2012-11-03  1:53       ` Dave Chinner
  1 sibling, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-03  1:53 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote:
> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote:
> > The TODO list for userspace release currently stands at:
> > 
> > 1) fix the header checksum failures... which is resolved
> > 2) fix a futex hang in 059
> > 3) fix the golden output changes related to multistream support in xfsdump
> >    and --largefs
> 
> Well, understand them first, then fix ;)

Because it is a bug in the dump filter that the largefs patchset
introduces. I've fixed this, and the problem goes away. You can take
this off your list ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-03  1:35         ` Eric Sandeen
@ 2012-11-03  1:55           ` Dave Chinner
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-03  1:55 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Ben Myers, xfs

On Fri, Nov 02, 2012 at 08:35:32PM -0500, Eric Sandeen wrote:
> On 11/2/12 7:16 PM, Dave Chinner wrote:
> > On Sat, Nov 03, 2012 at 10:03:34AM +1100, Dave Chinner wrote:
> >> On Fri, Nov 02, 2012 at 01:59:23PM -0500, Ben Myers wrote:
> >>> Hi Dave,
> >>>
> >>> On Fri, Nov 02, 2012 at 04:51:02PM +1100, Dave Chinner wrote:
> >>>> On Thu, Oct 25, 2012 at 10:15:01AM -0500, Ben Myers wrote:
> >>>>> Hi Folks,
> >>>>>
> >>>>> We're working toward a userspace release this month.  There are several patches
> >>>>> that need to go in first, including backing out the xfsdump format version bump
> >>>>> from Eric, fixes for the makefiles from Mike, and the Polish language update
> >>>>> for xfsdump from Jakub.  If anyone knows of something else we need, now is the
> >>>>> time to flame about it.  I will take a look around for other important patches
> >>>>> too.
> > ....
> >>> The TODO list for userspace release currently stands at:
> >>>
> >>> 1) fix the header checksum failures... which is resolved
> >>> 2) fix a futex hang in 059
> >>> 3) fix the golden output changes related to multistream support in xfsdump
> >>>    and --largefs
> >>
> >> Well, understand them first, then fix ;)
> >>
> >>> 4) test on more platforms
> > 
> > Another:
> > 
> > $ sudo xfs_info /mnt/scratch/
> > meta-data=/dev/vdc               isize=256    agcount=4, agsize=12800 blks
> >          =                       sectsz=512   attr=2
> > data     =                       bsize=4096   blocks=51200, imaxpct=25
> >          =                       sunit=0      swidth=0 blks
> > naming   =version 2              bsize=4096   ascii-ci=0
> > log      =internal               bsize=4096   blocks=1200, version=2
> >          =                       sectsz=512   sunit=0 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> > $ sudo xfs_db -r -c "sb 0" -c "version" /dev/vdc
> > versionnum [0xb4a4+0x8a] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT
> > $
> > 
> > xfs_info is not reporting the 32 bit project ID status.
> 
> Weird, I didn't realize that
> [PATCH 2/2] xfsprogs: report projid32 status in growfs output
> hadn't been pulled in.
> 
> > Yes, I know this requires the XFS_IOC_FSGEOM support for it, but I'd
> > like it this release to at least say "off or unknown" here.
> 
> Heh, ok, when you reviewed you said it was no big deal ;)  but I guess
> we can add the "or unknown" if you like.

It probably doesn't matter that much, because we'll know if it is
supporte dbyt the kernel the user is running. having it there is the
most important thing.

> > I say this, because this is the first thing I noticed when having a
> > look at a test 287 failure:
> 
> Hm that's pretty odd.

Yeah, Still need to get to the bottom of it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: status of userspace release
  2012-11-03  0:16       ` Dave Chinner
  2012-11-03  1:35         ` Eric Sandeen
@ 2012-11-03  3:16         ` Dave Chinner
  1 sibling, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2012-11-03  3:16 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Sat, Nov 03, 2012 at 11:16:39AM +1100, Dave Chinner wrote:
> I say this, because this is the first thing I noticed when having a
> look at a test 287 failure:
> 
> 7 10s ... - output mismatch (see 287.out.bad)
> --- 287.out     2012-10-05 11:38:08.000000000 +1000
> +++ 287.out.bad 2012-11-03 10:55:15.000000000 +1100
> @@ -2,22 +2,24 @@
>  No 32bit project quotas:
>  projid = 1234
>  projid = 0
> +xfs_quota: cannot set project on /mnt/scratch/pquota/32bit: Invalid argument
>  With 32bit project quota support:
>  projid = 1234
> -projid = 2123456789
> +projid = 0
> +xfs_quota: cannot set project on /mnt/scratch/restore/pquota/32bitv2: Invalid argument
>  The restored file system + one additional file:
>  projid = 1234
> -projid = 2123456789
> -projid = 2123456789
> +projid = 0
> +projid = 0
>  These two values of 16bit project quota ids shall be the same
> -core.projid_lo = 1234
> +core.projid_lo = 0
>  core.projid_hi = 0
>  core.projid_lo = 1234
>  core.projid_hi = 0
>  These three values of 32bit project quota ids shall be the same
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> -core.projid_lo = 24853
> -core.projid_hi = 32401
> +core.projid_lo = 0
> +core.projid_hi = 0
> +core.projid_lo = 0
> +core.projid_hi = 0
> +core.projid_lo = 0
> +core.projid_hi = 0
> 
> Here's what's curious - this is failing on the 17TB filesystem, but
> is not failing on 10-20GB filesystems. There seems to be a pattern
> here....

This is caused by a longstanding bug in xfs_db. The fix below should
be included in the release, I think...

Cheers,

Dave
-- 
Dave Chinner
david@fromorbit.com

xfs_db: flush devices before exiting

From: Dave Chinner <dchinner@redhat.com>

Test 287 uses xfs_db to change 32-bit project ID support while the
filesystem is unmounted. On a large filesystem the test was failing
due to the mount not seeing the feature bit in the superblock.

xfs_db uses a different address space to the filesystem when it is
mounted by the kernel, so the only way to keep them coherent is to
ensure that all buffered data is written to disk before the other
entity tries to read it. xfs_db uses buffered IO, but does not close
the devices when it exits, thereby leaving changes it has written in
the block device cache rather than on disk. Hence when the kernel
tries to mount the filesystem, it reads what is on disk and does not
see xfs_db's changes.

Fix this by ensuring that xfs_db flushes it's changes to disk before
it exits by caling libxfs_device_close(). This fsyncs the data and
flushes the caches to ensure that it is present on disk before
xfs_db exits.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 db/init.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/db/init.c b/db/init.c
index 2a5ef2b..2a31cb8 100644
--- a/db/init.c
+++ b/db/init.c
@@ -170,7 +170,7 @@ main(
 	}
 	if (cmdline) {
 		xfree(cmdline);
-		return exitcode;
+		goto close_devices;
 	}
 
 	while (!done) {
@@ -181,5 +181,13 @@ main(
 			done = command(c, v);
 		doneline(input, v);
 	}
+
+close_devices:
+	if (x.ddev)
+		libxfs_device_close(x.ddev);
+	if (x.logdev && x.logdev != x.ddev)
+		libxfs_device_close(x.logdev);
+	if (x.rtdev)
+		libxfs_device_close(x.rtdev);
 	return exitcode;
 }

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-11-03  3:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-25 15:15 status of userspace release Ben Myers
2012-10-26 21:57 ` Ben Myers
2012-10-28 21:27   ` Dave Chinner
2012-10-29 16:17     ` Ben Myers
2012-11-02  5:51 ` Dave Chinner
2012-11-02 18:59   ` Ben Myers
2012-11-02 23:03     ` Dave Chinner
2012-11-03  0:16       ` Dave Chinner
2012-11-03  1:35         ` Eric Sandeen
2012-11-03  1:55           ` Dave Chinner
2012-11-03  3:16         ` Dave Chinner
2012-11-03  1:53       ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox