* mkfs.xfs with a 9TB realtime volume hangs
@ 2008-11-14 10:41 Jan Wagner
2009-01-10 21:13 ` Eric Sandeen
0 siblings, 1 reply; 4+ messages in thread
From: Jan Wagner @ 2008-11-14 10:41 UTC (permalink / raw)
To: xfs
Hi,
I have a RAID0 with 11x750GB+1x1TB components in the following
partitionable-md test setup
root@abidal:~# cat /proc/partitions | grep md
254 0 9035047936 md_d0
254 1 124983 md_d0p1
254 2 1828125 md_d0p2
254 3 1953125 md_d0p3
254 4 9031141669 md_d0p4
Essentially, four partitions: 128MB, ~1.9GB, 2GB, 9TB. I'd like to use the
1.9GB partition for xfs and put a realtime subvolume onto the same raid0
onto the 9TB partition. The partition tables are GDT instead of MBR to be
able to have >=2TB partitions.
When I create xfs with realtime subvolume on the 2GB partition all is
fine:
root@abidal:~# mkfs.xfs -f -d su=1024k,sw=12 -r rtdev=/dev/md_d0p3 /dev/md_d0p2
log stripe unit (1048576 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md_d0p2 isize=256 agcount=4, agsize=114432
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=457031, imaxpct=25
= sunit=256 swidth=3072 blks
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=8 blks, lazy-count=0
realtime =/dev/md_d0p3 extsz=4096 blocks=488281, rtextents=488281
When I try the same but place the realtime subvolume on the 9TB partition
the mkfs.xfs hangs indefinitely with 100% CPU:
root@abidal:~# mkfs.xfs -f -d su=1024k,sw=12 -r rtdev=/dev/md_d0p4 /dev/md_d0p2
log stripe unit (1048576 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md_d0p2 isize=256 agcount=4, agsize=114432
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=457031, imaxpct=25
= sunit=256 swidth=3072 blks
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=8 blks, lazy-count=0
realtime =/dev/md_d0p4 extsz=4096 blocks=2257785417, rtextents=2257785417
(hangs...)
When I run strace on the first, it completes with
...
pwrite(4, "IABT\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\0"...,
4096, 468725760) = 4096
pwrite(4, "XAGI\0\0\0\1\0\0\0\1\0\1\277\0\0\0\0\0\0\0\0\3\0\0\0\1"...,
512, 468714496) = 512
pread(4, "XFSB\0\0\20\0\0\0\0\0\0\6\371G\0\0\0\0\0\7sY\0\0\0\0\0"..., 512,
0) = 512
pwrite(4, "XFSB\0\0\20\0\0\0\0\0\0\6\371G\0\0\0\0\0\7sY\0\0\0\0\0"...,
512, 0) = 512
fsync(5) = 0
ioctl(5, BLKFLSBUF, 0) = 0
close(5) = 0
fsync(4) = 0
ioctl(4, BLKFLSBUF, 0) = 0
close(4) = 0
exit_group(0) = ?
When I run strace on the latter mkfs.xfs it is reading for hours
pread(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096, 7802880) = 4096
brk(0x1667000) = 0x1667000
pread(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096, 7806976) = 4096
pread(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096, 7811072) = 4096
....
Any ideas?
- Jan
--
****************************************************
Helsinki University of Technology
Dept. of Metsähovi Radio Observatory
http://www.metsahovi.fi/~jwagner/
Work +358-9-428320-36
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mkfs.xfs with a 9TB realtime volume hangs
2008-11-14 10:41 mkfs.xfs with a 9TB realtime volume hangs Jan Wagner
@ 2009-01-10 21:13 ` Eric Sandeen
2009-01-11 10:35 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Eric Sandeen @ 2009-01-10 21:13 UTC (permalink / raw)
To: Jan Wagner; +Cc: xfs
Jan Wagner wrote:
> Hi,
>
> I have a RAID0 with 11x750GB+1x1TB components in the following
> partitionable-md test setup
>
> root@abidal:~# cat /proc/partitions | grep md
> 254 0 9035047936 md_d0
> 254 1 124983 md_d0p1
> 254 2 1828125 md_d0p2
> 254 3 1953125 md_d0p3
> 254 4 9031141669 md_d0p4
>
> Essentially, four partitions: 128MB, ~1.9GB, 2GB, 9TB. I'd like to use the
> 1.9GB partition for xfs and put a realtime subvolume onto the same raid0
> onto the 9TB partition. The partition tables are GDT instead of MBR to be
> able to have >=2TB partitions.
Sorry for the slow/no reply. It seems to be doing many calculations in
rtinit, haven't sorted out what yet, but it's not likely hung, it's
workin hard. :)
If you give it a larger extsize it should go faster (if the larger
extsize is acceptable for your use...)
I tried a 4t realtime volume:
mkfs.xfs -dfile,name=fsfile,size=1g -rfile,name=rtfile,size=4t,extsize=$SIZE
for a few different extent sizes, and got
extsize time
------- ----
512k 0.3s
256k 0.7s
128k 1.9s
64k 8.4s
32k 25.4s
16k 129.4s
With the default 4k extent size this takes forever (the man page claims
default is 64k, maybe this got broken at some point).
Somebody will need to find time to look into what's going on.
I think it's doing lots of work in rtinit, something like
#0 xfs_rtfind_back (mp=0x7fffa69e2f30, tp=0x3afc8b0, start=115245056,
limit=0, rtblock=0x7fffa69e28b8) at xfs_rtalloc.c:83
#1 0x000000000041433c in xfs_rtfree_range (mp=0x7fffa69e2f30,
tp=0x3afc8b0, start=115245056, len=32768, rbpp=0x7fffa69e2908,
rsb=0x7fffa69e2910)
at xfs_rtalloc.c:448
#2 0x00000000004144e4 in libxfs_rtfree_extent (tp=0x3afc8b0,
bno=115245056, len=32768) at xfs_rtalloc.c:756
#3 0x0000000000403a6b in parseproto (mp=0x7fffa69e2f30, pip=<value
optimized out>, fsxp=0x7fffa69e3240, pp=0x7fffa69e2ef0, name=<value
optimized out>)
at proto.c:752
...
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mkfs.xfs with a 9TB realtime volume hangs
2009-01-10 21:13 ` Eric Sandeen
@ 2009-01-11 10:35 ` Dave Chinner
2009-01-11 13:46 ` Eric Sandeen
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2009-01-11 10:35 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs, Jan Wagner
On Sat, Jan 10, 2009 at 03:13:23PM -0600, Eric Sandeen wrote:
> Jan Wagner wrote:
> > Hi,
> >
> > I have a RAID0 with 11x750GB+1x1TB components in the following
> > partitionable-md test setup
> >
> > root@abidal:~# cat /proc/partitions | grep md
> > 254 0 9035047936 md_d0
> > 254 1 124983 md_d0p1
> > 254 2 1828125 md_d0p2
> > 254 3 1953125 md_d0p3
> > 254 4 9031141669 md_d0p4
> >
> > Essentially, four partitions: 128MB, ~1.9GB, 2GB, 9TB. I'd like to use the
> > 1.9GB partition for xfs and put a realtime subvolume onto the same raid0
> > onto the 9TB partition. The partition tables are GDT instead of MBR to be
> > able to have >=2TB partitions.
>
> Sorry for the slow/no reply. It seems to be doing many calculations in
> rtinit, haven't sorted out what yet, but it's not likely hung, it's
> workin hard. :)
>
> If you give it a larger extsize it should go faster (if the larger
> extsize is acceptable for your use...)
>
> I tried a 4t realtime volume:
>
> mkfs.xfs -dfile,name=fsfile,size=1g -rfile,name=rtfile,size=4t,extsize=$SIZE
>
> for a few different extent sizes, and got
>
> extsize time
> ------- ----
> 512k 0.3s
> 256k 0.7s
> 128k 1.9s
> 64k 8.4s
> 32k 25.4s
> 16k 129.4s
>
> With the default 4k extent size this takes forever (the man page claims
> default is 64k, maybe this got broken at some point).
It got changed a few years back by Nathan, IIRC. I bet the time
being taken a result of the blow-out in bitmap size caused by reducing
the extent size. Given it is non-linear, it may have something to do
with cache sizes as well. e.g buftarg hashes not large enough.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mkfs.xfs with a 9TB realtime volume hangs
2009-01-11 10:35 ` Dave Chinner
@ 2009-01-11 13:46 ` Eric Sandeen
0 siblings, 0 replies; 4+ messages in thread
From: Eric Sandeen @ 2009-01-11 13:46 UTC (permalink / raw)
To: Eric Sandeen, Jan Wagner, xfs
Dave Chinner wrote:
> On Sat, Jan 10, 2009 at 03:13:23PM -0600, Eric Sandeen wrote:
>> With the default 4k extent size this takes forever (the man page claims
>> default is 64k, maybe this got broken at some point).
>
> It got changed a few years back by Nathan, IIRC.
Yep, I found the commit yesterday, it was for buffered IO's benefit on
the rt subvol, to reduce unwritten extent conversion IIRC.
I was going to follow up w/ the commit etc yesterday, but the mailing
list was down (again, sigh - this is indicative of sgi's staunch
commitment to xfs, I'm sure) so had nothing to reply to.
-Eric
> I bet the time
> being taken a result of the blow-out in bitmap size caused by reducing
> the extent size. Given it is non-linear, it may have something to do
> with cache sizes as well. e.g buftarg hashes not large enough.
>
> Cheers,
>
> Dave.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-01-11 13:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-14 10:41 mkfs.xfs with a 9TB realtime volume hangs Jan Wagner
2009-01-10 21:13 ` Eric Sandeen
2009-01-11 10:35 ` Dave Chinner
2009-01-11 13:46 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox