* higher agcount on LVM2 thinp volumes
@ 2013-08-29 6:09 Chris Murphy
2013-08-30 1:44 ` Stan Hoeppner
2013-08-30 3:04 ` Eric Sandeen
0 siblings, 2 replies; 15+ messages in thread
From: Chris Murphy @ 2013-08-29 6:09 UTC (permalink / raw)
To: xfs
Is it expected when formatting, using defaults, that a thinp volume compared to either a conventional LV or partition of the same size, should have a higher agcount?
HDD, GPT partitioned, 100GB partition size:
[root@f19s ~]# mkfs.xfs /dev/sda7
meta-data=/dev/sda7 isize=256 agcount=4, agsize=6553600 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
A 400GB partition, made into PV, PV added to VG, and all extents put into a thinpool volume, a 100GB virtual sized LV:
[root@f19s ~]# mkfs.xfs /dev/mapper/vg1-data
meta-data=/dev/mapper/vg1-data isize=256 agcount=16, agsize=1638400 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
I get agcount=4 on a conventional LV as well. Why agcount=16 on thinp?
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-29 6:09 higher agcount on LVM2 thinp volumes Chris Murphy
@ 2013-08-30 1:44 ` Stan Hoeppner
2013-08-30 2:08 ` Chris Murphy
2013-08-30 3:04 ` Eric Sandeen
1 sibling, 1 reply; 15+ messages in thread
From: Stan Hoeppner @ 2013-08-30 1:44 UTC (permalink / raw)
To: Chris Murphy; +Cc: xfs
On 8/29/2013 1:09 AM, Chris Murphy wrote:
>
> Is it expected when formatting, using defaults, that a thinp volume compared to either a conventional LV or partition of the same size, should have a higher agcount?
>
>
> HDD, GPT partitioned, 100GB partition size:
>
> [root@f19s ~]# mkfs.xfs /dev/sda7
> meta-data=/dev/sda7 isize=256 agcount=4, agsize=6553600 blks
> = sectsz=512 attr=2, projid32bit=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
>
> A 400GB partition, made into PV, PV added to VG, and all extents put into a thinpool volume, a 100GB virtual sized LV:
>
> [root@f19s ~]# mkfs.xfs /dev/mapper/vg1-data
> meta-data=/dev/mapper/vg1-data isize=256 agcount=16, agsize=1638400 blks
> = sectsz=512 attr=2, projid32bit=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
>
>
> I get agcount=4 on a conventional LV as well. Why agcount=16 on thinp?
More information would be helpful, specifically WRT the device stack
underlying mkfs.xfs. I.e. we need to know more about the LVM configuration.
See:
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 1:44 ` Stan Hoeppner
@ 2013-08-30 2:08 ` Chris Murphy
2013-08-30 2:58 ` Dave Chinner
0 siblings, 1 reply; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 2:08 UTC (permalink / raw)
To: stan; +Cc: xfs
On Aug 29, 2013, at 7:44 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>
> More information would be helpful, specifically WRT the device stack
> underlying mkfs.xfs. I.e. we need to know more about the LVM configuration.
>
> See:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
Summary: laptop, one HDD, one 402GB partition is made into a PV, one VG is created with that PV and is the only VG on the system, one 400GB logical volume pool is created, one 100GB virtual sized logical volume is created from the thin pool.
Linux f19s.local 3.10.9-200.fc19.x86_64 #1 SMP Wed Aug 21 19:27:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
xfsprogs-3.1.10-2.fc19.x86_64
lvm2-2.02.98-12.fc19.x86_64
DMI: Apple Inc. MacBookPro4,1/Mac-F42C89C8
WDC WD5000BEVT-22ZAT0
[root@f19s ~]# gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.8.7
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 976773168 sectors, 465.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): CB8CACD3-0BDB-46E0-AD6A-D7F99915EA2D
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 976773134
Partitions will be aligned on 8-sector boundaries
Total free space is 478 sectors (239.0 KiB)
Number Start (sector) End (sector) Size Code Name
1 40 409639 200.0 MiB EF00 EFI System Partition
2 409640 49237767 23.3 GiB AF00 Mac
3 49237768 50507303 619.9 MiB AB00 Recovery HD
4 50507776 50712575 100.0 MiB AF00 fedoraEFI
5 50712576 67096575 7.8 GiB 8200 swap
6 67096576 133308415 31.6 GiB 8300 fedoraroot
7 133308416 976773134 402.2 GiB 8E00 LVM2
[root@f19s ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda7 vg1 lvm2 a-- 402.19g 188.00m
[root@f19s ~]# lvdisplay
--- Logical volume ---
LV Name thinp
VG Name vg1
LV UUID mKI9dj-1CsO-Ke7e-JcMM-NUDH-MpOD-yMl8Da
LV Write Access read/write
LV Creation host, time f19s.local, 2013-08-29 00:14:38 -0600
LV Pool transaction ID 1
LV Pool metadata thinp_tmeta
LV Pool data thinp_tdata
LV Pool chunk size 4.00 MiB
LV Zero new blocks yes
LV Status available
# open 0
LV Size 402.00 GiB
Allocated pool data 0.95%
Allocated metadata 1.12%
Current LE 102912
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:3
--- Logical volume ---
LV Path /dev/vg1/data
LV Name data
VG Name vg1
LV UUID yL0g16-qjUq-20MJ-Fu3I-q5AH-H8ed-LOhNwS
LV Write Access read/write
LV Creation host, time f19s.local, 2013-08-29 00:15:49 -0600
LV Pool name thinp
LV Status available
# open 0
LV Size 100.00 GiB
Mapped size 3.83%
Current LE 25600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:4
Commands to create the thinp volume:
[root@f19s ~]# pvcreate /dev/sda7
Physical volume "/dev/sda7" successfully created
[root@f19s ~]# vgcreate vg1 /dev/sda7
Volume group "vg1" successfully created
[root@f19s ~]# lvcreate -L 400G --type thin-pool --thinpool thinp vg1
device-mapper: remove ioctl on failed: Device or resource busy
Logical volume "thinp" created
[root@f19s ~]# lvcreate -V 100G -T vg1/thinp --name data
Logical volume "data" created
[root@f19s ~]# mkfs.xfs /dev/vg1/data
meta-data=/dev/vg1/data isize=256 agcount=16, agsize=1638400 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Whereas if I mkfs.xfs on /dev/sda7, or if I create a regular LV rather than a thinp volume, agcount is 4. It doesn't matter whether I create the thinp with the chunk option set to default (as above) or 1MB or 4MB.
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 2:08 ` Chris Murphy
@ 2013-08-30 2:58 ` Dave Chinner
2013-08-30 3:21 ` Chris Murphy
0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2013-08-30 2:58 UTC (permalink / raw)
To: Chris Murphy; +Cc: stan, xfs
On Thu, Aug 29, 2013 at 08:08:25PM -0600, Chris Murphy wrote:
>
> On Aug 29, 2013, at 7:44 PM, Stan Hoeppner
> <stan@hardwarefreak.com> wrote:
> >
> > More information would be helpful, specifically WRT the device
> > stack underlying mkfs.xfs. I.e. we need to know more about the
> > LVM configuration.
> >
> > See:
> >
> > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> Summary: laptop, one HDD, one 402GB partition is made into a PV,
> one VG is created with that PV and is the only VG on the system,
> one 400GB logical volume pool is created, one 100GB virtual sized
> logical volume is created from the thin pool.
....
> meta-data=/dev/vg1/data isize=256 agcount=16, agsize=1638400 blks
> = sectsz=512 attr=2, projid32bit=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> Whereas if I mkfs.xfs on /dev/sda7, or if I create a regular LV
> rather than a thinp volume, agcount is 4. It doesn't matter
> whether I create the thinp with the chunk option set to default
> (as above) or 1MB or 4MB.
Which means that the thinp device has some difference in what it is
telling mkfs.xfs about it's configuration that makes mkfs.xfs think
it is a RAID volume, not a single disk.
Basically, I think you'll find that the thinp device is emitting a
an optimal IO size that is not aligned to the filesystem block size,
so the AG count is being calculated as though it is a ~1TB
"multidisk" device (which gives 16 AGs) and then setting
sunit/swidth to zero because they aren't filesystem block aligned...
Check the contents of
/sys/block/<dev>/queue/{minimum,optimal}_io_size for the single
device, the standard LV and the thinp device. I think that you'll
find only the thinp device has a non-zero value. If the value from
the thinp code is 512 (i.e. single sector) then that's a bug in
the thinp device code as it should be zero...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-29 6:09 higher agcount on LVM2 thinp volumes Chris Murphy
2013-08-30 1:44 ` Stan Hoeppner
@ 2013-08-30 3:04 ` Eric Sandeen
2013-08-30 3:18 ` Chris Murphy
1 sibling, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2013-08-30 3:04 UTC (permalink / raw)
To: Chris Murphy; +Cc: xfs
On 8/29/13 1:09 AM, Chris Murphy wrote:
>
> Is it expected when formatting, using defaults, that a thinp volume
> compared to either a conventional LV or partition of the same size,
> should have a higher agcount?
woohoo, thinp testing! :)
>
> HDD, GPT partitioned, 100GB partition size:
>
> [root@f19s ~]# mkfs.xfs /dev/sda7
> meta-data=/dev/sda7 isize=256 agcount=4, agsize=6553600 blks
> = sectsz=512 attr=2, projid32bit=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
>
> A 400GB partition, made into PV, PV added to VG, and all extents put into a thinpool volume, a 100GB virtual sized LV:
>
> [root@f19s ~]# mkfs.xfs /dev/mapper/vg1-data
> meta-data=/dev/mapper/vg1-data isize=256 agcount=16, agsize=1638400 blks
> = sectsz=512 attr=2, projid32bit=0
> data = bsize=4096 blocks=26214400, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
>
>
> I get agcount=4 on a conventional LV as well. Why agcount=16 on thinp?
Hm.
calc_default_ag_geometry() in mkfs does this stuff. There is a "multidisk" mode
which creates more AGs, but you don't have any stripe geometry set, which is what
is supposed to trigger it.
What does
# blockdev --getiomin --getioopt mkfs.xfs /dev/mapper/vg1-data
say? I'm guessing we picked multidisk mode due to what looks like stripe
geometry, but then maybe it was out of bounds, and we turned it back off.
Or something... that does seem likely though, because on the same sized
fs created on a file:
w/o stripe geometry we get 4 ags:
$ mkfs.xfs -dfile,name=testfile,size=107374182400
meta-data=testfile isize=256 agcount=4, agsize=6553600 blks
...
w/ stripe geometry we get 16 ags:
$ mkfs.xfs -dfile,name=testfile,size=107374182400,su=64k,sw=4
meta-data=testfile isize=256 agcount=16, agsize=1638384 blks
...
so at the time we calculated AGs, we thought we had stripe geometry, and then
eventually discarded it.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:04 ` Eric Sandeen
@ 2013-08-30 3:18 ` Chris Murphy
2013-08-30 3:19 ` Eric Sandeen
0 siblings, 1 reply; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 3:18 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
On Aug 29, 2013, at 9:04 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> What does
>
> # blockdev --getiomin --getioopt mkfs.xfs /dev/mapper/vg1-data
>
> say?
[root@f19s ~]# blockdev --getiomin --getioopt mkfs.xfs /dev/mapper/vg1-data
blockdev: cannot open mkfs.xfs: No such file or directory
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:18 ` Chris Murphy
@ 2013-08-30 3:19 ` Eric Sandeen
2013-08-30 3:24 ` Chris Murphy
0 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2013-08-30 3:19 UTC (permalink / raw)
To: Chris Murphy; +Cc: xfs
On 8/29/13 10:18 PM, Chris Murphy wrote:
>
> On Aug 29, 2013, at 9:04 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>
>> What does
>>
>> # blockdev --getiomin --getioopt mkfs.xfs /dev/mapper/vg1-data
>>
>> say?
>
>
> [root@f19s ~]# blockdev --getiomin --getioopt mkfs.xfs /dev/mapper/vg1-data
> blockdev: cannot open mkfs.xfs: No such file or directory
Argh sorry, how did I type THAT?
# blockdev --getiomin --getioopt /dev/mapper/vg1-data
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 2:58 ` Dave Chinner
@ 2013-08-30 3:21 ` Chris Murphy
2013-08-30 3:38 ` Dave Chinner
0 siblings, 1 reply; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 3:21 UTC (permalink / raw)
To: Dave Chinner; +Cc: stan, xfs
On Aug 29, 2013, at 8:58 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Check the contents of
> /sys/block/<dev>/queue/{minimum,optimal}_io_size for the single
> device, the standard LV and the thinp device.
physical device:
[root@f19s ~]# cat /sys/block/sda/queue/minimum_io_size
512
[root@f19s ~]# cat /sys/block/sda/queue/optimal_io_size
0
conventional LV on that physical device:
[root@f19s ~]# cat /sys/block/dm-0/queue/minimum_io_size
512
[root@f19s ~]# cat /sys/block/dm-0/queue/optimal_io_size
0
thinp pool and LV:
lrwxrwxrwx. 1 root root 7 Aug 29 20:46 vg1-thinp -> ../dm-3
[root@f19s ~]# cat /sys/block/dm-3/queue/minimum_io_size
512
[root@f19s ~]# cat /sys/block/dm-3/queue/optimal_io_size
262144
[root@f19s ~]#
lrwxrwxrwx. 1 root root 7 Aug 29 20:47 vg1-data -> ../dm-4
[root@f19s ~]# cat /sys/block/dm-4/queue/minimum_io_size
512
[root@f19s ~]# cat /sys/block/dm-4/queue/optimal_io_size
262144
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:19 ` Eric Sandeen
@ 2013-08-30 3:24 ` Chris Murphy
2013-08-30 3:29 ` Chris Murphy
2013-08-30 3:35 ` Eric Sandeen
0 siblings, 2 replies; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 3:24 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
On Aug 29, 2013, at 9:19 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>
> Argh sorry, how did I type THAT?
>
> # blockdev --getiomin --getioopt /dev/mapper/vg1-data
conventional LV:
[root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
512
0
thinp LV:
[root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
512
262144
(Now I see two ways to get the same info.)
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:24 ` Chris Murphy
@ 2013-08-30 3:29 ` Chris Murphy
2013-08-30 3:35 ` Eric Sandeen
1 sibling, 0 replies; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 3:29 UTC (permalink / raw)
To: Eric Sandeen; +Cc: stan@hardwarefreak.com Hoeppner, xfs
On Aug 29, 2013, at 9:24 PM, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Aug 29, 2013, at 9:19 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>
>> Argh sorry, how did I type THAT?
>>
>> # blockdev --getiomin --getioopt /dev/mapper/vg1-data
>
> conventional LV:
> [root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
> 512
> 0
>
> thinp LV:
>
> [root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
> 512
> 262144
It's tied to the chunk size of the thinp. If I create a 4MB chunk size, ioopt goes up to match it. The default is 256KB, which reflects the values above.
[root@f19s ~]# lvcreate -L 400G --type thin-pool -c 4M --thinpool thinp vg1
device-mapper: remove ioctl on failed: Device or resource busy
Logical volume "thinp" created
[root@f19s ~]# lvcreate -V 100G -T vg1/thinp --name data
Logical volume "data" created
[root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
512
4194304
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:24 ` Chris Murphy
2013-08-30 3:29 ` Chris Murphy
@ 2013-08-30 3:35 ` Eric Sandeen
1 sibling, 0 replies; 15+ messages in thread
From: Eric Sandeen @ 2013-08-30 3:35 UTC (permalink / raw)
To: Chris Murphy; +Cc: xfs
On 8/29/13 10:24 PM, Chris Murphy wrote:
>
> On Aug 29, 2013, at 9:19 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>
>> Argh sorry, how did I type THAT?
>>
>> # blockdev --getiomin --getioopt /dev/mapper/vg1-data
>
> conventional LV:
> [root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
> 512
> 0
>
> thinp LV:
>
> [root@f19s ~]# blockdev --getiomin --getioopt /dev/mapper/vg1-data
> 512
> 262144
>
> (Now I see two ways to get the same info.)
:)
ok so it says the stripe unit (minimum IO size) is 512...
Around line 2240, it does:
if (dsunit && !(BBTOB(dsunit) % blocksize) &&
dswidth && !(BBTOB(dswidth) % blocksize)) {
...
} else {
if (nodsflag)
dsunit = dswidth = 0;
essentially saying: If we autodetected a stripe unit or stripe width
which is not a multiple of the block size, silently set it to 0.
So we do that.
However, _just_ before this, we did:
calc_default_ag_geometry(blocklog, dblocks,
dsunit | dswidth, &agsize, &agcount);
when dsunit & dswidth were still set (to invalid values).
So we calculated it w/ stripe geom set, got more AGs, then zeroed
out the stripe geom.
I'm ... not sure how many bugs are here. ;) We shouldn't calculate
AG geometry until we've validated sunit/swidth, I think. But I'm not
convinced that dm-thinp's exported values make a lot of sense either.
-Eric
>
> Chris Murphy
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:21 ` Chris Murphy
@ 2013-08-30 3:38 ` Dave Chinner
2013-08-30 17:55 ` Chris Murphy
0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2013-08-30 3:38 UTC (permalink / raw)
To: Chris Murphy; +Cc: stan, xfs
On Thu, Aug 29, 2013 at 09:21:15PM -0600, Chris Murphy wrote:
>
> On Aug 29, 2013, at 8:58 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Check the contents of
> > /sys/block/<dev>/queue/{minimum,optimal}_io_size for the single
> > device, the standard LV and the thinp device.
>
> physical device:
>
> [root@f19s ~]# cat /sys/block/sda/queue/minimum_io_size
> 512
> [root@f19s ~]# cat /sys/block/sda/queue/optimal_io_size
> 0
>
> conventional LV on that physical device:
>
> [root@f19s ~]# cat /sys/block/dm-0/queue/minimum_io_size
> 512
> [root@f19s ~]# cat /sys/block/dm-0/queue/optimal_io_size
> 0
>
>
> thinp pool and LV:
>
> lrwxrwxrwx. 1 root root 7 Aug 29 20:46 vg1-thinp -> ../dm-3
>
> [root@f19s ~]# cat /sys/block/dm-3/queue/minimum_io_size
> 512
> [root@f19s ~]# cat /sys/block/dm-3/queue/optimal_io_size
> 262144
> [root@f19s ~]#
>
> lrwxrwxrwx. 1 root root 7 Aug 29 20:47 vg1-data -> ../dm-4
>
> [root@f19s ~]# cat /sys/block/dm-4/queue/minimum_io_size
> 512
> [root@f19s ~]# cat /sys/block/dm-4/queue/optimal_io_size
> 262144
Yup, there's the problem - minimum_io_size is 512 bytes, which is
too small for a stripe unit to be set to. Hence sunit/swidth get set
to zero.
The problem here is that minimum_io_size is not the minimum IO size
that can be done, but the minimum IO size that is *efficient*. For
example, my workstation has a MD RAID0 device with a 512k chunk size
and two drives:
$ cat /sys/block/md0/queue/minimum_io_size
524288
$ cat /sys/block/md0/queue/optimal_io_size
1048576
Here we see the minimum *efficient* IO size is the stripe chunk size
(i.e. what gets written to a single disk) and the optimal is an IO
that hits all disks at once.
So, what dm-thinp is trying to tell us is that the minimum
*physical* IO size is 512 bytes (i.e. /sys/.../physical_block_size)
but the efficient IO size is 256k. So dm-thinp is exposing the
information incorrectly. What it shoul dbe doing is setting both the
minimum_io_size and the optimal_io_size to the same value of 256k...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 3:38 ` Dave Chinner
@ 2013-08-30 17:55 ` Chris Murphy
2013-08-31 1:22 ` Eric Sandeen
0 siblings, 1 reply; 15+ messages in thread
From: Chris Murphy @ 2013-08-30 17:55 UTC (permalink / raw)
To: Dave Chinner; +Cc: stan, xfs
On Aug 29, 2013, at 9:38 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> So, what dm-thinp is trying to tell us is that the minimum
> *physical* IO size is 512 bytes (i.e. /sys/.../physical_block_size)
> but the efficient IO size is 256k. So dm-thinp is exposing the
> information incorrectly. What it shoul dbe doing is setting both the
> minimum_io_size and the optimal_io_size to the same value of 256k…
Should I file a bug? Against lvm2?
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-30 17:55 ` Chris Murphy
@ 2013-08-31 1:22 ` Eric Sandeen
2013-09-01 3:39 ` Chris Murphy
0 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2013-08-31 1:22 UTC (permalink / raw)
To: Chris Murphy; +Cc: stan@hardwarefreak.com, xfs@oss.sgi.com
On Aug 30, 2013, at 12:55 PM, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Aug 29, 2013, at 9:38 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> So, what dm-thinp is trying to tell us is that the minimum
>> *physical* IO size is 512 bytes (i.e. /sys/.../physical_block_size)
>> but the efficient IO size is 256k. So dm-thinp is exposing the
>> information incorrectly. What it shoul dbe doing is setting both the
>> minimum_io_size and the optimal_io_size to the same value of 256k…
>
> Should I file a bug? Against lvm2?
>
>
I think so. They may already be aware of it but better to not lose it...
Eric
>
> Chris Murphy
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: higher agcount on LVM2 thinp volumes
2013-08-31 1:22 ` Eric Sandeen
@ 2013-09-01 3:39 ` Chris Murphy
0 siblings, 0 replies; 15+ messages in thread
From: Chris Murphy @ 2013-09-01 3:39 UTC (permalink / raw)
To: Eric Sandeen; +Cc: stan@hardwarefreak.com Hoeppner, xfs
On Aug 30, 2013, at 7:22 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> I think so. They may already be aware of it but better to not lose it…
Done.
https://bugzilla.redhat.com/show_bug.cgi?id=1003227
Chris Murphy
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-09-01 3:39 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-29 6:09 higher agcount on LVM2 thinp volumes Chris Murphy
2013-08-30 1:44 ` Stan Hoeppner
2013-08-30 2:08 ` Chris Murphy
2013-08-30 2:58 ` Dave Chinner
2013-08-30 3:21 ` Chris Murphy
2013-08-30 3:38 ` Dave Chinner
2013-08-30 17:55 ` Chris Murphy
2013-08-31 1:22 ` Eric Sandeen
2013-09-01 3:39 ` Chris Murphy
2013-08-30 3:04 ` Eric Sandeen
2013-08-30 3:18 ` Chris Murphy
2013-08-30 3:19 ` Eric Sandeen
2013-08-30 3:24 ` Chris Murphy
2013-08-30 3:29 ` Chris Murphy
2013-08-30 3:35 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox