* [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
@ 2009-01-08 18:49 Uri Lublin
2009-01-08 19:37 ` Anthony Liguori
2009-01-09 9:09 ` Kevin Wolf
0 siblings, 2 replies; 8+ messages in thread
From: Uri Lublin @ 2009-01-08 18:49 UTC (permalink / raw)
To: qemu-devel; +Cc: Uri Lublin
From: Uri Lublin <uril@redhat.com>
This patchset let the user know the highest allocated byte of qcow2 images.
Actually it's the first unallocated byte after the highest byte written,
cluster-size aligned.
The highest allocated byte gives a maximal limit (easy to calculate)
to the number of bytes allocated for that image, and may hint how many more
allocations can be done before we reach end-of-file (end of host block device).
Although there may be many free blocks below that number (allocated and freed)
the file system can not deallocate those blocks, and they have to be reused
by qemu. Also note that due to fragmentation those free blocks may not
be used on next allocations.
It can be useful for truncation of backing file images (ftruncate).
Also it may be useful for defragmentation later (although we'll need
the number of free blocks as well).
The first patch calculates the highest byte for qcow2 images (block-qcow2.c)
The second patch exposed it through a BlockDeviceInfo
The third patch term_prints it upon info blockstats (for qcow2 images)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-08 18:49 [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes) Uri Lublin
@ 2009-01-08 19:37 ` Anthony Liguori
2009-01-11 9:26 ` Uri Lublin
2009-01-09 9:09 ` Kevin Wolf
1 sibling, 1 reply; 8+ messages in thread
From: Anthony Liguori @ 2009-01-08 19:37 UTC (permalink / raw)
To: qemu-devel; +Cc: Uri Lublin
Uri Lublin wrote:
> From: Uri Lublin <uril@redhat.com>
>
> This patchset let the user know the highest allocated byte of qcow2
> images.
> Actually it's the first unallocated byte after the highest byte written,
> cluster-size aligned.
>
> The highest allocated byte gives a maximal limit (easy to calculate)
> to the number of bytes allocated for that image, and may hint how many
> more allocations can be done before we reach end-of-file (end of host
> block device).
> Although there may be many free blocks below that number (allocated
> and freed)
> the file system can not deallocate those blocks, and they have to be
> reused
> by qemu. Also note that due to fragmentation those free blocks may not
> be used on next allocations.
>
> It can be useful for truncation of backing file images (ftruncate).
> Also it may be useful for defragmentation later (although we'll need
> the number of free blocks as well).
I'm having trouble seeing the utility of this as it seems to be not
really reliable. Surely, after a lot of work, you'll have one block far
at the end of the file, no? I don't see how knowing this location helps
practically speaking. Can you explain a little more about what you want
to use this functionality for?
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-08 18:49 [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes) Uri Lublin
2009-01-08 19:37 ` Anthony Liguori
@ 2009-01-09 9:09 ` Kevin Wolf
2009-01-11 9:31 ` Uri Lublin
2009-01-11 14:56 ` Shahar Frank
1 sibling, 2 replies; 8+ messages in thread
From: Kevin Wolf @ 2009-01-09 9:09 UTC (permalink / raw)
To: qemu-devel; +Cc: Uri Lublin
Uri Lublin schrieb:
> Although there may be many free blocks below that number (allocated and
> freed)
> the file system can not deallocate those blocks, and they have to be reused
> by qemu. Also note that due to fragmentation those free blocks may not
> be used on next allocations.
Any idea what would it mean to performance if we changed the behaviour
so that s->free_cluster_index always points to lowest free cluster? Then
most of the fragmentation should be gone.
If the impact would be too big we could still change the code to use two
free_cluster_indexes, one for single cluster allocation and one for
larger blocks. This was suggested earlier and I think there were even
patches for it, but I don't seem to remember who exactly suggested this.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-08 19:37 ` Anthony Liguori
@ 2009-01-11 9:26 ` Uri Lublin
2009-01-11 15:16 ` Shahar Frank
0 siblings, 1 reply; 8+ messages in thread
From: Uri Lublin @ 2009-01-11 9:26 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
Anthony Liguori wrote:
> Uri Lublin wrote:
>> From: Uri Lublin <uril@redhat.com>
>>
>> This patchset let the user know the highest allocated byte of qcow2
>> images.
>> Actually it's the first unallocated byte after the highest byte written,
>> cluster-size aligned.
>>
>> The highest allocated byte gives a maximal limit (easy to calculate)
>> to the number of bytes allocated for that image, and may hint how many
>> more allocations can be done before we reach end-of-file (end of host
>> block device).
>> Although there may be many free blocks below that number (allocated
>> and freed)
>> the file system can not deallocate those blocks, and they have to be
>> reused
>> by qemu. Also note that due to fragmentation those free blocks may not
>> be used on next allocations.
>>
>> It can be useful for truncation of backing file images (ftruncate).
>> Also it may be useful for defragmentation later (although we'll need
>> the number of free blocks as well).
>
> I'm having trouble seeing the utility of this as it seems to be not
> really reliable. Surely, after a lot of work, you'll have one block far
> at the end of the file, no? I don't see how knowing this location helps
> practically speaking. Can you explain a little more about what you want
> to use this functionality for?
>
Currently, qcow2 images can only grow, never shrink.
The main usage would be to trigger an appropriate operation when a threshold is
reached. The threshold and operation are defined by a management application.
Basically we can do one of the following:
1. Defragment the qcow2 image (simplest way is to qemu-img convert it, the best
is to do it online if possible).
2. Allocate more space (especially when using LVM)
I plan on adding another "blockstat" that shows the number of free
bytes/blocks/clusters for a qcow2 image. This would make it easier to choose the
appropriate operation above.
Thanks,
Uri.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-09 9:09 ` Kevin Wolf
@ 2009-01-11 9:31 ` Uri Lublin
2009-01-11 14:56 ` Shahar Frank
1 sibling, 0 replies; 8+ messages in thread
From: Uri Lublin @ 2009-01-11 9:31 UTC (permalink / raw)
To: Kevin Wolf; +Cc: qemu-devel
Kevin Wolf wrote:
> Uri Lublin schrieb:
>> Although there may be many free blocks below that number (allocated and
>> freed)
>> the file system can not deallocate those blocks, and they have to be reused
>> by qemu. Also note that due to fragmentation those free blocks may not
>> be used on next allocations.
>
> Any idea what would it mean to performance if we changed the behaviour
> so that s->free_cluster_index always points to lowest free cluster? Then
> most of the fragmentation should be gone.
I don't know, it has to be implemented and measured.
>
> If the impact would be too big we could still change the code to use two
> free_cluster_indexes, one for single cluster allocation and one for
> larger blocks. This was suggested earlier and I think there were even
> patches for it, but I don't seem to remember who exactly suggested this.
That should make qcow2 images less defragmented.
Thanks,
Uri.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-09 9:09 ` Kevin Wolf
2009-01-11 9:31 ` Uri Lublin
@ 2009-01-11 14:56 ` Shahar Frank
2009-01-12 9:50 ` Kevin Wolf
1 sibling, 1 reply; 8+ messages in thread
From: Shahar Frank @ 2009-01-11 14:56 UTC (permalink / raw)
To: qemu-devel; +Cc: Uri Lublin
Kevin Wolf wrote:
> Uri Lublin schrieb:
>> Although there may be many free blocks below that number (allocated and
>> freed)
>> the file system can not deallocate those blocks, and they have to be reused
>> by qemu. Also note that due to fragmentation those free blocks may not
>> be used on next allocations.
>
> Any idea what would it mean to performance if we changed the behaviour
> so that s->free_cluster_index always points to lowest free cluster? Then
> most of the fragmentation should be gone.
>
free_cluster_index if already pointing the lowest known free space. The
problem is that the its update logic is very simplistic so an allocation
of multiple clusters may cause this pointer to skip many single (in fact
it will skip all cluster sequences that are shorter than the requested
number), so the next allocation may miss it. This will increase the
fragmentation. Note that it wasn't so important until Laurent Vivier
implemented his optimizations that allocated cluster sequences.
see block-qcow2.c:alloc_clusters_noref() and
block-qcow2.c: update_cluster_refcount()
> If the impact would be too big we could still change the code to use two
> free_cluster_indexes, one for single cluster allocation and one for
> larger blocks. This was suggested earlier and I think there were even
> patches for it, but I don't seem to remember who exactly suggested this.
>
I suggested it as part of the first zero-dedup patch, and that was
because I suspected that the zero dedup may increase fragmentation due
that simplistic free cluster indexes. In fact, having two or even
several free pointers is probably a step in the right direction, but we
may need some better allocation mechanism to really solve the problem
(btree+ structure, or something else). The target should be a decent
extend based allocation. This improve qcow2 performance and handle he
fragmentation problem. The problem is that it will probably change the
qcow2 internals, so may better implement a simple approach for qcow2 and
start designing qcow3...
> Kevin
>
>
Shahar
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-11 9:26 ` Uri Lublin
@ 2009-01-11 15:16 ` Shahar Frank
0 siblings, 0 replies; 8+ messages in thread
From: Shahar Frank @ 2009-01-11 15:16 UTC (permalink / raw)
To: qemu-devel
Uri Lublin wrote:
> Anthony Liguori wrote:
>> Uri Lublin wrote:
>>> From: Uri Lublin <uril@redhat.com>
>>>
>>> This patchset let the user know the highest allocated byte of qcow2
>>> images.
>>> Actually it's the first unallocated byte after the highest byte written,
>>> cluster-size aligned.
>>>
>>> The highest allocated byte gives a maximal limit (easy to calculate)
>>> to the number of bytes allocated for that image, and may hint how
>>> many more allocations can be done before we reach end-of-file (end of
>>> host block device).
>>> Although there may be many free blocks below that number (allocated
>>> and freed)
>>> the file system can not deallocate those blocks, and they have to be
>>> reused
>>> by qemu. Also note that due to fragmentation those free blocks may not
>>> be used on next allocations.
>>>
>>> It can be useful for truncation of backing file images (ftruncate).
>>> Also it may be useful for defragmentation later (although we'll need
>>> the number of free blocks as well).
>>
>> I'm having trouble seeing the utility of this as it seems to be not
>> really reliable. Surely, after a lot of work, you'll have one block
>> far at the end of the file, no? I don't see how knowing this location
>> helps practically speaking. Can you explain a little more about what
>> you want to use this functionality for?
>>
>
> Currently, qcow2 images can only grow, never shrink.
> The main usage would be to trigger an appropriate operation when a
> threshold is reached. The threshold and operation are defined by a
> management application.
> Basically we can do one of the following:
> 1. Defragment the qcow2 image (simplest way is to qemu-img convert it,
> the best is to do it online if possible).
> 2. Allocate more space (especially when using LVM)
>
> I plan on adding another "blockstat" that shows the number of free
> bytes/blocks/clusters for a qcow2 image. This would make it easier to
> choose the appropriate operation above.
>
As Uri wrote this patch is part of a patch set that will include also
free clusters statistics. Together these new statistics will enable an
image repository system to quickly estimate how "fragmented" is a given
image, so it can decide if to perform expensive
cleaning/shrinking/defragmenting processes on that image.
This highest allocated byte is also critical to let you run qcow2 images
over raw devices- it give you a good estimate how much space you still
have for the image to expand. This is a bit pessimistic statistic, but
due the much too simplistic allocation mechanism within qcow2 and the
recent multiple clusters allocations optimizations, this may be also a
pretty realistic figure.
Note that if you run qcow2 over expandable volumes such as LVM volumes,
you can use this figure by monitoring process that will expand the
volume once a threshold is passed.
Shahar
> Thanks,
> Uri.
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes)
2009-01-11 14:56 ` Shahar Frank
@ 2009-01-12 9:50 ` Kevin Wolf
0 siblings, 0 replies; 8+ messages in thread
From: Kevin Wolf @ 2009-01-12 9:50 UTC (permalink / raw)
To: qemu-devel
Shahar Frank schrieb:
> Kevin Wolf wrote:
>> Uri Lublin schrieb:
>>> Although there may be many free blocks below that number (allocated and
>>> freed)
>>> the file system can not deallocate those blocks, and they have to be
>>> reused
>>> by qemu. Also note that due to fragmentation those free blocks may not
>>> be used on next allocations.
>>
>> Any idea what would it mean to performance if we changed the behaviour
>> so that s->free_cluster_index always points to lowest free cluster? Then
>> most of the fragmentation should be gone.
>>
> free_cluster_index if already pointing the lowest known free space. The
Right, that's what I thought, too. Until I looked at code again.
alloc_clusters_noref() moves free_cluster_index forward if the needed
number of cluster don't fit right there. It's only set back to the
lowest free cluster when that cluster is freed later.
> problem is that the its update logic is very simplistic so an allocation
> of multiple clusters may cause this pointer to skip many single (in fact
> it will skip all cluster sequences that are shorter than the requested
> number), so the next allocation may miss it. This will increase the
> fragmentation. Note that it wasn't so important until Laurent Vivier
> implemented his optimizations that allocated cluster sequences.
Yes, it wouldn't have been a problem before these patches. But now what
is the right thing to do if we're having some one-cluster holes and want
to write a larger block? There are only two options: Try to find a place
where the clusters are physically contiguous for better performance but
at the cost of fragmentation (that's what we to today) or fill up all
the holes first at cost of performance (we could to that with a few
lines of code).
> In fact, having two or even
> several free pointers is probably a step in the right direction, but we
> may need some better allocation mechanism to really solve the problem
> (btree+ structure, or something else). The target should be a decent
> extend based allocation. This improve qcow2 performance and handle he
> fragmentation problem. The problem is that it will probably change the
> qcow2 internals, so may better implement a simple approach for qcow2 and
> start designing qcow3...
Maybe you're right. But actually I don't feel like starting qcow3 now...
And it would be a long term thing anyway.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-01-12 9:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-08 18:49 [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes) Uri Lublin
2009-01-08 19:37 ` Anthony Liguori
2009-01-11 9:26 ` Uri Lublin
2009-01-11 15:16 ` Shahar Frank
2009-01-09 9:09 ` Kevin Wolf
2009-01-11 9:31 ` Uri Lublin
2009-01-11 14:56 ` Shahar Frank
2009-01-12 9:50 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).