From: Peter Lieven <pl@kamp.de>
To: Hu Tao <hutao@cn.fujitsu.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH v2 5/6] qcow2: implement bdrv_preallocate
Date: Thu, 28 Nov 2013 11:03:04 +0100 [thread overview]
Message-ID: <529714D8.5010304@kamp.de> (raw)
In-Reply-To: <20131128084859.GN24296@G08FNSTD100614.fnst.cn.fujitsu.com>
On 28.11.2013 09:48, Hu Tao wrote:
> On Wed, Nov 27, 2013 at 11:13:40AM +0100, Peter Lieven wrote:
>> Am 27.11.2013 11:07, schrieb Fam Zheng:
>>> On 2013年11月27日 18:03, Peter Lieven wrote:
>>>> Am 27.11.2013 07:40, schrieb Fam Zheng:
>>>>> On 2013年11月27日 14:01, Hu Tao wrote:
>>>>>> On Wed, Nov 27, 2013 at 11:01:23AM +0800, Fam Zheng wrote:
>>>>>>> On 2013年11月27日 10:15, Hu Tao wrote:
>>>>>>>> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>>>>>>>> ---
>>>>>>>> block/qcow2.c | 7 +++++++
>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>>>>>>> index b054a01..a23fade 100644
>>>>>>>> --- a/block/qcow2.c
>>>>>>>> +++ b/block/qcow2.c
>>>>>>>> @@ -2180,6 +2180,12 @@ static int qcow2_amend_options(BlockDriverState *bs,
>>>>>>>> return 0;
>>>>>>>> }
>>>>>>>>
>>>>>>>> +static int qcow2_preallocate(BlockDriverState *bs, int64_t offset,
>>>>>>>> + int64_t length)
>>>>>>>> +{
>>>>>>>> + return bdrv_preallocate(bs->file, offset, length);
>>>>>>>> +}
>>>>>>>> +
>>>>>>> What's the semantics of .bdrv_preallocate? I think you should map
>>>>>>> [offset, offset + length) to clusters in image file, and then
>>>>>>> forward to bs->file, rather than this direct wrapper.
>>>>>>>
>>>>>>> E.g. bdrv_preallocate(qcow2_bs, 0, cluster_size) should call
>>>>>>> bdrv_preallocate(qcow2_bs->file, offset_off_first_cluster,
>>>>>>> cluster_size).
>>>>>> You mean data clusters here, right? Is there a single function to get
>>>>>> the offset of the first data cluster?
>>>>>>
>>>>> There is a function, qcow2_get_cluster_offset.
>>>> This should return no valid offset as long as the cluster is not allocated.
>>>>
>>>> I think you actually have to "write" all clusters of a qcow2 one by one.
>>>> Eventually this write could be an fallocate call instead of a zero write.
>>>>
>>> Yes, I was wrong about qcow2_get_cluster_offset. The logic here is more like cluster allocation in qcow2_alloc_cluster_offset. Maybe we can reuse that.
>> What I don't like about the preallocation is that we would loose the information that a cluster contains no valid data and would read it e.g. during
>> conversion.
> So the information is stored in table and you mean we shouldn't clear
> table when do preallocation? I'm not sure how the information could be
> useful on a newly-created image, but it seems ideal to keep informations
> in table.
When you want to e.g. convert this qcow2 later the performance is lower than needed because
you read all those preallocated sectors altough you could now they are empty.
>
>> I think what we want is a preallocated image with all clusters sequentally mapped into the qcow2 file. Preallocate all the cluster extends, but still
>> have the information in the table that the cluster in fact has no valid data. So we would need a valid cluster offset while still haveing the
>> flag that the cluster is unallocated. I think this would require thoughtfully checking all the cluster functions if they can easily cope with this.
>>
>> The quetion is Hu, what do you want to achieve? Do you want that the space on the filesystem is preallocated so you can't overcommit or
>> do you also want a sequential mapping of all the clusters into the file?
> The goal is to avoid sparse file as it can cause performance problem. So
> the first one. I'm not sure about the second but IIUC, one fallocate()
> is enough for all clusters if they are sequentially mapped.
If you do not premap them they are allocated in the order they are written.
So if you are going to preallocate the whole file anyway, you should sequentally map all clusters into the file
AND still keep the information that they are in fact not yet written.
Peter
next prev parent reply other threads:[~2013-11-28 10:02 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-27 2:15 [Qemu-devel] [RFC PATCH v2 0/6] qemu-img: add preallocation=full Hu Tao
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 1/6] block: introduce prealloc_mode Hu Tao
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 2/6] block: add BlockDriver.bdrv_preallocate Hu Tao
2013-11-27 2:35 ` Fam Zheng
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 3/6] block/raw-posix: implement bdrv_preallocate Hu Tao
2013-11-27 2:40 ` Fam Zheng
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 4/6] raw-posix: Add full image preallocation option Hu Tao
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 5/6] qcow2: implement bdrv_preallocate Hu Tao
2013-11-27 3:01 ` Fam Zheng
2013-11-27 6:01 ` Hu Tao
2013-11-27 6:40 ` Fam Zheng
2013-11-27 10:03 ` Peter Lieven
2013-11-27 10:07 ` Fam Zheng
2013-11-27 10:13 ` Peter Lieven
2013-11-28 8:48 ` Hu Tao
2013-11-28 10:03 ` Peter Lieven [this message]
2013-12-11 7:33 ` Hu Tao
2013-12-16 8:24 ` Hu Tao
2013-12-16 9:21 ` Fam Zheng
2013-12-17 2:03 ` Hu Tao
2013-11-27 2:15 ` [Qemu-devel] [RFC PATCH v2 6/6] qcow2: Add full image preallocation option Hu Tao
2013-11-27 3:22 ` [Qemu-devel] [RFC PATCH v2 0/6] qemu-img: add preallocation=full Fam Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=529714D8.5010304@kamp.de \
--to=pl@kamp.de \
--cc=famz@redhat.com \
--cc=hutao@cn.fujitsu.com \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).