From: "Denis V. Lunev" <den@openvz.org>
To: Peter Lieven <pl@kamp.de>
Cc: Kevin Wolf <kwolf@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/8] block: prepare bdrv_co_do_write_zeroes to deal with large bl.max_write_zeroes
Date: Mon, 5 Jan 2015 14:06:10 +0300 [thread overview]
Message-ID: <54AA7022.2030304@openvz.org> (raw)
In-Reply-To: <54AA3E84.8080003@kamp.de>
On 05/01/15 10:34, Peter Lieven wrote:
> On 30.12.2014 10:20, Denis V. Lunev wrote:
>> bdrv_co_do_write_zeroes split writes using bl.max_write_zeroes or
>> 16 MiB as a chunk size. This is implemented in this way to tolerate
>> buggy block backends which do not accept too big requests.
>>
>> Though if the bdrv_co_write_zeroes callback is not good enough, we
>> fallback to write data explicitely using bdrv_co_writev and we
>> create buffer to accomodate zeroes inside. The size of this buffer
>> is the size of the chunk. Thus if the underlying layer will have
>> bl.max_write_zeroes high enough, f.e. 4 GiB, the allocation can fail.
>>
>> Actually, there is no need to allocate such a big amount of memory.
>> We could simply allocate 1 MiB buffer and create iovec, which will
>> point to the same memory.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Peter Lieven <pl@kamp.de>
>> ---
>> block.c | 35 ++++++++++++++++++++++++-----------
>> 1 file changed, 24 insertions(+), 11 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 4165d42..d69c121 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3173,14 +3173,18 @@ int coroutine_fn
>> bdrv_co_copy_on_readv(BlockDriverState *bs,
>> * of 32768 512-byte sectors (16 MiB) per request.
>> */
>> #define MAX_WRITE_ZEROES_DEFAULT 32768
>> +/* allocate iovec with zeroes using 1 MiB chunks to avoid to big
>> allocations */
>> +#define MAX_ZEROES_CHUNK (1024 * 1024)
>> static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState
>> *bs,
>> int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
>> {
>> BlockDriver *drv = bs->drv;
>> QEMUIOVector qiov;
>> - struct iovec iov = {0};
>> int ret = 0;
>> + void *chunk = NULL;
>> +
>> + qemu_iovec_init(&qiov, 0);
>> int max_write_zeroes = bs->bl.max_write_zeroes ?
>> bs->bl.max_write_zeroes :
>> MAX_WRITE_ZEROES_DEFAULT;
>> @@ -3217,27 +3221,35 @@ static int coroutine_fn
>> bdrv_co_do_write_zeroes(BlockDriverState *bs,
>> }
>> if (ret == -ENOTSUP) {
>> + int64_t num_bytes = (int64_t)num << BDRV_SECTOR_BITS;
>> + int chunk_size = MIN(MAX_ZEROES_CHUNK, num_bytes);
>> +
>> /* Fall back to bounce buffer if write zeroes is
>> unsupported */
>> - iov.iov_len = num * BDRV_SECTOR_SIZE;
>> - if (iov.iov_base == NULL) {
>> - iov.iov_base = qemu_try_blockalign(bs, num *
>> BDRV_SECTOR_SIZE);
>> - if (iov.iov_base == NULL) {
>> + if (chunk == NULL) {
>> + chunk = qemu_try_blockalign(bs, chunk_size);
>> + if (chunk == NULL) {
>> ret = -ENOMEM;
>> goto fail;
>> }
>> - memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE);
>> + memset(chunk, 0, chunk_size);
>> + }
>> +
>> + while (num_bytes > 0) {
>> + int to_add = MIN(chunk_size, num_bytes);
>> + qemu_iovec_add(&qiov, chunk, to_add);
>
> This can and likely will fail for big num_bytes if you exceed IOV_MAX
> vectors.
>
> I would stick to the old method and limit the num to a reasonable
> value e.g. MAX_WRITE_ZEROES_DEFAULT.
> This becomes necessary as you set INT_MAX for max_write_zeroes. That
> hasn't been considered before in
> the original patch.
>
> Peter
>
hmm. You are right, but I think that it would be better to limit iovec size
to 32 and this will solve the problem. Allocation of 32 Mb could be a
real problem
on loaded system could be a problem.
What do you think on this? May be we could consider 16 as a limit...
next prev parent reply other threads:[~2015-01-05 11:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-30 9:20 [Qemu-devel] [PATCH v3 0/8] eliminate data write in bdrv_write_zeroes on Linux in raw-posix.c Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 1/8] block: prepare bdrv_co_do_write_zeroes to deal with large bl.max_write_zeroes Denis V. Lunev
2015-01-05 7:34 ` Peter Lieven
2015-01-05 11:06 ` Denis V. Lunev [this message]
2015-01-05 11:23 ` Peter Lieven
2015-01-05 11:48 ` Denis V. Lunev
2015-01-05 12:26 ` [Qemu-devel] [PATCH v2 1/1] " Denis V. Lunev
2015-01-05 12:32 ` [Qemu-devel] [PATCH v3 " Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 2/8] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 3/8] block/raw-posix: create do_fallocate helper Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 4/8] block/raw-posix: create translate_err helper to merge errno values Denis V. Lunev
2015-01-05 6:46 ` Fam Zheng
2015-01-05 11:17 ` Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 5/8] block/raw-posix: refactor handle_aiocb_write_zeroes a bit Denis V. Lunev
2015-01-05 6:57 ` Fam Zheng
2015-01-05 11:07 ` Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 6/8] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
2015-01-05 7:02 ` Fam Zheng
2015-01-05 11:14 ` Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 7/8] block/raw-posix: call plain fallocate in handle_aiocb_write_zeroes Denis V. Lunev
2014-12-30 9:20 ` [Qemu-devel] [PATCH 8/8] block/raw-posix: set max_write_zeroes to INT_MAX for regular files Denis V. Lunev
2014-12-30 10:55 ` [Qemu-devel] [PATCH v3 0/8] eliminate data write in bdrv_write_zeroes on Linux in raw-posix.c Peter Lieven
2014-12-30 11:07 ` Denis V. Lunev
2015-01-05 6:55 ` Peter Lieven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54AA7022.2030304@openvz.org \
--to=den@openvz.org \
--cc=kwolf@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.