qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Dietmar Maurer <dietmar@proxmox.com>,
	qemu-block@nongnu.org,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: backup_calculate_cluster_size does not consider source
Date: Wed, 6 Nov 2019 11:42:16 +0100	[thread overview]
Message-ID: <37f72cb7-7085-3c40-7728-e41d59137b3b@redhat.com> (raw)
In-Reply-To: <20191106103450.cafwk7m5xd5eulxo@olga.proxmox.com>


[-- Attachment #1.1: Type: text/plain, Size: 2914 bytes --]

On 06.11.19 11:34, Wolfgang Bumiller wrote:
> On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote:
>> On 06.11.19 09:32, Stefan Hajnoczi wrote:
>>> On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote:
>>>> Example: Backup from ceph disk (rbd_cache=false) to local disk:
>>>>
>>>> backup_calculate_cluster_size returns 64K (correct for my local .raw image)
>>>>
>>>> Then the backup job starts to read 64K blocks from ceph.
>>>>
>>>> But ceph always reads 4M block, so this is incredibly slow and produces
>>>> way too much network traffic.
>>>>
>>>> Why does backup_calculate_cluster_size does not consider the block size from
>>>> the source disk? 
>>>>
>>>> cluster_size = MAX(block_size_source, block_size_target)
>>
>> So Ceph always transmits 4 MB over the network, no matter what is
>> actually needed?  That sounds, well, interesting.
> 
> Or at least it generates that much I/O - in the end, it can slow down
> the backup by up to a multi-digit factor...

Oh, so I understand ceph internally resolves the 4 MB block and then
transmits the subcluster range.  That makes sense.

>> backup_calculate_cluster_size() doesn’t consider the source size because
>> to my knowledge there is no other medium that behaves this way.  So I
>> suppose the assumption was always that the block size of the source
>> doesn’t matter, because a partial read is always possible (without
>> having to read everything).
> 
> Unless you enable qemu-side caching this only works until the
> block/cluster size of the source exceeds the one of the target.
> 
>> What would make sense to me is to increase the buffer size in general.
>> I don’t think we need to copy clusters at a time, and
>> 0e2402452f1f2042923a5 has indeed increased the copy size to 1 MB for
>> backup writes that are triggered by guest writes.  We haven’t yet
>> increased the copy size for background writes, though.  We can do that,
>> of course.  (And probably should.)
>>
>> The thing is, it just seems unnecessary to me to take the source cluster
>> size into account in general.  It seems weird that a medium only allows
>> 4 MB reads, because, well, guests aren’t going to take that into account.
> 
> But guests usually have a page cache, which is why in many setups qemu
> (and thereby the backup process) often doesn't.

But this still doesn’t make sense to me.  Linux doesn’t issue 4 MB
requests to pre-fill the page cache, does it?

And if it issues a smaller request, there is no way for a guest device
to tell it “OK, here’s your data, but note we have a whole 4 MB chunk
around it, maybe you’d like to take that as well...?”

I understand wanting to increase the backup buffer size, but I don’t
quite understand why we’d want it to increase to the source cluster size
when the guest also has no idea what the source cluster size is.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-11-06 10:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05 10:02 backup_calculate_cluster_size does not consider source Dietmar Maurer
2019-11-06  8:32 ` Stefan Hajnoczi
2019-11-06  9:37   ` Max Reitz
2019-11-06 10:18     ` Dietmar Maurer
2019-11-06 10:37       ` Max Reitz
2019-11-06 10:34     ` Wolfgang Bumiller
2019-11-06 10:42       ` Max Reitz [this message]
2019-11-06 11:18         ` Dietmar Maurer
2019-11-06 11:22           ` Max Reitz
2019-11-06 11:37             ` Max Reitz
2019-11-06 13:09               ` Dietmar Maurer
2019-11-06 13:17                 ` Max Reitz
2019-11-06 13:34                   ` Dietmar Maurer
2019-11-06 13:52                     ` Max Reitz
2019-11-06 14:39                       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37f72cb7-7085-3c40-7728-e41d59137b3b@redhat.com \
    --to=mreitz@redhat.com \
    --cc=dietmar@proxmox.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).