Re: [Qemu-devel] [Qemu-block] Request for clarification on qemu-img convert behavior zeroing target host_device

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: "De Backer, Fred (Nokia - BE/Antwerp)" <fred.de_backer@nokia.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Nir Soffer <nsoffer@redhat.com>,
	qemu block <qemu-block@nongnu.org>,
	"Richard W.M. Jones" <rjones@redhat.com>,
	"Aamir T, Owais (Nokia - IN/Chennai)" <owais.aamir_t@nokia.com>
Subject: Re: [Qemu-devel] [Qemu-block] Request for clarification on qemu-img convert behavior zeroing target host_device
Date: Thu, 13 Dec 2018 15:49:14 +0100	[thread overview]
Message-ID: <20181213144914.GH5427@linux.fritz.box> (raw)
In-Reply-To: <14b32e39-a9f6-1a83-87e6-6e150954ddb7@redhat.com>

Am 13.12.2018 um 15:17 hat Eric Blake geschrieben:
> On 12/13/18 7:12 AM, De Backer, Fred (Nokia - BE/Antwerp) wrote:
> > Hi,
> > 
> > We're using Openstack Ironic to deploy baremetal servers. During the deployment process an agent (ironic-python-agent) running on Fedora linux uses qemu-img to write a qcow2 file to a blockdevice.
> > 
> > Recently we saw a change in behavior of qemu-img. Previously we were using Fedora 27 containing a fedora packaged version of qemu-img v2.10.2 (qemu-img-2.10.2-1.fc27.x86_64.rpm); now we use Fedora 29 containing a fedora packaged version of qemu-img v3.0.0 (qemu-img-3.0.0-2.fc29.x86_64.rpm).
> > 
> > The command that is run by the ironic-python-agent (the same in both FC27 and FC29) is: qemu-img -t directsync -O host_device /tmp/image.qcow2 /dev/sda
> > 
> > We observe that in Fedora 29 the qemu-img, before imaging the disk, it fully zeroes it. Taking into account the disk size, the whole process now takes 35 minutes instead of 50 seconds. This causes the ironic-python-agent operation to time-out. The Fedora 27 qemu-img doesn't do that.
> 
> Known issue; Nir and Rich have posted a previous thread on the topic, and
> the conclusion is that we need to make qemu-img smarter about NOT requesting
> pre-zeroing of devices where that is more expensive than just zeroing as we
> go.
> https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg01182.html

Yes, we should be careful to avoid the fallback in this case.

However, how could this ever go from 50 seconds for writing the whole
image to 35 minutes?! Even if you end up writing the whole image twice
because you write zeros first and then overwrite them everywhere with
data, shouldn't the maximum be doubling the time, i.e. 100 seconds?

Why is the write_zeroes fallback _that_ slow? It will also hit guests
that request write_zeroes, so I feel this is worth investigating a bit
more nevertheless.

Can you check with strace which operation actually succeeds writing
zeros to /dev/sda? The first thing we try is fallocate with
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE. This should always be fast,
so I suppose this fails in your case. The next thing is BLKZEROOUT,
which I think can do a fallback in the kernel. Does this return success?
Otherwise we have another fallback mechanism inside of QEMU, which would
use normal pwrite calls with a zeroed buffer.

Once we know which mechanism is used, we can look into why it is so
abysmally slow.

Kevin

next prev parent reply	other threads:[~2018-12-13 14:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <DB6PR07MB33330E3562B0AF0ECC8307BCAEA00@DB6PR07MB3333.eurprd07.prod.outlook.com>
2018-12-13 13:12 ` [Qemu-devel] Request for clarification on qemu-img convert behavior zeroing target host_device De Backer, Fred (Nokia - BE/Antwerp)
2018-12-13 14:17   ` Eric Blake
2018-12-13 14:49     ` Kevin Wolf [this message]
2018-12-13 15:05       ` [Qemu-devel] [Qemu-block] " Eric Blake
2018-12-13 21:14         ` De Backer, Fred (Nokia - BE/Antwerp)
2018-12-13 21:53           ` Nir Soffer
     [not found]             ` <VI1PR07MB3344412C5DE71936F9689909AEA10@VI1PR07MB3344.eurprd07.prod.outlook.com>
2018-12-14 10:59               ` De Backer, Fred (Nokia - BE/Antwerp)
2018-12-14 12:26                 ` Kevin Wolf
2018-12-14 12:52                   ` De Backer, Fred (Nokia - BE/Antwerp)
2018-12-14 13:10                     ` Richard W.M. Jones
2018-12-14 13:22                   ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181213144914.GH5427@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fred.de_backer@nokia.com \
    --cc=nsoffer@redhat.com \
    --cc=owais.aamir_t@nokia.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).