From: Kevin Wolf <kwolf@redhat.com>
To: Nir Soffer <nsoffer@redhat.com>
Cc: nirsof <nirsof@gmail.com>, qemu-block <qemu-block@nongnu.org>,
pl@kamp.de, QEMU Developers <qemu-devel@nongnu.org>,
Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH] block: file-posix: Fail unmap with NO_FALLBACK on block device
Date: Wed, 17 Jun 2020 12:47:09 +0200 [thread overview]
Message-ID: <20200617104709.GA5166@linux.fritz.box> (raw)
In-Reply-To: <CAMRbyyvD6kRdaitm6Oc6LAnF6_e+Y9y+jTPKNVq8ENLEKYyKng@mail.gmail.com>
Am 16.06.2020 um 22:01 hat Nir Soffer geschrieben:
> On Tue, Jun 16, 2020 at 8:39 PM Nir Soffer <nsoffer@redhat.com> wrote:
> >
> > On Tue, Jun 16, 2020 at 6:32 PM Kevin Wolf <kwolf@redhat.com> wrote:
> > >
> > > Am 15.06.2020 um 21:32 hat Nir Soffer geschrieben:
> > > > We can zero 2.3 g/s:
> > > >
> > > > # time blkdiscard -z test-lv
> > > >
> > > > real 0m43.902s
> > > > user 0m0.002s
> > > > sys 0m0.130s
> > >
> > > > We can write 445m/s:
> > > >
> > > > # dd if=/dev/zero bs=2M count=51200 of=test-lv oflag=direct conv=fsync
> > > > 107374182400 bytes (107 GB, 100 GiB) copied, 241.257 s, 445 MB/s
> > >
> > > So using FALLOC_FL_PUNCH_HOLE _is_ faster after all. What might not be
> > > faster is zeroing out the whole device and then overwriting a
> > > considerable part of it again.
> > >
> > > I think this means that we shouldn't fail write_zeroes at the file-posix
> > > level even if BDRV_REQ_NO_FALLBACK is given. Instead, qemu-img convert
> > > is where I see a fix.
> > >
> > > Certainly qemu-img could be cleverer and zero out more selectively. The
> > > idea of doing a blk_make_zero() first seems to have caused some
> > > problems, though of course its introduction was also justified with
> > > performance, so improving one case might hurt another if we're not
> > > careful.
> > >
> > > However, when Peter Lieven introduced this (commit 5a37b60a61c), we
> > > didn't use write_zeroes yet during the regular copy loop (we do since
> > > commit 690c7301600). So chances are that blk_make_zero() doesn't
> > > actually help any more now.
> > >
> > > Can you run another test with the patch below?
> >
> > Sure, I can try this.
>
> Tried it, and it performs the same as expected.
Thanks.
> > > I think it should perform
> > > the same as yours. Eric, Peter, do you think this would have a negative
> > > effect for NBD and/or iscsi?
> > >
> > > The other option would be providing an option and making it Someone
> > > Else's Problem.
> > >
> > > Kevin
> > >
> > >
> > > diff --git a/qemu-img.c b/qemu-img.c
> > > index d7e846e607..bdb9f6aa46 100644
> > > --- a/qemu-img.c
> > > +++ b/qemu-img.c
> > > @@ -2084,15 +2084,6 @@ static int convert_do_copy(ImgConvertState *s)
> > > s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target));
> > > }
> > >
> > > - if (!s->has_zero_init && !s->target_has_backing &&
> > > - bdrv_can_write_zeroes_with_unmap(blk_bs(s->target)))
> > > - {
> > > - ret = blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK);
> > > - if (ret == 0) {
> > > - s->has_zero_init = true;
> > > - }
> > > - }
> >
> > This will work of course, but now we will not do bulk zero for any target
>
> I would like to have a minimal change to increase the chance that we
> can consume this very soon in oVirt.
I think this one would be pretty minimal.
Maybe we can later bring this code back, but with an implementation of
blk_make_zero() that doesn't use the generic write_zeroes operation,
but with a specific callback like Eric suggested.
> > I think we never do this for regular files anyway since we truncate
> > files, and there is nothing to zero, but maybe there is some case
> > when this is useful?
Yes, regular files have s->has_zero_init == true anyway.
> > BTW, do we use BDRV_REQ_NO_FALLBACK elsewhere? maybe we can remove
> > it.
qcow2 uses it when zeroing out parts of a newly allocated cluster on
partial writes. Though that code is questionable, too, and XFS people
suggest that we should stop using it.
Kevin
next prev parent reply other threads:[~2020-06-17 10:48 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-13 17:08 [PATCH] block: file-posix: Fail unmap with NO_FALLBACK on block device Nir Soffer
2020-06-15 19:32 ` Nir Soffer
2020-06-16 15:32 ` Kevin Wolf
2020-06-16 17:39 ` Nir Soffer
2020-06-16 20:01 ` Nir Soffer
2020-06-17 10:47 ` Kevin Wolf [this message]
2020-06-16 20:09 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200617104709.GA5166@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=nirsof@gmail.com \
--cc=nsoffer@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.