From: Jason Dillaman <jdillama@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Josh Durgin <jdurgin@redhat.com>,
qemu-block <qemu-block@nongnu.org>,
qemu-devel <qemu-devel@nongnu.org>, Max Reitz <mreitz@redhat.com>,
John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v2] block/rbd: implement .bdrv_get_allocated_file_size callback
Date: Tue, 2 Jul 2019 11:09:14 -0400 [thread overview]
Message-ID: <CA+aFP1C828MUouu+3P2HKn3zHc3STKZ1mcEY8S4HWRNcFoiO9w@mail.gmail.com> (raw)
In-Reply-To: <20190628085901.rnzcfu6rgkvrg46n@steredhat>
On Fri, Jun 28, 2019 at 4:59 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Thu, Jun 27, 2019 at 03:43:04PM -0400, Jason Dillaman wrote:
> > On Thu, Jun 27, 2019 at 1:24 PM John Snow <jsnow@redhat.com> wrote:
> > > On 6/27/19 4:48 AM, Stefano Garzarella wrote:
> > > > On Wed, Jun 26, 2019 at 05:04:25PM -0400, John Snow wrote:
> > > >> It looks like this has hit a 30 day expiration without any reviews or
> > > >> being merged; do we still want this? If so, can you please resend?
> > > >
> > > > Yes, I think we still want :)
> > > >
> > > > Is it okay if I send a v3 following your comments?
> > > >
> > >
> > > Yes, but I don't know who is responsible for final approval; I guess
> > > that's Josh Durgin?
> >
> > I'm the new (for the past several years) upstream PTL for RBD, so feel
> > free to tag me.
> >
> > > >>
> > > >> On 5/10/19 11:33 AM, Stefano Garzarella wrote:
> > > >>> This patch allows 'qemu-img info' to show the 'disk size' for
> > > >>> the RBD images that have the fast-diff feature enabled.
> > > >>>
> > > >>> If this feature is enabled, we use the rbd_diff_iterate2() API
> > > >>> to calculate the allocated size for the image.
> > > >>>
> > > >>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > >>> ---
> > > >>> v2:
> > > >>> - calculate the actual usage only if the fast-diff feature is
> > > >>> enabled [Jason]
> > > >>> ---
> > > >>> block/rbd.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >>> 1 file changed, 54 insertions(+)
> > > >>>
> > > >>> diff --git a/block/rbd.c b/block/rbd.c
> > > >>> index 0c549c9935..f1bc76ab80 100644
> > > >>> --- a/block/rbd.c
> > > >>> +++ b/block/rbd.c
> > > >>> @@ -1046,6 +1046,59 @@ static int64_t qemu_rbd_getlength(BlockDriverState *bs)
> > > >>> return info.size;
> > > >>> }
> > > >>>
> > > >>> +static int rbd_allocated_size_cb(uint64_t offset, size_t len, int exists,
> > > >>> + void *arg)
> > > >>> +{
> > > >>> + int64_t *alloc_size = (int64_t *) arg;
> > > >>> +
> > > >>> + if (exists) {
> > > >>> + (*alloc_size) += len;
> > > >>> + }
> > > >>> +
> > > >>> + return 0;
> > > >>> +}
> > > >>> +
> > > >>> +static int64_t qemu_rbd_get_allocated_file_size(BlockDriverState *bs)
> > > >>> +{
> > > >>> + BDRVRBDState *s = bs->opaque;
> > > >>> + uint64_t flags, features;
> > > >>> + int64_t alloc_size = 0;
> > > >>> + int r;
> > > >>> +
> > > >>> + r = rbd_get_flags(s->image, &flags);
> > > >>> + if (r < 0) {
> > > >>> + return r;
> > > >>> + }
> > > >>> +
> > > >>
> > > >> Do you know where rbd_get_flags is documented? I can't seem to quickly
> > > >> find a reference that tells me what to expect from calling it. It
> > > >> returns an int, I guess an error code, but how can I confirm this?
> > > >>
> > > >> *clones the ceph repository*
> > > >>
> > > >> src/librbd/internal.cc get_flags convinces me it probably works like I
> > > >> think, but is there not a reference here?
> > > >>
> > > >
> > > > Good question!
> > > > I didn't find any docs, but looking in the ceph tests test/librbd/fsx.cc,
> > > > they print an error message if the return value is less than 0.
> > > >
> > > > A 'get_flags' implemented in cls/rbd/cls_rbd.cc for example returns 0 at the
> > > > end and -EINVAL in a try/catch. It also uses 'read_key()' that in some cases
> > > > returns -EIO, so I hope that the error returned by rbd_get_flags() is one of
> > > > the errors defined in errno.h
> > > >
> > > >>> + r = rbd_get_features(s->image, &features);
> > > >>> + if (r < 0) {
> > > >>> + return r;
> > > >>> + }
> > > >>> +
> > > >>> + /*
> > > >>> + * We use rbd_diff_iterate2() only if the RBD image have fast-diff
> > > >>> + * feature enabled. If it is disabled, rbd_diff_iterate2() could be
> > > >>> + * very slow on a big image.
> > > >>> + */
> > > >>> + if (!(features & RBD_FEATURE_FAST_DIFF) ||
> > > >>> + (flags & RBD_FLAG_FAST_DIFF_INVALID)) {
> > > >>> + return -1;
> > > >>> + }
> > > >>> +
> > > >>
> > > >> (Is there a reference for the list of flags to make sure there aren't
> > > >> other cases we might want to skip this?)
> > > >
> > > > Unfortunately no :(
> > > > As Jason suggested, I followed what libvirt did in the
> > > > volStorageBackendRBDUseFastDiff() [src/storage/storage_backend_rbd.c]
> >
> > These are the only ones.
>
> Thanks for the clarification!
>
> >
> > > >>
> > > >> It looks reasonable at a glance, but maybe let's return -ENOTSUP instead
> > > >> of -1, based on the idea that bdrv_get_allocated_file_size returns
> > > >> -ENOMEDIUM in a prominent error case -- let's match that error convention.
> > > >
> > > > Sure, -ENOTSUP is absolutely better!
> > > >
> > > >>
> > > >> (Well, I wonder what the librbd calls are returning and if THOSE mean
> > > >> anything.)
> > > >
> > > > I hope they return an errno.h errors, but I'm not sure if the meaning
> > > > make sense for us.
> > > >
> > > > Do you think is better to return -ENOTSUP or -EIO when librbd calls
> > > > fail?
> > > >
> > >
> > > I'll be honest, I have no idea because I don't know what failure of
> > > these calls means _at all_, so I don't know if it should be something
> > > severe like EIO or something more mundane.
> > >
> > > I guess just leave them alone in absence of better information, honestly.
> >
> > It looks like QEMU treats any negative error code like the actual size
> > isn't available. However, for clarity I would vote for -ENOTSUPP since
> > that's what would be returned if the driver doesn't support it.
> >
>
> Do you mean to return -ENOTSUP even when there's a runtime error in
> rbd_get_features() or rbd_get_flags() or rbd_diff_iterate2?
I was advocating for only returning -ENOTSUP in the case where
fast-diff isn't enabled or is flagged as invalid (instead of the
hard-coded -1). If one of the librbd APIs returns an error for some
reason, it seems like you can just pass it along like the current
patch does since any negative result is handled.
> Thanks,
> Stefano
--
Jason
next prev parent reply other threads:[~2019-07-02 16:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-10 15:33 [Qemu-devel] [PATCH v2] block/rbd: implement .bdrv_get_allocated_file_size callback Stefano Garzarella
2019-06-26 21:04 ` [Qemu-devel] [Qemu-block] " John Snow
2019-06-27 8:48 ` Stefano Garzarella
2019-06-27 17:24 ` John Snow
2019-06-27 19:43 ` Jason Dillaman
2019-06-27 19:45 ` John Snow
2019-06-28 0:23 ` Jason Dillaman
2019-06-28 8:59 ` Stefano Garzarella
2019-07-02 15:09 ` Jason Dillaman [this message]
2019-07-03 7:31 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+aFP1C828MUouu+3P2HKn3zHc3STKZ1mcEY8S4HWRNcFoiO9w@mail.gmail.com \
--to=jdillama@redhat.com \
--cc=dillaman@redhat.com \
--cc=jdurgin@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=sgarzare@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).