Re: [Qemu-devel] [PATCH v9 05/20] dirty-bitmap: Avoid size query failure during truncate

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: John Snow <jsnow@redhat.com>
To: Eric Blake <eblake@redhat.com>, Fam Zheng <famz@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com,
	vsementsov@virtuozzo.com, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v9 05/20] dirty-bitmap: Avoid size query failure during truncate
Date: Wed, 20 Sep 2017 15:50:57 -0400	[thread overview]
Message-ID: <adf09781-51c7-adda-bd0b-f2630fedceff@redhat.com> (raw)
In-Reply-To: <de133330-5cc3-d877-516c-38fc1928bebf@redhat.com>

On 09/20/2017 09:11 AM, Eric Blake wrote:
> On 09/19/2017 09:10 PM, Fam Zheng wrote:
> 
>>>
>>> Do you suspect that almost certainly if bdrv_truncate() fails overall
>>> that the image format driver will either unmount the image or become
>>> read-only?
> 
> Uggh - it feels like I've bitten off more than I can chew with this
> patch - I'm getting bogged down by trying to fix bad behavior in code
> that is mostly unrelated to the patch at hand, so I don't have a good
> opinion on WHAT is supposed to happen if bdrv_truncate() fails, only
> that I'm trying to avoid compounding that failure even worse.
> 

Yes, I apologize -- I realize I'm holding this series hostage. For now I
am just trying to legitimately understand the behavior. I am willing to
accept "It's sorta busted right now, but -EOUTOFSCOPE"

>>> I suppose if *not* that's a bug for callers of bdrv_truncate to allow
>>> that kind of monkey business, but if it CAN happen, hbitmap only guards
>>> against such things with an assert (which, IIRC, is not guaranteed to be
>>> on for all builds)
>>
>> It's guaranteed since a few hours ago:
>>
>> commit 262a69f4282e44426c7a132138581d400053e0a1
> 
> Indeed - but even without my patch, we would have hit the assertion
> failures when trying to resize the dirty bitmap to -1 when
> bdrv_nb_sectors() fails (which was likely if refresh_total_sectors()
> failed).
> 
>>> So the question is: "bdrv_truncate failure is NOT considered recoverable
>>> in ANY case, is it?"
>>>
>>> It may possibly be safer to, if the initial truncate request succeeds,
>>> apply a best-effort to the bitmap before returning the error.
>>
>> Like fallback "offset" (or it aligned up to bs cluster size) if
>> refresh_total_sectors() returns error? I think that is okay.
> 
> Here's my proposal for squashing in a best-effort dirty-bitmap resize no
> matter what happens in refresh_total_sectors() (but really, if you
> successfully truncate the disk but then get a failure while trying to
> read back the actual new size, which may differ from the requested size,
> you're probably doomed down the road anyways).
> 
> diff --git i/block.c w/block.c
> index 3caf6bb093..ef5af81f66 100644
> --- i/block.c
> +++ w/block.c
> @@ -3552,8 +3552,9 @@ int bdrv_truncate(BdrvChild *child, int64_t
> offset, PreallocMode prealloc,
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "Could not refresh total sector
> count");
>      } else {
> -        bdrv_dirty_bitmap_truncate(bs, bs->total_sectors *
> BDRV_SECTOR_SIZE);
> +        offset = bs->total_sectors * BDRV_SECTOR_SIZE;
>      }
> +    bdrv_dirty_bitmap_truncate(bs, offset);
>      bdrv_parent_cb_resize(bs);
>      atomic_inc(&bs->write_gen);
>      return ret;
> 
> 

Don't respin on my accord, I'm trying to find out if there is a problem;
I'm not convinced of one yet. Just thinking out loud.

Two cases:

(1) Attempt to resize larger. Resize succeeds, but refresh fails.
Possibly a temporary protocol failure, but we'll assume the resize
actually worked. Bitmap does not get resized, however any caller of
truncate *must* assume that the resize did not succeed. Any calls to
write beyond previous EOF are a bug by the calling module.

(2) Attempt to resize smaller, an actual truncate. Call succeeds but
refresh doesn't. Bitmap is now larger than the drive. The bitmap itself
is perfectly capable of describing reads/writes even to the now-OOB
area, but it's unlikely the BB would submit any. Problems may arise if
the BB does not treat this as a hard failure and a user later attempts
to use this bitmap for a backup operation, as the trailing bits now
reference disk segments that may or may not physically exist. Likely to
hit EIO problems during block jobs.

If we do decide to resize the bitmap even on refresh failure, We
probably do still run the risk of the bitmap being slightly bigger or
slightly smaller than the actual size due to alignment.

It sounds like the resize operation itself needs to be able to return to
the caller the actual size of the operation instead of forcing the
caller to query separately in a follow-up call to really "fix" this.

Considering that either resizing or not resizing the bitmap after a
partial failure probably still leaves us with a possibly dangerous
bitmap, I don't think I'll hold you to the flames over this one.

--js

next prev parent reply	other threads:[~2017-09-20 19:51 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-19 20:18 [Qemu-devel] [PATCH v9 00/20] make dirty-bitmap byte-based Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 01/20] block: Make bdrv_img_create() size selection easier to read Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 02/20] hbitmap: Rename serialization_granularity to serialization_align Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 03/20] qcow2: Ensure bitmap serialization is aligned Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 04/20] dirty-bitmap: Drop unused functions Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 05/20] dirty-bitmap: Avoid size query failure during truncate Eric Blake
2017-09-19 23:00   ` John Snow
2017-09-20  2:10     ` Fam Zheng
2017-09-20 13:11       ` Eric Blake
2017-09-20 19:50         ` John Snow [this message]
2017-09-23 12:04   ` Vladimir Sementsov-Ogievskiy
2017-09-25 13:45     ` Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 06/20] dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 07/20] dirty-bitmap: Track bitmap size by bytes Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 08/20] dirty-bitmap: Change bdrv_dirty_bitmap_*serialize*() to take bytes Eric Blake
2017-09-19 20:18 ` [Qemu-devel] [PATCH v9 09/20] qcow2: Switch sectors_covered_by_bitmap_cluster() to byte-based Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 10/20] dirty-bitmap: Set iterator start by offset, not sector Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 11/20] dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 12/20] dirty-bitmap: Change bdrv_get_dirty_count() to report bytes Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 13/20] dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 14/20] dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 15/20] mirror: Switch mirror_dirty_init() to byte-based iteration Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 16/20] qcow2: Switch qcow2_measure() " Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 17/20] qcow2: Switch load_bitmap_data() " Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 18/20] qcow2: Switch store_bitmap_data() " Eric Blake
2017-09-23 12:01   ` Vladimir Sementsov-Ogievskiy
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 19/20] dirty-bitmap: Switch bdrv_set_dirty() to bytes Eric Blake
2017-09-19 20:19 ` [Qemu-devel] [PATCH v9 20/20] dirty-bitmap: Convert internal hbitmap size/granularity Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adf09781-51c7-adda-bd0b-f2630fedceff@redhat.com \
    --to=jsnow@redhat.com \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).