qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: John Snow <jsnow@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "fam@euphon.net" <fam@euphon.net>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	Denis Lunev <den@virtuozzo.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 3/3] block/qcow2-bitmap: rewrite bitmap reopening logic
Date: Thu, 30 May 2019 14:39:12 +0000	[thread overview]
Message-ID: <a1be0624-dfb3-46bb-eb87-f80e48aa347d@virtuozzo.com> (raw)
In-Reply-To: <8d969ad9-976c-ad6c-f728-4e7d1028975d@redhat.com>

30.05.2019 17:20, John Snow wrote:
> 
> On 5/30/19 4:23 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 29.05.2019 21:08, John Snow wrote:
>>> On 5/29/19 5:10 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> 29.05.2019 2:24, John Snow wrote:
>>>>> On 5/23/19 11:47 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
> [...]
> 
>>>
>>> Right, we've not really used readonly in this way before. It makes sense
>>> to a point, but it's a bit of a semantic overload -- the disk is
>>> actually RW but the bitmap is RO; the problem that I have with this is
>>> that we guard RO bitmaps with assertions and not errors,
>>
>> Oops, you are right. I thought we have errors for guest writes in this case.
>>
> 
> (I'm looking at your second email) -- Ah, we guard this elsewhere.
> 
> Asserts:
> bdrv_set_dirty_bitmap_locked
> bdrv_reset_dirty_bitmap_locked
> bdrv_clear_dirty_bitmap
> bdrv_restore_dirty_bitmap
> bdrv_set_dirty
> 
> Runtime errors (via has_readonly_bitmaps):
> bdrv_aligned_pwritev
> bdrv_co_pdiscard
> 
> I guess it's OK to set readonly in this way then, but I still think it's
> a little confusing and, so long as the in-memory inconsistent bit is
> set, not strictly necessary. (Unless there's some case I am not aware
> of, or it just helps the code to be nicer in some way, etc. I'm not
> aiming to be a purist about the way this flag is used.)
> 
>>> so it seems
>>> feasible to me that the reopen-rw will succeed, a guest will write, and
>>> then we'll abort because of these "readonly corrupt" bitmaps.
>>>
>>> In other words, we don't *really* support the idea of having readonly
>>> bitmaps on readwrite nodes.
>>>
>>> I think given that this is an error state to begin with that simply
>>> marking the bitmap as inconsistent in memory (and trying to write the
>>> IN_USE flag to its header) is sufficient, it will skip any new writes
>>> and prohibit its use for any backup operations.
>>
>> So, you propose not to annoy user too much because of inconsistent bitmaps,
>> for which we can't sync their inconsistancy to the image.. May be it's OK,
>> but let's see what we decide in block-discussion. It would be really cool
>> if we can move this all to .prepare
>>
> 
> Hm, I wouldn't say it's about not annoying the user too much; we should
> still try to write the IN_USE flag to the image and (ideally) report
> that we were unable to if we fail.
> 
> So I think doing these two things is enough:
> 
> 1. Set the in-memory inconsistent flag, which will prohibit the user
> from doing anything with this bitmap AND ignore all future data writes
> 
> 2. Attempt to write the IN_USE flag back out to disk to ensure that it
> will be loaded inconsistent in the future. If we fail to do so, this
> should be considered a fatal error. (Why can't we write to our node?
> It's probably EIO or something fatal.)
> 
> Of course, actually having #2 be fatal depends on reworking the block
> layer a bit.
> 
> [...]
> 
>>>>>> +    if (need_update) {
>>>>>> +        if (!can_write(bs->file->bs)) {
>>>>>
>>>>> I genuinely don't know: is it legitimate to check your child's write
>>>>> permission in this way? will we always have bs->file->bs?
>>>>
>>>> Hmm.. but we are going to write to it very soon, I think it should exist.
>>>>
>>>
>>> Apparently Max is adding a bdrv_storage_bs() helper for this exact
>>> thing, in an upcoming patch. I just get nervous when I see double
>>> indirections >
>>
>> Hmm... But if we have separate data file for qcow2, bdrv_storage_bs will refer to it,
>> so here we need exactly bs->file I think.
>>
> 
> Alright, I don't know the particulars. Whatever works is fine by me.
> 
> [...]
> 
>>>>
>>>> Yes. In short, it was bad, it still bad, but at least one bug is fixed :)
>>>>
>>>
>>> Hahaha! Very good summary. Let's discuss the block implications with
>>> Max, Berto and Kevin. Until then, do you think it's OK to split out the
>>> release_bitmaps boolean as its own patch to "lessen" the severity of the
>>> bug?
>>
>> Bitmap will not be reopend rw anyway, so, we should crash on first guest write, as
>> you noted..
>>
> 
> Why not? If the bitmaps are still in memory (because we didn't close
> them fully on reopen-ro), then reopen-rw ought to be able to see them
> and drop the readonly marker, right?
> 
>> Maybe, I'll refactor it back, to return normal error, and just ignore it in commit, so that,
>> we'll move it to .prepare as soon as we able to do and with less pain?
>>
> 
> I may have misunderstood Max, but at the moment I'm thinking that as
> much as you can move into prepare() and have it still work the better. I
> assume that writing the IN_USE flags from prepare() won't work, though,
> because the nodes aren't technically RW yet, so the write primitives
> won't work...
> 
> At this point, I'll trust your judgment to come up with something that
> seems tidier; I don't think I have suggestions unless I start
> prototyping it too.
> 
> (Though I might try to come up with something minimally workable as a
> bugfix while we try to tackle more systemic issues...)
> 
> --js
> 

I'm now preparing v2 which tries to do everything we need in prepare, which includes
changes in generic block layer reopen functionality. Seems like it's not very hard.


-- 
Best regards,
Vladimir

  reply	other threads:[~2019-05-30 14:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 15:47 [Qemu-devel] [PATCH 0/3] qcow2-bitmaps: rewrite reopening logic Vladimir Sementsov-Ogievskiy
2019-05-23 15:47 ` [Qemu-devel] [PATCH 1/3] iotests: add test 255 to check bitmap life after snapshot + commit Vladimir Sementsov-Ogievskiy
2019-05-24 23:15   ` John Snow
2019-05-25  9:16     ` Vladimir Sementsov-Ogievskiy
2019-05-23 15:47 ` [Qemu-devel] [PATCH 2/3] block/qcow2-bitmap: get rid of bdrv_has_changed_persistent_bitmaps Vladimir Sementsov-Ogievskiy
2019-05-24 23:37   ` John Snow
2019-05-23 15:47 ` [Qemu-devel] [PATCH 3/3] block/qcow2-bitmap: rewrite bitmap reopening logic Vladimir Sementsov-Ogievskiy
2019-05-28 23:24   ` John Snow
2019-05-29  9:10     ` Vladimir Sementsov-Ogievskiy
2019-05-29 18:08       ` John Snow
2019-05-30  8:23         ` Vladimir Sementsov-Ogievskiy
2019-05-30 11:54           ` Vladimir Sementsov-Ogievskiy
2019-05-30 14:20           ` John Snow
2019-05-30 14:39             ` Vladimir Sementsov-Ogievskiy [this message]
2019-05-29 15:33   ` Max Reitz
2019-05-29 15:58     ` Vladimir Sementsov-Ogievskiy
2019-05-29 18:18       ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1be0624-dfb3-46bb-eb87-f80e48aa347d@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).