All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Alberto Garcia <berto@igalia.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH for-5.0 v2 10/23] quorum: Implement .bdrv_recurse_can_replace()
Date: Thu, 6 Feb 2020 16:42:01 +0100	[thread overview]
Message-ID: <20200206154201.GF4926@linux.fritz.box> (raw)
In-Reply-To: <1bb2e344-e66d-de37-0d49-f4a8a5a6eb40@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6860 bytes --]

Am 06.02.2020 um 16:19 hat Max Reitz geschrieben:
> On 06.02.20 15:42, Kevin Wolf wrote:
> > Am 06.02.2020 um 11:21 hat Max Reitz geschrieben:
> >> On 05.02.20 16:55, Kevin Wolf wrote:
> >>> Am 11.11.2019 um 17:02 hat Max Reitz geschrieben:
> >>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>>> ---
> >>>>  block/quorum.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  1 file changed, 62 insertions(+)
> >>>>
> >>>> diff --git a/block/quorum.c b/block/quorum.c
> >>>> index 3a824e77e3..8ee03e9baf 100644
> >>>> --- a/block/quorum.c
> >>>> +++ b/block/quorum.c
> >>>> @@ -825,6 +825,67 @@ static bool quorum_recurse_is_first_non_filter(BlockDriverState *bs,
> >>>>      return false;
> >>>>  }
> >>>>  
> >>>> +static bool quorum_recurse_can_replace(BlockDriverState *bs,
> >>>> +                                       BlockDriverState *to_replace)
> >>>> +{
> >>>> +    BDRVQuorumState *s = bs->opaque;
> >>>> +    int i;
> >>>> +
> >>>> +    for (i = 0; i < s->num_children; i++) {
> >>>> +        /*
> >>>> +         * We have no idea whether our children show the same data as
> >>>> +         * this node (@bs).  It is actually highly likely that
> >>>> +         * @to_replace does not, because replacing a broken child is
> >>>> +         * one of the main use cases here.
> >>>> +         *
> >>>> +         * We do know that the new BDS will match @bs, so replacing
> >>>> +         * any of our children by it will be safe.  It cannot change
> >>>> +         * the data this quorum node presents to its parents.
> >>>> +         *
> >>>> +         * However, replacing @to_replace by @bs in any of our
> >>>> +         * children's chains may change visible data somewhere in
> >>>> +         * there.  We therefore cannot recurse down those chains with
> >>>> +         * bdrv_recurse_can_replace().
> >>>> +         * (More formally, bdrv_recurse_can_replace() requires that
> >>>> +         * @to_replace will be replaced by something matching the @bs
> >>>> +         * passed to it.  We cannot guarantee that.)
> >>>> +         *
> >>>> +         * Thus, we can only check whether any of our immediate
> >>>> +         * children matches @to_replace.
> >>>> +         *
> >>>> +         * (In the future, we might add a function to recurse down a
> >>>> +         * chain that checks that nothing there cares about a change
> >>>> +         * in data from the respective child in question.  For
> >>>> +         * example, most filters do not care when their child's data
> >>>> +         * suddenly changes, as long as their parents do not care.)
> >>>> +         */
> >>>> +        if (s->children[i].child->bs == to_replace) {
> >>>> +            Error *local_err = NULL;
> >>>> +
> >>>> +            /*
> >>>> +             * We now have to ensure that there is no other parent
> >>>> +             * that cares about replacing this child by a node with
> >>>> +             * potentially different data.
> >>>> +             */
> >>>> +            s->children[i].to_be_replaced = true;
> >>>> +            bdrv_child_refresh_perms(bs, s->children[i].child, &local_err);
> >>>> +
> >>>> +            /* Revert permissions */
> >>>> +            s->children[i].to_be_replaced = false;
> >>>> +            bdrv_child_refresh_perms(bs, s->children[i].child, &error_abort);
> >>>
> >>> Quite a hack. The two obvious problems are:
> >>>
> >>> 1. We can't guarantee that we can actually revert the permissions. I
> >>>    think we ignore failure to loosen permissions meanwhile so that at
> >>>    least the &error_abort doesn't trigger, but bs could still be in the
> >>>    wrong state afterwards.
> >>
> >> I thought we guaranteed that loosening permissions never fails.
> >>
> >> (Well, you know.  It may “leak” permissions, but we’d never get an error
> >> here so there’s nothing to handle anyway.)
> > 
> > This is what I meant. We ignore the failure (i.e. don't return an error),
> > but the result still isn't completely correct ("leaked" permissions).
> > 
> >>>    It would be cleaner to use check+abort instead of actually setting
> >>>    the new permission.
> >>
> >> Oh.  Yes.  Maybe.  It does require more code, though, because I’d rather
> >> not use bdrv_check_update_perm() from here as-is.
> > 
> > I'm not saying you need to do it, just that it would be cleaner. :-)
> 
> It would.  Thanks for the suggestion, I obviously didn’t think of it.
> (Or there’d be a comment on how this is not the best way in theory, but
> in practice it’s good enough.)  I suppose I’ll see how what I can do.
> 
> >>> 2. As aborting the permission change makes more obvious, we're checking
> >>>    something that might not be true any more when we actually make the
> >>>    change.
> >>
> >> True.  I tried to do it right by having a post-replace cleanup function,
> >> but after a while that was just going nowhere, really.  So I just went
> >> with what’s patch 13 here.
> >>
> >> But isn’t 13 enough, actually?  It check can_replace right before
> >> replacing in a drained section.  I can’t imagine the permissions to
> >> change there.
> > 
> > Permissions are tied to file locks, so an external process can just grab
> > the locks in between.
> 
> Ah, right, I didn’t think of that.
> 
> > But if I understand correctly, all we try here is
> > to have an additional safeguard to prevent the user from doing stupid
> > things. So I guess not being 100% is fine as long as it's documented in
> > the code.
> 
> Yes.  I just think it actually would be 100 % in practice, so I wondered
> whether it would need to be documented.
> 
> You’re right, though, it isn’t 100 %, so it should definitely be
> documented.  Maybe something like
> 
> In theory, we would have to keep the permissions tightened until the
> node is replaced.  In practice, that would require post-replacement
> cleanup infrastructure, which we do not have, and which would be
> unreasonably complex to implement.

Sounds good until here.

> Therefore, all we can do is require
> anyone who wants to replace one node by some potentially unrelated other
> node (i.e., the mirror job on completion) to invoke
> bdrv_recurse_can_replace() immediately before and thus minimize the time
> during which some condition may arise that might forbid the swap.
> 
> ?

This second part of your suggested comment could be dropped, as far as
I'm concerned. If anything, it's part of the contract and would belong
in the bdrv_recurse_can_replace() documentation.

However, I think I would mention why not being 100% is okay: The part
with "additional safeguard to prevent the user from doing stupid
things", and that it doesn't make a difference if the user runs the
correct command.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

  reply	other threads:[~2020-02-06 15:48 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-11 16:01 [PATCH for-5.0 v2 00/23] block: Fix check_to_replace_node() Max Reitz
2019-11-11 16:01 ` [PATCH for-5.0 v2 01/23] blockdev: Allow external snapshots everywhere Max Reitz
2019-11-11 16:01 ` [PATCH for-5.0 v2 02/23] blockdev: Allow resizing everywhere Max Reitz
2019-12-06 14:04   ` Alberto Garcia
2019-12-09 13:56     ` Max Reitz
2019-11-11 16:01 ` [PATCH for-5.0 v2 03/23] block: Drop bdrv_is_first_non_filter() Max Reitz
2019-11-11 16:01 ` [PATCH for-5.0 v2 04/23] iotests: Let 041 use -blockdev for quorum children Max Reitz
2019-11-11 16:01 ` [PATCH for-5.0 v2 05/23] quorum: Fix child permissions Max Reitz
2019-11-29  9:14   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:01 ` [PATCH for-5.0 v2 06/23] block: Add bdrv_recurse_can_replace() Max Reitz
2019-11-29  9:34   ` Vladimir Sementsov-Ogievskiy
2019-11-29 10:23     ` Max Reitz
2019-11-29 11:04       ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 07/23] blkverify: Implement .bdrv_recurse_can_replace() Max Reitz
2019-11-29  9:41   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 08/23] quorum: Store children in own structure Max Reitz
2019-11-29  9:46   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 09/23] quorum: Add QuorumChild.to_be_replaced Max Reitz
2019-11-29  9:59   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 10/23] quorum: Implement .bdrv_recurse_can_replace() Max Reitz
2019-11-29 10:18   ` Vladimir Sementsov-Ogievskiy
2019-11-29 12:50     ` Max Reitz
2020-02-05 15:55   ` Kevin Wolf
2020-02-05 16:03     ` Kevin Wolf
2020-02-06 10:21     ` Max Reitz
2020-02-06 14:42       ` Kevin Wolf
2020-02-06 15:19         ` Max Reitz
2020-02-06 15:42           ` Kevin Wolf [this message]
2020-02-06 16:44             ` Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 11/23] block: Use bdrv_recurse_can_replace() Max Reitz
2019-11-29 11:07   ` Vladimir Sementsov-Ogievskiy
2020-02-05 15:57   ` Kevin Wolf
2019-11-11 16:02 ` [PATCH for-5.0 v2 12/23] block: Remove bdrv_recurse_is_first_non_filter() Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 13/23] mirror: Double-check immediately before replacing Max Reitz
2019-11-29 11:18   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 14/23] quorum: Stop marking it as a filter Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 15/23] mirror: Prevent loops Max Reitz
2019-11-29 12:01   ` Vladimir Sementsov-Ogievskiy
2019-11-29 13:46     ` Max Reitz
2019-11-29 13:55       ` Vladimir Sementsov-Ogievskiy
2019-11-29 14:17         ` Max Reitz
2019-11-29 14:26           ` Vladimir Sementsov-Ogievskiy
2019-11-29 14:38             ` Max Reitz
2019-12-02 12:12   ` Vladimir Sementsov-Ogievskiy
2019-12-09 14:43     ` Max Reitz
2019-12-13 11:18       ` Vladimir Sementsov-Ogievskiy
2019-12-20 11:39         ` Max Reitz
2019-12-20 11:55           ` Vladimir Sementsov-Ogievskiy
2019-12-20 12:10             ` Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 16/23] iotests: Use complete_and_wait() in 155 Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 17/23] iotests: Use skip_if_unsupported decorator in 041 Max Reitz
2019-12-03 12:03   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 18/23] iotests: Add VM.assert_block_path() Max Reitz
2019-12-03 12:59   ` Vladimir Sementsov-Ogievskiy
2019-12-09 15:10     ` Max Reitz
2019-12-13 11:26       ` Vladimir Sementsov-Ogievskiy
2019-12-13 11:27   ` Vladimir Sementsov-Ogievskiy
2019-12-20 11:42     ` Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 19/23] iotests: Resolve TODOs in 041 Max Reitz
2019-12-03 13:32   ` Vladimir Sementsov-Ogievskiy
2019-12-03 13:33     ` Vladimir Sementsov-Ogievskiy
2019-12-09 15:15       ` Max Reitz
2019-12-13 11:31         ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 20/23] iotests: Use self.image_len in TestRepairQuorum Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 21/23] iotests: Add tests for invalid Quorum @replaces Max Reitz
2019-12-03 14:40   ` Vladimir Sementsov-Ogievskiy
2019-11-11 16:02 ` [PATCH for-5.0 v2 22/23] iotests: Check that @replaces can replace filters Max Reitz
2019-12-03 15:58   ` Vladimir Sementsov-Ogievskiy
2019-12-09 15:17     ` Max Reitz
2019-11-11 16:02 ` [PATCH for-5.0 v2 23/23] iotests: Mirror must not attempt to create loops Max Reitz
2019-12-03 17:03   ` Vladimir Sementsov-Ogievskiy
2019-11-29 12:24 ` [PATCH for-5.0 v2 00/23] block: Fix check_to_replace_node() Vladimir Sementsov-Ogievskiy
2019-11-29 12:49   ` Max Reitz
2019-11-29 12:55     ` Vladimir Sementsov-Ogievskiy
2019-11-29 13:08       ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200206154201.GF4926@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=berto@igalia.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.