qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Kevin Wolf <kwolf@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	Denis Lunev <den@virtuozzo.com>,
	"fam@euphon.net" <fam@euphon.net>,
	"eblake@redhat.com" <eblake@redhat.com>,
	"jsnow@redhat.com" <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3] block/file-posix: do not fail on unlock bytes
Date: Wed, 3 Apr 2019 18:41:49 +0200	[thread overview]
Message-ID: <45b9861a-8ad4-43d6-3b1a-55495adfc866@redhat.com> (raw)
In-Reply-To: <8e2bf8a6-6755-fb4b-4ad6-90ef037dd589@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 5975 bytes --]

On 01.04.19 09:21, Vladimir Sementsov-Ogievskiy wrote:
> 29.03.2019 22:32, Kevin Wolf wrote:
>> Am 29.03.2019 um 19:00 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> 29.03.2019 20:58, Vladimir Sementsov-Ogievskiy wrote:
>>>> 29.03.2019 20:44, Max Reitz wrote:
>>>>> On 29.03.19 18:40, Kevin Wolf wrote:
>>>>>> Am 29.03.2019 um 18:30 hat Max Reitz geschrieben:
>>>>>>> On 29.03.19 18:24, Kevin Wolf wrote:
>>>>>>>> Am 29.03.2019 um 18:15 hat Max Reitz geschrieben:
>>>>>>>>> On 29.03.19 12:04, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>>> bdrv_replace_child() calls bdrv_check_perm() with error_abort on
>>>>>>>>>> loosening permissions. However file-locking operations may fail even
>>>>>>>>>> in this case, for example on NFS. And this leads to Qemu crash.
>>>>>>>>>>
>>>>>>>>>> Let's avoid such errors. Note, that we ignore such things anyway on
>>>>>>>>>> permission update commit and abort.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>>>>>> ---
>>>>>>>>>>    block/file-posix.c | 12 ++++++++++++
>>>>>>>>>>    1 file changed, 12 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>>>>>>>>> index db4cccbe51..1cf4ee49eb 100644
>>>>>>>>>> --- a/block/file-posix.c
>>>>>>>>>> +++ b/block/file-posix.c
>>>>>>>>>> @@ -815,6 +815,18 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
>>>>>>>>>>        switch (op) {
>>>>>>>>>>        case RAW_PL_PREPARE:
>>>>>>>>>> +        if ((s->perm | new_perm) == s->perm &&
>>>>>>>>>> +            (s->shared_perm & new_shared) == s->shared_perm)
>>>>>>>>>> +        {
>>>>>>>>>> +            /*
>>>>>>>>>> +             * We are going to unlock bytes, it should not fail. If it fail due
>>>>>>>>>> +             * to some fs-dependent permission-unrelated reasons (which occurs
>>>>>>>>>> +             * sometimes on NFS and leads to abort in bdrv_replace_child) we
>>>>>>>>>> +             * can't prevent such errors by any check here. And we ignore them
>>>>>>>>>> +             * anyway in ABORT and COMMIT.
>>>>>>>>>> +             */
>>>>>>>>>> +            return 0;
>>>>>>>>>> +        }
>>>>>>>>>>            ret = raw_apply_lock_bytes(s, s->fd, s->perm | new_perm,
>>>>>>>>>>                                       ~s->shared_perm | ~new_shared,
>>>>>>>>>>                                       false, errp);
>>>>>>>>>
>>>>>>>>> Help me understand the exact issue, please.  I understand that there are
>>>>>>>>> operations like bdrv_replace_child() that pass &error_abort to
>>>>>>>>> bdrv_check_perm() because they just loosen the permissions, so it should
>>>>>>>>> not fail.
>>>>>>>>>
>>>>>>>>> However, if the whole effect really would be to loosen permissions,
>>>>>>>>> raw_apply_lock_bytes() wouldn't have failed here in PREPARE anyway:
>>>>>>>>> @unlock is passed as false, so no bytes will be unlocked.  And if
>>>>>>>>> permissions are just loosened (as your condition checks), it should not
>>>>>>>>> lock any bytes.
>>>>>>>>>
>>>>>>>>> So why does it attempt lock any bytes in the first place?  There must be
>>>>>>>>> some discrepancy between s->perm and s->locked_perm, or ~s->shared_perm
>>>>>>>>> and s->locked_shared_perm.  How does that occur?
>>>>>>>>
>>>>>>>> I suppose raw_check_lock_bytes() is what is failing, not
>>>>>>>> raw_apply_lock_bytes().
>>>>>>>
>>>>>>> Hm, maybe in Vladimir's case, but not in e.g.
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1652572 .
>>>>>>
>>>>>> This is reported against 3.0, which didn't avoid re-locking permissions
>>>>>> that we already hold, so there raw_apply_lock_bytes() can still fail.
>>>>>
>>>>> That makes sense.  Which leaves the question why Vladimir still seems to
>>>>> see the error there...?
>>>>>
>>>>
>>>> I'm sorry :(. I'm trying to fix bug based on 2.10, and now I see that is already fixed
>>>>    upstream. I don't have a reproducer, only old coredumps.
>>>>
>>>> So, now it looks like we don't need this patch, as on permission loosening file-posix
>>>> don't call any FS apis, yes?
>>>>
>>>
>>>
>>> Ah, you mentioned, that raw_check_lock_bytes is still buggy.
>>
>> I haven't tried it out, but from looking at the code it seems so. Maybe
>> you can reproduce on master just to be sure?
>>
> 
> I don't have a reproducer :(

I have one, but it only breaks before
2996ffad3acabe890fbb4f84a069cdc325a68108:

First, setup on an NFS mount on /mnt/nfs.  Second:

$ qemu-img create -f qcow2 /mnt/nfs/foo.qcow2 64M
Formatting '/mnt/nfs/foo.qcow2', fmt=qcow2 size=67108864
cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ (sleep 5; echo "{'execute':'qmp_capabilities'}"; \
   echo "{'execute':'blockdev-del','arguments':{'node-name':'fmt'}}";
   echo "{'execute':'quit'}") \
  | x86_64-softmmu/qemu-system-x86_64 -qmp stdio \
    -blockdev node-name=proto,driver=file,filename=/mnt/nfs/foo.qcow2 \
    -blockdev node-name=fmt,driver=qcow2,file=proto
{"QMP": {"version": {"qemu": {"micro": 90, "minor": 0, "major": 3},
"package": "v3.1.0-rc0-71-ga883d6a0bc"}, "capabilities": []}}

Before the sleep is done, stop the service on the NFS host:

$ systemctl stop nfs-service

Once the sleep has run out (you get a {"return": {}} over QMP), start
the service again:

$ systemctl start nfs-service

And then this happens:

Unexpected error in raw_apply_lock_bytes() at block/file-posix.c:705:
Failed to lock byte 100
[1]    30486 done                 ( sleep 5; echo
"{'execute':'qmp_capabilities'}"; echo ; echo ; ) |
       30487 abort (core dumped)  x86_64-softmmu/qemu-system-x86_64 -qmp
stdio -blockdev  -blockdev

It works fine after 2996ffad3acabe890fbb4f84a069cdc325a68108.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-04-03 16:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190329110454.82409-1-vsementsov@virtuozzo.com>
     [not found] ` <5d4b5522-c31b-f69c-93c2-506fef535745@redhat.com>
     [not found]   ` <20190329172411.GM5081@localhost.localdomain>
     [not found]     ` <85784659-e4ed-4a21-8606-c12de3db064a@redhat.com>
     [not found]       ` <20190329174059.GO5081@localhost.localdomain>
     [not found]         ` <2f820393-b9e1-cd23-a220-bae14b98ab4a@redhat.com>
     [not found]           ` <993ee436-f998-bba4-cd50-03142174b7cd@virtuozzo.com>
     [not found]             ` <4f035f02-ded6-7796-836e-937ce45079f0@virtuozzo.com>
     [not found]               ` <20190329193224.GP5081@localhost.localdomain>
2019-04-01  7:21                 ` [Qemu-devel] [PATCH v3] block/file-posix: do not fail on unlock bytes Vladimir Sementsov-Ogievskiy
2019-04-03 16:41                   ` Max Reitz [this message]
2019-04-03 16:56                     ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45b9861a-8ad4-43d6-3b1a-55495adfc866@redhat.com \
    --to=mreitz@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).