All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: xen-devel@lists.xen.org
Subject: Re: migration regression in xen-4.11 and qemu-2.11 and qcow2
Date: Tue, 8 May 2018 13:31:43 +0200	[thread overview]
Message-ID: <20180508133143.77e209f2.olaf@aepfle.de> (raw)
In-Reply-To: <20180507151940.GA31926@aepfle.de>


[-- Attachment #1.1: Type: text/plain, Size: 4708 bytes --]

Am Mon, 7 May 2018 17:19:46 +0200
schrieb Olaf Hering <olaf@aepfle.de>:

> What I gathered during debugging so far is that somehow qemu on the receiving side locks a region twice:

After further debugging with many wild printfs:
On the receiving side blockdev_init sets BDRV_O_INACTIVE because RUN_STATE_INMIGRATE is true.
BDRV_O_INACTIVE causes bdrv_is_writable to return false.
As a result bdrv_format_default_perms does not set BLK_PERM_WRITE in perms.

On the sending side offset 0xc9 is unlocked on the other fd, which allows F_WRLCK to succeed:
2018-05-08T11:20:54.491168Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:20:54.492162Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:20:54.494752Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.189455Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.190460Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.192726Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.194298Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.195079Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.197123Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.199378Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.201108Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.344335Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.345969Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.346836Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.348937Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.359691Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.360632Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.363221Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.364781Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.365607Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.367794Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success

It seems on the receiving side some code forgets to unclock offset 0xc9, which causes F_WRLCK to fail:
2018-05-08T11:21:52.108809Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.112193Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.113028Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.115401Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.122037Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.122886Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.125189Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.126969Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.127801Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.130109Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.859199Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.862010Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.862673Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.112935Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.363246Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.615668Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:53.616426Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:53.616816Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 F_UNLCK>F_UNLCK 0 Success


It is unclear why that was never noticed in xen-4.10, qemu-2.9 did not have that bug.
Also, if a KVM or Xen guest is migrated should make zero difference for the qcow2 driver...


Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-05-08 11:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-07 15:19 migration regression in xen-4.11 and qemu-2.11 and qcow2 Olaf Hering
2018-05-08 11:31 ` Olaf Hering [this message]
2018-05-08 16:40   ` Olaf Hering
2018-05-09 12:23     ` Olaf Hering
2018-05-09 21:08       ` Stefano Stabellini
2018-05-09 21:13         ` Olaf Hering
2018-05-09 21:43           ` Stefano Stabellini
2018-05-10  6:04             ` Olaf Hering
2018-05-10 16:03               ` Stefano Stabellini
2018-05-16 13:13                 ` Olaf Hering
2018-05-10 10:40   ` Anthony PERARD
2018-05-14 14:15     ` Olaf Hering
2018-05-16 14:53     ` Olaf Hering
2018-05-17  6:30       ` Olaf Hering
2018-05-17  9:08         ` Olaf Hering
2018-05-17  8:31 ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180508133143.77e209f2.olaf@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.