From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:32950)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1XlvTa-0007Wk-VK
	for qemu-devel@nongnu.org; Wed, 05 Nov 2014 02:58:06 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1XlvTT-0007cm-F0
	for qemu-devel@nongnu.org; Wed, 05 Nov 2014 02:57:58 -0500
Message-ID: <5459D87D.9010103@suse.de>
Date: Wed, 05 Nov 2014 08:57:49 +0100
From: Alexander Graf <agraf@suse.de>
MIME-Version: 1.0
References: <1415168221-2324-1-git-send-email-sam.mj@au1.ibm.com>
	<1415168221-2324-2-git-send-email-sam.mj@au1.ibm.com>
In-Reply-To: <1415168221-2324-2-git-send-email-sam.mj@au1.ibm.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/2] spapr: Fix stale HTAB
 during live migration (KVM)
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>, qemu-ppc@nongnu.org, qemu-devel@nongnu.org


On 05.11.14 07:17, Samuel Mendoza-Jonas wrote:
> If a guest reboots during a running migration, changes to the
> hash page table are not necessarily updated on the destination.
> Opening a new file descriptor to the HTAB forces the migration
> handler to resend the entire table.
> 
> Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
> ---
>  hw/ppc/spapr.c         | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr.h |  2 ++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0a2bfe6..1610c28 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -833,6 +833,13 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
>          /* Kernel handles htab, we don't need to allocate one */
>          spapr->htab_shift = shift;
>          kvmppc_kern_htab = true;
> +
> +        /* Tell readers to update their file descriptor */
> +        pthread_mutex_lock(&spapr->htab_mutex);

I don't think you can directly use pthread functions in hw/. These files
could be compiled on Windows which doesn't have pthread. Instead, please
use the QEMU wrappers from util/qemu-thread-posix.c.

Or maybe try and find out whether you actually do need the lock. Reboots
can only happen when triggered via a HCALL which takes the BQL. I don't
quite know how much the migration code became threaded, but I'd assume
that at least device migration would happen under the BQL or after
stopping the VM, but in a consistent place.

So as long as we're guaranteed that the htab_fd_stale variable is set at
the final "send all device contents" phase, we should automatically
catch any reset that happened in between - even without a lock, no?


Alex