From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CFE7C47082 for ; Thu, 3 Jun 2021 20:45:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8F2BD610CB for ; Thu, 3 Jun 2021 20:45:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8F2BD610CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58288 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1louDr-0003dH-KK for qemu-devel@archiver.kernel.org; Thu, 03 Jun 2021 16:45:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45010) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1louD6-0002rT-FP for qemu-devel@nongnu.org; Thu, 03 Jun 2021 16:45:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52352) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1louD3-0003CF-Jk for qemu-devel@nongnu.org; Thu, 03 Jun 2021 16:44:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1622753096; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=94rcF0LBindzcq9vsfSCicnbMqTAXL7V6eU+BfmEQ30=; b=BQE0JE7Sb6c+ZN4fMRdlUIkVMh/lZ1SdMj49TISczWPafeXSszUyjTDph86c+/hEfrDs+F 4SvY0RfYdF5f4z5ntPbThFCsfpBuSOHKMq/3G29gQTrUI3H3QAx+b+9hINn6RHtucYl2hh bJmUSl2UXt2WsaNlT1c1ZGtnLurk2x0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-394-pnva9GtPNJKmSbDWzaEXew-1; Thu, 03 Jun 2021 16:44:54 -0400 X-MC-Unique: pnva9GtPNJKmSbDWzaEXew-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B53D5101371F; Thu, 3 Jun 2021 20:44:53 +0000 (UTC) Received: from redhat.com (ovpn-112-32.ams2.redhat.com [10.36.112.32]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 82CCE5D6AB; Thu, 3 Jun 2021 20:44:37 +0000 (UTC) Date: Thu, 3 Jun 2021 21:44:34 +0100 From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= To: "Dr. David Alan Gilbert" Subject: Re: [PATCH V3 00/22] Live Update [restart] Message-ID: References: <29356c3a-0b3f-2773-4f9f-18ff4ed4d5bb@oracle.com> <38dc91ab-b96b-1cc3-bf8b-84ff77b334fd@oracle.com> <37713c98-5f2a-6cff-d3c4-9e1c60cc4107@oracle.com> <9a1564dd-bed9-8518-8d5b-3ca9f0243643@oracle.com> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/2.0.7 (2021-05-04) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=berrange@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Received-SPF: pass client-ip=170.10.133.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.37, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Cc: Jason Zeng , "Michael S. Tsirkin" , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Juan Quintela , qemu-devel@nongnu.org, Markus Armbruster , Alex Williamson , Steven Sistare , Stefan Hajnoczi , =?utf-8?Q?Marc-Andr=C3=A9?= Lureau , Paolo Bonzini , Alex =?utf-8?Q?Benn=C3=A9e?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Thu, Jun 03, 2021 at 08:36:42PM +0100, Dr. David Alan Gilbert wrote: > * Steven Sistare (steven.sistare@oracle.com) wrote: > > On 5/24/2021 6:39 AM, Dr. David Alan Gilbert wrote: > > > * Steven Sistare (steven.sistare@oracle.com) wrote: > > >> On 5/20/2021 9:13 AM, Dr. David Alan Gilbert wrote: > > >>> On the 'restart' branch of questions; can you explain, > > >>> other than the passing of the fd's, why the outgoing side of > > >>> qemu's 'migrate exec:' doesn't work for you? > > >> > > >> I'm not sure what I should describe. Can you be more specific? > > >> Do you mean: can we add the cpr specific bits to the migrate exec code? > > > > > > Yes; if possible I'd prefer to just keep the one exec mechanism. > > > It has an advantage of letting you specify the new command line; that > > > avoids the problems I'd pointed out with needing to change the command > > > line if a hotplug had happened. It also means we only need one chunk of > > > exec code. > > > > How/where would you specify a new command line? Are you picturing the usual migration > > setup where you start a second qemu with its own arguments, plus a migrate_incoming > > option or command? That does not work for live update restart; the old qemu must exec > > the new qemu. Or something else? > > The existing migration path allows an exec - originally intended to exec > something like a compressor or a store to file rather than a real > migration; i.e. you can do: > > migrate "exec:gzip > mig" > > and that will save the migration stream to a compressed file called mig. > Now, I *think* we can already do: > > migrate "exec:path-to-qemu command line parameters -incoming 'hmmmmm'" > (That's probably cleaner via the QMP interface). > > I'm not quite sure what I want in the incoming there, but that is > already the source execing the destination qemu - although I think we'd > probably need to check if that's actually via an intermediary. I don't think you can dirctly exec qemu in that way, because the source QEMU migration code is going to wait for completion of the QEMU you exec'd and that'll never come on success. So you'll end up with both QEMU's running forever. If you pass the daemonize option to the new QEMU then it will immediately detach itself, and the source QEMU will think the migration command has finished or failed. I think you can probably do it if you use a wrapper script though. The wrapper would have to fork QEMU in the backend, and then the wrapper would have to monitor the new QEMU to see when the incoming migration has finished/aborted, at which point the wrapper can exit, so the source QEMU sees a successful cleanup of the exec'd command. > > We could shoehorn cpr restart into the migrate exec path by defining a new migration > > capability that the client would set before issuing the migrate command. However, the > > implementation code would be sprinkled with conditionals to suppress migrate-only bits > > and call cpr-only bits. IMO that would be less maintainable than having a separate > > cprsave function. Currently cprsave does not duplicate any migration functionality. > > cprsave calls qemu_save_device_state() which is used by xen. > > To me it feels like cprsave in particular is replicating more code. > > It's also jumping through hoops in places to avoid changing the > commandline; that's going to cause more pain for a lot of people - not > just because it's hacks all over for that, but because a lot of people > are actually going to need to change the commandline even in a cpr like > case (e.g. due to hotplug or changing something else about the > environment, like auth data or route to storage or networking that > changed). Management apps that already support migration, will almost certainly know how to start up a new QEMU with a different command line that takes account of hotplugged/unplugged devices. IOW avoiding changing the command line only really addresses the simple case, and the hard case is likely already solved for purposes of handling regular live migration. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|