From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6F805CAC59A for ; Thu, 18 Sep 2025 14:46:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uzFt9-0002Qo-MQ; Thu, 18 Sep 2025 10:45:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uzFt7-0002Q8-O5 for qemu-devel@nongnu.org; Thu, 18 Sep 2025 10:45:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uzFt5-00052p-S0 for qemu-devel@nongnu.org; Thu, 18 Sep 2025 10:45:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758206730; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=GccWsFP/YkEObU3YQe5mvdiRyi2djt3zw64iO9qxNsU=; b=Fgf6Ig8eKyuoBwftDLRF5mbwvIiXQf/dNhmm61bWNxdwXLopWF48Ze7M9e5TeuuNT67eas yIDH+KLlPTyI3C0e97Ech9vBEBtJKCqFFuLEpSX1SUgc8pqur/Ahc/+HJBNCT2qZ5VdIYa NETUX0hXza9SJa0+EFzVUOI46/H2jNQ= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-587-AaGCM-ddMQCpM_hDUJPq8Q-1; Thu, 18 Sep 2025 10:45:27 -0400 X-MC-Unique: AaGCM-ddMQCpM_hDUJPq8Q-1 X-Mimecast-MFC-AGG-ID: AaGCM-ddMQCpM_hDUJPq8Q_1758206726 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 484E31955D77; Thu, 18 Sep 2025 14:45:26 +0000 (UTC) Received: from redhat.com (unknown [10.42.28.161]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 54ABC1800447; Thu, 18 Sep 2025 14:45:24 +0000 (UTC) Date: Thu, 18 Sep 2025 15:45:21 +0100 From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= To: Juraj Marcin Cc: Peter Xu , qemu-devel@nongnu.org, Fabiano Rosas , Jiri Denemark Subject: Re: [PATCH] migration: Apply migration specific keep-alive defaults to inet socket Message-ID: References: <20250909150127.1494626-1-jmarcin@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/2.2.14 (2025-02-20) X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.005, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Thu, Sep 18, 2025 at 04:16:56PM +0200, Juraj Marcin wrote: > If there is no outgoing traffic from the destination side (this can be > caused for example by a workload with no page faults or paused machine), > QEMU has no way of knowing if the connection is still working or not. > The TCP stack doesn't treat no incoming traffic as a sign of a broken > connection. Therefore, QEMU would stay in postcopy-active waiting for > pages indefinitely. > > Also, libvirt might not be aware of a connection dropout between QEMUs, > if libvirt's connection is intact, especially if libvirt daemons are > communicating through some central entity that is managing the migration > and not directly. And to do postcopy migration recovery, libvirt needs > both sides to be in postcopy-paused state. Whether keepalive timeouts are at the QEMU level or global kernel level, there will always be situations where the timeouts are too long. Apps/admins can have out of band liveliness checks between hosts that detect a problem before the keepalives will trigger and shouldn't have to wait to recover migration, once they have resolved the underlying network issue. There needs to be a way to initiate post-copy recovery regardless of whether we've hit a keepalive timeout. Especially if we can see one QEMU in postcopy-paused, but not the other side, it doesn't appear to make sense to block the recovery process. The virDomainJobCancel command can do a migrate-cancel on the src, but it didn't look like we could do the same on the dst. Unless I've overlooked something, Libvirt needs to gain a way to explicitly force both sides into the postcopy-paused state, and thus be able to immediately initiate recovery. > Alternatively, there also might be an issue with the connection between > libvirt daemons, but not the migration connection. Even if the libvirt > connection fails, the migration is not paused, rather libvirt lets the > migration finish normally. Similarly, if the libvirt connection is > broken up due to, for example, libvirt daemon restart, the ongoing > migration is not paused, but after the libvirt daemon starts again, it > sees an ongoing migration and lets it finish. Whole this is a reliability issue for libvirt, this doesn't have any bearing on migration keepalive timeouts, as we're only concerned about QEMU connections. > Additionally, libvirt uses its own internal keep-alive packets with much > more aggressive timeouts, waiting 5 - 10 seconds idle before sending a > keep-alive packet and then killing the connection if there is no > response in 30 seconds. Yep, this keepalive is very aggressive and has frequently caused problems with libvirt connections being torn down inappropriately. We get away with that because most libvirt APIs don't need to have persistent state over the duration of a connection. The migration APIs are there area where this isn't true, and the keepalives on libvirt conmnections have resulted in us breaking otherwise still functional migrations. IOW, I wouldn't point to libvirt as an illustration of keepalives being free of significant downsides. > I think, if we enable keep-alive in QEMU, but let the default timeouts > be longer, for example idle time of 5 minutes and 15 retries in 1 minute > intervals (which would mean, that connection would be considered broken > after 20 minutes of unsuccessful communication attempts), that would be > an acceptable solution. I'm fine with turning on keepalives on the socket, but IMHO the out of the box behaviour should be to honour the kernel default tunables unless the admin decides they want different behaviour. I'm not seeing a rational for why the kernel defaults should be forceably overridden in QEMU out of the box. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|