From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F25F0C5B552 for ; Mon, 9 Jun 2025 13:51:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uOcu8-0005Cc-LP; Mon, 09 Jun 2025 09:51:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uOcu4-0005CF-GG for qemu-devel@nongnu.org; Mon, 09 Jun 2025 09:51:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uOcu2-0002y7-BS for qemu-devel@nongnu.org; Mon, 09 Jun 2025 09:51:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1749477063; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q404t1FMSYXSf5BY7h6gmKe0mHDCAvX+l6Tv0jbWDBI=; b=PFQXWKsAXv2mI7/L7ZdKXX19I6+oFxVsEtqVruozEFOAFoo3+STDIa8YgDWRhl5plTM38Y yWrW8zkvkIy7TGh+Y4DwZ8esGEfQkzEf+cUZFEIAIi7wWddRHwuUkfXxpgktZg7msxbCwO HsYjrrHUBKFFlKe34N9LxP5futAhb18= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-619-Ps8N7Kt-Pz2tmPuYR0dsqw-1; Mon, 09 Jun 2025 09:51:00 -0400 X-MC-Unique: Ps8N7Kt-Pz2tmPuYR0dsqw-1 X-Mimecast-MFC-AGG-ID: Ps8N7Kt-Pz2tmPuYR0dsqw_1749477058 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8D6C71809C9D; Mon, 9 Jun 2025 13:50:58 +0000 (UTC) Received: from redhat.com (unknown [10.42.28.45]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7B9CE18003FC; Mon, 9 Jun 2025 13:50:56 +0000 (UTC) Date: Mon, 9 Jun 2025 14:50:53 +0100 From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= To: Steven Sistare Cc: JAEHOON KIM , qemu-devel@nongnu.org, jjherne@linux.ibm.com, peterx@redhat.com, farosas@suse.de Subject: Re: [PATCH v1] migration: Wait for cpr.sock file to appear before connecting Message-ID: References: <7dc8d42d-47f1-49c1-9bff-ab2d09d0b6f3@oracle.com> <5f211f67-17f7-4b1d-a60a-4ff62645fbfa@linux.ibm.com> <881cb07a-95c7-4f3b-8012-352873e88d64@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <881cb07a-95c7-4f3b-8012-352873e88d64@oracle.com> User-Agent: Mutt/2.2.14 (2025-02-20) X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Mon, Jun 09, 2025 at 09:39:48AM -0400, Steven Sistare wrote: > On 6/9/2025 9:20 AM, Daniel P. Berrangé wrote: > > On Mon, Jun 09, 2025 at 09:12:27AM -0400, Steven Sistare wrote: > > > On 6/9/2025 4:06 AM, Daniel P. Berrangé wrote: > > > > On Fri, Jun 06, 2025 at 03:37:56PM -0400, Steven Sistare wrote: > > > > > > > > > > The easiest solution, with no interface changes, is adding wait_for_socket() in qtest, > > > > > with this addition from Daniel: > > > > > > > > > > "With the busy wait you risk looping forever if the child (target) QEMU > > > > > already exited for some reason without ever creating the socket. You > > > > > can mitigate this by using 'kill($PID, 0)' in the loop and looking > > > > > for -ERSCH, but this only works if you know the pid involved." > > > > > > > > > > Daniel also suggested: > > > > > "For the tests, passing a pre-opened UNIX socket FD could work" > > > > > > > > > > Note we can not use any of the standard chardev options to specify such a socket, > > > > > because the cpr socket is created before chardevs are created. > > > > > > > > > > Perhaps we could specify the fd in an extension of the MigrationChannel MigrationAddress. > > > > > { 'union': 'MigrationAddress', > > > > > 'base': { 'transport' : 'MigrationAddressType'}, > > > > > 'discriminator': 'transport', > > > > > 'data': { > > > > > 'socket': 'SocketAddress', > > > > > 'exec': 'MigrationExecCommand', > > > > > 'rdma': 'InetSocketAddress', > > > > > 'file': 'FileMigrationArgs', > > > > > 'fd': 'FdMigrationArgs' } } <-- add this > > > > > > > > > > That would be useful for all clients, but this is asking a lot from you, > > > > > when you are just trying to fix the tests. > > > > > > > > Note, 'SocketAddress' already has an option for declaring a FD that > > > > represents a socket. > > > > > > Yes, but if I understand, you proposed passing an fd that represents a > > > pre-listened socket, which requires target qemu to accept() first. The > > > existing FdSocketAddress is ready to read. We could add a boolean to enable > > > the new behavior. > > > > It can do both actually - it depends on what APIs the QEMU uses the > > SocketAddress with. > > > > If it is used with qio_channel_socket_connect* the FD must be an > > active peer connection. > > > > If it is used with qio_channel_socket_listen*/qio_net_listener* the > > FD must be listener socket. > > Fair enough. cpr currently listens here, and we could add a case for the FD: > > QEMUFile *cpr_transfer_input(MigrationChannel *channel, Error **errp) > { > MigrationAddress *addr = channel->addr; > > if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET && > addr->u.socket.type == SOCKET_ADDRESS_TYPE_UNIX) { > ... > g_autoptr(QIONetListener) listener = qio_net_listener_new(); > > Or to use my socketpair() suggestion, that function would also need changes, > calling qio_channel_socket_connect. > > Which do you think is better for clients -- socketpair or pre-listened? Please just use the existing SocketAddress functionality, as that's used throughout QEMU - a special case with socketpair for migration is not desirable. The SocketAddress stuff is what libvirt's used for many years now to address the race condition with QMP listeners. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|