From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F9ACEE6432 for ; Fri, 15 Sep 2023 07:42:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232553AbjIOHmg (ORCPT ); Fri, 15 Sep 2023 03:42:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232125AbjIOHmg (ORCPT ); Fri, 15 Sep 2023 03:42:36 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CF4872126 for ; Fri, 15 Sep 2023 00:41:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1694763688; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ovJMK9GrilbjwX/0y7+mnCgJGmzw3Nr5NlskEa/lZDc=; b=Pm2fwBW07tuvk98NyQrmqHlqmzj3kdwmkw0VAQBNlJNwaDCoTN/35HKMBdYlBrSGj9aw7p YMWbkEZ/2pC6P4N+3J42H3wq62c9pqqJPZcKw7oVakHQd3GQ3sCQt0Y6OAiB9xhAyZIuau Ct6ou6jfrPqI4n4sq3dpXLyRjgBPeyQ= Received: from mail-oo1-f70.google.com (mail-oo1-f70.google.com [209.85.161.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-115-VF-xkYlVM1mBcBKqgPmUGA-1; Fri, 15 Sep 2023 03:41:27 -0400 X-MC-Unique: VF-xkYlVM1mBcBKqgPmUGA-1 Received: by mail-oo1-f70.google.com with SMTP id 006d021491bc7-57698a90356so2440619eaf.0 for ; Fri, 15 Sep 2023 00:41:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694763686; x=1695368486; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ovJMK9GrilbjwX/0y7+mnCgJGmzw3Nr5NlskEa/lZDc=; b=X5H0VQalzgG4Kcq/utn2DrINdeyr150bRUQq1Rd2M3sIkqwneQw6l5TrREf2cGro60 IoAnrc9Y3g9AFLm/z7lqAqY5dqboUCK+o/+jcEeSyKNC118tyWKuWzwZBdBLKVgfzU2I F6Od204pcwq4VBIthg5gwri97z5vUjhpcEptjx24Sg8pPRbylsNHfXnz7fkbWrg1wWzP cyJcUPSAH99sePr9i8+F7/baXma/qeIoFYfZxkNKtKLX1jf1bXAjn7ED8XME9o26BwOp RSnuWqsUzl9O+x/1PVEhdz5OaWZP/m5AevPmdX6CIB9TnsdXLotJIgoYoYuUf+IZHCa6 G7uQ== X-Gm-Message-State: AOJu0Yz/qHZVD6kt9W4m1+PEikxtF3TkSczFtisBDLpzHzPlioTFr/tr Aum0jg7nyx+PaI+0vUFmQSeGwX0bz50BJoo++w/IbW1z6icFmN/ZKhw4wZ+Lw4x91r5MOy7Kz26 qrrNRMeF1FdZjYrLM X-Received: by 2002:a4a:7652:0:b0:573:f620:ec80 with SMTP id w18-20020a4a7652000000b00573f620ec80mr921810ooe.2.1694763686662; Fri, 15 Sep 2023 00:41:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGZ36StU+OLLbiTLn2e1HMORaLyCJT5Ubwlj6Yrdptj0Bnukx6gd9OIIFfyZaGbk53Jz3ZDiQ== X-Received: by 2002:a4a:7652:0:b0:573:f620:ec80 with SMTP id w18-20020a4a7652000000b00573f620ec80mr921799ooe.2.1694763686457; Fri, 15 Sep 2023 00:41:26 -0700 (PDT) Received: from redhat.com ([2804:1b3:a803:4ff9:7c29:fe41:6aa7:43df]) by smtp.gmail.com with ESMTPSA id v6-20020a9d7d06000000b006b9cbad68a8sm1399416otn.30.2023.09.15.00.41.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 00:41:25 -0700 (PDT) Date: Fri, 15 Sep 2023 04:41:20 -0300 From: Leonardo Bras To: Tyler Stachecki Cc: Dongli Zhang , kvm@vger.kernel.org, seanjc@google.com, pbonzini@redhat.com, dgilbert@redhat.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, bp@alien8.de, Tyler Stachecki , stable@vger.kernel.org Subject: Re: [PATCH] x86/kvm: Account for fpstate->user_xfeatures changes Message-ID: References: <20230914010003.358162-1-tstachecki@bloomberg.net> <93592292-ab7e-71ac-dd72-74cc76e97c74@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On Thu, Sep 14, 2023 at 08:58:42PM -0400, Tyler Stachecki wrote: > On Thu, Sep 14, 2023 at 10:05:57AM -0700, Dongli Zhang wrote: > > That is: > > > > 1. Without the commit (src and dst), something bad may happen. > > > > 2. With the commit on src, issue is fixed. > > > > 3. With the commit only dst, it is expected that issue is not fixed. > > > > Therefore, from administrator's perspective, the bugfix should always be applied > > no the source server, in order to succeed the migration. > > I fully agree. Though, I think this boils down to: > The commit must be on the source or something bad may happen. > > It then follows that you cannot live-migrate guests off the source to patch it > without potentially corrupting the guests currently running on that source... Well, the bug was a real bad issue, and even the solution does not solve all problems. As we discussed, there is no way of safely removing any feature from the guest without potential issues. One potential solution would be having hosts that implement the missing guest features needed for the VMs, but this may be far from easy depending on the missing feature. Other than that, all I can think of is removing the features from guest: As you commented, there may be some features that would not be a problem to be removed, and also there may be features which are not used by the workload, and could be removed. But this would depend on the feature, and the workload, beind a custom solution for every case. For this (removing guest features), from kernel side, I would suggest using SystemTap (and eBPF, IIRC). The procedures should be something like: - Try to migrate VM from host with older kernel: fail - Look at qemu error, which features are missing? - Are those features safely removable from guest ? - If so, get an SystemTap / eBPF script masking out the undesired bits. - Try the migration again, it should succeed. IIRC, this could also be done in qemu side, with a custom qemu: - Try to migrate VM from host with older kernel: fail - Look at qemu error, which features are missing? - Are those features safely removable from guest ? - If so, get a custom qemu which mask-out the desired flags before the VM starts - Live migrate (can be inside the source host) to the custom qemu - Live migrate from custom qemu to target host. - The custom qemu could be on a auxiliary host, and used only for this Yes, it's hard, takes time, and may not solve every case, but it gets a higher chance of the VM surviving in the long run. But keep in mind this is a hack. Taking features from a live guest is not supported in any way, and has a high chance of crashing the VM. Best regards, Leo > > Regards, > Tyler >