From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE614C433FE for ; Mon, 28 Nov 2022 17:58:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234052AbiK1R6s (ORCPT ); Mon, 28 Nov 2022 12:58:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234099AbiK1R6T (ORCPT ); Mon, 28 Nov 2022 12:58:19 -0500 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9F093D926 for ; Mon, 28 Nov 2022 09:46:28 -0800 (PST) Received: by mail-pj1-x102a.google.com with SMTP id v3-20020a17090ac90300b00218441ac0f6so12956213pjt.0 for ; Mon, 28 Nov 2022 09:46:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Yt+6ZsFhHbg9VXiMxe9kvr7coxEz5653nTiwyPOx528=; b=M602G5EP/uN3K4grhCY6Shfqyo+fxipZ4KWTTuEwvKP/eqEFyLVW8NRu3X89MxlYzY 2YxMA/Xzs8zAJzyf6jPcLQ4GfFKkYV0x4ISr2XKqchK3Jx02Ax36vhA7NAHk+OlvcL9L dhUWN6THrPOaW9jyniNJVqtSlrT1HM/5Rs38XmlzlfhbjFUlDH1WUN9NFhgs4Woi2QUy dSaL+EXTRy9/9LYpnXYEof4+I3HUdFPAkdSlv4P+catGH0hbvukplczpguDKxoYPrmqE tG2tILO4kQFHaOVGFjY3QTXfDjAVfsUp3fXL8P/5duSeu2XyVbr48YvIGFiWBkYFgEMf SPQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Yt+6ZsFhHbg9VXiMxe9kvr7coxEz5653nTiwyPOx528=; b=FofJfLkmLEcVRLyhSuBWHnckodgeRxfWAkOxwAjTmGykQ1PUFcXz3LmiUKmoZvXLXY JBUL6iCIkbTcgxAi7RObVebCEUeMmKRhrqvCgA/5nLNnxFTOfF7Qk+NVICrh53dQqpXh YEKo4ES9gM1FO2Ka16tqtTyC7RzoecGVrW/ZzyuCc934I4MhP3keAaNo1j1ZjT08ELoR 6YaI9Rj3EEXiw7QW7JmE6sDV9TTfH3K31YWs4sgJk7bm2ciXP5kvz5z/6KHAwOnxHWUL zhKsa6XtxhL6kMjkEQp2V0zeYTE0V0ShkGs+UPqdWJIPM/ngYhlCfPTfJ9xM5nL/vjKW etUw== X-Gm-Message-State: ANoB5pkLmBX4OVgbRLbpLiSZInngXyUYXkDFl35zwyQftksQx1tFZUG4 0tH09GuemE7LyxonGb6SMJNUcgzY8utAeA== X-Google-Smtp-Source: AA0mqf6RR+CKCljliuYRRA/Xk2TOOtJ2SX5gEyk55ktJ9yFGoMm9S7KlVOyhhodEO8xdzKB+0DNV8w== X-Received: by 2002:a17:902:7087:b0:189:4de4:e71b with SMTP id z7-20020a170902708700b001894de4e71bmr25875312plk.159.1669657578962; Mon, 28 Nov 2022 09:46:18 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id u10-20020a170902e5ca00b00189371b5971sm9176020plf.220.2022.11.28.09.46.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 09:46:18 -0800 (PST) Date: Mon, 28 Nov 2022 17:46:15 +0000 From: Sean Christopherson To: Mingwei Zhang Cc: Paolo Bonzini , "H. Peter Anvin" , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Nagareddy Reddy , Jim Mattson , David Matlack Subject: Re: [RFC PATCH v3 2/2] KVM: x86/mmu: replace BUG() with KVM_BUG() in shadow mmu Message-ID: References: <20221128002043.1555543-1-mizhang@google.com> <20221128002043.1555543-3-mizhang@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221128002043.1555543-3-mizhang@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 28, 2022, Mingwei Zhang wrote: > Replace BUG() in pte_list_remove() with KVM_BUG() to avoid crashing the > host. MMU bug is difficult to discover due to various racing conditions and > corner cases and thus it extremely hard to debug. The situation gets much > worse when it triggers the shutdown of a host. Host machine crash > eliminates everything including the potential clues for debugging. > > From cloud computing service perspective, BUG() or BUG_ON() is probably no > longer appropriate as the host reliability is top priority. I don't think we need to bring "cloud computing" into this. Linus has made it clear over and over and over that BUG() / BUG_ON() need to be avoided unless the alternative is worse. E.g. the BUG() in __handle_changed_spte() is warranted because the alternative is silent corruption of guest data. > Crashing the physical machine is almost never a good option as it eliminates > innocent VMs and cause service outage in a larger scope. Even worse, if > attacker can reliably triggers this code by diverting the control flow or > corrupting the memory, Or if there's a KVM bug, which is waaaaay more likely. > then this becomes vm-of-death attack. This is true of any BUG(), and really of any unexpected fault while holding a spinlock, e.g. NULL pointer derefs in the MMU are almost always fatal as well. > This is a huge attack vector to cloud providers, as the death of one single > host machine is not the end of the story. Without manual interferences, a > failed cloud job may be dispatched to other hosts and continue host crashes > until all of them are dead. > > Because of the above reasons, shrink the scope of crash to the target VM > only. KVM_BUG() and KVM_BUG_ON() requires a valid struct kvm which requires > extra plumbing. Avoid it in this version by just using > kvm_get_running_vcpu()->kvm instead. Stale comment. > Cc: Nagareddy Reddy > Cc: Jim Mattson > Cc: David Matlack > Signed-off-by: Mingwei Zhang > --- > arch/x86/kvm/mmu/mmu.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index b5a44b8f5f7b..e132d82ab4c0 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -956,12 +956,12 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, > > if (!rmap_head->val) { > pr_err("%s: %p 0->BUG\n", __func__, spte); These probably need to be ratelimited (or "once"). Bugging the VM will prevent doing anything useful with the VM, but KVM still needs to destroy the VM, which means zapping SPTEs and purging the rmaps. Theoretically, there could be thousands of broken rmaps. > - BUG(); > + KVM_BUG(true, kvm, ""); If you don't want to provide a message, use KVM_BUG_ON(), not an empty message. Though my vote would be to fold the existing pr_err() messages into KVM_BUG(), which would make the WARN much more helpful and would address the pr_err() issue above. The __func__ printing can also go away in that case because the stack track will provide all the necessary info. The only reason not to drop the pr_err() entirely is if a ratelimited message is helpful for debugging failures that occur in production, which I doubt it true. And rather than pass "true", wrap the actual check with the KVM_BUG(). > } else if (!(rmap_head->val & 1)) { > rmap_printk("%p 1->0\n", spte); > if ((u64 *)rmap_head->val != spte) { > pr_err("%s: %p 1->BUG\n", __func__, spte); > - BUG(); > + KVM_BUG(true, kvm, ""); KVM needs to return here, otherwise KVM is knowingly writing a garbage pointer, e.g. will corrupt memory or trigger a fault. > } > rmap_head->val = 0; > } else { Something like this? --- arch/x86/kvm/mmu/mmu.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b5a44b8f5f7b..12790ccb8731 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -954,15 +954,16 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, struct pte_list_desc *prev_desc; int i; - if (!rmap_head->val) { - pr_err("%s: %p 0->BUG\n", __func__, spte); - BUG(); - } else if (!(rmap_head->val & 1)) { + if (KVM_BUG(!rmap_head->val, kvm, "rmap for %p is empty", spte)) + return; + + if (!(rmap_head->val & 1)) { rmap_printk("%p 1->0\n", spte); - if ((u64 *)rmap_head->val != spte) { - pr_err("%s: %p 1->BUG\n", __func__, spte); - BUG(); - } + + if (KVM_BUG((u64 *)rmap_head->val != spte, kvm, + "single rmap for %p doesn't match", spte)) + return; + rmap_head->val = 0; } else { rmap_printk("%p many->many\n", spte); @@ -979,8 +980,7 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, prev_desc = desc; desc = desc->more; } - pr_err("%s: %p many->many\n", __func__, spte); - BUG(); + KVM_BUG(true, kvm, "no rmap for %p (many->many)", spte); } } base-commit: d74237e747db7f9f27e821e6683d58185e846378 --