From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA7D9C433E0 for ; Tue, 26 Jan 2021 09:19:21 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 67767206E5 for ; Tue, 26 Jan 2021 09:19:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 67767206E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:Message-ID:References:In-Reply-To:Subject:To:From: Date:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=AlLlCdi4piJGq/nEs7WsLIZUNVq9qrvZzc4y38U5GSk=; b=eV0BATw5uA8Kc5m8nsELXVPtU NJU8mUAqoTdquR0+lvQ3knM+heQM9Ja5dxHmS/nX5vnP3gKC8uiO1e7Li/l/0x0WYpma8UJ81bCwE shDmptfqISEIdYzN99ca4DGtdDBrhfTm6GVBA651lEbyCb6Vq1OEBH/qMzqJdXXH9826XVVGTbPgE 87Wcpt4tnylLvxuNVx1Gt8MfYFLvAG5H/mSlSnYjrLaTcBiiS3KEIa48fxJfR45KNfA5vYM558QNt 6XQGqblAWRhvrodxMX0uR8T7hguNmdf/1DZJ3s1y0n3Mi8JX882344WpTzfGIY5DsTTSzLqUh/qQi BYg3jCxUg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l4KUC-00050Y-NJ; Tue, 26 Jan 2021 09:18:08 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l4KU7-0004zd-NA for linux-arm-kernel@lists.infradead.org; Tue, 26 Jan 2021 09:18:06 +0000 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B71FC206E5; Tue, 26 Jan 2021 09:18:02 +0000 (UTC) Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1l4KU4-00A4WZ-Ji; Tue, 26 Jan 2021 09:18:00 +0000 MIME-Version: 1.0 Date: Tue, 26 Jan 2021 09:18:00 +0000 From: Marc Zyngier To: Jianyong Wu Subject: Re: [PATCH] arm64/kvm: correct the error report in kvm_handle_guest_abort In-Reply-To: References: <20210115093028.6504-1-jianyong.wu@arm.com> <6f5a2ce458e33f879635f45140293517@kernel.org> User-Agent: Roundcube Webmail/1.4.10 Message-ID: <73b606d9c49e05435113a40a534af133@kernel.org> X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: Jianyong.Wu@arm.com, James.Morse@arm.com, will@kernel.org, Suzuki.Poulose@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, Steve.Capper@arm.com, Justin.He@arm.com, nd@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210126_041803_928615_8CA7440D X-CRM114-Status: GOOD ( 34.54 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Justin He , Steve Capper , Suzuki Poulose , James Morse , nd , will@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2021-01-26 08:10, Jianyong Wu wrote: > Hi Marc, > >> /* >> - * Check for a cache maintenance operation. Since we >> - * ended-up here, we know it is outside of any memory >> - * slot. But we can't find out if that is for a device, >> - * or if the guest is just being stupid. The only thing >> - * we know for sure is that this range cannot be cached. >> + * Check for a cache maintenance operation. Two cases: >> * >> - * So let's assume that the guest is just being >> - * cautious, and skip the instruction. >> + * - It is outside of any memory slot. But we can't >> + * find out if that is for a device, or if the guest >> + * is just being stupid. The only thing we know for >> + * sure is that this range cannot be cached. So >> + * let's assume that the guest is just being >> + * cautious, and skip the instruction. >> + * >> + * - Otherwise, clean/invalidate the whole memslot. We >> + * should special-case DC IVAC and inject a >> + * permission fault, but we can't really identify it >> + * in this context. >> */ >> - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) >> { >> + if (kvm_vcpu_dabt_is_cm(vcpu)) { >> + if (!kvm_is_error_hva(hva)) { >> + spin_lock(&vcpu->kvm->mmu_lock); >> + stage2_flush_memslot(vcpu->kvm, >> memslot); > > Maybe we should not flush the whole memslot here as every "dc ivac" > only work on a range of memory with cache line size. So what about > using: > stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_flush) instead. It will a bit > faster than flush the whole memslot. > Also I test your idea of "unmap after flush" using: > stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_flush); > stage2_apply_range(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_unmap, true); > then I do "dc ivac" on the rom of 128M twice and got the double time > of around 11s. it means that there is no optimization when do the > second "dc ivac". > I'm not sure if there is something wrong in my test. > > So what about just using " stage2_apply_range_resched(vcpu->kvm, > fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);" > instead of > " stage2_flush_memslot(vcpu->kvm, memslot);" and let the guest bears > the disadvantage of low performance. No, both solutions are wrong. I had to write my own test case because the idea of hacking some unknown guest wasn't very appealing. At the end of the day, what we need to implement is as follow: - if a CMO hits normal memory, it's all already handled - if a CMO hits non-memory, we skip it, as we do today - if a CMO hits R/O memory, that's where things become fun: - if it is a DC IVAC, the architecture says this should result in a permission fault - if it is a DC CIVAC, it works as expected So we need to distinguish between IVAC and CIVAC. One way to do it is to treat CMOs generating a translation fault as a *read*, even when they are on a RO memslot. If they come back with a permission fault: - inside a RW memslot: no problem, treat it as a write - inside a RO memslot: only IVAC will fault here, inject an abort in the guest This translates into the following patch, which does the trick for me. M. From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Sat, 16 Jan 2021 10:53:21 +0000 Subject: [PATCH] CMO on RO memslot Signed-off-by: Marc Zyngier --- arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7d2257cc5438..3c176b5b0a28 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_pgtable *pgt; fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); - write_fault = kvm_is_write_fault(vcpu); + /* + * Treat translation faults on CMOs as read faults. Should + * this further generate a permission fault, it will be caught + * in kvm_handle_guest_abort(), with prejudice... + */ + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu)) + write_fault = false; + else + write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) } /* - * Check for a cache maintenance operation. Since we - * ended-up here, we know it is outside of any memory - * slot. But we can't find out if that is for a device, - * or if the guest is just being stupid. The only thing - * we know for sure is that this range cannot be cached. + * Check for a cache maintenance operation. Two cases: + * + * - It is outside of any memory slot. But we can't find out + * if that is for a device, or if the guest is just being + * stupid. The only thing we know for sure is that this + * range cannot be cached. So let's assume that the guest + * is just being cautious, and skip the instruction. + * + * - Otherwise, check whether this is a permission fault. + * If so, that's a DC IVAC on a R/O memslot, which is a + * pretty bad idea, and we tell the guest so. * - * So let's assume that the guest is just being - * cautious, and skip the instruction. + * - If this wasn't a permission fault, pass it along for + * further handling (including faulting the page in if it + * was a translation fault). */ - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { - kvm_incr_pc(vcpu); - ret = 1; - goto out_unlock; + if (kvm_vcpu_dabt_is_cm(vcpu)) { + if (kvm_is_error_hva(hva)) { + kvm_incr_pc(vcpu); + ret = 1; + goto out_unlock; + } + + if (fault_status == FSC_PERM) { + /* DC IVAC on a R/O memslot */ + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); + ret = 1; + goto out_unlock; + } + + goto handle_access; } /* @@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } +handle_access: /* Userspace should not be able to register out-of-bounds IPAs */ VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm)); -- 2.29.2 -- Jazz is not dead. It just smells funny... _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel