From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E60CBC433EF
	for <linux-kernel@archiver.kernel.org>; Wed,  8 Jun 2022 09:38:06 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235212AbiFHJiD (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 8 Jun 2022 05:38:03 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38352 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235088AbiFHJhe (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 8 Jun 2022 05:37:34 -0400
Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8338152B91
        for <linux-kernel@vger.kernel.org>; Wed,  8 Jun 2022 02:02:08 -0700 (PDT)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 67E25B82507
        for <linux-kernel@vger.kernel.org>; Wed,  8 Jun 2022 09:02:07 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E63EC34116;
        Wed,  8 Jun 2022 09:02:06 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1654678926;
        bh=ciXs0VBQZ43yarfmvG1TJioFSHjE9QKanTgHkc+TAYw=;
        h=Date:From:To:Cc:Subject:In-Reply-To:References:From;
        b=cTuPXMDH7vT+4clVqwyhhvnoVkM7UZ0gAGqvOOGuSdzu506VwaF5d7EL5sazj5+2+
         VXhCFi5cXRE55HyvpFDpz8mxw0zx5eNnFYBIqJBjor7pvzcM6br/Q1i/WywGAAl2ni
         uppoiiplt5poiQGwxrlgv5Gtq+DjW2c7tSqSqNTzP1ZOxZAzyZRH2yioVvKGSTEatR
         t3BR81OdIk9mBzl7MP0ynep7cQ6S4C0RZd5RMexKW+O/2hpvAz5/+ei0RhsmvtQppv
         sAiaq769m2RBCktfi/UkAKycnpS7vWaPDVnhSn9LRb1PrJ1KK9aNViI3rxpzcEihYT
         9SZkrGgJHygfw==
Received: from [104.132.45.110] (helo=wait-a-minute.misterjones.org)
        by disco-boy.misterjones.org with esmtpsa  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        (Exim 4.94.2)
        (envelope-from <maz@kernel.org>)
        id 1nyrZj-00GXzu-HD; Wed, 08 Jun 2022 10:02:03 +0100
Date:   Wed, 08 Jun 2022 10:01:56 +0100
Message-ID: <8735gfr0jf.wl-maz@kernel.org>
From:   Marc Zyngier <maz@kernel.org>
To:     Kalesh Singh <kaleshsingh@google.com>
Cc:     mark.rutland@arm.com, broonie@kernel.org, will@kernel.org,
        qperret@google.com, tabba@google.com, surenb@google.com,
        tjmercier@google.com, kernel-team@android.com,
        James Morse <james.morse@arm.com>,
        Alexandru Elisei <alexandru.elisei@arm.com>,
        Suzuki K Poulose <suzuki.poulose@arm.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Masami Hiramatsu <mhiramat@kernel.org>,
        Alexei Starovoitov <ast@kernel.org>,
        "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Andrew Jones <drjones@redhat.com>,
        Zenghui Yu <yuzenghui@huawei.com>,
        Keir Fraser <keirf@google.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Ard Biesheuvel <ardb@kernel.org>,
        Oliver Upton <oupton@google.com>,
        linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/5] KVM: arm64: Allocate shared stacktrace pages
In-Reply-To: <20220607165105.639716-5-kaleshsingh@google.com>
References: <20220607165105.639716-1-kaleshsingh@google.com>
        <20220607165105.639716-5-kaleshsingh@google.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1
 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
X-SA-Exim-Connect-IP: 104.132.45.110
X-SA-Exim-Rcpt-To: kaleshsingh@google.com, mark.rutland@arm.com, broonie@kernel.org, will@kernel.org, qperret@google.com, tabba@google.com, surenb@google.com, tjmercier@google.com, kernel-team@android.com, james.morse@arm.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, catalin.marinas@arm.com, mhiramat@kernel.org, ast@kernel.org, madvenka@linux.microsoft.com, peterz@infradead.org, drjones@redhat.com, yuzenghui@huawei.com, keirf@google.com, wangkefeng.wang@huawei.com, ardb@kernel.org, oupton@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org
X-SA-Exim-Mail-From: maz@kernel.org
X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, 07 Jun 2022 17:50:46 +0100,
Kalesh Singh <kaleshsingh@google.com> wrote:
> 
> The nVHE hypervisor can use this shared area to dump its stacktrace
> addresses on hyp_panic(). Symbolization and printing the stacktrace can
> then be handled by the host in EL1 (done in a later patch in this series).
> 
> Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h |  1 +
>  arch/arm64/kvm/arm.c             | 34 ++++++++++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/nvhe/setup.c  | 11 +++++++++++
>  3 files changed, 46 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 2e277f2ed671..ad31ac68264f 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -174,6 +174,7 @@ struct kvm_nvhe_init_params {
>  	unsigned long hcr_el2;
>  	unsigned long vttbr;
>  	unsigned long vtcr;
> +	unsigned long stacktrace_hyp_va;
>  };
>  
>  /* Translate a kernel address @ptr into its equivalent linear mapping */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 400bb0fe2745..c0a936a7623d 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -50,6 +50,7 @@ DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
>  DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
>  
>  static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
> +DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stacktrace_page);

Why isn't this static, since the address is passed via the
kvm_nvhe_init_params block?

>  unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
>  DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
>  
> @@ -1554,6 +1555,7 @@ static void cpu_prepare_hyp_mode(int cpu)
>  	tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
>  	params->tcr_el2 = tcr;
>  
> +	params->stacktrace_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stacktrace_page, cpu));
>  	params->pgd_pa = kvm_mmu_get_httbr();
>  	if (is_protected_kvm_enabled())
>  		params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
> @@ -1845,6 +1847,7 @@ static void teardown_hyp_mode(void)
>  	free_hyp_pgds();
>  	for_each_possible_cpu(cpu) {
>  		free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
> +		free_page(per_cpu(kvm_arm_hyp_stacktrace_page, cpu));
>  		free_pages(kvm_arm_hyp_percpu_base[cpu], nvhe_percpu_order());
>  	}
>  }
> @@ -1936,6 +1939,23 @@ static int init_hyp_mode(void)
>  		per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
>  	}
>  
> +	/*
> +	 * Allocate stacktrace pages for Hypervisor-mode.
> +	 * This is used by the hypervisor to share its stacktrace
> +	 * with the host on a hyp_panic().
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		unsigned long stacktrace_page;
> +
> +		stacktrace_page = __get_free_page(GFP_KERNEL);
> +		if (!stacktrace_page) {
> +			err = -ENOMEM;
> +			goto out_err;
> +		}
> +
> +		per_cpu(kvm_arm_hyp_stacktrace_page, cpu) = stacktrace_page;

I have the same feeling as with the overflow stack. This is
potentially a huge amount of memory: on my test box, with 64k pages,
this is a whole 10MB that I give away for something that is only a
debug facility.

Can this somehow be limited? I don't see it being less than a page as
a problem, as the memory is always shared back with EL1 in the case of
pKVM (and normal KVM doesn't have that problem anyway).

Alternatively, this should be restricted to pKVM. With normal nVHE,
the host should be able to parse the EL2 stack directly with some
offsetting. Actually, this is probably the best option.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.