From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8B59C4360F for ; Fri, 5 Apr 2019 11:31:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 971232186A for ; Fri, 5 Apr 2019 11:31:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730755AbfDELbH (ORCPT ); Fri, 5 Apr 2019 07:31:07 -0400 Received: from mail-wr1-f67.google.com ([209.85.221.67]:45070 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730642AbfDELbH (ORCPT ); Fri, 5 Apr 2019 07:31:07 -0400 Received: by mail-wr1-f67.google.com with SMTP id s15so7505863wra.12 for ; Fri, 05 Apr 2019 04:31:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=fTBOSUKpnb4iZC2klHF3NUVyOEn0t9VM8wGyJuSdIy8=; b=pc/wtmFO8mogPzxZTYyeK0Yx0QKaloV2nFGmsXOh6VcmbT+Q73ZTjPu01PcN5JYIQe wRcIUTc1fVKfB6iSiJrozMBvyBYJ4HefixyAlTB+B/Uli4q8eLTli0ApwQ7oZ8sfHiSI 6Y8deLD4RsTVs1TqOT5qlqb0xTFlonzC3k53Wa3s9jH9XZv4ed5mI4V34GRDykT/MJvp sH0Qd83CwQfLjCN2zFtl+JsJN08/Q+btCC70bwTdHAVO0e2NWuKOg7/vJcmvYL7X94kY 7axZ+3TIij5LFfjz3YMQVWzoHASu+YgPnGULkNAT2yIdeAJ2uMX73sCcQzEi55aSCMaN WRIQ== X-Gm-Message-State: APjAAAV1acdFBn6haf2wxs8QAcrRlu+wxquWqMCgGMc8nu0b46kFRLLn LVGRbKY8XtGF7Tm5uIf85LUztQ== X-Google-Smtp-Source: APXvYqyJQZLVhgZzXy/XWh1fpOWWyB0nheoKBglixP6Qy62FKpVOi6lD5zuIcn1Q1VOF2xT804tUzw== X-Received: by 2002:adf:eb02:: with SMTP id s2mr8649900wrn.29.1554463864618; Fri, 05 Apr 2019 04:31:04 -0700 (PDT) Received: from vitty.brq.redhat.com (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id n13sm25207808wrw.67.2019.04.05.04.31.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 05 Apr 2019 04:31:03 -0700 (PDT) From: Vitaly Kuznetsov To: Maya Nakamura , mikelley@microsoft.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org Cc: x86@kernel.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/6] x86: hv: hv_init.c: Replace alloc_page() with kmem_cache_alloc() In-Reply-To: References: Date: Fri, 05 Apr 2019 13:31:02 +0200 Message-ID: <87wok8it8p.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-hyperv-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-hyperv@vger.kernel.org Maya Nakamura writes: > Switch from the function that allocates a single Linux guest page to a > different one to use a Hyper-V page because the guest page size and > hypervisor page size concepts are different, even though they happen to > be the same value on x86. > > Signed-off-by: Maya Nakamura > --- > arch/x86/hyperv/hv_init.c | 38 +++++++++++++++++++++++++++----------- > 1 file changed, 27 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index e4ba467a9fc6..5f946135aa18 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #ifdef CONFIG_HYPERV_TSCPAGE > > @@ -98,18 +99,20 @@ EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg); > u32 hv_max_vp_index; > EXPORT_SYMBOL_GPL(hv_max_vp_index); > > +struct kmem_cache *cachep; > +EXPORT_SYMBOL_GPL(cachep); > + > static int hv_cpu_init(unsigned int cpu) > { > u64 msr_vp_index; > struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; > void **input_arg; > - struct page *pg; > > input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - pg = alloc_page(GFP_KERNEL); > - if (unlikely(!pg)) > + *input_arg = kmem_cache_alloc(cachep, GFP_KERNEL); I'm not sure use of kmem_cache is justified here: pages we allocate are not cache-line and all these allocations are supposed to persist for the lifetime of the guest. In case you think that even on x86 it will be possible to see PAGE_SIZE != HV_HYP_PAGE_SIZE you can use alloc_pages() instead. Also, in case the idea is to generalize stuff, what will happen if PAGE_SIZE > HV_HYP_PAGE_SIZE? Who will guarantee proper alignment? I think we can leave hypercall arguments, vp_assist and similar pages alone for now: the code is not going to be shared among architectures anyways. > + > + if (unlikely(!*input_arg)) > return -ENOMEM; > - *input_arg = page_address(pg); > > hv_get_vp_index(msr_vp_index); > > @@ -122,14 +125,12 @@ static int hv_cpu_init(unsigned int cpu) > return 0; > > if (!*hvp) > - *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL); > + *hvp = kmem_cache_alloc(cachep, GFP_KERNEL); > > if (*hvp) { > u64 val; > > - val = vmalloc_to_pfn(*hvp); > - val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) | > - HV_X64_MSR_VP_ASSIST_PAGE_ENABLE; > + val = virt_to_phys(*hvp) | HV_X64_MSR_VP_ASSIST_PAGE_ENABLE; > > wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val); > } > @@ -233,17 +234,22 @@ static int hv_cpu_die(unsigned int cpu) > unsigned long flags; > void **input_arg; > void *input_pg = NULL; > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; > > local_irq_save(flags); > input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > input_pg = *input_arg; > *input_arg = NULL; > local_irq_restore(flags); > - free_page((unsigned long)input_pg); > + kmem_cache_free(cachep, input_pg); > + input_pg = NULL; > > if (hv_vp_assist_page && hv_vp_assist_page[cpu]) > wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0); > > + kmem_cache_free(cachep, *hvp); > + *hvp = NULL; > + > if (hv_reenlightenment_cb == NULL) > return 0; > > @@ -325,6 +331,11 @@ void __init hyperv_init(void) > goto free_vp_index; > } > > + cachep = kmem_cache_create("hyperv_pages", HV_HYP_PAGE_SIZE, > + HV_HYP_PAGE_SIZE, 0, NULL); > + if (!cachep) > + goto free_vp_assist_page; > + > cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online", > hv_cpu_init, hv_cpu_die); > if (cpuhp < 0) > @@ -338,7 +349,10 @@ void __init hyperv_init(void) > guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0); > wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id); > > - hv_hypercall_pg = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RX); > + hv_hypercall_pg = kmem_cache_alloc(cachep, GFP_KERNEL); > + if (hv_hypercall_pg) > + set_memory_x((unsigned long)hv_hypercall_pg, 1); _RX is not writeable, right? > + > if (hv_hypercall_pg == NULL) { > wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); > goto remove_cpuhp_state; > @@ -346,7 +360,8 @@ void __init hyperv_init(void) > > rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > hypercall_msr.enable = 1; > - hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg); > + hypercall_msr.guest_physical_address = virt_to_phys(hv_hypercall_pg) >> > + HV_HYP_PAGE_SHIFT; > wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > > hv_apic_init(); > @@ -416,6 +431,7 @@ void hyperv_cleanup(void) > * let hypercall operations fail safely rather than > * panic the kernel for using invalid hypercall page > */ > + kmem_cache_free(cachep, hv_hypercall_pg); Please don't do that: hyperv_cleanup() is called on kexec/kdump and we're trying to do the bare minimum to allow next kernel to boot. Doing excessive work here will likely lead to consequent problems (we're already crashing the case it's kdump!). > hv_hypercall_pg = NULL; > > /* Reset the hypercall page */ -- Vitaly