From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3EDEEE49AA for ; Mon, 21 Aug 2023 16:12:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236231AbjHUQMS (ORCPT ); Mon, 21 Aug 2023 12:12:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236220AbjHUQMR (ORCPT ); Mon, 21 Aug 2023 12:12:17 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F24C712B for ; Mon, 21 Aug 2023 09:11:53 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-56385c43eaeso3738787a12.1 for ; Mon, 21 Aug 2023 09:11:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692634309; x=1693239109; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=u3QjzztrdUek9NmbsnBxXwIbW573YOZ8AspNggbaDLI=; b=orQ0mERyXnuswm1Zmd5V0G2Yd7o2mRzbpRQ7ehInterfzMqEr/DJqjkfLX2fLO8nb1 P4+xMkEb20m4pfjtjTdl5wdE4FS9QBsEODvaFJtNedeTxKJXtW80WVogUppoc8CR6PQW ct5vvuGn+2UjNqdutlSSVWqL5+0wcs/RKZWVH6s2pJ0PC+pCxv7SOGmFNfNNnhuC5+e+ 8W9b8zzdeqle2AdclPRpkGE7V234Q3wB/BbTtvmm2oNECNKY8GkQEjpxWXuZblQ071aI 1Y/WiFg0fwvKhPhN6kl4rXSH/ZydK0sKD/yajDyMXvjkrDFwpoSk4TFwDnMI447FuN2s LXnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692634309; x=1693239109; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=u3QjzztrdUek9NmbsnBxXwIbW573YOZ8AspNggbaDLI=; b=PK6ocjCYxL2tYuiPafW8Cv5pJdCpekjCpBebjClRV2+8Uj1+bTdQXHb0rfxWopYH9p C5aeEjOeGd7H5Sq+0VeSRMSDYaPc0UFFpnALrPIp6qb3T22obGG09EP91XkdfOdI+s/F Y3eWMua+8MJj3qPI7wGMuiGi1L/+yaws8UfiNSC2hx5EbjVA1YNdqWog2wwgA1KjZvwx ztlmPI0kPHzSLoWWlMRZO0MH2fznWzoMTY7W419aP2SsA9xlRheFy6nOGWqyXsgmWZp+ 4y6Gq+2LiSuuZKNvg5sWSojybKDaUWzGZUXbpQ8lBal7Rk4vnk1Oqx4W1LjuvthysS84 bWVA== X-Gm-Message-State: AOJu0Yy8tuLKTLSvQUp4Y5QlL+B+5Oyn1VJ4TWf9jvTjucRrpE4LK84m rezStEQfSFoa/cJJk7cusBwbB2Cjags= X-Google-Smtp-Source: AGHT+IHzfmmumYzE550NGIbVGdL815/OeQTYhHvf+Kl0A5LkJNaTRpp1roY3W2fsxg8AP46Z9G3DSRXLuGg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:754f:0:b0:564:6e43:a00d with SMTP id f15-20020a63754f000000b005646e43a00dmr885355pgn.3.1692634308794; Mon, 21 Aug 2023 09:11:48 -0700 (PDT) Date: Mon, 21 Aug 2023 09:11:46 -0700 In-Reply-To: <33f0e9bb-da79-6f32-f1c3-816eb37daea6@linux.alibaba.com> Mime-Version: 1.0 References: <1692588392-58155-1-git-send-email-hao.xiang@linux.alibaba.com> <6d10dcf7-7912-25a2-8d8e-ef7d71a4ce83@linux.alibaba.com> <33f0e9bb-da79-6f32-f1c3-816eb37daea6@linux.alibaba.com> Message-ID: Subject: Re: [PATCH] kvm: x86: emulate MSR_PLATFORM_INFO msr bits From: Sean Christopherson To: Hao Xiang Cc: Chao Gao , kvm@vger.kernel.org, shannon.zhao@linux.alibaba.com, pbonzini@redhat.com, linux-kernel@vger.kernel.org, Aaron Lewis Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +Aaron When resending a patch, e.g. to change To: or Cc:, tag it RESEND. I got three copies of this... On Mon, Aug 21, 2023, Hao Xiang wrote: > > > On 2023/8/21 18:44, Chao Gao wrote: > > On Mon, Aug 21, 2023 at 05:11:16PM +0800, Hao Xiang wrote: > > > For reason that, > > > > > > The turbo frequency info depends on specific machine type. And the msr value > > > of MSR_PLATFORM_INFO may be diferent on diffrent generation machine. > > > > > > Get following msr bits (needed by turbostat on intel platform) by rdmsr > > > MSR_PLATFORM_INFO directly in KVM is more reasonable. And set these msr bits > > > as vcpu->arch.msr_platform_info default value. > > > -bit 15:8, Maximum Non-Turbo Ratio (MAX_NON_TURBO_LIM_RATIO) > > > -bit 47:40, Maximum Efficiency Ratio (MAX_EFFICIENCY_RATIO) > > > > I don't get why QEMU cannot do this with the existing interface, e.g., > > KVM_SET_MSRS. > > > > will the MSR value be migrated during VM migration? > > > > looks we are in a dilemma. on one side, if the value is migrated, the value can > > become inconsisntent with hardware value. On the other side, changing the ratio > > bits at runtime isn't the architectural behavior. > > > > And the MSR is per-socket. In theory, a system can have two sockets with > > different values of the MSR. what if a vCPU is created on a socket and then > > later runs on the other socket? > > > > Set these msr bits (needed by turbostat on intel platform) in KVM by > default. > Of cource, QEMU can also set MSR value by need. It does not conflict. It doesn't conflict per se, but it's still problematic. By stuffing a default value, KVM _forces_ userspace to override the MSR to align with the topology and CPUID defined by userspace. And if userspace uses KVM's "default" CPUID, or lack thereof, using the underlying values from hardware are all but guaranteed to be wrong. The existing code that sets MSR_PLATFORM_INFO_CPUID_FAULT really should not exist, i.e. KVM shouldn't shouldn't assume userspace wants to expose CPUID faulting to the guest. That particular one probably isn't worth trying to retroactively fix. Ditto for setting MSR_IA32_ARCH_CAPABILITIES; KVM is overstepping, but doing so likely doesn't cause problems. MSR_IA32_PERF_CAPABILITIES is a different story. Setting a non-zero default value is blatantly wrong, as KVM will advertise vPMU features even if userspace doesn't advertise. Aaron is planning on sending a patch for this one (I'm hoping we can get away with retroactively dropping the code without having to add a quirk). *If* we need KVM to expose the ratios to userspace, then the correct way to do so is handle turbo and efficiency ratio information is to by implementing support in kvm_get_msr_feature(), i.e. KVM_GET_MSRS on /dev/kvm. Emphasis on "if", because I would prefer to do nothing in KVM if that information is already surfaced to userspace through other mechanisms in the kernel.