linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Sandipan Das <sandipan.das@amd.com>, dongli.zhang@oracle.com
Cc: linux-perf-users@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org
Subject: Re: Small question about reserved bits in MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR
Date: Tue, 17 Sep 2024 08:54:04 -0400	[thread overview]
Message-ID: <36f601823359ed6d694d42c6c79e11a0403b0da3.camel@redhat.com> (raw)
In-Reply-To: <04a91009-c160-4920-a5d0-81a8e1e7cf97@amd.com>

On Tue, 2024-09-17 at 11:22 +0530, Sandipan Das wrote:
> On 9/17/2024 2:11 AM, dongli.zhang@oracle.com wrote:
> > On 9/16/24 11:54 AM, Maxim Levitsky wrote:
> > > Hi!
> > > 
> > > We recently saw a failure in one of the aws VM instances that causes the following error during the guest boot:
> > > 
> > >  0.480051] unchecked MSR access error: WRMSR to 0xc0000302 (tried to write 0x040000000000001f) at rIP: 0xffffffff96c093e2 (amd_pmu_cpu_reset.constprop.0+0x42/0x80)
> > > 
> > > 
> > > I investigated the issue and I see that the hypervisor does expose PerfmonV2, but not the LBRv2 support:
> > > 
> > > #  cpuid -1 -l 0x80000022 
> > > CPU:
> > >    Extended Performance Monitoring and Debugging (0x80000022):
> > >       AMD performance monitoring V2         = true
> > >       AMD LBR V2                            = false
> > >       AMD LBR stack & PMC freezing          = false
> > >       number of core perf ctrs              = 0x5 (5)
> > >       number of LBR stack entries           = 0x0 (0)
> > >       number of avail Northbridge perf ctrs = 0x0 (0)
> > >       number of available UMC PMCs          = 0x0 (0)
> > >       active UMCs bitmask                   = 0x0
> > > 
> 
> That's expected. LBRv2 is currently not available to KVM guests. However, PerfMonV2 should be the
> only feature bit required to indicate the availability of MSRs 0xc0000300..0xc0000303
> 
> > > I also verified that I can write 0x1f to 0xc0000302 but not 0x040000000000001f:
> > > 
> > > # wrmsr 0xc0000302 0x1f
> > > # wrmsr 0xc0000302 0x040000000000001f
> > > wrmsr: CPU 0 cannot set MSR 0xc0000302 to 0x040000000000001f
> > > #
> > > 
> > > The AMD's APM is not clear on what should happen if unsupported bits are attempted to be cleared
> > > using this MSR.
> > > 
> > > Also I noticed that amd_pmu_v2_handle_irq writes 0xffffffffffffffff to this msr.
> > > It has the following code:
> > > 
> > > 
> > > 	WARN_ON(status > 0);
> > > 
> > > 	/* Clear overflow and freeze bits */
> > > 	amd_pmu_ack_global_status(~status);
> > > 
> > > 
> > > This implies that it is OK to set all bits in this MSR.
> > > 
> 
> It is, but writes to the reserved bits are ignored.
> 
> > To share my data point on QEMU+KVM: I am not able to reproduce with the most
> > recent QEMU (not AWS) + below patch.
> > 
> > [PATCH v2 2/4] i386/cpu: Add PerfMonV2 feature bit
> > https://lore.kernel.org/all/69905b486218f8287b9703d1a9001175d04c2f02.1723068946.git.babu.moger@amd.com/
> > 
> > Both my VM and KVM are 6.10.
> > 
> > vm# cpuid -1 -l 0x80000022
> > CPU:
> >    Extended Performance Monitoring and Debugging (0x80000022):
> >       AMD performance monitoring V2         = true
> >       AMD LBR V2                            = false
> >       AMD LBR stack & PMC freezing          = false
> >       number of core perf ctrs              = 0x6 (6)
> >       number of LBR stack entries           = 0x0 (0)
> >       number of avail Northbridge perf ctrs = 0x0 (0)
> >       number of available UMC PMCs          = 0x0 (0)
> >       active UMCs bitmask                   = 0x0
> > 
> > 
> > Both writes are passed.
> > 
> > vm# wrmsr 0xc0000302 0x1f
> > vm# wrmsr 0xc0000302 0x040000000000001f
> > 
> > Here is bcc output. Both writes are good.
> > 
> > kvm# /usr/share/bcc/tools/trace -t -C 'kvm_pmu_set_msr "%x", retval'
> > ... ...
> > 4.748614 19  43545   43550   CPU 0/KVM       kvm_pmu_set_msr  0
> > 10.97396 19  43545   43550   CPU 0/KVM       kvm_pmu_set_msr  0
> > 
> 
> Thanks for testing. I cannot replicate this either with an upstream kernel.


Hi,

I also tested on bare metal Zen4 system just now, and I also see that MSR 0xc0000302 can be set
to any value.

So this is a hypervisor bug, I'll report it to AWS.

Best regards,
	Maxim Levitsky

> 
> - Sandipan
> 



  reply	other threads:[~2024-09-17 12:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-16 18:54 Small question about reserved bits in MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR Maxim Levitsky
2024-09-16 20:41 ` dongli.zhang
2024-09-17  5:52   ` Sandipan Das
2024-09-17 12:54     ` Maxim Levitsky [this message]
2024-11-11 16:57       ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36f601823359ed6d694d42c6c79e11a0403b0da3.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=dongli.zhang@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sandipan.das@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).