Re: GLibC AMD CPUID cache reporting regression (was Re: qemu-user self emulation broken with default CPU on x86/x64)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Florian Weimer <fweimer@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Pierrick Bouvier <pierrick.bouvier@linaro.org>,
	 qemu-devel@nongnu.org,
	Richard Henderson <richard.henderson@linaro.org>,
	 laurent@vivier.eu, Sajan Karumanchi <sajan.karumanchi@amd.com>
Subject: Re: GLibC AMD CPUID cache reporting regression (was Re: qemu-user self emulation broken with default CPU on x86/x64)
Date: Tue, 04 Jul 2023 19:37:39 +0200	[thread overview]
Message-ID: <87v8ez4ly4.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <ZKM4LV5UboN7PGni@redhat.com> ("Daniel P. Berrangé"'s message of "Mon, 3 Jul 2023 22:05:49 +0100")

* Daniel P. Berrangé:

> On Mon, Jul 03, 2023 at 06:03:08PM +0200, Pierrick Bouvier wrote:
>> Hi everyone,
>> 
>> Recently (in d135f781 [1], between v7.0.0 and v8.0.0), qemu-user default cpu
>> was updated to "max" instead of qemu32/qemu64.
>> 
>> This change "broke" qemu self emulation if this new default cpu is used.
>> 
>> $ ./qemu-x86_64 ./qemu-x86_64 --version
>> qemu-x86_64: ../util/cacheflush.c:212: init_cache_info: Assertion `(isize &
>> (isize - 1)) == 0' failed.
>> qemu: uncaught target signal 6 (Aborted) - core dumped
>> Aborted
>> 
>> By setting cpu back to qemu64, it works again.
>> $ ./qemu-x86_64 -cpu qemu64 ./qemu-x86_64  --version
>> qemu-x86_64 version 8.0.50 (v8.0.0-2317-ge125b08ed6)
>> Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
>> 
>> Commenting assert does not work, as qemu aligned malloc fail shortly after.
>> 
>> I'm willing to fix it, but I'm not sure what is the issue with "max" cpu
>> exactly. Is it missing CPU cache line, or something else?
>
> I've observed GLibC is issuing CPUID leaf 0x8000_001d
>
> QEMU 'max' CPU model doesn't defnie xlevel, so QEMU makes it default
> to the same as min_xlevel, which is calculated to be 0x8000_000a.
>
> cpu_x86_cpuid() in QEMU sees CPUID leaf 0x8000_001d is above 0x8000_000a,
> and so  considers it an invaild CPUID and thus forces it to report
> 0x0000_000d which is supposedly what an invalid CPUID leaf should do.
>
>
> Net result: glibc is asking for 0x8000_001d, but getting back data
> for 0x0000_000d.
>
> This doesn't end happily for obvious reasons, getting garbage for
> the dcache sizes.
>
>
> The 'qemu64' CPU model also gets CPUID leaf 0x8000_001d capped back
> to 0x0000_000d, but crucially qemu64 lacks the 'xsave' feature bit,
> so QEMU returns all-zeroes for CPUID leaf 0x0000_000d. Still not
> good, but this makes glibc report 0 for DCACHE_*, which in turn
> avoids tripping up the nested qemu which queries DCACHE sysconf.
>
> So the problem is thus more widespread than just 'max' CPU model.
>
> Any QEMU CPU model with vendor=AuthenticAMD and the xsave feature,
> and the xlevel unset, will cause glibc to report garbage for the
> L1D cache info
>
> Any QEMU CPU model with vendor=AuthenticAMD and without the xsave
> feature, and the xlevel unset, will cause glibc to report zeroes
> for L1D cache info
>
> Neither is good, but the latter at least doesn't trip up the
> nested QEMU when it queries L1D cache info.
>
> I'm unsure if QEMU's behaviour is correct with calculating the
> default 'xlevel' values for 'max', but I'm assuming the xlevel
> was correct for Opteron_G4/5 since those are explicitly set
> in the code for along time.

We are tracking this as:

  New AMD cache size computation logic does not work for some CPUs,
  hypervisors
  <https://sourceware.org/bugzilla/show_bug.cgi?id=30428>

I filed it after we resolved the earlier crashes because the data is
clearly not accurate.  I was also able to confirm that impacts more than
just hypervisors.

Sajan posted a first patch:

  [PATCH] x86: Fix for cache computation on AMD legacy cpus.
  <https://sourceware.org/pipermail/libc-alpha/2023-June/148763.html>

However, it changes the reported cache sizes on some older CPUs compared
to what we had before (although the values are no longer zero at least).

Thanks,
Florian

next prev parent reply	other threads:[~2023-07-04 17:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-03 16:03 qemu-user self emulation broken with default CPU on x86/x64 Pierrick Bouvier
2023-07-03 18:04 ` Daniel P. Berrangé
2023-07-03 21:05 ` GLibC AMD CPUID cache reporting regression (was Re: qemu-user self emulation broken with default CPU on x86/x64) Daniel P. Berrangé
2023-07-04 17:30   ` Pierrick Bouvier
2023-07-04 17:37   ` Florian Weimer [this message]
2023-07-05 13:08     ` Karumanchi, Sajan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8ez4ly4.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=berrange@redhat.com \
    --cc=laurent@vivier.eu \
    --cc=pierrick.bouvier@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sajan.karumanchi@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).