All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Zhao Liu <zhao1.liu@intel.com>
Cc: qemu-devel@nongnu.org, "Thomas Huth" <thuth@redhat.com>,
	devel@lists.libvirt.org,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Peter Krempa" <pkrempa@redhat.com>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any machine
Date: Mon, 13 May 2024 15:39:54 +0100	[thread overview]
Message-ID: <ZkImOkl-HtsFMaAz@redhat.com> (raw)
In-Reply-To: <ZkIiHgw9rQActD2i@intel.com>

On Mon, May 13, 2024 at 10:22:22PM +0800, Zhao Liu wrote:
> Cc Paolo for x86 topology part
> 
> Hi Daniel,
> 
> On Mon, May 13, 2024 at 01:33:57PM +0100, Daniel P. Berrangé wrote:
> > Date: Mon, 13 May 2024 13:33:57 +0100
> > From: "Daniel P. Berrangé" <berrange@redhat.com>
> > Subject: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any
> >  machine
> > 
> > This effectively reverts
> > 
> >   commit 54c4ea8f3ae614054079395842128a856a73dbf9
> >   Author: Zhao Liu <zhao1.liu@intel.com>
> >   Date:   Sat Mar 9 00:01:37 2024 +0800
> > 
> >     hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations
> > 
> > but is not done as a 'git revert' since the part of the changes to the
> > file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
> > Furthermore, we have to tweak the subsequently added unit test to
> > account for differing warning message.
> > 
> > The rationale for the original deprecation was:
> > 
> >   "Currently, it was allowed for users to specify the unsupported
> >    topology parameter as "1". For example, x86 PC machine doesn't
> >    support drawer/book/cluster topology levels, but user could specify
> >    "-smp drawers=1,books=1,clusters=1".
> > 
> >    This is meaningless and confusing, so that the support for this kind
> >    of configurations is marked deprecated since 9.0."
> > 
> > There are varying POVs on the topic of 'unsupported' topology levels.
> > 
> > It is common to say that on a system without hyperthreading, that there
> > is always 1 thread. Likewise when new CPUs introduced a concept of
> > multiple "dies', it was reasonable to say that all historical CPUs
> > before that implicitly had 1 'die'. Likewise for the more recently
> > introduced 'modules' and 'clusters' parameter'. From this POV, it is
> > valid to set 'parameter=1' on the -smp command line for any machine,
> > only a value > 1 is strictly an error condition.
> 
> Currently QEMU has become more and more difficult to maintain a general
> topology hierarchy, there are two recent examples:
> 
> 1. as you mentioned "module" v.s. "cluster", one reason for introducing
> "module" is because it is difficult to define what "cluster" is for x86,
> the cluster in the device tree can be nested, then it can correspond to
> an x86 die, or it can correspond to an x86 module. Therefore, specifying
> "clusters=1" for x86 is ambiguous.
> 
> 2. s390 introduces book and drawer, which are above socket/package
> level, but for x86, the level above the package names "cluster" (yeah,
> "cluster" again :-(). So if user sets "books=1" or "drawers=1" for x86,
> then it's meaningless. Similarly, "clusters=1" is also very confusing for
> x86 machine.
> 
> I think that only thread/core/socket are architecturally general, the
> other topology levels are hard to define across architectures, then
> allowing unsupported topology=1 is always confusing...
> 
> Moreover, QEMU currently requires a clear topology containment
> relationship when defining a topology, after which it will become
> increasingly difficult to define a generic topology containment
> relationship when new topology levels are introduced in the future...

I'm failing to see what real world technical problems QEMU faces
with a parameter being set to '1' by a mgmt app, when QEMU itself
treats all omitted values as being '1' anyway.

If we're trying to faithfully model the real world, then restricting
the topology against machine types though still looks inherantly wrong.
The valid topology ought to be constrained based on the named CPU model.
eg it doesn't make sense to allow 'dies=4' with a Skylake CPU model,
only an EPYC CPU model, especially if we want to model cache info in
a way that matches the real world silicon better.

> > It doesn't cause any functional difficulty for QEMU, because internally
> > the QEMU code is itself assuming that all "unsupported" parameters
> > implicitly have a value of '1'.
> > 
> > At the libvirt level, we've allowed applications to set 'parameter=1'
> > when configuring a guest, and pass that through to QEMU.
> > 
> > Deprecating this creates extra difficulty for because there's no info
> > exposed from QEMU about which machine types "support" which parameters.
> > Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
> > a given machine type, or whether it will trigger deprecation messages.
> 
> I understand that libvirt is having trouble because there is no interface
> to expose which topology levels the current machine supports. As a
> workaround to eliminate the difficulties at the libvirt level, it's
> ok for me.
> 
> But I believe deprecating the unsupported topology is necessary, so do
> you think it's acceptable to include an interface to expose the supported
> topology if it's going to be deprecated again later?

As above, I think that restrictions based on machine type, while nice and
simple, are incorrect long term. If we did impose restrictions based on
CPU model, then we could trivially expose this info to mgmt apps via the
existing mechanism for querying supported CPU models. Limiting based on
CPU model, however, has potentially greater back compat issues, though
it would be strictly more faithful to hardware.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2024-05-13 14:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-13 12:33 [PATCH 0/2] hw/core: revert deprecation of 'parameter=1' for SMP topology Daniel P. Berrangé
2024-05-13 12:33 ` [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any machine Daniel P. Berrangé
2024-05-13 14:22   ` Zhao Liu
2024-05-13 14:39     ` Daniel P. Berrangé [this message]
2024-05-14  3:49       ` Zhao Liu
2024-05-15 17:06         ` Daniel P. Berrangé
2024-05-16  8:47           ` Zhao Liu
2024-05-16  8:54   ` Zhao Liu
2024-05-13 12:33 ` [PATCH 2/2] tests: add testing of parameter=1 for SMP topology Daniel P. Berrangé
2024-05-16  2:57   ` Xiaoyao Li
2024-05-16  8:59   ` Zhao Liu
2024-05-13 13:53 ` [PATCH 0/2] hw/core: revert deprecation of 'parameter=1' " Ján Tomko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZkImOkl-HtsFMaAz@redhat.com \
    --to=berrange@redhat.com \
    --cc=devel@lists.libvirt.org \
    --cc=eduardo@habkost.net \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=wangyanan55@huawei.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.