All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Zhang, Rui" <rui.zhang@intel.com>,
	"Chen, Tim C" <tim.c.chen@intel.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>,
	"paulmck@kernel.org" <paulmck@kernel.org>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [Patch v2 2/2] x86/tsc: use logical_packages as a better estimation of socket numbers
Date: Mon, 19 Jun 2023 18:42:06 +0800	[thread overview]
Message-ID: <ZJAw/ipOybjHNfeh@feng-clx> (raw)
In-Reply-To: <20230616080231.GZ4253@hirez.programming.kicks-ass.net>

On Fri, Jun 16, 2023 at 10:02:31AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 16, 2023 at 06:53:21AM +0000, Zhang, Rui wrote:
> > On Thu, 2023-06-15 at 11:20 +0200, Peter Zijlstra wrote:
> 
> > > So I have at least two machines where I boot with 'possible_cpus=#'
> > > because the BIOS MADT is reporting a stupid number of CPUs that
> > > aren't
> > > actually there.
> > 
> > Does the MADT report those CPUs as disabled but online capable?
> > can you send me a copy of the acpidmp?
> 
> Sent privately, it's a bit big.
> 
> > I had a patch to parse MADT and count the number of physical packages
> > by decoding all the valid APICIDs in MADT.
> > I'm wondering if the patch still works on this machine.
> 
> I can certainly give it a spin; it has IPMI serial-over-ethernet that
> works. Brilliant dev machine.
> 
> > > So I think I'm lucky and side-stepped this nonsense, but if someone
> > > were
> > > to use nr_cpus= for this same purpose, they get screwed over and get
> > > the
> > > watchdog. Sad day for them I suppose.
> > 
> > what if using package_count_from_MADT?
> 
> So I'm thinking that if you cap possible_mask the actual logical
> packages is the right number.
> 
> Suppose you have a machine with 8 sockets, but limit possible_mask to
> only 1 socket. Then TSC will actually be stable, it doesn't matter you
> have 7 idle sockets that are not synchronized.
> 
> Then again, perhaps if you limit it to 2 sockets you're still in
> trouble, I'm not entirely sure how the TSC sync stuff comes apart on
> these large systems.

I had the similar thought. For this case, the defensive way is to keep
the watchdog for 'nr_cpus=' and 'possible_cpus=' setup, and if the
specific setup has no TSC sync issue, people can add one more parameter
'tsc=reliable' to skip the watchdog, while aggressive way is to ignore
the 2 cmdline parameters as the above case is really rare.

Again, as you mentioned, I can't find a perfect solution to cover all
kinds of setup and broken firmware. But at least 'logical_packages' is
much better than 'nr_online_nodes' :)

Thanks,
Feng

  parent reply	other threads:[~2023-06-19 10:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-13  5:25 [Patch v2 1/2] smp: Add helper function to mark possible bad package number Feng Tang
2023-06-13  5:25 ` [Patch v2 2/2] x86/tsc: use logical_packages as a better estimation of socket numbers Feng Tang
2023-06-15  9:20   ` Peter Zijlstra
2023-06-16  6:53     ` Zhang, Rui
2023-06-16  8:02       ` Peter Zijlstra
2023-06-16  8:10         ` Peter Zijlstra
2023-06-16  9:19           ` Zhang, Rui
2023-06-16  9:42             ` Peter Zijlstra
2023-06-16 11:23               ` Zhang, Rui
2023-06-16 11:47                 ` Feng Tang
2023-06-16  8:22         ` Peter Zijlstra
2023-06-19 10:42         ` Feng Tang [this message]
2023-06-16  7:18     ` Feng Tang
2023-06-22 14:27       ` Thomas Gleixner
2023-06-22 23:07         ` Thomas Gleixner
2023-06-23 15:49           ` Zhang, Rui
     [not found]           ` <ZJW0gi5oQQbxf8Df@feng-clx>
2023-06-25 14:51             ` Feng Tang
2023-06-27 11:14               ` Thomas Gleixner
2023-06-29 13:27                 ` Feng Tang
2023-07-17 13:38                   ` Feng Tang
2023-07-26 19:37                     ` Thomas Gleixner
2023-07-27  1:24                       ` Feng Tang
2023-06-23 15:36         ` Zhang, Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJAw/ipOybjHNfeh@feng-clx \
    --to=feng.tang@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw@amazon.co.uk \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.