public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>, <rui.zhang@intel.com>,
	<tim.c.chen@intel.com>,
	Xiongfeng Wang <wangxiongfeng2@huawei.com>,
	Yu Liao <liaoyu15@huawei.com>
Subject: Re: [PATCH] x86/tsc: Extend the watchdog check exemption to 4S/8S machine
Date: Tue, 11 Oct 2022 09:09:12 +0800	[thread overview]
Message-ID: <Y0TCOKc7n38341eJ@feng-clx> (raw)
In-Reply-To: <aff10f33-b379-6872-f180-b38f6a0a669a@intel.com>

On Mon, Oct 10, 2022 at 07:23:10AM -0700, Dave Hansen wrote:
> On 10/9/22 18:23, Feng Tang wrote:
> >>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> >>> index cafacb2e58cc..b4ea79cb1d1a 100644
> >>> --- a/arch/x86/kernel/tsc.c
> >>> +++ b/arch/x86/kernel/tsc.c
> >>> @@ -1217,7 +1217,7 @@ static void __init check_system_tsc_reliable(void)
> >>>  	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> >>>  	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> >>>  	    boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> >>> -	    nr_online_nodes <= 2)
> >>> +	    nr_online_nodes <= 8)
> >> So you're saying all 8 socket systems since Broadwell (?) are TSC
> >> sync'ed ?
> > No, I didn't mean that. I haven't got chance to any 8 sockets
> > machine, and I got a report last month that on one 8S machine,
> > the TSC was judged 'unstable' by HPET as watchdog.
> 
> That's not a great check.  Think about numa=fake=4U, for instance.  Or a
> single-socket system with persistent memory and high bandwidth memory.
> 
> Basically 'nr_online_nodes' is a software construct.  It's going to be
> really hard to infer anything from it about what the _hardware_ is.

You are right! How to get the socket number was indeed a trouble when
I worked on commit b50db7095fe0, the problem is related to the
initialization order. This tsc check needs to be done in tsc_init(),
while the node_stats[] get initialized in later's call of smp_init().

For the case you mentioned above, I dug out some old logs which showed
its init order:

  numa=fake=4 on a SKL desktop
  ================
  [    0.000066] [tsc_early_init()]: nr_online_nodes = 1
  [    0.000068] [tsc_early_init()]: nr_cpu_nodes = 0
  [    0.000070] [tsc_early_init()]: nr_mem_nodes = 0
  [    0.104015] [tsc_init()]: nr_online_nodes = 4
  [    0.104019] [tsc_init()]: nr_cpu_nodes = 0
  [    0.104022] [tsc_init()]: nr_mem_nodes = 4
  [    0.124778] smp: Brought up 4 nodes, 4 CPUs
  [    0.760915] [init_tsc_clocksource()]: nr_online_nodes = 4
  [    0.760919] [init_tsc_clocksource()]: nr_cpu_nodes = 4
  [    0.760922] [init_tsc_clocksource()]: nr_mem_nodes = 4
  
  QEMU with 2 CPU-DRAM nodes + 2 Persistent memory nodes 
  ========================================================
  [    0.066651] [tsc_early_init()]: nr_online_nodes = 1
  [    0.067494] [tsc_early_init()]: nr_cpu_nodes = 0
  [    0.068288] [tsc_early_init()]: nr_mem_nodes = 0
  [    0.677694] [tsc_init()]: nr_online_nodes = 4
  [    0.678862] [tsc_init()]: nr_cpu_nodes = 0
  [    0.679962] [tsc_init()]: nr_mem_nodes = 4
  [    1.139240] [init_tsc_clocksource()]: nr_online_nodes = 4
  [    1.140576] [init_tsc_clocksource()]: nr_cpu_nodes = 2
  [    1.141823] [init_tsc_clocksource()]: nr_mem_nodes = 4
  [    1.660100] [kernel_init()]: nr_online_nodes = 4
  [    1.661234] [kernel_init()]: nr_cpu_nodes = 2
  [    1.662300] [kernel_init()]: nr_mem_nodes = 4

The 'nr_online_nodes' was chosed in the hope of that, in worse case
the patch is just a nop and won't wrongly lift the check.

One possible solution for this problem is to leverage the SRAT table
early init which is called before tsc_init(), and can provide CPU
nodes info. Will try this way.

Thanks,
Feng




  reply	other threads:[~2022-10-11  1:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-09  5:12 [PATCH] x86/tsc: Extend the watchdog check exemption to 4S/8S machine Feng Tang
2022-10-09 13:01 ` Peter Zijlstra
2022-10-10  1:23   ` Feng Tang
2022-10-10 14:23     ` Dave Hansen
2022-10-11  1:09       ` Feng Tang [this message]
2022-10-11  7:51         ` Feng Tang
2022-10-11 13:01           ` Peter Zijlstra
2022-10-12  8:44             ` Feng Tang
2022-10-11  7:52       ` Peter Zijlstra
2022-10-11 13:33         ` Zhang Rui
2022-10-11 14:01           ` Peter Zijlstra
2022-10-11 14:11             ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0TCOKc7n38341eJ@feng-clx \
    --to=feng.tang@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=liaoyu15@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox