All of lore.kernel.org
 help / color / mirror / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>, <rui.zhang@intel.com>,
	<tim.c.chen@intel.com>,
	Xiongfeng Wang <wangxiongfeng2@huawei.com>,
	Yu Liao <liaoyu15@huawei.com>
Subject: Re: [PATCH] x86/tsc: Extend the watchdog check exemption to 4S/8S machine
Date: Tue, 11 Oct 2022 09:09:12 +0800	[thread overview]
Message-ID: <Y0TCOKc7n38341eJ@feng-clx> (raw)
In-Reply-To: <aff10f33-b379-6872-f180-b38f6a0a669a@intel.com>

On Mon, Oct 10, 2022 at 07:23:10AM -0700, Dave Hansen wrote:
> On 10/9/22 18:23, Feng Tang wrote:
> >>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> >>> index cafacb2e58cc..b4ea79cb1d1a 100644
> >>> --- a/arch/x86/kernel/tsc.c
> >>> +++ b/arch/x86/kernel/tsc.c
> >>> @@ -1217,7 +1217,7 @@ static void __init check_system_tsc_reliable(void)
> >>>  	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> >>>  	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> >>>  	    boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> >>> -	    nr_online_nodes <= 2)
> >>> +	    nr_online_nodes <= 8)
> >> So you're saying all 8 socket systems since Broadwell (?) are TSC
> >> sync'ed ?
> > No, I didn't mean that. I haven't got chance to any 8 sockets
> > machine, and I got a report last month that on one 8S machine,
> > the TSC was judged 'unstable' by HPET as watchdog.
> 
> That's not a great check.  Think about numa=fake=4U, for instance.  Or a
> single-socket system with persistent memory and high bandwidth memory.
> 
> Basically 'nr_online_nodes' is a software construct.  It's going to be
> really hard to infer anything from it about what the _hardware_ is.

You are right! How to get the socket number was indeed a trouble when
I worked on commit b50db7095fe0, the problem is related to the
initialization order. This tsc check needs to be done in tsc_init(),
while the node_stats[] get initialized in later's call of smp_init().

For the case you mentioned above, I dug out some old logs which showed
its init order:

  numa=fake=4 on a SKL desktop
  ================
  [    0.000066] [tsc_early_init()]: nr_online_nodes = 1
  [    0.000068] [tsc_early_init()]: nr_cpu_nodes = 0
  [    0.000070] [tsc_early_init()]: nr_mem_nodes = 0
  [    0.104015] [tsc_init()]: nr_online_nodes = 4
  [    0.104019] [tsc_init()]: nr_cpu_nodes = 0
  [    0.104022] [tsc_init()]: nr_mem_nodes = 4
  [    0.124778] smp: Brought up 4 nodes, 4 CPUs
  [    0.760915] [init_tsc_clocksource()]: nr_online_nodes = 4
  [    0.760919] [init_tsc_clocksource()]: nr_cpu_nodes = 4
  [    0.760922] [init_tsc_clocksource()]: nr_mem_nodes = 4
  
  QEMU with 2 CPU-DRAM nodes + 2 Persistent memory nodes 
  ========================================================
  [    0.066651] [tsc_early_init()]: nr_online_nodes = 1
  [    0.067494] [tsc_early_init()]: nr_cpu_nodes = 0
  [    0.068288] [tsc_early_init()]: nr_mem_nodes = 0
  [    0.677694] [tsc_init()]: nr_online_nodes = 4
  [    0.678862] [tsc_init()]: nr_cpu_nodes = 0
  [    0.679962] [tsc_init()]: nr_mem_nodes = 4
  [    1.139240] [init_tsc_clocksource()]: nr_online_nodes = 4
  [    1.140576] [init_tsc_clocksource()]: nr_cpu_nodes = 2
  [    1.141823] [init_tsc_clocksource()]: nr_mem_nodes = 4
  [    1.660100] [kernel_init()]: nr_online_nodes = 4
  [    1.661234] [kernel_init()]: nr_cpu_nodes = 2
  [    1.662300] [kernel_init()]: nr_mem_nodes = 4

The 'nr_online_nodes' was chosed in the hope of that, in worse case
the patch is just a nop and won't wrongly lift the check.

One possible solution for this problem is to leverage the SRAT table
early init which is called before tsc_init(), and can provide CPU
nodes info. Will try this way.

Thanks,
Feng




  reply	other threads:[~2022-10-11  1:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-09  5:12 [PATCH] x86/tsc: Extend the watchdog check exemption to 4S/8S machine Feng Tang
2022-10-09 13:01 ` Peter Zijlstra
2022-10-10  1:23   ` Feng Tang
2022-10-10 14:23     ` Dave Hansen
2022-10-11  1:09       ` Feng Tang [this message]
2022-10-11  7:51         ` Feng Tang
2022-10-11 13:01           ` Peter Zijlstra
2022-10-12  8:44             ` Feng Tang
2022-10-11  7:52       ` Peter Zijlstra
2022-10-11 13:33         ` Zhang Rui
2022-10-11 14:01           ` Peter Zijlstra
2022-10-11 14:11             ` Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0TCOKc7n38341eJ@feng-clx \
    --to=feng.tang@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=liaoyu15@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.