public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
  • [parent not found: <20240319185148.985729-3-kyle.meyer@hpe.com>]
  • * Re: [PATCH 0/2] sched/topology: Optimize topology_span_sane()
           [not found] <20240319185148.985729-1-kyle.meyer@hpe.com>
           [not found] ` <20240319185148.985729-2-kyle.meyer@hpe.com>
           [not found] ` <20240319185148.985729-3-kyle.meyer@hpe.com>
    @ 2024-04-09  8:31 ` Valentin Schneider
      2 siblings, 0 replies; 10+ messages in thread
    From: Valentin Schneider @ 2024-04-09  8:31 UTC (permalink / raw)
      To: Kyle Meyer, yury.norov, andriy.shevchenko, linux, mingo, peterz,
    	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
    	mgorman, bristot, linux-kernel
      Cc: russ.anderson, dimitri.sivanich, steve.wahl, Kyle Meyer
    
    On 19/03/24 13:51, Kyle Meyer wrote:
    > A soft lockup is being detected in build_sched_domains() on 32 socket
    > Sapphire Rapids systems with 3840 processors.
    >
    > topology_span_sane(), called by build_sched_domains(), checks that each
    > processor's non-NUMA scheduling domains are completely equal or
    > completely disjoint. If a non-NUMA scheduling domain partially overlaps
    > another, scheduling groups can break.
    >
    > This series adds for_each_cpu_from() as a generic cpumask macro to
    > optimize topology_span_sane() by removing duplicate comparisons. The
    > total number of comparisons is reduced from N * (N - 1) to
    > N * (N - 1) / 2 (per non-NUMA scheduling domain level), decreasing the
    > boot time by approximately 20 seconds and preventing the soft lockup on
    > the mentioned systems.
    >
    > Kyle Meyer (2):
    >   cpumask: Add for_each_cpu_from()
    >   sched/topology: Optimize topology_span_sane()
    
    I somehow never got 2/2, and it doesn't show up on lore.kernel.org
    either. I can see it from Yury's reply and it looks OK to me, but you'll
    have to resend it for maintainers to be able to pick it up.
    
    >
    >  include/linux/cpumask.h | 10 ++++++++++
    >  kernel/sched/topology.c |  6 ++----
    >  2 files changed, 12 insertions(+), 4 deletions(-)
    >
    > -- 
    > 2.44.0
    
    
    ^ permalink raw reply	[flat|nested] 10+ messages in thread
  • * [PATCH 0/2] sched/topology: optimize topology_span_sane()
    @ 2024-08-02 17:57 Yury Norov
      2024-08-02 17:57 ` [PATCH 2/2] " Yury Norov
      0 siblings, 1 reply; 10+ messages in thread
    From: Yury Norov @ 2024-08-02 17:57 UTC (permalink / raw)
      To: linux-kernel
      Cc: Yury Norov, Christophe JAILLET, Leonardo Bras, Ingo Molnar,
    	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
    	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider
    
    Pre-compute pre-compute topology_span_sane() loop params and optimize
    the functtion to avoid calling cpumask_equal() when masks are the same.
    
    This series follows up comments from here:
    
    https://lore.kernel.org/lkml/ZqqV5OxZPHUgjhag@LeoBras/T/#md6b2b6bdd09e63740bbf010530211842a79b5f57
    
    
    Yury Norov (2):
      sched/topology: pre-compute topology_span_sane() loop params
      sched/topology: optimize topology_span_sane()
    
     kernel/sched/topology.c | 8 ++++++--
     1 file changed, 6 insertions(+), 2 deletions(-)
    
    -- 
    2.43.0
    
    
    ^ permalink raw reply	[flat|nested] 10+ messages in thread
    * [PATCH v2 0/2] sched/topology: optimize topology_span_sane()
    @ 2024-08-07 19:05 Yury Norov
      2024-08-07 19:05 ` [PATCH 2/2] " Yury Norov
      0 siblings, 1 reply; 10+ messages in thread
    From: Yury Norov @ 2024-08-07 19:05 UTC (permalink / raw)
      To: linux-kernel
      Cc: Yury Norov, Chen Yu, Christophe JAILLET, Leonardo Bras,
    	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
    	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
    	Valentin Schneider
    
    The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i),
    even when cpu != i. In such case, cpumask_equal() would always return
    true, and we can proceed to the next iteration immediately.
    
    Valentin Schneider shares on it:
    
      PKG can potentially hit that condition, and so can any
      sched_domain_mask_f that relies on the node masks...
      
      I'm thinking ideally we should have checks in place to
      ensure all node_to_cpumask_map[] masks are disjoint,
      then we could entirely skip the levels that use these
      masks in topology_span_sane(), but there's unfortunately
      no nice way to flag them... Also there would be cases
      where there's no real difference between PKG and NODE
      other than NODE is still based on a per-cpu cpumask and
      PKG isn't, so I don't see a nicer way to go about this.
    
    v1: https://lore.kernel.org/lkml/ZrJk00cmVaUIAr4G@yury-ThinkPad/T/
    v2:
     - defer initialization of 'mc' in patch #1 @Chen Yu;
     - more comments from Valentin Schneider.
    
    
    Yury Norov (2):
      sched/topology: pre-compute topology_span_sane() loop params
      sched/topology: optimize topology_span_sane()
    
     kernel/sched/topology.c | 20 ++++++++++++++++++--
     1 file changed, 18 insertions(+), 2 deletions(-)
    
    -- 
    2.43.0
    
    
    ^ permalink raw reply	[flat|nested] 10+ messages in thread

    end of thread, other threads:[~2024-08-07 19:05 UTC | newest]
    
    Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <20240319185148.985729-1-kyle.meyer@hpe.com>
         [not found] ` <20240319185148.985729-2-kyle.meyer@hpe.com>
    2024-03-20 18:31   ` [PATCH 1/2] cpumask: Add for_each_cpu_from() Yury Norov
    2024-03-20 20:22     ` Yury Norov
         [not found] ` <20240319185148.985729-3-kyle.meyer@hpe.com>
    2024-03-20 18:32   ` [PATCH 2/2] sched/topology: Optimize topology_span_sane() Yury Norov
    2024-04-09  8:31 ` [PATCH 0/2] " Valentin Schneider
    2024-08-02 17:57 [PATCH 0/2] sched/topology: optimize topology_span_sane() Yury Norov
    2024-08-02 17:57 ` [PATCH 2/2] " Yury Norov
    2024-08-06 15:50   ` Valentin Schneider
    2024-08-06 18:00     ` Yury Norov
    2024-08-07 13:53       ` Valentin Schneider
    2024-08-07 16:39         ` Yury Norov
      -- strict thread matches above, loose matches on Subject: below --
    2024-08-07 19:05 [PATCH v2 0/2] " Yury Norov
    2024-08-07 19:05 ` [PATCH 2/2] " Yury Norov
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox