public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Waiman Long <longman@redhat.com>
Subject: Re: [PATCH v4 5/5] timers: Exclude isolated cpus from timer migation
Date: Tue, 6 May 2025 18:00:05 +0200	[thread overview]
Message-ID: <aBoyBbDDslnly60w@localhost.localdomain> (raw)
In-Reply-To: <20250506091534.42117-12-gmonaco@redhat.com>

Le Tue, May 06, 2025 at 11:15:40AM +0200, Gabriele Monaco a écrit :
> @@ -1472,15 +1473,24 @@ static int tmigr_cpu_unavailable(unsigned int cpu)
>  
>  static int tmigr_cpu_available(unsigned int cpu)
>  {
> -	struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu);
> +	struct tmigr_cpu *tmc = per_cpu_ptr(&tmigr_cpu, cpu);
>  
>  	/* Check whether CPU data was successfully initialized */
>  	if (WARN_ON_ONCE(!tmc->tmgroup))
>  		return -EINVAL;
>  
> +	/*
> +	 * Isolated CPUs don't participate in timer migration.
> +	 * Checking here guarantees that CPUs isolated at boot (e.g. isolcpus)
> +	 * are not marked as available when they first become online.
> +	 * During runtime, any offline isolated CPU is also not incorrectly
> +	 * marked as available once it gets back online.
> +	 */
> +	if (cpu_is_isolated(cpu))

I would like nohz_full to remain an exception here. It already handles
well (even better than domain isolated CPUs) global timers by behaving like idle
CPUs. Because when the tick is stopped on nohz_full, the global timers are
then handled by housekeeping. We are doing something different with domain
isolated CPUs because those must still handle their own global timers.

So please keep nohz_full CPUs inside the tree (that includes CPUs that are
_both_ nohz_full and domain isolated).


> +		return 0;
>  	raw_spin_lock_irq(&tmc->lock);
>  	trace_tmigr_cpu_available(tmc);
> -	tmc->idle = timer_base_is_idle();
> +	tmc->idle = timer_base_remote_is_idle(cpu);

This is racy:

      CPU 0                                 CPU 1
      -----                                 -----

tick_nohz_stop_tick()
    timer_base_try_to_set_idle()
        __get_next_timer_interrupt()
            tmigr_cpu_deactivate()
                                          tmigr_isolated_exclude_cpumask()
                                              tmigr_cpu_available()
                                                  tmc->idle = timer_base_remote_is_idle(cpu);
                                                  if (!tmc->idle)
                                                      __tmigr_cpu_activate(tmc);
            base_global->is_idle = true;
            

CPU 0 can now become the migrator even when it's idle sleeping forever.

My suggestion is to not rely remotely on is_idle. This can only
be racy. You can trigger tmigr_cpu_available() through smp_call_function_many()
instead.

Thanks.


>  	if (!tmc->idle)
>  		__tmigr_cpu_activate(tmc);
>  	tmc->available = true;
> @@ -1489,6 +1499,21 @@ static int tmigr_cpu_available(unsigned int cpu)
>  	return 0;
>  }
>  
> +void tmigr_isolated_exclude_cpumask(cpumask_var_t exclude_cpumask)
> +{
> +	int cpu;
> +
> +	lockdep_assert_cpus_held();
> +
> +	for_each_cpu_and(cpu, exclude_cpumask, tmigr_available_cpumask)
> +		tmigr_cpu_unavailable(cpu);
> +
> +	for_each_cpu_andnot(cpu, cpu_online_mask, exclude_cpumask) {
> +		if (!cpumask_test_cpu(cpu, tmigr_available_cpumask))
> +			tmigr_cpu_available(cpu);
> +	}
> +}
> +
>  static void tmigr_init_group(struct tmigr_group *group, unsigned int lvl,
>  			     int node)
>  {
> -- 
> 2.49.0
> 

-- 
Frederic Weisbecker
SUSE Labs

      reply	other threads:[~2025-05-06 16:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06  9:15 [PATCH v4 0/5] timers: Exclude isolated cpus from timer migation Gabriele Monaco
2025-05-06  9:15 ` [PATCH v4 1/5] timers: Rename tmigr 'online' bit to 'available' Gabriele Monaco
2025-05-06  9:15 ` [PATCH v4 2/5] timers: Add the available mask in timer migration Gabriele Monaco
2025-05-06 16:07   ` Frederic Weisbecker
2025-05-07  7:57     ` Gabriele Monaco
2025-05-07 12:25       ` Frederic Weisbecker
2025-05-07 12:46         ` Gabriele Monaco
2025-05-07 13:40           ` Frederic Weisbecker
2025-05-07 13:54             ` Gabriele Monaco
2025-05-07 14:27               ` Frederic Weisbecker
2025-05-06  9:15 ` [PATCH v4 3/5] cgroup/cpuset: Rename update_unbound_workqueue_cpumask to update_exclusion_cpumasks Gabriele Monaco
2025-05-06  9:15 ` [PATCH v4 4/5] timers: Add timer_base_remote_is_idle to query from remote cpus Gabriele Monaco
2025-05-06  9:15 ` [PATCH v4 5/5] timers: Exclude isolated cpus from timer migation Gabriele Monaco
2025-05-06 16:00   ` Frederic Weisbecker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBoyBbDDslnly60w@localhost.localdomain \
    --to=frederic@kernel.org \
    --cc=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox