public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	John Stultz <jstultz@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Eric Dumazet <edumazet@google.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Arjan van de Ven <arjan@infradead.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Rik van Riel <riel@surriel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Giovanni Gherdovich <ggherdovich@suse.cz>,
	Lukasz Luba <lukasz.luba@arm.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Srinivas Pandruvada <srinivas.pandruvada@intel.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>
Subject: Re: [PATCH v10 18/20] timers: Implement the hierarchical pull model
Date: Tue, 06 Feb 2024 12:36:55 +0100	[thread overview]
Message-ID: <87eddp3ju0.fsf@somnus> (raw)
In-Reply-To: <ZcAQZ8-gbEYaQflu@pavilion.home>

Frederic Weisbecker <frederic@kernel.org> writes:

> Le Mon, Jan 15, 2024 at 03:37:41PM +0100, Anna-Maria Behnsen a écrit :
>> +/*
>> + * Returns true, if there is nothing to be propagated to the next level
>> + *
>> + * @data->firstexp is set to expiry of first gobal event of the (top level of
>> + * the) hierarchy, but only when hierarchy is completely idle.
>> + *
>> + * This is the only place where the group event expiry value is set.
>> + */
>> +static
>> +bool tmigr_update_events(struct tmigr_group *group, struct tmigr_group *child,
>> +			 struct tmigr_walk *data, union tmigr_state childstate,
>> +			 union tmigr_state groupstate)
>> +{
>> +	struct tmigr_event *evt, *first_childevt;
>> +	bool walk_done, remote = data->remote;
>> +	bool leftmost_change = false;
>> +	u64 nextexp;
>> +
>> +	if (child) {
>> +		raw_spin_lock(&child->lock);
>> +		raw_spin_lock_nested(&group->lock, SINGLE_DEPTH_NESTING);
>> +
>> +		if (childstate.active) {
>> +			walk_done = true;
>> +			goto unlock;
>> +		}
>> +
>> +		first_childevt = tmigr_next_groupevt(child);
>> +		nextexp = child->next_expiry;
>> +		evt = &child->groupevt;
>> +	} else {
>> +		nextexp = data->nextexp;
>> +
>> +		first_childevt = evt = data->evt;
>> +
>> +		/*
>> +		 * Walking the hierarchy is required in any case when a
>> +		 * remote expiry was done before. This ensures to not lose
>> +		 * already queued events in non active groups (see section
>> +		 * "Required event and timerqueue update after a remote
>> +		 * expiry" in the documentation at the top).
>> +		 *
>> +		 * The two call sites which are executed without a remote expiry
>> +		 * before, are not prevented from propagating changes through
>> +		 * the hierarchy by the return:
>> +		 *  - When entering this path by tmigr_new_timer(), @evt->ignore
>> +		 *    is never set.
>> +		 *  - tmigr_inactive_up() takes care of the propagation by
>> +		 *    itself and ignores the return value. But an immediate
>> +		 *    return is required because nothing has to be done in this
>> +		 *    level as the event could be ignored.
>> +		 */
>> +		if (evt->ignore && !remote)
>> +			return true;
>> +
>> +		raw_spin_lock(&group->lock);
>> +	}
>> +
>> +	if (nextexp == KTIME_MAX) {
>> +		evt->ignore = true;
>> +
>> +		/*
>> +		 * When the next child event could be ignored (nextexp is
>> +		 * KTIME_MAX) and there was no remote timer handling before or
>> +		 * the group is already active, there is no need to walk the
>> +		 * hierarchy even if there is a parent group.
>> +		 *
>> +		 * The other way round: even if the event could be ignored, but
>> +		 * if a remote timer handling was executed before and the group
>> +		 * is not active, walking the hierarchy is required to not miss
>> +		 * an enqueued timer in the non active group. The enqueued timer
>> +		 * of the group needs to be propagated to a higher level to
>> +		 * ensure it is handled.
>> +		 */
>> +		if (!remote || groupstate.active) {
>> +			walk_done = true;
>> +			goto unlock;
>
> So if the current tmc going inactive was the migrator for the whole hierarchy
> and it is reaching here the top-level, this assumes that if none of this tmc's
> groups have a timer, then it can just return. But what if the top level has
> timers from other children? Who is going to handle them then?
>
> Should this be "goto check_toplvl" instead?
>

Simply replacing this goto will not work. Then we chould end up with a
'data->firstexp' set even if we do not want to have it (when remote is
not set).

There is another issue in here. When the event could be ignored and it
is propagated because of e.g. remote timer handling, then the timerqueue
dance is done nevertheless. It's not a big problem (as the ignore flag
is set and event is removed of queue when revisting the timer queue),
but its obviously more work than it is required to have.

Thanks

>> +		}
>> +	} else {
>> +		/*
>> +		 * An update of @evt->cpu and @evt->ignore flag is required only
>> +		 * when @child is set (the child is equal or higher than lvl0),
>> +		 * but it doesn't matter if it is written once more to the per
>> +		 * CPU event; make the update unconditional.
>> +		 */
>> +		evt->cpu = first_childevt->cpu;
>> +		evt->ignore = false;
>> +	}
>> +
>> +	walk_done = !group->parent;
>> +
>> +	/*
>> +	 * If the child event is already queued in the group, remove it from the
>> +	 * queue when the expiry time changed only.
>> +	 */
>> +	if (timerqueue_node_queued(&evt->nextevt)) {
>> +		if (evt->nextevt.expires == nextexp)
>> +			goto check_toplvl;
>> +
>> +		leftmost_change = timerqueue_getnext(&group->events) == &evt->nextevt;
>> +		if (!timerqueue_del(&group->events, &evt->nextevt))
>> +			WRITE_ONCE(group->next_expiry, KTIME_MAX);
>> +	}
>> +
>> +	evt->nextevt.expires = nextexp;
>> +
>> +	if (timerqueue_add(&group->events, &evt->nextevt)) {
>> +		leftmost_change = true;
>> +		WRITE_ONCE(group->next_expiry, nextexp);
>> +	}
>> +
>> +check_toplvl:
>> +	if (walk_done && (groupstate.migrator == TMIGR_NONE)) {
>> +		/*
>> +		 * Nothing to do when first event didn't changed and update was
>> +		 * done during remote timer handling.
>> +		 */
>> +		if (remote && !leftmost_change)
>> +			goto unlock;
>> +		/*
>> +		 * The top level group is idle and it has to be ensured the
>> +		 * global timers are handled in time. (This could be optimized
>> +		 * by keeping track of the last global scheduled event and only
>> +		 * arming it on the CPU if the new event is earlier. Not sure if
>> +		 * its worth the complexity.)
>> +		 */
>> +		data->firstexp = tmigr_next_groupevt_expires(group);
>> +	}
>> +
>> +unlock:
>> +	raw_spin_unlock(&group->lock);
>> +
>> +	if (child)
>> +		raw_spin_unlock(&child->lock);
>> +
>> +	return walk_done;
>> +}

  reply	other threads:[~2024-02-06 11:36 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-15 14:37 [PATCH v10 00/20] timers: Move from a push remote at enqueue to a pull at expiry model Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 01/20] timers: Restructure get_next_timer_interrupt() Anna-Maria Behnsen
2024-01-17 15:01   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 02/20] timers: Split out get next timer interrupt Anna-Maria Behnsen
2024-01-17 15:06   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 03/20] timers: Move marking timer bases idle into tick_nohz_stop_tick() Anna-Maria Behnsen
2024-01-17 16:02   ` Frederic Weisbecker
2024-01-22 11:45     ` Anna-Maria Behnsen
2024-01-22 21:49       ` Frederic Weisbecker
2024-02-19  8:52   ` [PATCH v10a] " Anna-Maria Behnsen
2024-02-19 22:37     ` Frederic Weisbecker
2024-02-20 10:48       ` Anna-Maria Behnsen
2024-02-20 11:41         ` Frederic Weisbecker
2024-02-20 12:02           ` Anna-Maria Behnsen
2024-02-20 12:34             ` Frederic Weisbecker
2024-02-20 14:00               ` Anna-Maria Behnsen
2024-02-20 15:10                 ` Frederic Weisbecker
2024-02-20 15:23                   ` Anna-Maria Behnsen
2024-02-20 15:25                     ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 04/20] timers: Optimization for timer_base_try_to_set_idle() Anna-Maria Behnsen
2024-01-17 16:45   ` Frederic Weisbecker
2024-01-22 11:48     ` Anna-Maria Behnsen
2024-01-22 22:22   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 05/20] timers: Introduce add_timer() variants which modify timer flags Anna-Maria Behnsen
2024-01-17 17:01   ` Frederic Weisbecker
2024-01-22 11:50     ` Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 06/20] workqueue: Use global variant for add_timer() Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 07/20] timers: add_timer_on(): Make sure TIMER_PINNED flag is set Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 08/20] timers: Ease code in run_local_timers() Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 09/20] timers: Split next timer interrupt logic Anna-Maria Behnsen
2024-01-23 14:28   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 10/20] timers: Keep the pinned timers separate from the others Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 11/20] timers: Retrieve next expiry of pinned/non-pinned timers separately Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 12/20] timers: Split out "get next timer interrupt" functionality Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 13/20] timers: Add get next timer interrupt functionality for remote CPUs Anna-Maria Behnsen
2024-02-19 16:04   ` Frederic Weisbecker
2024-02-19 16:57     ` Anna-Maria Behnsen
2024-01-15 14:37 ` [PATCH v10 14/20] timers: Restructure internal locking Anna-Maria Behnsen
2024-01-24 13:56   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 15/20] timers: Check if timers base is handled already Anna-Maria Behnsen
2024-01-24 14:22   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 16/20] tick/sched: Split out jiffies update helper function Anna-Maria Behnsen
2024-01-24 14:42   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 17/20] timers: Introduce function to check timer base is_idle flag Anna-Maria Behnsen
2024-01-24 14:52   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 18/20] timers: Implement the hierarchical pull model Anna-Maria Behnsen
2024-01-25 14:30   ` Frederic Weisbecker
2024-01-28 15:58     ` Anna-Maria Behnsen
2024-01-30 15:29       ` Frederic Weisbecker
2024-01-30 16:45         ` Anna-Maria Behnsen
2024-01-26 12:53   ` Frederic Weisbecker
2024-01-27 22:54   ` Frederic Weisbecker
2024-01-29 10:50     ` Anna-Maria Behnsen
2024-01-29 22:21       ` Frederic Weisbecker
2024-01-30 13:32         ` Anna-Maria Behnsen
2024-01-29 13:50     ` Paul E. McKenney
2024-01-29  1:04   ` Frederic Weisbecker
2024-01-30 17:56     ` Anna-Maria Behnsen
2024-01-30 21:13       ` Frederic Weisbecker
2024-01-31 11:19         ` Anna-Maria Behnsen
2024-01-30 15:37   ` Frederic Weisbecker
2024-02-01 14:59     ` Anna-Maria Behnsen
2024-02-01 15:05   ` Frederic Weisbecker
2024-02-01 16:15     ` Anna-Maria Behnsen
2024-02-01 17:43       ` Frederic Weisbecker
2024-02-01 20:52         ` Anna-Maria Behnsen
2024-02-05 13:29           ` Anna-Maria Behnsen
2024-02-05 20:30             ` Frederic Weisbecker
2024-02-06 10:06               ` Anna-Maria Behnsen
2024-02-06 10:29                 ` Frederic Weisbecker
2024-02-01 16:33   ` Frederic Weisbecker
2024-02-05 15:59     ` Anna-Maria Behnsen
2024-02-05 20:28       ` Frederic Weisbecker
2024-02-04 22:02   ` Frederic Weisbecker
2024-02-06 11:03     ` Anna-Maria Behnsen
2024-02-06 11:11       ` Frederic Weisbecker
2024-02-04 22:32   ` Frederic Weisbecker
2024-02-06 11:36     ` Anna-Maria Behnsen [this message]
2024-02-06 13:21       ` Frederic Weisbecker
2024-02-06 14:13         ` Anna-Maria Behnsen
2024-02-06 14:21           ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 19/20] timer_migration: Add tracepoints Anna-Maria Behnsen
2024-02-01 16:47   ` Frederic Weisbecker
2024-01-15 14:37 ` [PATCH v10 20/20] timers: Always queue timers on the local CPU Anna-Maria Behnsen
2024-02-01 17:36   ` Frederic Weisbecker
2024-02-01 20:58     ` Anna-Maria Behnsen
2024-02-02 11:57       ` Frederic Weisbecker
2024-01-30 22:07 ` [PATCH v10 00/20] timers: Move from a push remote at enqueue to a pull at expiry model Christian Loehle
2024-02-01 15:03   ` Anna-Maria Behnsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eddp3ju0.fsf@somnus \
    --to=anna-maria@linutronix.de \
    --cc=arjan@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=edumazet@google.com \
    --cc=frederic@kernel.org \
    --cc=gautham.shenoy@amd.com \
    --cc=ggherdovich@suse.cz \
    --cc=jstultz@google.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=riel@surriel.com \
    --cc=rostedt@goodmis.org \
    --cc=srinivas.pandruvada@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox