From: Wen Yang <wen.yang@linux.dev>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Nam Cao <namcao@linutronix.de>,
linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors
Date: Wed, 27 May 2026 01:27:33 +0800 [thread overview]
Message-ID: <43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev> (raw)
In-Reply-To: <1d7268da9c74d99c84a83d7039e67a9f0da14d98.camel@redhat.com>
On 5/20/26 19:22, Gabriele Monaco wrote:
> On Wed, 2026-05-20 at 00:48 +0800, Wen Yang wrote:
>> The goal is right. One thing worth double-checking is the load order
>> in the callback against the "SMP BARRIER PAIRING" section of
>> Documentation/memory-barriers.txt, which states:
>>
>
> Yeah I realised that after I sent my answer.. You might have noticed but I
> proposed a version using acquire/release semantics in [1].
>
> I'm waiting to send it all in a V2 for the fixes series.
>
> Are you going to send your patch with tracepoint_synchronize_unregister() in the
> per-task destruction (can be a patch alone)?
>
> If not I'll do it myself and append that too, I prefer to have everything
> together to avoid conflict resolution issues.
>
Thanks for the update.
I did notice your acquire/release version — much appreciated.
For the tracepoint_synchronize_unregister() patch, I would prefer that
you send it yourself, whether as a standalone patch or within your fix
series. That way we can concentrate our efforts on the tlob monitor
without worrying about cross-patch conflicts.
So please go ahead and submit it.
Thanks a lot for handling this!
--
Best wishes,
Wen
>
> [1] -
> https://lore.kernel.org/lkml/02c522f2a09183c9e1a6ff5b0110d0d5cc5e35bd.camel@redhat.com/
>
>> [!] Note that the stores before the write barrier would normally be
>> expected to match the loads after the read barrier or the
>> address-dependency barrier, and vice versa ...
>>
>> So, we should to swap the read order in the callback so that it matches
>> the standard pattern:
>>
>> void __ha_monitor_timer_callback() {
>> guard(rcu)();
>> curr_state = READ_ONCE(ha_mon->da_mon.curr_state); /* B:
>> before rmb */
>> smp_rmb();
>> if (unlikely(!da_monitoring(&ha_mon->da_mon))) /* A:
>> after rmb */
>> return;
>> /*
>> * Reached here: monitoring = 1 (old_A).
>> * Standard wmb/rmb guarantee: curr_state (read before rmb) is also
>> * old, i.e. not initial_state.
>> */
>> ha_react(curr_state, EVENT_NONE, env_string.buffer);
>> ...
>> }
>>
>> void da_monitor_reset() {
>> da_monitor_reset_hook(da_mon);
>> WRITE_ONCE(da_mon->monitoring, 0); /* A: before wmb */
>> smp_wmb();
>> WRITE_ONCE(da_mon->curr_state, model_get_initial_state()); /*
>> B: after wmb */
>> }
>>
>>
>>
>> --
>> Best wishes,
>> Wen
>>
>>>
>>> [1] -
>>> https://lore.kernel.org/lkml/8af5ba4bd93d2acb8a546e8e47ced974a87c1eb8.1778522945.git.wen.yang@linux.dev
>>>
>>>>
>>>>
>>>> --
>>>> Best wishes,
>>>> Wen
>>>>
>>>>
>>>> On 5/12/26 22:02, Gabriele Monaco wrote:
>>>>> HA monitors may start timers, all cleanup functions currently stop the
>>>>> timers asynchronously to avoid sleeping in the wrong context.
>>>>> Nothing makes sure running callbacks terminate on cleanup.
>>>>>
>>>>> Run the entire HA timer callback in an RCU read-side critical section,
>>>>> this way we can simply synchronize_rcu() with any pending timer and are
>>>>> sure any cleanup using kfree_rcu() runs after callbacks terminated.
>>>>> Additionally make sure any unlikely callback running late won't run any
>>>>> code if the monitor is marked as disabled.
>>>>>
>>>>> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
>>>>> Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA")
>>>>> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
>>>>> ---
>>>>> include/rv/da_monitor.h | 23 +++++++++++++++++++----
>>>>> include/rv/ha_monitor.h | 18 ++++++++++++++++--
>>>>> 2 files changed, 35 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
>>>>> index a4a13b62d1a4..402d3b935c08 100644
>>>>> --- a/include/rv/da_monitor.h
>>>>> +++ b/include/rv/da_monitor.h
>>>>> @@ -57,6 +57,15 @@ static struct rv_monitor rv_this;
>>>>> #define da_monitor_reset_hook(da_mon)
>>>>> #endif
>>>>>
>>>>> +/*
>>>>> + * Hook to allow the implementation of hybrid automata: define it with
>>>>> a
>>>>> + * function that waits for the termination of all monitors background
>>>>> + * activities (e.g. all timers). This hook can sleep.
>>>>> + */
>>>>> +#ifndef da_monitor_sync_hook
>>>>> +#define da_monitor_sync_hook()
>>>>> +#endif
>>>>> +
>>>>> /*
>>>>> * Type for the target id, default to int but can be overridden.
>>>>> * A long type can work as hash table key (PER_OBJ) but will be
>>>>> downgraded
>>>>> to
>>>>> @@ -179,6 +188,7 @@ static inline int da_monitor_init(void)
>>>>> static inline void da_monitor_destroy(void)
>>>>> {
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>> }
>>>>>
>>>>> #ifndef da_implicit_guard
>>>>> @@ -232,6 +242,7 @@ static inline int da_monitor_init(void)
>>>>> static inline void da_monitor_destroy(void)
>>>>> {
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>> }
>>>>>
>>>>> #ifndef da_implicit_guard
>>>>> @@ -319,6 +330,7 @@ static inline void da_monitor_destroy(void)
>>>>> }
>>>>>
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>>
>>>>> rv_put_task_monitor_slot(task_mon_slot);
>>>>> task_mon_slot = RV_PER_TASK_MONITOR_INIT;
>>>>> @@ -497,10 +509,9 @@ static void da_monitor_reset_all(void)
>>>>> struct da_monitor_storage *mon_storage;
>>>>> int bkt;
>>>>>
>>>>> - rcu_read_lock();
>>>>> + guard(rcu)();
>>>>> hash_for_each_rcu(da_monitor_ht, bkt, mon_storage, node)
>>>>> da_monitor_reset(&mon_storage->rv.da_mon);
>>>>> - rcu_read_unlock();
>>>>> }
>>>>>
>>>>> static inline int da_monitor_init(void)
>>>>> @@ -516,13 +527,17 @@ static inline void da_monitor_destroy(void)
>>>>> int bkt;
>>>>>
>>>>> tracepoint_synchronize_unregister();
>>>>> + scoped_guard(rcu) {
>>>>> + hash_for_each_rcu(da_monitor_ht, bkt, mon_storage,
>>>>> node) {
>>>>> + da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>> + }
>>>>> + }
>>>>> + da_monitor_sync_hook();
>>>>> /*
>>>>> * This function is called after all probes are disabled and no
>>>>> longer
>>>>> * pending, we can safely assume no concurrent user.
>>>>> */
>>>>> - synchronize_rcu();
>>>>> hash_for_each_safe(da_monitor_ht, bkt, tmp, mon_storage, node)
>>>>> {
>>>>> - da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>> hash_del_rcu(&mon_storage->node);
>>>>> kfree(mon_storage);
>>>>> }
>>>>> diff --git a/include/rv/ha_monitor.h b/include/rv/ha_monitor.h
>>>>> index d59507e8cb30..47ff1a41febe 100644
>>>>> --- a/include/rv/ha_monitor.h
>>>>> +++ b/include/rv/ha_monitor.h
>>>>> @@ -36,6 +36,7 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor
>>>>> *da_mon,
>>>>> #define da_monitor_event_hook ha_monitor_handle_constraint
>>>>> #define da_monitor_init_hook ha_monitor_init_env
>>>>> #define da_monitor_reset_hook ha_monitor_reset_env
>>>>> +#define da_monitor_sync_hook() synchronize_rcu()
>>>>>
>>>>> #include <rv/da_monitor.h>
>>>>> #include <linux/seq_buf.h>
>>>>> @@ -237,12 +238,25 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor *da_mon,
>>>>> return false;
>>>>> }
>>>>>
>>>>> +/*
>>>>> + * __ha_monitor_timer_callback - generic callback representation
>>>>> + *
>>>>> + * This callback runs in an RCU read-side critical section to allow the
>>>>> + * destruction sequence to easily synchronize_rcu() with all pending
>>>>> timer
>>>>> + * after asynchronously disabling them.
>>>>> + */
>>>>> static inline void __ha_monitor_timer_callback(struct ha_monitor
>>>>> *ha_mon)
>>>>> {
>>>>> - enum states curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>> DECLARE_SEQ_BUF(env_string, ENV_BUFFER_SIZE);
>>>>> - u64 time_ns = ha_get_ns();
>>>>> + enum states curr_state;
>>>>> + u64 time_ns;
>>>>> +
>>>>> + if (unlikely(!da_monitor_handling_event(&ha_mon->da_mon)))
>>>>> + return;
>>>>>
>>>>> + guard(rcu)();
>>>>> + curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>> + time_ns = ha_get_ns();
>>>>> ha_get_env_string(&env_string, ha_mon, time_ns);
>>>>> ha_react(curr_state, EVENT_NONE, env_string.buffer);
>>>>> ha_trace_error_env(ha_mon, model_get_state_name(curr_state),
>>>
>
next prev parent reply other threads:[~2026-05-26 17:28 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 14:02 [PATCH 0/9] rv: Fixes on Deterministic and Hybrid Automata Gabriele Monaco
2026-05-12 14:02 ` [PATCH 1/9] rv: Fix __user specifier usage in extract_params() Gabriele Monaco
2026-05-17 8:48 ` Wen Yang
2026-05-12 14:02 ` [PATCH 2/9] rv: Fix read_lock scope in per-task DA cleanup Gabriele Monaco
2026-05-17 8:51 ` Wen Yang
2026-05-12 14:02 ` [PATCH 3/9] rv: Reset per-task DA monitors before releasing the slot Gabriele Monaco
2026-05-17 8:55 ` Wen Yang
2026-05-12 14:02 ` [PATCH 4/9] rv: Prevent task migration while handling per-CPU events Gabriele Monaco
2026-05-17 8:57 ` Wen Yang
2026-05-12 14:02 ` [PATCH 5/9] rv: Ensure all pending probes terminate on per-obj monitor destroy Gabriele Monaco
2026-05-17 9:01 ` Wen Yang
2026-05-12 14:02 ` [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors Gabriele Monaco
2026-05-17 9:12 ` Wen Yang
2026-05-18 11:54 ` Gabriele Monaco
2026-05-19 9:31 ` Gabriele Monaco
2026-05-19 16:48 ` Wen Yang
2026-05-20 11:22 ` Gabriele Monaco
2026-05-26 17:27 ` Wen Yang [this message]
2026-05-12 14:02 ` [PATCH 7/9] rv: Do not rely on clean monitor when initialising HA Gabriele Monaco
2026-05-17 9:15 ` Wen Yang
2026-05-12 14:02 ` [PATCH 8/9] rv: Add automatic cleanup handlers for per-task HA monitors Gabriele Monaco
2026-05-17 9:40 ` Wen Yang
2026-05-18 12:18 ` Gabriele Monaco
2026-05-12 14:02 ` [PATCH 9/9] rv: Mandate deallocation for per-obj monitors Gabriele Monaco
2026-05-17 9:52 ` Wen Yang
2026-05-18 6:36 ` Gabriele Monaco
2026-05-18 15:40 ` Wen Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev \
--to=wen.yang@linux.dev \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=namcao@linutronix.de \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.