Linux Trace Kernel

* Re: [PATCH v3 1/9] rv/da: introduce DA_MON_ALLOCATION_STRATEGY
From: Gabriele Monaco @ 2026-06-15  9:56 UTC (permalink / raw)
  To: wen.yang; +Cc: Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <496394879a590b4d7bafdb2f13618d2e30be982f.1780847473.git.wen.yang@linux.dev>

On Mon, 2026-06-08 at 00:13 +0800, wen.yang@linux.dev wrote:

> +#ifndef DA_MON_ALLOCATION_STRATEGY
> +# define DA_MON_ALLOCATION_STRATEGY DA_ALLOC_AUTO
> +#endif

I'm not sure the space goes there according to kernel coding style, we
don't use it in this file and clang-format removes it. I'd keep
consistency.

...

>  
> +/*
> + * DA_MON_POOL_SIZE must be defined before this header is included

I don't think we need to be this verbose (also, ha_monitor may not even
be included if that's a da_monitor). I would stop at the line above.

> (directly or
> + * transitively via ha_monitor.h) when DA_ALLOC_POOL is selected. 
> In practice
> + * this means defining it after the monitor's model header (which
> supplies the
> + * capacity constant) and before the ha_monitor.h include.
> + */

> +#if DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL &&
> !defined(DA_MON_POOL_SIZE)
> +# error "DA_ALLOC_POOL requires DA_MON_POOL_SIZE to be defined
> before including this header"

Same here I'd keep consistency and remove the space before error.

> +#endif

...

> +/*
> + * Per-object pool state.
> + *
> + * Zero-initialised by default (storage == NULL ⟹ kmalloc mode).  A

Mmh, ⟹  doesn't seem to print that well on my terminal, let's perhaps
use plain old ASCII => . Also I don't find what you put in parentheses
to be adding much value, we could even omit it.

I remember discussing about this so I may have missed your answer, but
why don't we handle this pool as a simple kmem_cache/mempool instead of
implementing a similar logic from scratch?

> monitor
> + * opts into pool mode by defining DA_MON_ALLOCATION_STRATEGY
> DA_ALLOC_POOL
> + * and DA_MON_POOL_SIZE before including this header;
> da_monitor_init() then
> + * pre-allocates the pool internally.
> + *
> + * Because every field is wrapped in this struct and the struct
> itself is a
> + * per-TU static, each monitor that includes this header gets a
> completely
> + * independent pool.  A kmalloc monitor (e.g. nomiss) and a pool
> monitor
> + * (e.g. tlob) therefore coexist without any interference.
> + *
> + * da_pool_return_cb runs from softirq (non-PREEMPT_RT) or rcuc
> kthread
> + * (PREEMPT_RT); spin_lock_irqsave handles both.
> + */
> +struct da_per_obj_pool {
> +	struct da_monitor_storage  *storage;  /* non-NULL ⟹ pool
> mode */
> +	struct da_monitor_storage **free;     /* kmalloc'd pointer
> stack */
> +	unsigned int                free_top;
> +	unsigned int                capacity; /* total number of
> slots */
> +	spinlock_t                  lock;
> +};
> +
> +static struct da_per_obj_pool da_pool = {
> +	.lock = __SPIN_LOCK_UNLOCKED(da_pool.lock),
> +};

...

>  /*
>   * da_destroy_storage - destroy the per-object storage
>   *
> - * The caller is responsible to synchronise writers, either with
> locks or
> - * implicitly. For instance, if da_destroy_storage is called at
> sched_exit and
> - * da_create_storage can never occur after that, it's safe to call
> this without
> - * locks.
> - * This function includes an RCU read-side critical section to
> synchronise
> - * against da_monitor_destroy().
> + * Pool mode: removes from hash and returns the slot via call_rcu().
> + * Kmalloc mode: removes from hash and frees via kfree_rcu().
> + *
> + * Includes an RCU read-side critical section to synchronise against
> + * da_monitor_destroy().
>   */
>  static inline void da_destroy_storage(da_id_type id)
>  {
> @@ -558,7 +670,11 @@ static inline void da_destroy_storage(da_id_type
> id)
>  		return;
>  	da_monitor_reset_hook(&mon_storage->rv.da_mon);
>  	hash_del_rcu(&mon_storage->node);
> +#if DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL
> +	call_rcu(&mon_storage->rcu, da_pool_return_cb);
> +#else

ifdeffery in functions is discouraged as quite unreadable. Since
DA_MON_ALLOCATION_STRATEGY is guaranteed to be defined, simple C ifs are
going to be mostly equivalent (the compiler will cut instead of the
preprocessor, but still).

>  	kfree_rcu(mon_storage, rcu);
> +#endif
>  }

...

> +
> +/*
> + * da_monitor_init - initialise the per-object monitor
> + *
> + * Selects the allocation path at compile time based on
> DA_MON_ALLOCATION_STRATEGY:
> + *   DA_ALLOC_POOL   - pre-allocates DA_MON_POOL_SIZE storage slots.
> + *   DA_ALLOC_AUTO / DA_ALLOC_MANUAL - initialises the hash table
> only.
> + */
>  static inline int da_monitor_init(void)
>  {
>  	hash_init(da_monitor_ht);
> +#if DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL
> +	return __da_monitor_init_pool(DA_MON_POOL_SIZE);
> +#else

Same here, use if()

>  	return 0;
> +#endif
>  }
>  
> -static inline void da_monitor_destroy(void)
> +static inline void da_monitor_destroy_pool(void)
> +{
> +	struct da_monitor_storage *ms;
> +	struct hlist_node *tmp;
> +	int bkt;
> +
> +	/*
> +	 * Ensure all in-flight tracepoint handlers that may hold a
> raw pointer
> +	 * to a pool slot (e.g. tlob_stop_task after its RCU guard
> exits) have
> +	 * completed before we begin tearing down the pool.  Mirrors
> the same
> +	 * call in da_monitor_destroy_kmalloc().
> +	 */
> +	tracepoint_synchronize_unregister();
> +

This is common between the pool and kmalloc flavours, you can leave it
in da_monitor_destroy() before branching.

> +	/*
> +	 * Drain any entries that were not stopped before destroy
> (e.g.
> +	 * uprobe-started sessions whose stop probe never fired). 
> Call
> +	 * da_extra_cleanup() before hash_del_rcu() so the hook may
> safely
> +	 * call ha_cancel_timer_sync() while the monitor is still
> reachable.
> +	 */
> +	hash_for_each_safe(da_monitor_ht, bkt, tmp, ms, node) {
> +		da_extra_cleanup(&ms->rv.da_mon);
> +		hash_del_rcu(&ms->node);
> +		call_rcu(&ms->rcu, da_pool_return_cb);
> +	}

Cannot you make also this common? da_extra_cleanup() should be called in
all flavours and the only difference I see here is the rcu callback.

Also do you really need call_rcu() ? Since we should already have waited
for a grace period, you can probably call the function directly. If not,
I'd still try and make both flavours consistent (sync + free OR call_rcu
+ barrier, not both).

> +
> +	/*
> +	 * rcu_barrier() drains every pending call_rcu() callback,
> including
> +	 * both da_pool_return_cb() and any monitor-specific free
> callbacks
> +	 * (e.g. tlob_free_rcu) enqueued by da_extra_cleanup().
> +	 */
> +	rcu_barrier();
> +	kfree(da_pool.storage);
> +	da_pool.storage = NULL;
> +	kfree(da_pool.free);
> +	da_pool.free = NULL;
> +	da_pool.free_top = 0;
> +	da_pool.capacity = 0;

Only this part is really specific to this allocation flavour, if you
want it in a separate function, go ahead, but the rest should probably
share as much code as possible (especially the cleanup/synchronisation
mess we just worked out).

Thanks,
Gabriele

^ permalink raw reply