public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Waiman Long <longman@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] task_work: Consume only item at a time while invoking the callbacks.
Date: Wed, 26 Feb 2025 15:01:15 +0100	[thread overview]
Message-ID: <20250226140114.GE8995@redhat.com> (raw)
In-Reply-To: <20250226131315.GD8995@redhat.com>

On 02/26, Oleg Nesterov wrote:
>

Hmm. empty email? Let me resend.

On 02/25, Frederic Weisbecker wrote:
>
> Le Tue, Feb 25, 2025 at 05:35:50PM +0100, Oleg Nesterov a écrit :
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5304,12 +5304,12 @@ static void perf_pending_task_sync(struct perf_event *event)
> >  		return;
> >  	}
> >
> > -	/*
> > -	 * All accesses related to the event are within the same RCU section in
> > -	 * perf_pending_task(). The RCU grace period before the event is freed
> > -	 * will make sure all those accesses are complete by then.
> > -	 */
> > -	rcuwait_wait_event(&event->pending_work_wait, !event->pending_work, TASK_UNINTERRUPTIBLE);
> > +	spin_lock(XXX_LOCK);
> > +	if (event->pending_work) {
> > +		local_dec(&event->ctx->nr_no_switch_fast);
> > +		event->pending_work = -1;
> > +	}
> > +	spin_unlock(XXX_LOCK);
> >  }
> >
> >  static void _free_event(struct perf_event *event)
> > @@ -5369,7 +5369,15 @@ static void _free_event(struct perf_event *event)
> >  	exclusive_event_destroy(event);
> >  	module_put(event->pmu->module);
> >
> > -	call_rcu(&event->rcu_head, free_event_rcu);
> > +	bool free = true;
> > +	spin_lock(XXX_LOCK)
> > +	if (event->pending_work == -1) {
> > +		event->pending_work = -2;
> > +		free = false;
> > +	}
> > +	spin_unlock(XXX_LOCK);
> > +	if (free)
> > +		call_rcu(&event->rcu_head, free_event_rcu);
> >  }
> >
> >  /*
> > @@ -6981,7 +6989,14 @@ static void perf_pending_task(struct callback_head *head)
> >  {
> >  	struct perf_event *event = container_of(head, struct perf_event, pending_task);
> >  	int rctx;
> > +	bool free = false;
> >
> > +	spin_lock(XXX_LOCK);
> > +	if ((int)event->pending_work < 0) {
> > +		free = event->pending_work == -2u;
> > +		event->pending_work = 0;
> > +		goto unlock;
> > +	}
> >  	/*
> >  	 * All accesses to the event must belong to the same implicit RCU read-side
> >  	 * critical section as the ->pending_work reset. See comment in
> > @@ -7004,6 +7019,12 @@ static void perf_pending_task(struct callback_head *head)
> >
> >  	if (rctx >= 0)
> >  		perf_swevent_put_recursion_context(rctx);
> > +
> > +unlock:
> > +	spin_unlock(XXX_LOCK);
> > +
> > +	if (free)
> > +		call_rcu(&event->rcu_head, free_event_rcu);
> >  }
> >
> >  #ifdef CONFIG_GUEST_PERF_EVENTS
> >
>
> Heh, I suggested something similar also:
> https://lore.kernel.org/lkml/ZyJUzhzHGDu5CLdi@localhost.localdomain/

;)

I can't comment your patch because I don't understand this code enough.

My patch is more simple, it doesn't play with refcount.

perf_pending_task_sync() sets ->pending_work = -1, after that
perf_pending_task() (which can run in parallel on another CPU) will
only clear ->pending_work and do nothing else.

Then _free_event() rechecks ->pending_work before return, if it is still
nonzero then perf_pending_task() is still pending. In this case _free_event()
sets ->pending_work = -2 to offload call_rcu(free_event_rcu) to the pending
perf_pending_task().

But it is certainly more ugly, and perhaps the very idea is wrong. So I will
be happy if we go with your patch.

Either way, IMO we should try to kill this rcuwait_wait_event() logic. See
another email I sent a minute ago in this thread. Quite possibly I missed
something, but the very idea to wait for another task doesn't look safe
to me.

Thanks!

Oleg.


  reply	other threads:[~2025-02-26 14:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-21 17:05 [PATCH] task_work: Consume only item at a time while invoking the callbacks Sebastian Andrzej Siewior
2025-02-23 22:40 ` Oleg Nesterov
2025-02-25 14:28   ` Frederic Weisbecker
2025-02-25 16:35     ` Oleg Nesterov
2025-02-25 22:20       ` Frederic Weisbecker
2025-02-26 13:13         ` Oleg Nesterov
2025-02-26 14:01           ` Oleg Nesterov [this message]
2025-02-26 14:42             ` Frederic Weisbecker
2025-02-26 18:36               ` Oleg Nesterov
2025-02-26 14:16   ` Sebastian Andrzej Siewior
2025-02-26 14:29     ` Oleg Nesterov
2025-02-26 14:32       ` Sebastian Andrzej Siewior
2025-02-26 12:50 ` Oleg Nesterov
2025-02-26 13:08   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250226140114.GE8995@redhat.com \
    --to=oleg@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=edumazet@google.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox