From: Peter Zijlstra <peterz@infradead.org>
To: Stephane Eranian <eranian@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
mingo@elte.hu, Arnaldo Carvalho de Melo <acme@redhat.com>,
Jiri Olsa <jolsa@redhat.com>, "Liang, Kan" <kan.liang@intel.com>,
Song Liu <songliubraving@fb.com>, Ian Rogers <irogers@google.com>
Subject: Re: [PATCH] perf/core: fix multiplexing event scheduling issue
Date: Wed, 23 Oct 2019 13:02:34 +0200 [thread overview]
Message-ID: <20191023110234.GS1817@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CABPqkBQ86=EpsKXMyYgqGdUhSHOi2uCQQfEtupMPavDRiK_30Q@mail.gmail.com>
On Wed, Oct 23, 2019 at 12:30:03AM -0700, Stephane Eranian wrote:
> On Mon, Oct 21, 2019 at 3:21 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Oct 17, 2019 at 05:27:46PM -0700, Stephane Eranian wrote:
> > > This patch complements the following commit:
> > > 7fa343b7fdc4 ("perf/core: Fix corner case in perf_rotate_context()")
> > >
> > > The fix from Song addresses the consequences of the problem but
> > > not the cause. This patch fixes the causes and can sit on top of
> > > Song's patch.
> >
> > I'm tempted to say the other way around.
> >
> > Consider the case where you claim fixed2 with a pinned event and then
> > have another fixed2 in the flexible list. At that point you're _never_
> > going to run any other flexible events (without Song's patch).
> >
> In that case, there is no deactivation or removal of events, so yes, my patch
> will not help that case. I said his patch is still useful. You gave one example,
> even though in this case the rotate will not yield a reschedule of that flexible
> event because fixed2 is used by a pinned event. So checking for it, will not
> really help.
Stick 10 cycle events after the fixed2 flexible event. Without Song's
patch you'll never see those 10 cycle events get scheduled.
> > This patch isn't going to help with that. Similarly, Songs patch helps
> > with your situation where it will allow rotation to resume after you
> > disable/remove all active events (while you still have pending events).
> >
> Yes, it will unblock the case where active events are deactivated or
> removed. But it will delay the unblocking until the next mux timer
> expires. And I am saying this is too far away in many cases. For instance,
> we do not run with the 1ms timer for uncore, this is way too much overhead.
> Imagine this timer is set to 10ms or event 100ms, just with Song's patch, the
> inactive events would have to wait for up to 100ms to be scheduled again.
> This is not acceptable for us.
Then how was it acceptible to mux in the first place? And if
multiplexing wasn't acceptible, then why were you doing it?
> > > However, the cause is not addressed. The kernel should not rely on
> > > the multiplexing hrtimer to unblock inactive events. That timer
> > > can have abitrary duration in the milliseconds. Until the timer
> > > fires, counters are available, but no measurable events are using
> > > them. We do not want to introduce blind spots of arbitrary durations.
> >
> > This I disagree with -- you don't get a guarantee other than
> > timer_period/n when you multiplex, and idling the counters until the
> > next tick doesn't violate that at all.
>
> My take is that if you have free counters and "idling" events, the kernel
> should take every effort to schedule them as soon as they become available.
> In the situation I described in the patch, once I remove the active
> events, there
> is no more reasons for multiplexing, all the counters are free (ignore
> watchdog).
That's fine; all I'm arguing is that the current behaviour doesn't
violate the guarantees given. Now you want to improve counter
utilization (at a cost) and that is fine. Just don't argue that there's
something broken -- there is not.
Your patch also does not fix something more fundamental than Song's
patch did. Quite the reverse. Yours is purely a utilization efficiency
thing, while Song's addressed a correctness issue.
> Now you may be arguing, that it may take more time to ctx_resched() then to
> wait for the timer to expire. But I am not sure I buy that.
I'm not arguing that. All I'm saying is that fairness is not affected.
> Similarly, I am not sure there is code to cancel an active mux hrtimer
> when we clear rotate_necessary. Maybe we just let it lapse and clear
> itself via a ctx_sched_out() in the rotation code.
Yes, we let it lapse and disable itself, I don't see the problem with
that -- also remember that the timer services two contexts.
> > > This patch addresses the cause of the problem, by checking that,
> > > when an event is disabled or removed and the context was multiplexing
> > > events, inactive events gets immediately a chance to be scheduled by
> > > calling ctx_resched(). The rescheduling is done on event of equal
> > > or lower priority types. With that in place, as soon as a counter
> > > is freed, schedulable inactive events may run, thereby eliminating
> > > a blind spot.
> >
> > Disagreed, Song's patch removed the fundamental blind spot of rotation
> > completely failing.
> Sure it removed the infinite blocking of schedulable events. My patch
> addresses the issue of having free counters following a
> deactivation/removal and not scheduling the idling events on them,
> thereby creating a blind spot where no event is monitoring.
If the counters were removed later (like a us before their slice
expired) there would not have been much idle time.
Either way, fairness does not mandate we schedule immediately.
> > This just slightly optimizes counter usage -- at a cost of having to
> > reprogram the counters more often.
> >
> Only on deactivation/removal AND multiplexing so it is not every time but
> only where there is an opportunity to keep the counters busy.
Sure, but it's still non-zero :-)
> > Not saying we shouldn't do this, but this justification is just all
> > sorts of wrong.
>
> I think the patches are not mutually exclusive.
I never said they were. And I'm not opposed to your patch. What I
objected to was the mischaracterization of it in the Changelog.
Present it as an optimization and all should be well.
next prev parent reply other threads:[~2019-10-23 11:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 0:27 [PATCH] perf/core: fix multiplexing event scheduling issue Stephane Eranian
2019-10-18 6:13 ` Song Liu
2019-10-18 6:19 ` Stephane Eranian
2019-10-18 6:55 ` Song Liu
2019-10-21 10:05 ` Peter Zijlstra
2019-10-23 7:06 ` Stephane Eranian
2019-10-23 9:37 ` Peter Zijlstra
2019-10-23 15:29 ` Peter Zijlstra
2019-10-21 10:20 ` Peter Zijlstra
2019-10-23 7:30 ` Stephane Eranian
2019-10-23 11:02 ` Peter Zijlstra [this message]
2019-10-23 17:44 ` Stephane Eranian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191023110234.GS1817@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=acme@redhat.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@redhat.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox