public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Stephane Eranian <eranian@google.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@elte.hu" <mingo@elte.hu>,
	"acme@redhat.com" <acme@redhat.com>,
	"jolsa@redhat.com" <jolsa@redhat.com>,
	"kan.liang@intel.com" <kan.liang@intel.com>,
	"irogers@google.com" <irogers@google.com>
Subject: Re: [PATCH] perf/core: fix multiplexing event scheduling issue
Date: Fri, 18 Oct 2019 06:55:14 +0000	[thread overview]
Message-ID: <32FCDA83-9888-480A-9A77-AEB37FD004CE@fb.com> (raw)
In-Reply-To: <CABPqkBT40-wVWq7K93QJc1r_1=R0WQuoa_SHebWApPEstCWeNg@mail.gmail.com>



> On Oct 17, 2019, at 11:19 PM, Stephane Eranian <eranian@google.com> wrote:
> 
> On Thu, Oct 17, 2019 at 11:13 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 17, 2019, at 5:27 PM, Stephane Eranian <eranian@google.com> wrote:
>>> 
>>> This patch complements the following commit:
>>> 7fa343b7fdc4 ("perf/core: Fix corner case in perf_rotate_context()")
>>> 
>>> The fix from Song addresses the consequences of the problem but
>>> not the cause. This patch fixes the causes and can sit on top of
>>> Song's patch.
>>> 
>>> This patch fixes a scheduling problem in the core functions of
>>> perf_events. Under certain conditions, some events would not be
>>> scheduled even though many counters would be available. This
>>> is related to multiplexing and is architecture agnostic and
>>> PMU agnostic (i.e., core or uncore).
>>> 
>>> This problem can easily be reproduced when you have two perf
>>> stat sessions. The first session does not cause multiplexing,
>>> let's say it is measuring 1 event, E1. While it is measuring,
>>> a second session starts and causes multiplexing. Let's say it
>>> adds 6 events, B1-B6. Now, 7 events compete and are multiplexed.
>>> When the second session terminates, all 6 (B1-B6) events are
>>> removed. Normally, you'd expect the E1 event to continue to run
>>> with no multiplexing. However, the problem is that depending on
>>> the state Of E1 when B1-B6 are removed, it may never be scheduled
>>> again. If E1 was inactive at the time of removal, despite the
>>> multiplexing hrtimer still firing, it will not find any active
>>> events and will not try to reschedule. This is what Song's patch
>>> fixes. It forces the multiplexing code to consider non-active events.
>> 
>> Good analysis! I kinda knew the example I had (with pinned event)
>> was not the only way to trigger this issue. But I didn't think
>> about event remove path.
>> 
> I was pursuing this bug without knowledged of your patch. Your patch
> makes it harder to see. Clearly in my test case, it disappears, but it is
> just because of the multiplexing interval. If we get to the rotate code
> and we have no active events yet some inactive, there is something
> wrong because we are wasting counters. When I tracked the bug,
> I started from the remove_context code, then realized there was also
> the disable case. I fixed both and they I discovered your patch which
> is fixing it at the receiving end. Hopefully, there aren't any code path
> that can lead to this situation.

Thanks for the explanation. Agreed that blind spot has bigger impact 
with longer rotation interval. 

[...]

>>> Signed-off-by: Stephane Eranian <eranian@google.com>
>> 
>> Maybe add:
>> Fixes: 8d5bce0c37fa ("perf/core: Optimize perf_rotate_context() event scheduling")
>> 
> It does not really fix your patch, I think we can keep it as a double
> precaution. It fixes
> the causes. I think it is useful to check beyond the active in the
> rotate code as well.

Also agreed, this is not really fixing that specific commit. 

Acked-by: Song Liu <songliubraving@fb.com>

Thanks,
Song



  reply	other threads:[~2019-10-18  6:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18  0:27 [PATCH] perf/core: fix multiplexing event scheduling issue Stephane Eranian
2019-10-18  6:13 ` Song Liu
2019-10-18  6:19   ` Stephane Eranian
2019-10-18  6:55     ` Song Liu [this message]
2019-10-21 10:05 ` Peter Zijlstra
2019-10-23  7:06   ` Stephane Eranian
2019-10-23  9:37     ` Peter Zijlstra
2019-10-23 15:29       ` Peter Zijlstra
2019-10-21 10:20 ` Peter Zijlstra
2019-10-23  7:30   ` Stephane Eranian
2019-10-23 11:02     ` Peter Zijlstra
2019-10-23 17:44       ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32FCDA83-9888-480A-9A77-AEB37FD004CE@fb.com \
    --to=songliubraving@fb.com \
    --cc=acme@redhat.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox