From: Alexey Budankov <alexey.budankov@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>, Kan Liang <kan.liang@intel.com>,
Dmitri Prokhorov <Dmitry.Prohorov@intel.com>,
Valery Cherepennikov <valery.cherepennikov@intel.com>,
David Carrillo-Cisneros <davidcc@google.com>,
Stephane Eranian <eranian@google.com>,
Mark Rutland <mark.rutland@arm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2]: perf/core: addressing 4x slowdown during per-process, profiling of STREAM benchmark on Intel Xeon Phi
Date: Fri, 30 Jun 2017 13:22:09 +0300 [thread overview]
Message-ID: <e95f57ed-1ec6-e1b6-80b4-7afa79deabb3@linux.intel.com> (raw)
In-Reply-To: <2cc7195d-9354-0a2a-d97f-0bc8dc2545b0@linux.intel.com>
Hi Peter,
On 21.06.2017 18:39, Alexey Budankov wrote:
>
> Hi,
>
> On 15.06.2017 20:42, Alexey Budankov wrote:
>> On 29.05.2017 14:45, Alexey Budankov wrote:
>>> On 29.05.2017 14:23, Peter Zijlstra wrote:
>>>> On Mon, May 29, 2017 at 01:56:05PM +0300, Alexey Budankov wrote:
>>>>> On 29.05.2017 13:43, Peter Zijlstra wrote:
>>>>
>>>>>> Why can't the tree do both?
>>>>>>
>>>>>
>>>>> Well, indeed, the tree provides such capability too. However switching to
>>>>> the full tree iteration in cases where we now go through _groups lists will
>>>>> enlarge the patch, what is probably is not a big deal. Do you think it is
>>>>> worth implementing the switch?
>>>>
>>>> Do it as a series of patches, where patch 1 introduces the tree, patches
>>>> 2 through n convert the list users into tree users, and patch n+1
>>>> removes the list.
>>>
>>> Well ok, let's do that additionally but please expect delay in delivery (I am OOO till Jun 14).
>>
>> addressed in v3.
>>
>>>
>>>>
>>>> I think its good to not have duplicate data structures if we can avoid
>>>> it.
>>>>
>>>
>>> yeah, makes sense.
>>>
>>>
>>>
>>
>>
>
> After straightforward switch from struct list_head to struct rb_tree for flexible_groups I now get dmesg dumps on rb tree corruptions. That happens when iterating thru tree instead of thru list. No additional
> synchronization for the tree access was added. It looks like there are
> some assumptions on the list_head type in the implementation itself.
>
> Are there any ideas on why that corruptions may happen?
>
> I still suggest isolating event groups into a separate object (please see patch v4-1/4):
>
> struct perf_event_groups {
> struct rb_root tree;
> struct list_head list;
> };
>
> struct perf_event_context {
> ...
> struct perf_event_groups pinned_groups;
> struct perf_event_groups flexible_groups;
>
> and implementing new API for the object:
>
> perf_event_groups_empty()
> perf_event_groups_init()
> perf_event_groups_insert()
> perf_event_groups_delete()
> perf_event_groups_rotate(..., int cpu)
> perf_event_groups_iterate_cpu(..., int cpu)
> perf_event_groups_iterate()
>
> so that perf_event_groups_iterate() would go thru list but leaving
> the opportunity of iteration thru tree for a separate patch because
> complete transition to rb trees may incur synchronization overhead in runtime.
Completely got rid of list and tree duplication in patch v5 4/4.
Please see here:
[PATCH v5 4/4] perf/core: addressing 4x slowdown during per-process
profiling of STREAM benchmark on Intel Xeon Phi
>
> Thanks,
> Alexey
>
Thanks,
Alexey
next prev parent reply other threads:[~2017-06-30 10:22 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-26 22:13 [PATCH]: perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Alexey Budankov
2017-05-27 11:19 ` [PATCH v2]: perf/core: addressing 4x slowdown during per-process, " Alexey Budankov
2017-05-29 7:45 ` Peter Zijlstra
2017-05-29 9:24 ` Alexey Budankov
2017-05-29 10:33 ` Peter Zijlstra
2017-05-29 10:46 ` Alexey Budankov
2017-05-29 7:46 ` Peter Zijlstra
2017-05-29 9:15 ` Alexey Budankov
2017-05-29 10:43 ` Peter Zijlstra
2017-05-29 10:56 ` Alexey Budankov
2017-05-29 11:23 ` Peter Zijlstra
2017-05-29 11:45 ` Alexey Budankov
2017-06-15 17:42 ` Alexey Budankov
2017-06-21 15:39 ` Alexey Budankov
2017-06-30 10:22 ` Alexey Budankov [this message]
2017-05-31 21:33 ` David Carrillo-Cisneros
2017-06-14 11:27 ` Alexey Budankov
2017-05-29 12:03 ` [PATCH]: perf/core: addressing 4x slowdown during per-process " Alexander Shishkin
2017-05-29 13:43 ` Alexey Budankov
2017-05-29 15:22 ` Peter Zijlstra
2017-05-29 15:29 ` Peter Zijlstra
2017-05-29 16:41 ` Alexey Budankov
2017-05-30 8:29 ` Alexander Shishkin
2017-06-14 10:07 ` Alexey Budankov
2017-06-15 17:44 ` Alexey Budankov
-- strict thread matches above, loose matches on Subject: below --
2017-05-31 0:04 [PATCH v2]: perf/core: addressing 4x slowdown during per-process, " Arun Kalyanasundaram
2017-06-14 12:26 ` Alexey Budankov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e95f57ed-1ec6-e1b6-80b4-7afa79deabb3@linux.intel.com \
--to=alexey.budankov@linux.intel.com \
--cc=Dmitry.Prohorov@intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=davidcc@google.com \
--cc=eranian@google.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=valery.cherepennikov@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.