From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752655Ab1LTKLT (ORCPT ); Tue, 20 Dec 2011 05:11:19 -0500 Received: from mx3.mail.elte.hu ([157.181.1.138]:59757 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751982Ab1LTKLK (ORCPT ); Tue, 20 Dec 2011 05:11:10 -0500 Date: Tue, 20 Dec 2011 11:09:17 +0100 From: Ingo Molnar To: Avi Kivity Cc: Robert Richter , Benjamin Block , Hans Rosenfeld , hpa@zytor.com, tglx@linutronix.de, suresh.b.siddha@intel.com, eranian@google.com, brgerst@gmail.com, Andreas.Herrmann3@amd.com, x86@kernel.org, linux-kernel@vger.kernel.org, Benjamin Block Subject: Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1) Message-ID: <20111220100916.GA20788@elte.hu> References: <20111218080443.GB4144@elte.hu> <20111218234309.GA12958@elte.hu> <20111219090923.GB16765@erda.amd.com> <20111219105429.GC19861@elte.hu> <4EEF1C3B.3010307@redhat.com> <20111219114023.GB29855@elte.hu> <4EEF26F0.1050709@redhat.com> <20111220091511.GB3091@elte.hu> <4EF05996.8030807@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EF05996.8030807@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.3.1 -2.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Avi Kivity wrote: > On 12/20/2011 11:15 AM, Ingo Molnar wrote: > > > The LWPCB and the LWP ring-buffer are really just an > > extension of that concept: per task buffers which are ring 3 > > visible. > > No, it's worse. They are ring 3 writeable, and ring 3 > configurable. Avi, i know that very well. > > Note that user-space does not actually have to know about > > any of these LWP addresses (but can access them if it wants > > to - no strong feelings about that) - in the correctly > > implemented model it's fully kernel managed. > > btw, that means that the intended use case - self-monitoring > with no kernel support - cannot be done. [...] Arguably many years ago the hardware was designed for brain-dead instrumentation abstractions. Note that as i said user-space *can* acccess the area if it thinks it can do it better than the kernel (and we could export that information in a well defined way - we could do the same for PEBS as well) - i have no particular strong feelings about allowing that other than i think it's an obviously inferior model - *as long* as proper, generic, usable support is added. >>From my perspective there's really just one realistic option to accept this feature: if it's properly fit into existing, modern instrumentation abstractions. I made that abundantly clear in my feedback so far. It can obviously be done, alongside the suggestions i've given. That was the condition for Intel PEBS/DS/BTS support as well - which is hardware that has at least as many brain-dead constraints and roadblocks as LWP. > > > You could rebuild the LWP block on every context switch I > > > guess, but you need to prevent access to other cpus' LWP > > > blocks (since they may be running other processes). I > > > think this calls for per-cpu cr3, even for threads in the > > > same process. > > > > Why would we want to rebuild the LWPCB? Just keep one per > > task and do a lightweight switch to it during switch_to() - > > like we do it with the PEBS hardware-ring-buffer. It can be > > in the same single block of memory with the ring-buffer > > itself. (PEBS has similar characteristics) > > If it's in globally visible memory, the user can reprogram the > LWP from another thread to thrash ordinary VMAs. [...] User-space can smash it and make it not profile or profile the wrong thing or into the wrong buffer - but LWP itself runs with ring3 privileges so it won't do anything the user couldnt do already. Lack of protection against self-misconfiguration-damage is a benign hardware mis-feature - something for LWP v2 to specify i guess. But i don't want to reject this feature based on this mis-feature alone - it's a pretty harmless limitation and the precise, skid-less profiling that LWP offers is obviously useful. > [...] It has to be process local (at which point, you can > just use do_mmap() to allocate it). get_unmapped_area() + install_special_mapping() is probably better, but yeah. Thanks, Ingo