linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joerg Roedel <joro@8bytes.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Hans Rosenfeld" <hans.rosenfeld@amd.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Robert Richter" <robert.richter@amd.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>
Subject: Re: [RFC v3 0/8] x86, xsave: rework of extended state handling, LWP support
Date: Wed, 18 May 2011 10:16:53 +0200	[thread overview]
Message-ID: <20110518081653.GA23407@8bytes.org> (raw)
In-Reply-To: <20110517113020.GA13475@elte.hu>

Hi Ingo,

thanks for your thoughts on this. I have some comments below.

On Tue, May 17, 2011 at 01:30:20PM +0200, Ingo Molnar wrote:

> - Where is the hardware interrupt that signals the ring-buffer-full condition
>   exposed to user-space and how can user-space wait for ring buffer events?
>   AFAICS this needs to set the LWP_CFG MSR and needs an irq handler, which 
>   needs kernel side support - but that is not included in these patches.
> 
>   The way we solved this with Intel's BTS (and PEBS) feature is that there's
>   a per task hardware buffer that is coupled with the event ring buffer, so
>   both setup and 'waiting' for the ring-buffer happens automatically and
>   transparently because tools can already wait on the ring-buffer.
> 
>   Considerable effort went into that model on the Intel side before we merged
>   it and i see no reason why an AMD hw-tracing feature should not have this 
>   too...
> 
>   [ If that is implemented we can expose LWP to user-space as well (which can
>     choose to utilize it directly and buffer into its own memory area without 
>     irqs and using polling, but i'd generally discourage such crude event 
>     collection methods). ]

If I understand this correctly you suggest to propagate the lwp-events
through perf into user-space. This is certainly good because it provides
a unified interface, but it somewhat elimitates the 'lightweight' part
of LWP because the samples need to be read by the kernel from user-space
memory (the lwp-ring-buffer needs to be in user-space memory), convert
it to perf-samples, and copy it back to user-space. The benefit is the
unified interface but the 'lightweight' and low-impact part vanishes to
some degree.

Also, LWP is somewhat different from the old-style PMU. LWP is designed
for self-monitoring of applications that want to optimize themself at
runtime, like JIT compilers (Java, LVMM, ...) or databases. For those
applications it would be good to keep LWP as lightweight as possible.

The missing support for interupts is certainly a problem here which
significantly limits the usefulness of the feature for now. My idea was
to expose the interupt-event through perf to user-space so that the
application can wait on that event to read out the LWP ring-buffer.

But to come back to your idea, it probably could be done in a way to
enable profiling of other applications using LWP. The kernel needs to
allocate the lwp ring-buffer and setup lwp itself. The problem is that
the buffer needs to be user-accessible and where to map this buffer:

	a) On the kernel-part of the address space. Problematic because
	   every process can read the buffer of other tasks. So this is
	   a no-go from a security point-of-view.

	b) Change the address space layout in a comatible way to allow
	   the kernel to map it (e.g. make a small part of the
	   kernel-address space per-process). Somewhat intrusive to
	   current x86 code, also not sure this feature is worth it.

	c) Some way to let userspace setup such a buffer and give the
	   address to the kernel, or we mmap it directly into user
	   address space. But that may cause other problems with
	   applications that have strict requirements for their
	   address-space layout.

Bottom-line is, we need a good and secure way to setup a user-accessible
buffer per-process in the kernel. If we have that we can use LWP to
monitor other applications (unless the application decides to use LWP of
its own).

I like the idea, but we should also make sure that we don't prevent the
low-impact self-monitoring use-case for applications that want it.

> - LWP is exposed indiscriminately, without giving user-space a chance to 
>   disable it on a per task basis. Security-conscious apps would want to disable
>   access to the LWP instructions - which are all ring 3 and unprivileged! We
>   already allow this for the TSC for example. Right now sandboxed code like
>   seccomp would get access to LWP as well - not good. Some intelligent
>   (optional) control is needed, probably using cr0's lwp-enabled bit.

That could certainly be done, but requires an xcr0 write at
context-switch. JFI, how can the tsc be disabled for a task from
userspace?

Regards,

	Joerg


  parent reply	other threads:[~2011-05-18  8:16 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-09 19:14 [RFC 0/8] rework of extended state handling, LWP support Hans Rosenfeld
2011-03-09 19:14 ` [RFC 1/8] x86, xsave: cleanup fpu/xsave support Hans Rosenfeld
2011-03-09 19:14 ` [RFC 2/8] x86, xsave: rework " Hans Rosenfeld
2011-03-09 19:14 ` [RFC 3/8] x86, xsave: cleanup fpu/xsave signal frame setup Hans Rosenfeld
2011-03-09 19:14 ` [RFC 4/8] x86, xsave: remove unused code Hans Rosenfeld
2011-03-09 19:14 ` [RFC 5/8] x86, xsave: more cleanups Hans Rosenfeld
2011-03-09 19:15 ` [RFC 6/8] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-03-09 19:15 ` [RFC 7/8] x86, xsave: add kernel support for AMDs Lightweight Profiling (LWP) Hans Rosenfeld
2011-03-09 19:15 ` [RFC 8/8] x86, xsave: remove lazy allocation of xstate area Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 0/8] x86, xsave: rework of extended state handling, LWP support Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 1/8] x86, xsave: cleanup fpu/xsave support Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 2/8] x86, xsave: rework " Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 3/8] x86, xsave: cleanup fpu/xsave signal frame setup Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 4/8] x86, xsave: remove unused code Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 5/8] x86, xsave: more cleanups Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 6/8] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 7/8] x86, xsave: add kernel support for AMDs Lightweight Profiling (LWP) Hans Rosenfeld
2011-03-23 15:27   ` [RFC v2 8/8] x86, xsave: remove lazy allocation of xstate area Hans Rosenfeld
2011-03-24 11:39     ` Brian Gerst
2011-03-29 14:17       ` Hans Rosenfeld
2011-03-29 15:27         ` H. Peter Anvin
2011-03-30 13:11           ` Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 0/8] x86, xsave: rework of extended state handling, LWP support Hans Rosenfeld
2011-04-07  7:23             ` Ingo Molnar
2011-04-07 15:30               ` Hans Rosenfeld
2011-04-07 16:08                 ` [RFC v4 6/8] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-04-07 16:08                 ` [RFC v4 8/8] x86, xsave: remove lazy allocation of xstate area Hans Rosenfeld
2011-04-13 10:58                 ` [PATCH] x86, xsave: fix non-lazy allocation of the xsave area Hans Rosenfeld
2011-04-13 23:21                   ` H. Peter Anvin
2011-04-15 16:47                     ` [PATCH 1/1] " Hans Rosenfeld
2011-05-16 19:10               ` [RFC v3 0/8] x86, xsave: rework of extended state handling, LWP support Hans Rosenfeld
2011-05-17 11:30                 ` Ingo Molnar
2011-05-17 15:22                   ` Hans Rosenfeld
2011-05-18 11:22                     ` Ingo Molnar
2011-05-18 13:51                     ` Ingo Molnar
2011-05-18  8:16                   ` Joerg Roedel [this message]
2011-05-18 10:59                     ` Ingo Molnar
2011-05-18 18:02                   ` Andreas Herrmann
2011-04-05 15:50           ` [RFC v3 1/8] x86, xsave: cleanup fpu/xsave support Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 2/8] x86, xsave: rework " Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 3/8] x86, xsave: cleanup fpu/xsave signal frame setup Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 4/8] x86, xsave: remove unused code Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 5/8] x86, xsave: more cleanups Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 6/8] x86, xsave: add support for non-lazy xstates Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 7/8] x86, xsave: add kernel support for AMDs Lightweight Profiling (LWP) Hans Rosenfeld
2011-04-06 22:06             ` [tip:x86/xsave] " tip-bot for Hans Rosenfeld
2011-04-05 15:50           ` [RFC v3 8/8] x86, xsave: remove lazy allocation of xstate area Hans Rosenfeld
2011-04-06 22:06             ` [tip:x86/xsave] " tip-bot for Hans Rosenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110518081653.GA23407@8bytes.org \
    --to=joro@8bytes.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hans.rosenfeld@amd.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=robert.richter@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).