From: Mathieu Desnoyers via lttng-dev <lttng-dev@lists.lttng.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
lttng-dev <lttng-dev@lists.lttng.org>,
KVM list <kvm@vger.kernel.org>, rostedt <rostedt@goodmis.org>
Subject: Re: [lttng-dev] Unexport of kvm_x86_ops vs tracer modules
Date: Mon, 25 Apr 2022 09:00:22 -0400 (EDT) [thread overview]
Message-ID: <358746537.34457.1650891622938.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1622857974.11247.1649441213797.JavaMail.zimbra@efficios.com>
----- On Apr 8, 2022, at 2:06 PM, Mathieu Desnoyers via lttng-dev lttng-dev@lists.lttng.org wrote:
> ----- On Apr 8, 2022, at 12:24 PM, Paolo Bonzini pbonzini@redhat.com wrote:
>
>> On 4/8/22 17:36, Mathieu Desnoyers wrote:
>>> LTTng is an out of tree kernel module, which currently relies on the export.
>>> Indeed, arch/x86/kvm/x86.c exports a set of tracepoints to kernel modules, e.g.:
>>>
>>> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry)
>>>
>>> But any probe implementation hooking on that tracepoint would need kvm_x86_ops
>>> to translate the struct kvm_vcpu * into meaningful tracing data.
>>>
>>> I could work-around this on my side in ugly ways, but I would like to discuss
>>> how kernel module tracers are expected to implement kvm events probes without
>>> the kvm_x86_ops symbol ?
>>
>> The conversion is done in the TP_fast_assign snippets, which are part of
>> kvm.ko and therefore do not need the export. As I understand it, the
>> issue is that LTTng cannot use the TP_fast_assign snippets, because they
>> are embedded in the trace_event_raw_event_* symbols?
>
> Indeed, the fact that the TP_fast_assign snippets are embedded in the
> trace_event_raw_event_* symbols is an issue for LTTng. This ties those
> to ftrace.
>
> AFAIK, TP_fast_assign copies directly into ftrace ring buffers, and then
> afterwards things like dynamic filters are applied, which then "uncommits" the
> events if need be (and if possible). Also, TP_fast_assign is tied to the
> ftrace ring buffer event layout. The fact that the TP_STRUCT__entry()
> (description)
> and TP_fast_assign() (open-coded C) are separate fields really focuses on a
> use-case where all data is serialized to a ring buffer.
>
> In LTTng, the event fields are made available to a filter interpreter prior to
> being copied into LTTng's ring buffer. This is made possible by implementing
> our own LTTNG_TRACEPOINT_EVENT code generation headers. In addition, we have
> recently released an event notification mechanism (lttng 2.13) which captures
> specific event fields to send with an immediate notification (thus bypassing the
> tracer buffering). We are also currently working on a LTTng trace hit counters
> mechanism, which performs aggregation through per-cpu counters, which doesn't
> even allocate a ring buffer.
>
> For those reasons, LTTng reimplements its own tracepoint probe callbacks. All
> those sit within LTTng kernel modules, which means we currently need the
> exported
> kvm_x86_ops callbacks.
>
>> We cannot do the extraction before calling trace_kvm_exit, because it's
>> expensive.
>
> I suspect that extracting relevant data prior to calling trace_kvm_exit
> is too expensive because it cannot be skipped when the tracepoint is
> disabled. This is because trace_kvm_exit() is a static inline function,
> and the check to figure out if the event is enabled is within that function.
> Unfortunately, even if the tracepoint is disabled, the side-effects of the
> parameters passed to trace_kvm_exit() must happen.
>
> I've solved this in LTTng-UST by implementing a lttng_ust_tracepoint()
> macro, which basically "lifts" the tracepoint enabled check before the
> evaluation of the arguments.
>
> You could achieve something similar by using trace_kvm_exit_enabled() in the
> kernel like so:
>
> if (trace_kvm_exit_enabled())
> trace_kvm_exit(....);
>
> Which would skip evaluation of the argument side-effects when the tracepoint is
> disabled.
>
> By doing that, when multiple tracers are attached to a kvm tracepoint, the
> translation from pointer-to-internal-structure to meaningful fields would only
> need to be done once when a tracepoint is hit. And this would remove the need
> for using kvm_x86_ops callbacks from tracer probe functions.
>
> Thoughts ?
Hi Paolo,
We are at 5.18-rc4 now. Should I expect this unexport to stay in place
for 5.18 final and go ahead with using kallsyms to find this symbol from
lttng-modules instead ?
Thanks,
Mathieu
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
next prev parent reply other threads:[~2022-04-25 13:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-08 15:36 [lttng-dev] Unexport of kvm_x86_ops vs tracer modules Mathieu Desnoyers via lttng-dev
2022-04-08 16:24 ` Paolo Bonzini via lttng-dev
2022-04-08 18:06 ` Mathieu Desnoyers via lttng-dev
2022-04-25 13:00 ` Mathieu Desnoyers via lttng-dev [this message]
2022-04-25 13:28 ` Paolo Bonzini via lttng-dev
2022-04-25 14:04 ` Steven Rostedt via lttng-dev
2022-04-25 16:02 ` Mathieu Desnoyers via lttng-dev
2022-04-25 19:13 ` Steven Rostedt via lttng-dev
2022-04-26 6:36 ` Paolo Bonzini via lttng-dev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=358746537.34457.1650891622938.JavaMail.zimbra@efficios.com \
--to=lttng-dev@lists.lttng.org \
--cc=kvm@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=pbonzini@redhat.com \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).