qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] trace/simple: seperate hot paths of tracing fucntions
@ 2025-05-19 18:51 Tanish Desai
  2025-05-20 19:01 ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Tanish Desai @ 2025-05-19 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Tanish Desai

Remove hot paths from .c file and added it in .h file to keep it inline.

Signed-off-by: Tanish Desai <tanishdesai37@gmail.com>
---
 scripts/tracetool/backend/simple.py | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py
index a74d61fcd6..2688d4b64b 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -36,8 +36,17 @@ def generate_h_begin(events, group):
 
 
 def generate_h(event, group):
-    out('    _simple_%(api)s(%(args)s);',
+    event_id = 'TRACE_' + event.name.upper()
+    if "vcpu" in event.properties:
+        # already checked on the generic format code
+        cond = "true"
+    else:
+        cond = "trace_event_get_state(%s)" % event_id
+    out('    if (%(cond)s) {',
+        '        _simple_%(api)s(%(args)s);',
+        '    }',
         api=event.api(),
+        cond=cond,
         args=", ".join(event.args.names()))
 
 
@@ -72,22 +81,10 @@ def generate_c(event, group):
     if len(event.args) == 0:
         sizestr = '0'
 
-    event_id = 'TRACE_' + event.name.upper()
-    if "vcpu" in event.properties:
-        # already checked on the generic format code
-        cond = "true"
-    else:
-        cond = "trace_event_get_state(%s)" % event_id
-
     out('',
-        '    if (!%(cond)s) {',
-        '        return;',
-        '    }',
-        '',
         '    if (trace_record_start(&rec, %(event_obj)s.id, %(size_str)s)) {',
         '        return; /* Trace Buffer Full, Event Dropped ! */',
         '    }',
-        cond=cond,
         event_obj=event.api(event.QEMU_EVENT),
         size_str=sizestr)
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread
* [PATCH] trace/simple: seperate hot paths of tracing fucntions
@ 2025-05-28 19:25 Tanish Desai
  2025-06-02 22:25 ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Tanish Desai @ 2025-05-28 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, stefanha, mads, Tanish Desai

This change improves performance by moving the hot path of the trace_vhost_commit()(or any other trace function) logic to the header file.
Previously, even when the trace event was disabled, the function call chain:-
	trace_vhost_commit()(Or any other trace function) →  _nocheck__trace_vhost_commit() →  _simple_trace_vhost_commit()
	incurred a significant function prologue overhead before checking the trace state.

Disassembly of _simple_trace_vhost_commit() (from the .c file) showed that 11 out of the first 14 instructions were prologue-related, including:
0x10	stp x29, x30, [sp, #-64]!	Prologue: allocates 64-byte frame and saves old FP (x29) & LR (x30)
0x14	adrp x3, trace_events_enabled_count	Prologue: computes page-base of the trace-enable counter
0x18	adrp x2, __stack_chk_guard	Important (maybe prolog don't know?)(stack-protector): starts up the stack-canary load
0x1c	mov x29, sp	Prologue: sets new frame pointer
0x20	ldr x3, [x3]	Prologue: loads the actual trace-enabled count
0x24	stp x19, x20, [sp, #16]	Prologue: spills callee-saved regs used by this function (x19, x20)
0x28	and w20, w0, #0xff	Tracepoint setup: extracts the low-8 bits of arg0 as the “event boolean”
0x2c	ldr x2, [x2]	Prologue (cont’d): completes loading of the stack-canary value
0x30	and w19, w1, #0xff	Tracepoint setup: extracts low-8 bits of arg1
0x34	ldr w0, [x3]	Important: loads the current trace-enabled flag from memory
0x38	ldr x1, [x2]	Prologue (cont’d): reads the canary
0x3c	str x1, [sp, #56]	Prologue (cont’d): writes the canary into the new frame
0x40	mov x1, #0	Prologue (cont’d): zeroes out x1 for the upcoming branch test
0x44	cbnz w0, 0x88	Important: if tracing is disabled (w0==0) skip the heavy path entirely

The trace-enabled check happens after the prologue. This is wasteful when tracing is disabled, which is often the case in production.
To optimize this:
_nocheck__trace_vhost_commit() is now fully inlined in the .h file with
the hot path.It checks trace_event_get_state() before calling into _simple_trace_vhost_commit(), which remains in .c.
This avoids calling into the .c function altogether when the tracepoint is disabled, thereby skipping unnecessary prologue instructions.

This results in better performance by removing redundant instructions in the tracing fast path.

Signed-off-by: Tanish Desai <tanishdesai37@gmail.com>
---
 scripts/tracetool/backend/simple.py | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py
index a74d61fcd6..2688d4b64b 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -36,8 +36,17 @@ def generate_h_begin(events, group):
 
 
 def generate_h(event, group):
-    out('    _simple_%(api)s(%(args)s);',
+    event_id = 'TRACE_' + event.name.upper()
+    if "vcpu" in event.properties:
+        # already checked on the generic format code
+        cond = "true"
+    else:
+        cond = "trace_event_get_state(%s)" % event_id
+    out('    if (%(cond)s) {',
+        '        _simple_%(api)s(%(args)s);',
+        '    }',
         api=event.api(),
+        cond=cond,
         args=", ".join(event.args.names()))
 
 
@@ -72,22 +81,10 @@ def generate_c(event, group):
     if len(event.args) == 0:
         sizestr = '0'
 
-    event_id = 'TRACE_' + event.name.upper()
-    if "vcpu" in event.properties:
-        # already checked on the generic format code
-        cond = "true"
-    else:
-        cond = "trace_event_get_state(%s)" % event_id
-
     out('',
-        '    if (!%(cond)s) {',
-        '        return;',
-        '    }',
-        '',
         '    if (trace_record_start(&rec, %(event_obj)s.id, %(size_str)s)) {',
         '        return; /* Trace Buffer Full, Event Dropped ! */',
         '    }',
-        cond=cond,
         event_obj=event.api(event.QEMU_EVENT),
         size_str=sizestr)
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-06-02 22:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-19 18:51 [PATCH] trace/simple: seperate hot paths of tracing fucntions Tanish Desai
2025-05-20 19:01 ` Stefan Hajnoczi
2025-05-20 20:51   ` Paolo Bonzini
2025-05-21 18:16     ` Stefan Hajnoczi
2025-05-27  2:51       ` Tanish Desai
2025-05-28 18:06         ` Stefan Hajnoczi
2025-05-28 19:30           ` Tanish Desai
  -- strict thread matches above, loose matches on Subject: below --
2025-05-28 19:25 Tanish Desai
2025-06-02 22:25 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).