From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>,
Andrew Morton <akpm@osdl.org>, Ingo Molnar <mingo@redhat.com>,
Greg Kroah-Hartman <gregkh@suse.de>,
Christoph Hellwig <hch@infradead.org>,
ltt-dev@shafik.org, systemtap@sources.redhat.com,
Douglas Niehaus <niehaus@eecs.ku.edu>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 05/05] Linux Kernel Markers, non optimised architectures
Date: Fri, 12 Jan 2007 12:15:12 -0500 [thread overview]
Message-ID: <20070112171512.GB2888@Krystal> (raw)
In-Reply-To: <45A71827.6020300@yahoo.com.au>
* Nick Piggin (nickpiggin@yahoo.com.au) wrote:
> Mathieu Desnoyers wrote:
> >* Nick Piggin (nickpiggin@yahoo.com.au) wrote:
> >
> >>Mathieu Desnoyers wrote:
> >>
> >>
> >>>+#define MARK(name, format, args...) \
> >>>+ do { \
> >>>+ static marker_probe_func *__mark_call_##name = \
> >>>+ __mark_empty_function; \
> >>>+ volatile static char __marker_enable_##name = 0; \
> >>>+ static const struct __mark_marker_c __mark_c_##name \
> >>>+ __attribute__((section(".markers.c"))) = \
> >>>+ { #name, &__mark_call_##name, format } ; \
> >>>+ static const struct __mark_marker __mark_##name \
> >>>+ __attribute__((section(".markers"))) = \
> >>>+ { &__mark_c_##name, &__marker_enable_##name } ; \
> >>>+ asm volatile ( "" : : "i" (&__mark_##name)); \
> >>>+ __mark_check_format(format, ## args); \
> >>>+ if (unlikely(__marker_enable_##name)) { \
> >>>+ preempt_disable(); \
> >>>+ (*__mark_call_##name)(format, ## args); \
> >>>+ preempt_enable_no_resched(); \
> >>
> >>Why not just preempt_enable() here?
> >>
> >
> >
> >Because the preempt_enable() macro contains preempt_check_resched(), which
> >may call preempt_schedule() which leads us to a call to schedule().
> >Therefore,
> >all those very interesting scheduler functions would cause an infinite
> >recursive scheduler call if we marked schedule() and used preempt_enable()
> >in
> >the marker.
>
> The vast majority of schedule() has preempt turned off, so that shouldn't
> be a problem, if you provide a comment.
>
> >The primary goal for the markers (and the probes that attaches to them) is
> >to
> >have the fewest side-effects possible : any kernel method called from an
> >instrumentation site adds this precise kernel method to the "cannot be
> >instrumented" list, which I want to keep as small possible.
>
> OK, well one problem is that it can cause a resched event to be lost, so
> you might say it has more side-effects without checking resched.
>
I agree : this a side-effect I pointed out in my LTTng presentation last
summer at OLS.
Here is a quick idea of the potentially problematic instrumentation points
(i386 example) :
- with the task_rq_lock held (therefore preemption is disabled, so it's not a
problem)
sched.c wait_task_inactive()
sched.c try_to_wake_up()
sched.c wake_up_new_task()
sched.c sched_migrate_task()
sched.c schedule() after prepare_task_switch call, before context_switch call.
Surrounded by preempt_disable(), preempt_enable_no_resched(), should be ok.
- IRQs : irq_enter()/irq_exit() calls in do_IRQ makes sure that the
preempt_count is incremented. irq_enter() is called with interrupts still
disabled.
kernel/irq/handle.c handle_IRQ_event()
- NMIs : nmi_enter() -> irq_enter() -> add_preempt_count(HARDIRQ_OFFSET) called
with interrupts still disabled.
Therefore, preemption is disabled within trace points in do_nmi.
- traps : GPF, do_trap, do_page_fault, do_debug, spurious_interrupt,
math_emulate.
It is not uncommon for these trap handlers to reenable interrupts very soon.
They do not increment the preemption count.
Therefore, preemption must be expected when these handlers run : we cannot
rely of the fact that hard IRQs would be disabled to prevent the scheduler
from running, as markers becomes a new source of scheduler events.
- local_irq_enable()/local_irq_disable() :
It can call trace_hardirqs_on()/trace_hardirqs_off(). These macros are
sprinkled in _every_ possible context cited above, from trap handlers to
preemptible code.
Other contexts or code location are not a problem (process context, softirq).
If we are sure that we expect calls to preempt_schedule() from each of these
contexts, then it's ok to put preempt_enable(). It is important to note that a
marker would then act as a source of scheduler events in code paths where
disabling interrupts is expected to disable the scheduler.
Mathieu
--
OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2007-01-12 17:20 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-12 0:02 [PATCH 00/05] Linux Kernel Markers Mathieu Desnoyers
2007-01-12 0:02 ` [PATCH 01/05] Linux Kernel Markers : Kconfig menus Mathieu Desnoyers
2007-01-12 0:02 ` [PATCH 02/05] Linux Kernel Markers, architecture independant code Mathieu Desnoyers
2007-01-12 0:02 ` [PATCH 03/05] Linux Kernel Markers : powerpc optimisation Mathieu Desnoyers
2007-01-12 0:02 ` [PATCH 04/05] Linux Kernel Markers : i386 optimisation Mathieu Desnoyers
2007-01-12 0:02 ` [PATCH 05/05] Linux Kernel Markers, non optimised architectures Mathieu Desnoyers
2007-01-12 4:39 ` Nick Piggin
2007-01-12 5:00 ` Mathieu Desnoyers
2007-01-12 5:09 ` Nick Piggin
2007-01-12 17:15 ` Mathieu Desnoyers [this message]
2007-01-12 17:43 ` Mathieu Desnoyers
2007-01-12 18:23 ` [PATCH 05/05] update - " Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070112171512.GB2888@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=akpm@osdl.org \
--cc=gregkh@suse.de \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ltt-dev@shafik.org \
--cc=mingo@redhat.com \
--cc=nickpiggin@yahoo.com.au \
--cc=niehaus@eecs.ku.edu \
--cc=systemtap@sources.redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.