From: Oleg Nesterov <oleg@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>, Jason Baron <jbaron@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
Date: Fri, 30 Mar 2012 22:15:50 +0200 [thread overview]
Message-ID: <20120330201550.GA16628@redhat.com> (raw)
In-Reply-To: <1333134131.23924.191.camel@gandalf.stny.rr.com>
On 03/30, Steven Rostedt wrote:
>
> On Fri, 2012-03-30 at 20:31 +0200, Oleg Nesterov wrote:
> > Hello.
> >
> > I've looked at syscall_regfunc/unregfunc by accident, and I am
> > a bit confused...
> >
> > void syscall_regfunc(void)
> > {
> > unsigned long flags;
> > struct task_struct *g, *t;
> >
> > if (!sys_tracepoint_refcount) {
> > read_lock_irqsave(&tasklist_lock, flags);
> >
> > Why _irqsave? write_lock(tasklist) needs to disable irqs, but read_
> > doesn't. Any subtle reason I missed?
>
> As long as an interrupt doesn't take the tasklist lock as write,
No, this is forbidden.
> > do_each_thread(g, t) {
> > /* Skip kernel threads. */
> > if (t->mm)
> >
> > We should check PF_KTHREAD, not ->mm.
>
> A lot of places test ->mm for kernel threads.
And this is wrong, use_mm() can set ->mm != NULL. This is the common
mistake.
> Just search for ->mm in
> kernel/sched/core.c
Probably normalize_rt_tasks() and __sched_setscheduler() should be fixed.
> > Don't we need something like the patch below?
> >
> > Oleg.
> >
> >
> > --- x/kernel/fork.c
> > +++ x/kernel/fork.c
> > @@ -1446,7 +1446,12 @@ static struct task_struct *copy_process(
> >
> > total_forks++;
> > spin_unlock(¤t->sighand->siglock);
> > +#ifdef CONFIG_TRACEPOINTS
> > + if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> > + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
>
> I'm not so worried about the set (although that should be done) but it
> is entirely possible that we need a clear too. Leaving this flag set
> would cause a task to take the overhead of tracing syscalls without ever
> tracing them.
Agreed. OK, I'll send the patch with "else clear".
But I don't really understand why do you think that "clear" is more
important. Sure, the wrong TIF_SYSCALL_TRACEPOINT triggers the slow
path unnecessary, but this is more or less harmless. But if we do
not set the task obviously won't report trace_sys*, this looks like
a bug even if nothing bad can happen.
Oleg.
next prev parent reply other threads:[~2012-03-30 20:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-30 18:31 syscall_regfunc() && TIF_SYSCALL_TRACEPOINT Oleg Nesterov
2012-03-30 19:02 ` Steven Rostedt
2012-03-30 20:15 ` Oleg Nesterov [this message]
2012-03-31 0:13 ` Steven Rostedt
2012-03-31 20:45 ` Oleg Nesterov
2012-03-31 21:37 ` Steven Rostedt
2012-04-01 21:37 ` [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT) Oleg Nesterov
2012-04-01 21:38 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
2012-04-01 21:38 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
2012-04-20 21:26 ` [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT) Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120330201550.GA16628@redhat.com \
--to=oleg@redhat.com \
--cc=jbaron@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.