* [PATCH RESEND 0/2] TIF_SYSCALL_TRACEPOINT fixes
@ 2013-03-17 18:28 Oleg Nesterov
2013-03-17 18:28 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
2013-03-17 18:28 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
0 siblings, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-17 18:28 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Frederic Weisbecker, Steven Rostedt, linux-kernel
Hello.
IIRC Steven agrees with this changes, but nobody picked them.
Resend, perhaps Andrew can help...
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 18:28 [PATCH RESEND 0/2] TIF_SYSCALL_TRACEPOINT fixes Oleg Nesterov
@ 2013-03-17 18:28 ` Oleg Nesterov
2013-03-17 18:48 ` Steven Rostedt
2013-03-18 16:34 ` [PATCH v2 " Oleg Nesterov
2013-03-17 18:28 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
1 sibling, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-17 18:28 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Frederic Weisbecker, Steven Rostedt, linux-kernel
syscall_regfunc() and syscall_unregfunc() should set/clear
TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
with copy_process() and miss the new child which was not added to
init_task.tasks list yet.
Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
under tasklist.
While at it,
- remove _irqsafe from syscall_regfunc/syscall_unregfunc,
read_lock(tasklist) doesn't need to disable irqs.
- change syscall_unregfunc() to check PF_KTHREAD to skip
the kernel threads, ->mm != NULL is the common mistake.
Note: probably this check should be simply removed, needs
another patch.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/fork.c | 7 +++++++
kernel/tracepoint.c | 12 +++++-------
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 1766d32..8184a53 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1472,7 +1472,14 @@ static struct task_struct *copy_process(unsigned long clone_flags,
total_forks++;
spin_unlock(¤t->sighand->siglock);
+#ifdef CONFIG_TRACEPOINTS
+ if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
+ else
+ clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
+#endif
write_unlock_irq(&tasklist_lock);
+
proc_fork_connector(p);
cgroup_post_fork(p);
if (clone_flags & CLONE_THREAD)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 0c05a45..a16754b 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -732,33 +732,31 @@ static int sys_tracepoint_refcount;
void syscall_regfunc(void)
{
- unsigned long flags;
struct task_struct *g, *t;
if (!sys_tracepoint_refcount) {
- read_lock_irqsave(&tasklist_lock, flags);
+ read_lock(&tasklist_lock);
do_each_thread(g, t) {
/* Skip kernel threads. */
- if (t->mm)
+ if (!(t->flags & PF_KTHREAD))
set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
- read_unlock_irqrestore(&tasklist_lock, flags);
+ read_unlock(&tasklist_lock);
}
sys_tracepoint_refcount++;
}
void syscall_unregfunc(void)
{
- unsigned long flags;
struct task_struct *g, *t;
sys_tracepoint_refcount--;
if (!sys_tracepoint_refcount) {
- read_lock_irqsave(&tasklist_lock, flags);
+ read_lock(&tasklist_lock);
do_each_thread(g, t) {
clear_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
- read_unlock_irqrestore(&tasklist_lock, flags);
+ read_unlock(&tasklist_lock);
}
}
#endif
--
1.5.5.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 18:28 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
@ 2013-03-17 18:48 ` Steven Rostedt
2013-03-17 19:00 ` Oleg Nesterov
2013-03-18 16:34 ` [PATCH v2 " Oleg Nesterov
1 sibling, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2013-03-17 18:48 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel
On Sun, 2013-03-17 at 19:28 +0100, Oleg Nesterov wrote:
> syscall_regfunc() and syscall_unregfunc() should set/clear
> TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
> with copy_process() and miss the new child which was not added to
> init_task.tasks list yet.
>
> Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
> under tasklist.
Is this because "p = dup_task_struct(current);" is outside the lock?
Probably should state this in the change log.
>
> While at it,
>
> - remove _irqsafe from syscall_regfunc/syscall_unregfunc,
> read_lock(tasklist) doesn't need to disable irqs.
I'm fine with this.
>
> - change syscall_unregfunc() to check PF_KTHREAD to skip
> the kernel threads, ->mm != NULL is the common mistake.
I'm fine with this too.
>
> Note: probably this check should be simply removed, needs
> another patch.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
> kernel/fork.c | 7 +++++++
> kernel/tracepoint.c | 12 +++++-------
> 2 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 1766d32..8184a53 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1472,7 +1472,14 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>
> total_forks++;
> spin_unlock(¤t->sighand->siglock);
> +#ifdef CONFIG_TRACEPOINTS
> + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> + else
> + clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> +#endif
I hate seeing #ifdef code like this in C files. Can you add a function
to set this in include/trace/syscalls.h:
#ifdef CONFIG_TRACEPOINTS
static inline void syscall_tracepoint_update(struct task_struct *p)
{
if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
else
clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
}
#else
static inline void syscall_tracepoint_update(struct task_struct *p) {}
#endif
Then in copy_process() just have:
syscall_tracepoint_update(p);
Thanks,
-- Steve
> write_unlock_irq(&tasklist_lock);
> +
> proc_fork_connector(p);
> cgroup_post_fork(p);
> if (clone_flags & CLONE_THREAD)
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 0c05a45..a16754b 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -732,33 +732,31 @@ static int sys_tracepoint_refcount;
>
> void syscall_regfunc(void)
> {
> - unsigned long flags;
> struct task_struct *g, *t;
>
> if (!sys_tracepoint_refcount) {
> - read_lock_irqsave(&tasklist_lock, flags);
> + read_lock(&tasklist_lock);
> do_each_thread(g, t) {
> /* Skip kernel threads. */
> - if (t->mm)
> + if (!(t->flags & PF_KTHREAD))
> set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> } while_each_thread(g, t);
> - read_unlock_irqrestore(&tasklist_lock, flags);
> + read_unlock(&tasklist_lock);
> }
> sys_tracepoint_refcount++;
> }
>
> void syscall_unregfunc(void)
> {
> - unsigned long flags;
> struct task_struct *g, *t;
>
> sys_tracepoint_refcount--;
> if (!sys_tracepoint_refcount) {
> - read_lock_irqsave(&tasklist_lock, flags);
> + read_lock(&tasklist_lock);
> do_each_thread(g, t) {
> clear_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> } while_each_thread(g, t);
> - read_unlock_irqrestore(&tasklist_lock, flags);
> + read_unlock(&tasklist_lock);
> }
> }
> #endif
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 18:48 ` Steven Rostedt
@ 2013-03-17 19:00 ` Oleg Nesterov
2013-03-17 19:34 ` Steven Rostedt
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-17 19:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel
On 03/17, Steven Rostedt wrote:
>
> On Sun, 2013-03-17 at 19:28 +0100, Oleg Nesterov wrote:
> > syscall_regfunc() and syscall_unregfunc() should set/clear
> > TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
> > with copy_process() and miss the new child which was not added to
> > init_task.tasks list yet.
> >
> > Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
> > under tasklist.
>
> Is this because "p = dup_task_struct(current);" is outside the lock?
> Probably should state this in the change log.
Not only, syscall_regfunc/syscall_unregfunc can miss the new child.
Just suppose that syscall_regfunc() takes tasklist right before the
forking task tries to take it for writing and and the child to the
list.
> > +#ifdef CONFIG_TRACEPOINTS
> > + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> > + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > + else
> > + clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > +#endif
>
> I hate seeing #ifdef code like this in C files. Can you add a function
> to set this in include/trace/syscalls.h:
It seems that everyone hates them, except me ;)
> #ifdef CONFIG_TRACEPOINTS
> static inline void syscall_tracepoint_update(struct task_struct *p)
> {
> if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> else
> clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> }
> #else
> static inline void syscall_tracepoint_update(struct task_struct *p) {}
> #endif
OK, thanks, will do. But perhaps tracepoint_fork() would be better?
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 19:00 ` Oleg Nesterov
@ 2013-03-17 19:34 ` Steven Rostedt
2013-03-18 16:33 ` Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2013-03-17 19:34 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel
On Sun, 2013-03-17 at 20:00 +0100, Oleg Nesterov wrote:
> On 03/17, Steven Rostedt wrote:
> >
> > On Sun, 2013-03-17 at 19:28 +0100, Oleg Nesterov wrote:
> > > syscall_regfunc() and syscall_unregfunc() should set/clear
> > > TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
> > > with copy_process() and miss the new child which was not added to
> > > init_task.tasks list yet.
> > >
> > > Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
> > > under tasklist.
> >
> > Is this because "p = dup_task_struct(current);" is outside the lock?
> > Probably should state this in the change log.
>
> Not only, syscall_regfunc/syscall_unregfunc can miss the new child.
>
> Just suppose that syscall_regfunc() takes tasklist right before the
> forking task tries to take it for writing and and the child to the
> list.
I'm a bit confused by the above. Maybe it's the typo with the "and and"
that's confusing me.
>
> > > +#ifdef CONFIG_TRACEPOINTS
> > > + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> > > + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > > + else
> > > + clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > > +#endif
> >
> > I hate seeing #ifdef code like this in C files. Can you add a function
> > to set this in include/trace/syscalls.h:
>
> It seems that everyone hates them, except me ;)
>
> > #ifdef CONFIG_TRACEPOINTS
> > static inline void syscall_tracepoint_update(struct task_struct *p)
> > {
> > if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> > set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > else
> > clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> > }
> > #else
> > static inline void syscall_tracepoint_update(struct task_struct *p) {}
> > #endif
>
> OK, thanks, will do. But perhaps tracepoint_fork() would be better?
tracepoint_fork() is similar to being called trace_fork() which would be
considered a tracepoint. Seeing tracepoint_fork() would make me think it
has something to do with the fork tracepoint.
Do we plan on doing anything other than updating the syscall tracepoint
flag here? I find the "syscall_tracepoint_update()" very descriptive to
what is actually happening. While reading the fork code, seeing
'syscall_tracepoint_update()' would tell me that this has something to
do with syscall tracepoints, which it does. But tracepoint_fork() would
have me think something completely different.
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 19:34 ` Steven Rostedt
@ 2013-03-18 16:33 ` Oleg Nesterov
0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-18 16:33 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel
On 03/17, Steven Rostedt wrote:
>
> On Sun, 2013-03-17 at 20:00 +0100, Oleg Nesterov wrote:
> > On 03/17, Steven Rostedt wrote:
> > >
> > > > Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
> > > > under tasklist.
> > >
> > > Is this because "p = dup_task_struct(current);" is outside the lock?
> > > Probably should state this in the change log.
> >
> > Not only, syscall_regfunc/syscall_unregfunc can miss the new child.
> >
> > Just suppose that syscall_regfunc() takes tasklist right before the
> > forking task tries to take it for writing and and the child to the
> > list.
>
> I'm a bit confused by the above. Maybe it's the typo with the "and and"
> that's confusing me.
Yes, "and and" was supposed to be "and add".
But probably I misunderstood you before... Well yes, this is because
"p = dup_task_struct(current)" copies TIF_SYSCALL_TRACEPOINT outside
of the tasklist-protected section which also makes the new task visible
for do_each_thread().
IOW, the state of TIF_SYSCALL_TRACEPOINT bit can be correct after
dup_task_struct(), but it can't be updated until copy_process() add
the child to the list.
> > OK, thanks, will do. But perhaps tracepoint_fork() would be better?
>
> tracepoint_fork() is similar to being called trace_fork() which would be
> considered a tracepoint. Seeing tracepoint_fork() would make me think it
> has something to do with the fork tracepoint.
>
> Do we plan on doing anything other than updating the syscall tracepoint
> flag here? I find the "syscall_tracepoint_update()" very descriptive to
> what is actually happening. While reading the fork code, seeing
> 'syscall_tracepoint_update()' would tell me that this has something to
> do with syscall tracepoints, which it does. But tracepoint_fork() would
> have me think something completely different.
OK, thanks, I am sending v2 in reply to v1.
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-17 18:28 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
2013-03-17 18:48 ` Steven Rostedt
@ 2013-03-18 16:34 ` Oleg Nesterov
2013-03-20 19:16 ` Steven Rostedt
1 sibling, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-18 16:34 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Frederic Weisbecker, Steven Rostedt, linux-kernel
syscall_regfunc() and syscall_unregfunc() should set/clear
TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
with copy_process() and miss the new child which was not added to
init_task.tasks list yet.
Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
under tasklist.
While at it,
- remove _irqsafe from syscall_regfunc/syscall_unregfunc,
read_lock(tasklist) doesn't need to disable irqs.
- change syscall_unregfunc() to check PF_KTHREAD to skip
the kernel threads, ->mm != NULL is the common mistake.
Note: probably this check should be simply removed, needs
another patch.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/trace/syscall.h | 15 +++++++++++++++
kernel/fork.c | 2 ++
kernel/tracepoint.c | 12 +++++-------
3 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 84bc419..15a954b 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -4,6 +4,7 @@
#include <linux/tracepoint.h>
#include <linux/unistd.h>
#include <linux/ftrace_event.h>
+#include <linux/thread_info.h>
#include <asm/ptrace.h>
@@ -31,4 +32,18 @@ struct syscall_metadata {
struct ftrace_event_call *exit_event;
};
+#ifdef CONFIG_TRACEPOINTS
+static inline void syscall_tracepoint_update(struct task_struct *p)
+{
+ if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
+ else
+ clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
+}
+#else
+static inline void syscall_tracepoint_update(struct task_struct *p)
+{
+}
+#endif
+
#endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 1766d32..e463f99 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1472,7 +1472,9 @@ static struct task_struct *copy_process(unsigned long clone_flags,
total_forks++;
spin_unlock(¤t->sighand->siglock);
+ syscall_tracepoint_update(p);
write_unlock_irq(&tasklist_lock);
+
proc_fork_connector(p);
cgroup_post_fork(p);
if (clone_flags & CLONE_THREAD)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 0c05a45..a16754b 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -732,33 +732,31 @@ static int sys_tracepoint_refcount;
void syscall_regfunc(void)
{
- unsigned long flags;
struct task_struct *g, *t;
if (!sys_tracepoint_refcount) {
- read_lock_irqsave(&tasklist_lock, flags);
+ read_lock(&tasklist_lock);
do_each_thread(g, t) {
/* Skip kernel threads. */
- if (t->mm)
+ if (!(t->flags & PF_KTHREAD))
set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
- read_unlock_irqrestore(&tasklist_lock, flags);
+ read_unlock(&tasklist_lock);
}
sys_tracepoint_refcount++;
}
void syscall_unregfunc(void)
{
- unsigned long flags;
struct task_struct *g, *t;
sys_tracepoint_refcount--;
if (!sys_tracepoint_refcount) {
- read_lock_irqsave(&tasklist_lock, flags);
+ read_lock(&tasklist_lock);
do_each_thread(g, t) {
clear_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
- read_unlock_irqrestore(&tasklist_lock, flags);
+ read_unlock(&tasklist_lock);
}
}
#endif
--
1.5.5.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v2 1/2] tracing: syscall_*regfunc() can race with copy_process()
2013-03-18 16:34 ` [PATCH v2 " Oleg Nesterov
@ 2013-03-20 19:16 ` Steven Rostedt
0 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2013-03-20 19:16 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel
On Mon, 2013-03-18 at 17:34 +0100, Oleg Nesterov wrote:
> syscall_regfunc() and syscall_unregfunc() should set/clear
> TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
> with copy_process() and miss the new child which was not added to
> init_task.tasks list yet.
>
> Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
> under tasklist.
>
> While at it,
>
> - remove _irqsafe from syscall_regfunc/syscall_unregfunc,
> read_lock(tasklist) doesn't need to disable irqs.
>
> - change syscall_unregfunc() to check PF_KTHREAD to skip
> the kernel threads, ->mm != NULL is the common mistake.
>
> Note: probably this check should be simply removed, needs
> another patch.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
-- Steve
> ---
> include/trace/syscall.h | 15 +++++++++++++++
> kernel/fork.c | 2 ++
> kernel/tracepoint.c | 12 +++++-------
> 3 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/include/trace/syscall.h b/include/trace/syscall.h
> index 84bc419..15a954b 100644
> --- a/include/trace/syscall.h
> +++ b/include/trace/syscall.h
> @@ -4,6 +4,7 @@
> #include <linux/tracepoint.h>
> #include <linux/unistd.h>
> #include <linux/ftrace_event.h>
> +#include <linux/thread_info.h>
>
> #include <asm/ptrace.h>
>
> @@ -31,4 +32,18 @@ struct syscall_metadata {
> struct ftrace_event_call *exit_event;
> };
>
> +#ifdef CONFIG_TRACEPOINTS
> +static inline void syscall_tracepoint_update(struct task_struct *p)
> +{
> + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
> + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> + else
> + clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
> +}
> +#else
> +static inline void syscall_tracepoint_update(struct task_struct *p)
> +{
> +}
> +#endif
> +
> #endif /* _TRACE_SYSCALL_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 1766d32..e463f99 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1472,7 +1472,9 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>
> total_forks++;
> spin_unlock(¤t->sighand->siglock);
> + syscall_tracepoint_update(p);
> write_unlock_irq(&tasklist_lock);
> +
> proc_fork_connector(p);
> cgroup_post_fork(p);
> if (clone_flags & CLONE_THREAD)
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 0c05a45..a16754b 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -732,33 +732,31 @@ static int sys_tracepoint_refcount;
>
> void syscall_regfunc(void)
> {
> - unsigned long flags;
> struct task_struct *g, *t;
>
> if (!sys_tracepoint_refcount) {
> - read_lock_irqsave(&tasklist_lock, flags);
> + read_lock(&tasklist_lock);
> do_each_thread(g, t) {
> /* Skip kernel threads. */
> - if (t->mm)
> + if (!(t->flags & PF_KTHREAD))
> set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> } while_each_thread(g, t);
> - read_unlock_irqrestore(&tasklist_lock, flags);
> + read_unlock(&tasklist_lock);
> }
> sys_tracepoint_refcount++;
> }
>
> void syscall_unregfunc(void)
> {
> - unsigned long flags;
> struct task_struct *g, *t;
>
> sys_tracepoint_refcount--;
> if (!sys_tracepoint_refcount) {
> - read_lock_irqsave(&tasklist_lock, flags);
> + read_lock(&tasklist_lock);
> do_each_thread(g, t) {
> clear_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> } while_each_thread(g, t);
> - read_unlock_irqrestore(&tasklist_lock, flags);
> + read_unlock(&tasklist_lock);
> }
> }
> #endif
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 18:28 [PATCH RESEND 0/2] TIF_SYSCALL_TRACEPOINT fixes Oleg Nesterov
2013-03-17 18:28 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
@ 2013-03-17 18:28 ` Oleg Nesterov
2013-03-17 18:54 ` Steven Rostedt
1 sibling, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-17 18:28 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Frederic Weisbecker, Steven Rostedt, linux-kernel
syscall_regfunc() ignores the kernel thread because "it has
no effect", see cc3b13c1 "Don't trace kernel thread syscalls".
However, this means that a user-space task spawned by
call_usermodehelper() won't report the system calls if
kernel_execve() is called when sys_tracepoint_refcount != 0.
Remove this check. Hopefully the unnecessary report from
ret_from_fork path mentioned by cc3b13c1 is fine. In fact
"this is the only case" is not true. Say, kernel_execve()
itself does "int 80" on X86_32. Hopefully fine too.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/tracepoint.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index a16754b..4e1e4ca 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -737,9 +737,7 @@ void syscall_regfunc(void)
if (!sys_tracepoint_refcount) {
read_lock(&tasklist_lock);
do_each_thread(g, t) {
- /* Skip kernel threads. */
- if (!(t->flags & PF_KTHREAD))
- set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
+ set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
read_unlock(&tasklist_lock);
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 18:28 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
@ 2013-03-17 18:54 ` Steven Rostedt
2013-03-17 19:04 ` Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2013-03-17 18:54 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel,
H. Peter Anvin, linux-arch
On Sun, 2013-03-17 at 19:28 +0100, Oleg Nesterov wrote:
> syscall_regfunc() ignores the kernel thread because "it has
> no effect", see cc3b13c1 "Don't trace kernel thread syscalls".
>
> However, this means that a user-space task spawned by
> call_usermodehelper() won't report the system calls if
> kernel_execve() is called when sys_tracepoint_refcount != 0.
>
> Remove this check. Hopefully the unnecessary report from
> ret_from_fork path mentioned by cc3b13c1 is fine. In fact
> "this is the only case" is not true. Say, kernel_execve()
> itself does "int 80" on X86_32. Hopefully fine too.
>
I'm really thinking the TIF_SYSCALL_TRACEPOINT flag is getting a bit
ridiculous. We really should have a "swap syscall table when tracepoints
enabled" that changes the syscall table that does exactly the same thing
as the normal table but wraps the system call with the tracepoints.
Something that we are looking to do with interrupts.
Altough this may not be that trivial, as this seems to be the method to
trace system calls on not only x86, but also PowerPC, ARM, s390, Sparc,
and sh.
-- Steve
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
> kernel/tracepoint.c | 4 +---
> 1 files changed, 1 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index a16754b..4e1e4ca 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -737,9 +737,7 @@ void syscall_regfunc(void)
> if (!sys_tracepoint_refcount) {
> read_lock(&tasklist_lock);
> do_each_thread(g, t) {
> - /* Skip kernel threads. */
> - if (!(t->flags & PF_KTHREAD))
> - set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> + set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
> } while_each_thread(g, t);
> read_unlock(&tasklist_lock);
> }
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 18:54 ` Steven Rostedt
@ 2013-03-17 19:04 ` Oleg Nesterov
2013-03-17 19:36 ` Steven Rostedt
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-17 19:04 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel,
H. Peter Anvin, linux-arch
On 03/17, Steven Rostedt wrote:
>
> On Sun, 2013-03-17 at 19:28 +0100, Oleg Nesterov wrote:
> > syscall_regfunc() ignores the kernel thread because "it has
> > no effect", see cc3b13c1 "Don't trace kernel thread syscalls".
> >
> > However, this means that a user-space task spawned by
> > call_usermodehelper() won't report the system calls if
> > kernel_execve() is called when sys_tracepoint_refcount != 0.
> >
> > Remove this check. Hopefully the unnecessary report from
> > ret_from_fork path mentioned by cc3b13c1 is fine. In fact
> > "this is the only case" is not true. Say, kernel_execve()
> > itself does "int 80" on X86_32. Hopefully fine too.
>
> I'm really thinking the TIF_SYSCALL_TRACEPOINT flag is getting a bit
> ridiculous. We really should have a "swap syscall table when tracepoints
> enabled" that changes the syscall table that does exactly the same thing
> as the normal table but wraps the system call with the tracepoints.
But we also need to force the slow path in system_call...
Anyway, do you agree with this change for now?
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 19:04 ` Oleg Nesterov
@ 2013-03-17 19:36 ` Steven Rostedt
2013-03-18 16:26 ` Oleg Nesterov
2013-03-19 15:10 ` David Howells
0 siblings, 2 replies; 17+ messages in thread
From: Steven Rostedt @ 2013-03-17 19:36 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel,
H. Peter Anvin, linux-arch
On Sun, 2013-03-17 at 20:04 +0100, Oleg Nesterov wrote:
> > I'm really thinking the TIF_SYSCALL_TRACEPOINT flag is getting a bit
> > ridiculous. We really should have a "swap syscall table when tracepoints
> > enabled" that changes the syscall table that does exactly the same thing
> > as the normal table but wraps the system call with the tracepoints.
>
> But we also need to force the slow path in system_call...
Why? If we remove the tracepoint from the slowpath and use a table swap,
then we wouldn't need to use the slowpath at all.
>
> Anyway, do you agree with this change for now?
Well, if it's solving a bug today sure. But we should really be looking
at fixing what's there for the future.
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 19:36 ` Steven Rostedt
@ 2013-03-18 16:26 ` Oleg Nesterov
2013-03-19 15:10 ` David Howells
1 sibling, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-03-18 16:26 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andrew Morton, Ingo Molnar, Frederic Weisbecker, linux-kernel,
H. Peter Anvin, linux-arch
On 03/17, Steven Rostedt wrote:
>
> On Sun, 2013-03-17 at 20:04 +0100, Oleg Nesterov wrote:
>
> > > I'm really thinking the TIF_SYSCALL_TRACEPOINT flag is getting a bit
> > > ridiculous. We really should have a "swap syscall table when tracepoints
> > > enabled" that changes the syscall table that does exactly the same thing
> > > as the normal table but wraps the system call with the tracepoints.
> >
> > But we also need to force the slow path in system_call...
>
> Why? If we remove the tracepoint from the slowpath and use a table swap,
> then we wouldn't need to use the slowpath at all.
Ah, indeed, you are right.
> > Anyway, do you agree with this change for now?
>
> Well, if it's solving a bug today sure. But we should really be looking
> at fixing what's there for the future.
OK, thanks.
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-17 19:36 ` Steven Rostedt
2013-03-18 16:26 ` Oleg Nesterov
@ 2013-03-19 15:10 ` David Howells
2013-03-19 15:36 ` Steven Rostedt
1 sibling, 1 reply; 17+ messages in thread
From: David Howells @ 2013-03-19 15:10 UTC (permalink / raw)
To: Steven Rostedt
Cc: dhowells, Oleg Nesterov, Andrew Morton, Ingo Molnar,
Frederic Weisbecker, linux-kernel, H. Peter Anvin, linux-arch
Steven Rostedt <rostedt@goodmis.org> wrote:
> Why? If we remove the tracepoint from the slowpath and use a table swap,
> then we wouldn't need to use the slowpath at all.
How are you engineering a table swap? Do you patch the system call code to
change the immediate address loaded or do you put in a level of indirection?
David
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-19 15:10 ` David Howells
@ 2013-03-19 15:36 ` Steven Rostedt
2013-03-19 21:27 ` H. Peter Anvin
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2013-03-19 15:36 UTC (permalink / raw)
To: David Howells
Cc: Oleg Nesterov, Andrew Morton, Ingo Molnar, Frederic Weisbecker,
linux-kernel, H. Peter Anvin, linux-arch
On Tue, 2013-03-19 at 15:10 +0000, David Howells wrote:
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > Why? If we remove the tracepoint from the slowpath and use a table swap,
> > then we wouldn't need to use the slowpath at all.
>
> How are you engineering a table swap? Do you patch the system call code to
> change the immediate address loaded or do you put in a level of indirection?
Patching the call site would probably be the easiest method.
We've gotten pretty good at doing that ;-)
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2013-03-19 15:36 ` Steven Rostedt
@ 2013-03-19 21:27 ` H. Peter Anvin
0 siblings, 0 replies; 17+ messages in thread
From: H. Peter Anvin @ 2013-03-19 21:27 UTC (permalink / raw)
To: Steven Rostedt
Cc: David Howells, Oleg Nesterov, Andrew Morton, Ingo Molnar,
Frederic Weisbecker, linux-kernel, linux-arch
On 03/19/2013 08:36 AM, Steven Rostedt wrote:
> On Tue, 2013-03-19 at 15:10 +0000, David Howells wrote:
>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>>> Why? If we remove the tracepoint from the slowpath and use a table swap,
>>> then we wouldn't need to use the slowpath at all.
>>
>> How are you engineering a table swap? Do you patch the system call code to
>> change the immediate address loaded or do you put in a level of indirection?
>
> Patching the call site would probably be the easiest method.
>
> We've gotten pretty good at doing that ;-)
>
Yes, given that the machinery is already there we might as well use it.
-hpa
^ permalink raw reply [flat|nested] 17+ messages in thread
* syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
@ 2012-03-30 18:31 Oleg Nesterov
2012-03-30 19:02 ` Steven Rostedt
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2012-03-30 18:31 UTC (permalink / raw)
To: Ingo Molnar, Jason Baron, Steven Rostedt; +Cc: linux-kernel
Hello.
I've looked at syscall_regfunc/unregfunc by accident, and I am
a bit confused...
void syscall_regfunc(void)
{
unsigned long flags;
struct task_struct *g, *t;
if (!sys_tracepoint_refcount) {
read_lock_irqsave(&tasklist_lock, flags);
Why _irqsave? write_lock(tasklist) needs to disable irqs, but read_
doesn't. Any subtle reason I missed?
do_each_thread(g, t) {
/* Skip kernel threads. */
if (t->mm)
We should check PF_KTHREAD, not ->mm.
set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
But the main question is, can't we race with clone() and miss the
new child? The new task is not "visible" to do_each_thread() until
copy_process()->list_add_tail_rcu(thread_group/init_task.tasks).
Don't we need something like the patch below?
Oleg.
--- x/kernel/fork.c
+++ x/kernel/fork.c
@@ -1446,7 +1446,12 @@ static struct task_struct *copy_process(
total_forks++;
spin_unlock(¤t->sighand->siglock);
+#ifdef CONFIG_TRACEPOINTS
+ if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+ set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
+#endif
write_unlock_irq(&tasklist_lock);
+
proc_fork_connector(p);
cgroup_post_fork(p);
if (clone_flags & CLONE_THREAD)
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
2012-03-30 18:31 syscall_regfunc() && TIF_SYSCALL_TRACEPOINT Oleg Nesterov
@ 2012-03-30 19:02 ` Steven Rostedt
2012-03-30 20:15 ` Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2012-03-30 19:02 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Ingo Molnar, Jason Baron, linux-kernel
On Fri, 2012-03-30 at 20:31 +0200, Oleg Nesterov wrote:
> Hello.
>
> I've looked at syscall_regfunc/unregfunc by accident, and I am
> a bit confused...
>
> void syscall_regfunc(void)
> {
> unsigned long flags;
> struct task_struct *g, *t;
>
> if (!sys_tracepoint_refcount) {
> read_lock_irqsave(&tasklist_lock, flags);
>
> Why _irqsave? write_lock(tasklist) needs to disable irqs, but read_
> doesn't. Any subtle reason I missed?
As long as an interrupt doesn't take the tasklist lock as write, we
don't. If it doesn't then we should be safe not to disable interrupts
here.
>
> do_each_thread(g, t) {
> /* Skip kernel threads. */
> if (t->mm)
>
> We should check PF_KTHREAD, not ->mm.
A lot of places test ->mm for kernel threads. Just search for ->mm in
kernel/sched/core.c
>
> set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
>
> But the main question is, can't we race with clone() and miss the
> new child? The new task is not "visible" to do_each_thread() until
> copy_process()->list_add_tail_rcu(thread_group/init_task.tasks).
>
> Don't we need something like the patch below?
>
> Oleg.
>
>
> --- x/kernel/fork.c
> +++ x/kernel/fork.c
> @@ -1446,7 +1446,12 @@ static struct task_struct *copy_process(
>
> total_forks++;
> spin_unlock(¤t->sighand->siglock);
> +#ifdef CONFIG_TRACEPOINTS
> + if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
I'm not so worried about the set (although that should be done) but it
is entirely possible that we need a clear too. Leaving this flag set
would cause a task to take the overhead of tracing syscalls without ever
tracing them.
-- Steve
> +#endif
> write_unlock_irq(&tasklist_lock);
> +
> proc_fork_connector(p);
> cgroup_post_fork(p);
> if (clone_flags & CLONE_THREAD)
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
2012-03-30 19:02 ` Steven Rostedt
@ 2012-03-30 20:15 ` Oleg Nesterov
2012-03-31 0:13 ` Steven Rostedt
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2012-03-30 20:15 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Ingo Molnar, Jason Baron, linux-kernel
On 03/30, Steven Rostedt wrote:
>
> On Fri, 2012-03-30 at 20:31 +0200, Oleg Nesterov wrote:
> > Hello.
> >
> > I've looked at syscall_regfunc/unregfunc by accident, and I am
> > a bit confused...
> >
> > void syscall_regfunc(void)
> > {
> > unsigned long flags;
> > struct task_struct *g, *t;
> >
> > if (!sys_tracepoint_refcount) {
> > read_lock_irqsave(&tasklist_lock, flags);
> >
> > Why _irqsave? write_lock(tasklist) needs to disable irqs, but read_
> > doesn't. Any subtle reason I missed?
>
> As long as an interrupt doesn't take the tasklist lock as write,
No, this is forbidden.
> > do_each_thread(g, t) {
> > /* Skip kernel threads. */
> > if (t->mm)
> >
> > We should check PF_KTHREAD, not ->mm.
>
> A lot of places test ->mm for kernel threads.
And this is wrong, use_mm() can set ->mm != NULL. This is the common
mistake.
> Just search for ->mm in
> kernel/sched/core.c
Probably normalize_rt_tasks() and __sched_setscheduler() should be fixed.
> > Don't we need something like the patch below?
> >
> > Oleg.
> >
> >
> > --- x/kernel/fork.c
> > +++ x/kernel/fork.c
> > @@ -1446,7 +1446,12 @@ static struct task_struct *copy_process(
> >
> > total_forks++;
> > spin_unlock(¤t->sighand->siglock);
> > +#ifdef CONFIG_TRACEPOINTS
> > + if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> > + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT);
>
> I'm not so worried about the set (although that should be done) but it
> is entirely possible that we need a clear too. Leaving this flag set
> would cause a task to take the overhead of tracing syscalls without ever
> tracing them.
Agreed. OK, I'll send the patch with "else clear".
But I don't really understand why do you think that "clear" is more
important. Sure, the wrong TIF_SYSCALL_TRACEPOINT triggers the slow
path unnecessary, but this is more or less harmless. But if we do
not set the task obviously won't report trace_sys*, this looks like
a bug even if nothing bad can happen.
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
2012-03-30 20:15 ` Oleg Nesterov
@ 2012-03-31 0:13 ` Steven Rostedt
2012-03-31 20:45 ` Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2012-03-31 0:13 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Ingo Molnar, Jason Baron, linux-kernel
On Fri, 2012-03-30 at 22:15 +0200, Oleg Nesterov wrote:
> > A lot of places test ->mm for kernel threads.
>
> And this is wrong, use_mm() can set ->mm != NULL. This is the common
> mistake.
>
> > Just search for ->mm in
> > kernel/sched/core.c
>
> Probably normalize_rt_tasks() and __sched_setscheduler() should be fixed.
Heh, the __sched_setscheduler() usage is from me. But I'm not sure we
had an alternative back in 2005.
> But I don't really understand why do you think that "clear" is more
> important.
They are both important. But as I tend to consider performance when
tracing is off as critical, I'm more concerned about that. But both must
be fixed, because not reporting traces can confuse a developer.
> Sure, the wrong TIF_SYSCALL_TRACEPOINT triggers the slow
> path unnecessary, but this is more or less harmless. But if we do
> not set the task obviously won't report trace_sys*, this looks like
> a bug even if nothing bad can happen.
Agreed, both are a bug and both should be fixed.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
2012-03-31 0:13 ` Steven Rostedt
@ 2012-03-31 20:45 ` Oleg Nesterov
2012-03-31 21:37 ` Steven Rostedt
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2012-03-31 20:45 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Ingo Molnar, Jason Baron, linux-kernel
On 03/30, Steven Rostedt wrote:
>
> On Fri, 2012-03-30 at 22:15 +0200, Oleg Nesterov wrote:
>
> > But I don't really understand why do you think that "clear" is more
> > important.
>
> They are both important. But as I tend to consider performance when
> tracing is off as critical, I'm more concerned about that. But both must
> be fixed, because not reporting traces can confuse a developer.
Ah, got it, thanks.
I was going to send the simple patch we discussed, but suddenly I
realized that I have another question.
Why do we want to filter out the kernel threads in syscall_regfunc?
>From cc3b13c1 "tracing: Don't trace kernel thread syscalls"
then it has no effect to trace the kernel thread calls
to syscalls in that path.
Setting the TIF_SYSCALL_TRACEPOINT flag is then useless for these.
OK, but then it doesn't hurt? Or is there another reason why
TIF_SYSCALL_TRACEPOINT is not desirable on kthread?
The problem is ____call_usermodehelper() which execs the user-space
task. This clears PF_KTHREAD (sets ->mm), but obviously if
sys_tracepoint_refcount != 0 this is too late.
So what do you think we should do,
- keep this check
- remove it
- remove it in a separate patch
- add the "sync with sys_tracepoint_refcount" hook
before kernel_execve()
?
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT
2012-03-31 20:45 ` Oleg Nesterov
@ 2012-03-31 21:37 ` Steven Rostedt
2012-04-01 21:37 ` [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT) Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2012-03-31 21:37 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Ingo Molnar, Jason Baron, linux-kernel
On Sat, 2012-03-31 at 22:45 +0200, Oleg Nesterov wrote:
> On 03/30, Steven Rostedt wrote:
> >
> > On Fri, 2012-03-30 at 22:15 +0200, Oleg Nesterov wrote:
> >
> > > But I don't really understand why do you think that "clear" is more
> > > important.
> >
> > They are both important. But as I tend to consider performance when
> > tracing is off as critical, I'm more concerned about that. But both must
> > be fixed, because not reporting traces can confuse a developer.
>
> Ah, got it, thanks.
>
> I was going to send the simple patch we discussed, but suddenly I
> realized that I have another question.
>
> Why do we want to filter out the kernel threads in syscall_regfunc?
>
> >From cc3b13c1 "tracing: Don't trace kernel thread syscalls"
>
> then it has no effect to trace the kernel thread calls
> to syscalls in that path.
> Setting the TIF_SYSCALL_TRACEPOINT flag is then useless for these.
>
> OK, but then it doesn't hurt? Or is there another reason why
> TIF_SYSCALL_TRACEPOINT is not desirable on kthread?
Right, it doesn't hurt. I was about to say that in a previous email.
>
> The problem is ____call_usermodehelper() which execs the user-space
> task. This clears PF_KTHREAD (sets ->mm), but obviously if
> sys_tracepoint_refcount != 0 this is too late.
>
> So what do you think we should do,
>
> - keep this check
>
> - remove it
>
> - remove it in a separate patch
I say this one (remove it in a separate patch). That way if something
breaks we know exactly what did it ;-)
>
> - add the "sync with sys_tracepoint_refcount" hook
> before kernel_execve()
>
> ?
-- Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT)
2012-03-31 21:37 ` Steven Rostedt
@ 2012-04-01 21:37 ` Oleg Nesterov
2012-04-01 21:38 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2012-04-01 21:37 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, Jason Baron, linux-kernel, Hendrik Brueckner,
Frederic Weisbecker
On 03/31, Steven Rostedt wrote:
>
> On Sat, 2012-03-31 at 22:45 +0200, Oleg Nesterov wrote:
> >
> > So what do you think we should do,
> >
> > - keep this check
> >
> > - remove it
> >
> > - remove it in a separate patch
>
> I say this one (remove it in a separate patch). That way if something
> breaks we know exactly what did it ;-)
OK, agreed.
Don't really know how can I test this... but the kernel didn't
crash after I enabled the syscall tracer ;)
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads
2012-04-01 21:37 ` [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT) Oleg Nesterov
@ 2012-04-01 21:38 ` Oleg Nesterov
0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2012-04-01 21:38 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, Jason Baron, linux-kernel, Hendrik Brueckner,
Frederic Weisbecker
syscall_regfunc() ignores the kernel thread because "it has
no effect", see cc3b13c1 "Don't trace kernel thread syscalls".
However, this means that a user-space task spawned by
call_usermodehelper() won't report the system calls if
kernel_execve() is called when sys_tracepoint_refcount != 0.
Remove this check. Hopefully the unnecessary report from
ret_from_fork path mentioned by cc3b13c1 is fine. In fact
"this is the only case" is not true. Say, kernel_execve()
itself does "int 80" on X86_32. Hopefully fine too.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/tracepoint.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index e2a4523..2403e60 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -739,9 +739,7 @@ void syscall_regfunc(void)
if (!sys_tracepoint_refcount) {
read_lock(&tasklist_lock);
do_each_thread(g, t) {
- /* Skip kernel threads. */
- if (!(t->flags & PF_KTHREAD))
- set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
+ set_tsk_thread_flag(t, TIF_SYSCALL_TRACEPOINT);
} while_each_thread(g, t);
read_unlock(&tasklist_lock);
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-03-20 19:16 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-17 18:28 [PATCH RESEND 0/2] TIF_SYSCALL_TRACEPOINT fixes Oleg Nesterov
2013-03-17 18:28 ` [PATCH 1/2] tracing: syscall_*regfunc() can race with copy_process() Oleg Nesterov
2013-03-17 18:48 ` Steven Rostedt
2013-03-17 19:00 ` Oleg Nesterov
2013-03-17 19:34 ` Steven Rostedt
2013-03-18 16:33 ` Oleg Nesterov
2013-03-18 16:34 ` [PATCH v2 " Oleg Nesterov
2013-03-20 19:16 ` Steven Rostedt
2013-03-17 18:28 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
2013-03-17 18:54 ` Steven Rostedt
2013-03-17 19:04 ` Oleg Nesterov
2013-03-17 19:36 ` Steven Rostedt
2013-03-18 16:26 ` Oleg Nesterov
2013-03-19 15:10 ` David Howells
2013-03-19 15:36 ` Steven Rostedt
2013-03-19 21:27 ` H. Peter Anvin
-- strict thread matches above, loose matches on Subject: below --
2012-03-30 18:31 syscall_regfunc() && TIF_SYSCALL_TRACEPOINT Oleg Nesterov
2012-03-30 19:02 ` Steven Rostedt
2012-03-30 20:15 ` Oleg Nesterov
2012-03-31 0:13 ` Steven Rostedt
2012-03-31 20:45 ` Oleg Nesterov
2012-03-31 21:37 ` Steven Rostedt
2012-04-01 21:37 ` [PATCH 0/2] (Was: syscall_regfunc() && TIF_SYSCALL_TRACEPOINT) Oleg Nesterov
2012-04-01 21:38 ` [PATCH 2/2] tracing: syscall_regfunc() should not skip kernel threads Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox