linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: oleg@redhat.com (Oleg Nesterov)
To: linux-arm-kernel@lists.infradead.org
Subject: TIF_NOHZ can escape nonhz mask? (Was: [PATCH v3 6/8] x86: Split syscall_trace_enter into two phases)
Date: Wed, 30 Jul 2014 19:46:30 +0200	[thread overview]
Message-ID: <20140730174630.GA30862@redhat.com> (raw)
In-Reply-To: <20140730163516.GC18158@localhost.localdomain>

On 07/30, Frederic Weisbecker wrote:
>
> On Tue, Jul 29, 2014 at 07:54:14PM +0200, Oleg Nesterov wrote:
>
> >
> > Looks like, we can kill context_tracking_task_switch() and simply change the
> > "__init" callers of context_tracking_cpu_set() to do set_thread_flag(TIF_NOHZ) ?
> > Then this flag will be propagated by copy_process().
>
> Right, that would be much better. Good catch! context tracking is enabled from
> tick_nohz_init(). This is the init 0 task so the flag should be propagated from there.

actually init 1 task, but this doesn't matter.

> I still think we need a for_each_process_thread() set as well though because some
> kernel threads may well have been created at this stage already.

Yes... Or we can add set_thread_flag(TIF_NOHZ) into ____call_usermodehelper().

> > Or I am totally confused? (quite possible).
> >
> > > So here is a scenario where this is a problem: a task runs on CPU 0, passes the context
> > > tracking call before returning from a syscall to userspace, and gets an interrupt. The
> > > interrupt preempts the task and it moves to CPU 1. So it returns from preempt_schedule_irq()
> > > after which it is going to resume to userspace.
> > >
> > > In this scenario, if context tracking is only enabled on CPU 1, we have no way to know that
> > > the task is resuming to userspace, because we passed through the context tracking probe
> > > already and it was ignored on CPU 0.
> >
> > Thanks. But I still can't understand... So if we only track CPU 1, then in this
> > case context_tracking.state == IN_USER on CPU 0, but it can be IN_USER or IN_KERNEL
> > on CPU 1.
>
> I'm not sure I understand your question.

Probably because it was stupid. Seriously, I still have no idea what this code
actually does.

> Context tracking is either enabled everywhere or
> nowhere.
>
> I need to say though that there is a per CPU context tracking state named context_tracking.active.
> It's confusing because it suggests that context tracking is active per CPU. Actually it's tracked
> everywhere when globally enabled, but active determines if we call the RCU and vtime callbacks or
> not.
>
> So only nohz full CPUs have context_tracking.active set because only these need to call the RCU
> and vtime callbacks. Other CPUs still do the context tracking but they won't call rcu and vtime
> functions.

I meant that in the scenario you described above the "global" TIF_NOHZ doesn't
really make a difference, afaics.

Lets assume that context tracking is only enabled on CPU 1. To simplify,
assume that we have a single usermode task T which sleeps in kernel mode.

So context_tracking[0].state == context_tracking[1].state == IN_KERNEL.

T wakes up on CPU_0, returns to user space, calls user_enter(). This sets
context_tracking[0].state = IN_USER but otherwise does nothing else, this
CPU is not tracked and .active is false.

Right after local_irq_restore() this task can migrate to CPU_1 and finish
its ret-to-usermode path. But since it had already passed user_enter() we
do not change context_tracking[1].state and do not play with rcu/vtime.
(unless this task hits SCHEDULE_USER in asm).

The same for user_exit() of course.

Oleg.

  reply	other threads:[~2014-07-30 17:46 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22  1:49 [PATCH v3 0/8] Two-phase seccomp and x86 tracing changes Andy Lutomirski
2014-07-22  1:49 ` [PATCH v3 1/8] seccomp, x86, arm, mips, s390: Remove nr parameter from secure_computing Andy Lutomirski
2014-07-22  1:49 ` [PATCH v3 2/8] seccomp: Refactor the filter callback and the API Andy Lutomirski
2014-07-22  1:49 ` [PATCH v3 3/8] seccomp: Allow arch code to provide seccomp_data Andy Lutomirski
2014-07-22  1:49 ` [PATCH v3 4/8] seccomp: Document two-phase seccomp and arch-provided seccomp_data Andy Lutomirski
2014-07-22  1:53 ` [PATCH v3 5/8] x86,x32,audit: Fix x32's AUDIT_ARCH wrt audit Andy Lutomirski
2014-07-22  1:53   ` [PATCH v3 6/8] x86: Split syscall_trace_enter into two phases Andy Lutomirski
2014-07-28 17:37     ` Oleg Nesterov
2014-07-28 18:58       ` TIF_NOHZ can escape nonhz mask? (Was: [PATCH v3 6/8] x86: Split syscall_trace_enter into two phases) Oleg Nesterov
2014-07-28 19:22         ` Frederic Weisbecker
2014-07-29 17:54           ` Oleg Nesterov
2014-07-30 16:35             ` Frederic Weisbecker
2014-07-30 17:46               ` Oleg Nesterov [this message]
2014-07-31  0:30                 ` Frederic Weisbecker
2014-07-31 16:03                   ` Oleg Nesterov
2014-07-31 17:13                     ` Frederic Weisbecker
2014-07-31 18:12                       ` Oleg Nesterov
2014-07-31 18:47                         ` Frederic Weisbecker
2014-07-31 18:50                           ` Frederic Weisbecker
2014-07-31 19:05                             ` Oleg Nesterov
2014-08-02 17:30                           ` Oleg Nesterov
2014-08-04 12:02                             ` Paul E. McKenney
2014-07-28 20:23       ` [PATCH v3 6/8] x86: Split syscall_trace_enter into two phases Andy Lutomirski
2014-07-29 16:54         ` Oleg Nesterov
2014-07-29 17:01           ` Andy Lutomirski
2014-07-29 17:31             ` Oleg Nesterov
2014-07-29 17:55               ` Andy Lutomirski
2014-07-29 18:16                 ` Oleg Nesterov
2014-07-29 18:22                   ` Andy Lutomirski
2014-07-29 18:44                     ` Oleg Nesterov
2014-07-22  1:53   ` [PATCH v3 7/8] x86_64, entry: Treat regs->ax the same in fastpath and slowpath syscalls Andy Lutomirski
2014-07-22  1:53   ` [PATCH v3 8/8] x86_64, entry: Use split-phase syscall_trace_enter for 64-bit syscalls Andy Lutomirski
2014-07-22 19:37 ` [PATCH v3 0/8] Two-phase seccomp and x86 tracing changes Kees Cook
2014-07-23 19:20   ` Andy Lutomirski
2014-07-28 17:59     ` H. Peter Anvin
2014-07-28 23:29       ` Kees Cook
2014-07-28 23:34         ` H. Peter Anvin
2014-07-28 23:42           ` Kees Cook
2014-07-28 23:45             ` H. Peter Anvin
2014-07-28 23:54               ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140730174630.GA30862@redhat.com \
    --to=oleg@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).