[test patch] seccomp: add code to disable TSC when enabling seccomp

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [test patch] seccomp: add code to disable TSC when enabling seccomp
@ 2006-07-13 20:11 Chuck Ebbert
  2006-07-14  2:02 ` andrea
  0 siblings, 1 reply; 4+ messages in thread
From: Chuck Ebbert @ 2006-07-13 20:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: andrea@cpushare.com, Arjan van de Ven, Andrew Morton,
	Linus Torvalds

Is the below patch acceptable in generic code, or should some arch
helper function hide it?  It lets i386 / x86_64 add TIF_NOTSC
independently.  

Also, what prevents this flag from being set on a running process?
If that happens the CPU state and flag could get out of sync and
this could cause problems because of the way the current code tests
the flag.

--- 2.6.18-rc1-nb.orig/fs/proc/base.c
+++ 2.6.18-rc1-nb/fs/proc/base.c
@@ -1025,6 +1025,9 @@ static ssize_t seccomp_write(struct file
 	if (seccomp_mode && seccomp_mode <= NR_SECCOMP_MODES) {
 		tsk->seccomp.mode = seccomp_mode;
 		set_tsk_thread_flag(tsk, TIF_SECCOMP);
+#ifdef TIF_NOTSC
+		set_tsk_thread_flag(tsk, TIF_NOTSC);
+#endif
 	} else
 		goto out;
 	result = -EIO;
-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [test patch] seccomp: add code to disable TSC when enabling seccomp
  2006-07-13 20:11 [test patch] seccomp: add code to disable TSC when enabling seccomp Chuck Ebbert
@ 2006-07-14  2:02 ` andrea
  0 siblings, 0 replies; 4+ messages in thread
From: andrea @ 2006-07-14  2:02 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: linux-kernel, Arjan van de Ven, Andrew Morton, Linus Torvalds

On Thu, Jul 13, 2006 at 04:11:12PM -0400, Chuck Ebbert wrote:
> Is the below patch acceptable in generic code, or should some arch
> helper function hide it?  It lets i386 / x86_64 add TIF_NOTSC
> independently.  
>
> Also, what prevents this flag from being set on a running process?
> If that happens the CPU state and flag could get out of sync and
> this could cause problems because of the way the current code tests
> the flag.

Yes, there could be a tiny race where if the controller and seccomp
tasks run on two different CPUs: the seccomp task may write to the
pipe, and then read, but the read may not actually stop anywhere,
because the second CPU may have enabled seccomp and answered faster
than the first cpu. So there's tiny window for the TSC not to be
disabled synchronously at the start of the seccomp computations (and
if there are multiple seccomp tasks running the new ones could let the
old ones run a timeslice with the tsc enabled). The inverse isn't
possible because the SECCOMP/TSC bits cannot be cleared anywhere. In
short the only problem is that it's not a guarantee that the tsc is
always permanently disabled with seccomp in SMP.

To fix the tiny window if it's the current task writing to self, we
should also update the cr4 before returning from base.c. If it was a
different task it's more complicated (we would need to send a forced
sigstop, and wait the task->state to change, but then we go into the
ptrace parallelism I truly don't want to deal with in any way in
seccomp context). The whole point of seccomp is to be simple. So my
suggestion is either we ignore the tiny window, or we do it only from
the current task. If I've to deal with any sigstop then I could use
ptrace or utrace in the first place ;).

I generally preferred to be the controller task that fires up seccomp
(the controller tasks did a number of checks that everything was going
ok before firing it up), but if we forbid other tasks to fire up
seccomp, then perhaps there's no more reason to leave it a /proc
interface. I guess we could move it to a prctl which would probably
waste a few less bytes, so it gets even more friendly.

Thanks Chunk!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [test patch] seccomp: add code to disable TSC when enabling seccomp
@ 2006-07-14  6:02 Chuck Ebbert
  2006-07-14  6:30 ` andrea
  0 siblings, 1 reply; 4+ messages in thread
From: Chuck Ebbert @ 2006-07-14  6:02 UTC (permalink / raw)
  To: andrea@cpushare.com
  Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, linux-kernel

In-Reply-To: <20060714020257.GC18774@opteron.random>

On Fri, 14 Jul 2006 04:02:57 +0200, andrea@cpushare.com wrote:

> > Also, what prevents this flag from being set on a running process?
> > If that happens the CPU state and flag could get out of sync and
> > this could cause problems because of the way the current code tests
> > the flag.
> 
> Yes, there could be a tiny race where if the controller and seccomp
> tasks run on two different CPUs: the seccomp task may write to the
> pipe, and then read, but the read may not actually stop anywhere,
> because the second CPU may have enabled seccomp and answered faster
> than the first cpu. So there's tiny window for the TSC not to be
> disabled synchronously at the start of the seccomp computations (and
> if there are multiple seccomp tasks running the new ones could let the
> old ones run a timeslice with the tsc enabled).

But it looks like the mismatch could persist indefinitely: if a seccomp
task inherits the wrong cr4 flag it could pass it on to another, or back
to the original one and so on.  I think this is the only safe way:

	if (test_tsk_thread_flag(next_p, TIF_NOTSC) ||
	    test_tsk_thread_flag(prev_p, TIF_NOTSC)) {

		/* Flip TSC disable bit if necessary. */
		unsigned int cr4 = read_cr4();

		if (test_tsk_thread_flag(next_p, TIF_NOTSC)) {
			if (!(cr4 & X86_CR4_TSD))
				write_cr4(cr4 | X86_CR4_TSD);
		} else
			write_cr4(cr4 & ~X86_CR4_TSD);
	}

(Testing TSD in the 'else' path is not worth the trouble.)

> To fix the tiny window if it's the current task writing to self, we
> should also update the cr4 before returning from base.c. If it was a
> different task it's more complicated (we would need to send a forced
> sigstop, and wait the task->state to change, but then we go into the
> ptrace parallelism I truly don't want to deal with in any way in
> seccomp context). The whole point of seccomp is to be simple. So my
> suggestion is either we ignore the tiny window, or we do it only from
> the current task. If I've to deal with any sigstop then I could use
> ptrace or utrace in the first place ;).

The tiny window shouldn't be a problem, should it?  Just what is the
risk to begin with, and how much harder is it to exploit in such a
small window?

-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [test patch] seccomp: add code to disable TSC when enabling seccomp
  2006-07-14  6:02 Chuck Ebbert
@ 2006-07-14  6:30 ` andrea
  0 siblings, 0 replies; 4+ messages in thread
From: andrea @ 2006-07-14  6:30 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, linux-kernel

On Fri, Jul 14, 2006 at 02:02:55AM -0400, Chuck Ebbert wrote:
> But it looks like the mismatch could persist indefinitely: if a seccomp
> task inherits the wrong cr4 flag it could pass it on to another, or back
> to the original one and so on.  I think this is the only safe way:
> 
> 	if (test_tsk_thread_flag(next_p, TIF_NOTSC) ||
> 	    test_tsk_thread_flag(prev_p, TIF_NOTSC)) {
> 
> 		/* Flip TSC disable bit if necessary. */
> 		unsigned int cr4 = read_cr4();
> 
> 		if (test_tsk_thread_flag(next_p, TIF_NOTSC)) {
> 			if (!(cr4 & X86_CR4_TSD))
> 				write_cr4(cr4 | X86_CR4_TSD);
> 		} else
> 			write_cr4(cr4 & ~X86_CR4_TSD);
> 	}
> 
> (Testing TSD in the 'else' path is not worth the trouble.)

Yes that solves the problem of the first seccomp task propagating it
to the other seccomp tasks.

> The tiny window shouldn't be a problem, should it?  Just what is the
> risk to begin with, and how much harder is it to exploit in such a
> small window?

Even the original patch of yours was perfectly fine in
practice. Eventually one of the per-cpu daemons would also be
scheduled even if there are more seccomp tasks per cpu.

But in my opinion if we care about this window and we change
something, it worth to close it completely, mostly for code-style
reasons (i.e. strict code). Note that the NOTSC will never be checked
unless there's a second task, so it's not a single timeslice.

As far as I can tell, the only way to close it completely, is to
require that the task that fires up seccomp is the current task. That
is a very fine requirement. That guarantees that no other cpu could
require any update in-core. So it's enough to always flip the NOTSC
bit in the current cpu with test_and_set_bit and to synchronously
update the cr4 accordingly to the result of test_and_set_bit, then we
can go back to the nicer (and faster) xor way in the scheduler slow
path ;).

This is incidentally exactly what I implemented and tested with my
last patch ;) (which is of course a modification and extension of
yours)

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-07-14  6:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-13 20:11 [test patch] seccomp: add code to disable TSC when enabling seccomp Chuck Ebbert
2006-07-14  2:02 ` andrea
  -- strict thread matches above, loose matches on Subject: below --
2006-07-14  6:02 Chuck Ebbert
2006-07-14  6:30 ` andrea

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox