[patch] context-switching overhead in X, ioport(), 2.6.8.1

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [patch] context-switching overhead in X, ioport(), 2.6.8.1
@ 2004-08-21 13:55 Ingo Molnar
  2004-08-22  4:46 ` David S. Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2004-08-21 13:55 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton; +Cc: linux-kernel, Lee Revell

[-- Attachment #1: Type: text/plain, Size: 2694 bytes --]


while debugging/improving scheduling latencies i got the following
strange latency report from Lee Revell:

  http://krustophenia.net/testresults.php?dataset=2.6.8.1-P6#/var/www/2.6.8.1-P6

this trace shows a 120 usec latency caused by XFree86, on a 600 MHz x86
system. Looking closer reveals:

  00000002 0.006ms (+0.003ms): __switch_to (schedule)
  00000002 0.088ms (+0.082ms): finish_task_switch (schedule)

it took more than 80 usecs for XFree86 to do a context-switch!

it turns out that the reason for this (massive) context-switching
overhead is the following change in 2.6.8:

      [PATCH] larger IO bitmaps

To demonstrate the effect of this change i've written ioperm-latency.c
(attached), which gives the following on vanilla 2.6.8.1:

  # ./ioperm-latency
  default no ioperm:             scheduling latency: 2528 cycles
  turning on port 80 ioperm:     scheduling latency: 10563 cycles
  turning on port 65535 ioperm:  scheduling latency: 10517 cycles

the ChangeSet says:

        Now, with the lazy bitmap allocation and per-CPU TSS, this
        will really not drain any resources I think.

this is plain wrong. An increase in the IO bitmap size introduces
per-context-switch overhead as well: we now have to copy an 8K bitmap
every time XFree86 context-switches - even though XFree86 never uses
ports higher than 1024! I've straced XFree86 on a number of x86 systems
and in every instance ioperm() was used - so i'd say the majority of x86
Linux systems running 2.6.8.1 are affected by this problem.

This not only causes lots of overhead, it also trashes ~16K out of the
L1 and L2 caches, on every context-switch. It's as if XFree86 did a L1
cache flush on every context-switch ...

the simple solution would be to revert IO_BITMAP_BITS back to 1024 and
release 2.6.8.2?

I've implemented another solution as well, which tracks the
highest-enabled port # for every task and does the copying of the bitmap
intelligently. (patch attached) The patched kernel gives:

  # ./ioperm-latency
  default no ioperm:             scheduling latency: 2423 cycles
  turning on port 80 ioperm:     scheduling latency: 2503 cycles
  turning on port 65535 ioperm:  scheduling latency: 10607 cycles

this is much more acceptable - the full overhead only occurs in the very
unlikely event of a task using the high ioport range. X doesnt suffer
any significant overhead.

(tracking the maximum allowed port # also allows a simplification of
io_bitmap handling: e.g. we dont do the invalid-offset trick anymore -
the IO bitmap in the TSS is always valid and secure.)

I tested the patch on x86 SMP and UP, it works fine for me. I tested
boundary conditions as well, it all seems secure.

	Ingo

[-- Attachment #2: ioperm-latency.c --]
[-- Type: text/plain, Size: 1491 bytes --]

#include <errno.h>
#include <stdio.h>
#include <sched.h>
#include <signal.h>
#include <sys/io.h>
#include <stdlib.h>
#include <unistd.h>
#include <linux/unistd.h>

#define CYCLES(x) asm volatile ("rdtsc" :"=a" (x)::"edx")

#define __NR_sched_set_affinity 241
_syscall3 (int, sched_set_affinity, pid_t, pid, unsigned int, mask_len, unsigned long *, mask)

/*
 * Use a pair of RT processes bound to the same CPU to measure
 * context-switch overhead:
 */
static void measure(void)
{
	unsigned long i, min = ~0UL, pid, mask = 1, t1, t2;

	sched_set_affinity(0, sizeof(mask), &mask);

	pid = fork();
	if (!pid)
		for (;;) {
			asm volatile ("sti; nop; cli");
			sched_yield();
		}

	sched_yield();
	for (i = 0; i < 100; i++) {
		asm volatile ("sti; nop; cli");
		CYCLES(t1);
		sched_yield();
		CYCLES(t2);
		if (i > 10) {
			if (t2 - t1 < min)
				min = t2 - t1;
		}
	}
	asm volatile ("sti");

	kill(pid, 9);
	printf("scheduling latency: %ld cycles\n", min);
	sched_yield();
}

int main(void)
{
	struct sched_param p = { sched_priority: 2 };
	unsigned long mask = 1;

	if (iopl(3)) {
		printf("need to run as root!\n");
		exit(-1);
	}
	sched_setscheduler(0, SCHED_FIFO, &p);
	sched_set_affinity(0, sizeof(mask), &mask);

	printf("default no ioperm:             ");
	measure();

	printf("turning on port 80 ioperm:     ");
	ioperm(0x80,1,1);
	measure();

	printf("turning on port 65535 ioperm:  ");
	if (ioperm(0xffff,1,1))
		printf("FAILED - older kernel.\n");
	else
		measure();

	return 0;
}


[-- Attachment #3: ioport-latency-fix-2.6.8.1.patch --]
[-- Type: text/plain, Size: 6018 bytes --]


it taskes for X.org/XFree86 more than 80 usecs to do a context-switch!

it turns out that the reason for this (massive) context-switching
overhead is the following change in 2.6.8:

      [PATCH] larger IO bitmaps

the simple solution is to revert the change. I've implemented another
solution as well, which tracks the highest-enabled port # for every task
and does the copying of the bitmap intelligently. (patch attached) The
patched kernel gives:

  # ./ioperm-latency
  default no ioperm:             scheduling latency: 2423 cycles
  turning on port 80 ioperm:     scheduling latency: 2503 cycles
  turning on port 65535 ioperm:  scheduling latency: 10607 cycles

this is much more acceptable - the full overhead only occurs in the very
unlikely event of a task using the high ioport range.

tracking the maximum allowed port # also allows a simplification of
io_bitmap handling: e.g. we dont do the invalid-offset trick anymore -
the IO bitmap in the TSS is always valid and secure.

I tested the patch on x86 SMP and UP, it works fine for me. I tested
boundary conditions as well, it all seems secure.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

--- linux/arch/i386/kernel/ioport.c.orig	
+++ linux/arch/i386/kernel/ioport.c	
@@ -56,6 +56,7 @@ static void set_bitmap(unsigned long *bi
  */
 asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
 {
+	unsigned int i, max_long, bytes, bytes_updated;
 	struct thread_struct * t = &current->thread;
 	struct tss_struct * tss;
 	unsigned long *bitmap;
@@ -81,16 +82,34 @@ asmlinkage long sys_ioperm(unsigned long
 
 	/*
 	 * do it in the per-thread copy and in the TSS ...
+	 *
+	 * Disable preemption via get_cpu() - we must not switch away
+	 * because the ->io_bitmap_max value must match the bitmap
+	 * contents:
 	 */
-	set_bitmap(t->io_bitmap_ptr, from, num, !turn_on);
 	tss = init_tss + get_cpu();
-	if (tss->io_bitmap_base == IO_BITMAP_OFFSET) { /* already active? */
-		set_bitmap(tss->io_bitmap, from, num, !turn_on);
-	} else {
-		memcpy(tss->io_bitmap, t->io_bitmap_ptr, IO_BITMAP_BYTES);
-		tss->io_bitmap_base = IO_BITMAP_OFFSET; /* Activate it in the TSS */
-	}
+
+	set_bitmap(t->io_bitmap_ptr, from, num, !turn_on);
+
+	/*
+	 * Search for a (possibly new) maximum. This is simple and stupid,
+	 * to keep it obviously correct:
+	 */
+	max_long = 0;
+	for (i = 0; i < IO_BITMAP_LONGS; i++)
+		if (t->io_bitmap_ptr[i] != ~0UL)
+			max_long = i;
+
+	bytes = (max_long + 1) * sizeof(long);
+	bytes_updated = max(bytes, t->io_bitmap_max);
+
+	t->io_bitmap_max = bytes;
+
+	/* Update the TSS: */
+	memcpy(tss->io_bitmap, t->io_bitmap_ptr, bytes_updated);
+
 	put_cpu();
+
 	return 0;
 }
 
--- linux/arch/i386/kernel/process.c.orig	
+++ linux/arch/i386/kernel/process.c	
@@ -293,15 +293,20 @@ int kernel_thread(int (*fn)(void *), voi
  */
 void exit_thread(void)
 {
-	struct task_struct *tsk = current;
+	struct thread_struct *t = &current->thread;
 
 	/* The process may have allocated an io port bitmap... nuke it. */
-	if (unlikely(NULL != tsk->thread.io_bitmap_ptr)) {
+	if (unlikely(NULL != t->io_bitmap_ptr)) {
 		int cpu = get_cpu();
 		struct tss_struct *tss = init_tss + cpu;
-		kfree(tsk->thread.io_bitmap_ptr);
-		tsk->thread.io_bitmap_ptr = NULL;
-		tss->io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
+
+		kfree(t->io_bitmap_ptr);
+		t->io_bitmap_ptr = NULL;
+		/*
+		 * Careful, clear this in the TSS too:
+		 */
+		memset(tss->io_bitmap, 0xff, t->io_bitmap_max);
+		t->io_bitmap_max = 0;
 		put_cpu();
 	}
 }
@@ -369,8 +374,10 @@ int copy_thread(int nr, unsigned long cl
 	tsk = current;
 	if (unlikely(NULL != tsk->thread.io_bitmap_ptr)) {
 		p->thread.io_bitmap_ptr = kmalloc(IO_BITMAP_BYTES, GFP_KERNEL);
-		if (!p->thread.io_bitmap_ptr)
+		if (!p->thread.io_bitmap_ptr) {
+			p->thread.io_bitmap_max = 0;
 			return -ENOMEM;
+		}
 		memcpy(p->thread.io_bitmap_ptr, tsk->thread.io_bitmap_ptr,
 			IO_BITMAP_BYTES);
 	}
@@ -401,8 +408,10 @@ int copy_thread(int nr, unsigned long cl
 
 	err = 0;
  out:
-	if (err && p->thread.io_bitmap_ptr)
+	if (err && p->thread.io_bitmap_ptr) {
 		kfree(p->thread.io_bitmap_ptr);
+		p->thread.io_bitmap_max = 0;
+	}
 	return err;
 }
 
@@ -552,26 +561,18 @@ struct task_struct fastcall * __switch_t
 	}
 
 	if (unlikely(prev->io_bitmap_ptr || next->io_bitmap_ptr)) {
-		if (next->io_bitmap_ptr) {
+		if (next->io_bitmap_ptr)
 			/*
-			 * 4 cachelines copy ... not good, but not that
-			 * bad either. Anyone got something better?
-			 * This only affects processes which use ioperm().
-			 * [Putting the TSSs into 4k-tlb mapped regions
-			 * and playing VM tricks to switch the IO bitmap
-			 * is not really acceptable.]
+			 * Copy the relevant range of the IO bitmap.
+			 * Normally this is 128 bytes or less:
 			 */
 			memcpy(tss->io_bitmap, next->io_bitmap_ptr,
-				IO_BITMAP_BYTES);
-			tss->io_bitmap_base = IO_BITMAP_OFFSET;
-		} else
+				max(prev->io_bitmap_max, next->io_bitmap_max));
+		else
 			/*
-			 * a bitmap offset pointing outside of the TSS limit
-			 * causes a nicely controllable SIGSEGV if a process
-			 * tries to use a port IO instruction. The first
-			 * sys_ioperm() call sets up the bitmap properly.
+			 * Clear any possible leftover bits:
 			 */
-			tss->io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
+			memset(tss->io_bitmap, 0xff, prev->io_bitmap_max);
 	}
 	return prev_p;
 }
--- linux/include/asm-i386/processor.h.orig	
+++ linux/include/asm-i386/processor.h	
@@ -422,6 +422,8 @@ struct thread_struct {
 	unsigned int		saved_fs, saved_gs;
 /* IO permissions */
 	unsigned long	*io_bitmap_ptr;
+/* max allowed port in the bitmap, in bytes: */
+	unsigned int	io_bitmap_max;
 };
 
 #define INIT_THREAD  {							\
@@ -442,7 +444,7 @@ struct thread_struct {
 	.esp1		= sizeof(init_tss[0]) + (long)&init_tss[0],	\
 	.ss1		= __KERNEL_CS,					\
 	.ldt		= GDT_ENTRY_LDT,				\
-	.io_bitmap_base	= INVALID_IO_BITMAP_OFFSET,			\
+	.io_bitmap_base	= offsetof(struct tss_struct,io_bitmap),	\
 	.io_bitmap	= { [ 0 ... IO_BITMAP_LONGS] = ~0 },		\
 }
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-21 13:55 Ingo Molnar
@ 2004-08-22  4:46 ` David S. Miller
  2004-08-22  5:42   ` Ryan Cumming
  0 siblings, 1 reply; 10+ messages in thread
From: David S. Miller @ 2004-08-22  4:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: torvalds, akpm, linux-kernel, rlrevell


FWIW, I would recommend a sparse bitmap implementation for the
ioport stuff.  Something simple like:

struct set_ent {
	u32 pos;
	u32 bits;
	struct set_ent *next;
};

static inline int is_set(struct set_ent *head, int pos)
{
	while (head != NULL) {
		if (pos >= head->pos) {
			if (pos < head->pos + 32) {
				if (head->bits &
				    (1 << (pos - head->pos)))
					return 1;
			}
			break;
		}

		head = head->next;
	}
	return 0;
}

You get the idea.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22  4:46 ` David S. Miller
@ 2004-08-22  5:42   ` Ryan Cumming
  2004-08-22  6:00     ` Lee Revell
  0 siblings, 1 reply; 10+ messages in thread
From: Ryan Cumming @ 2004-08-22  5:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: Ingo Molnar, torvalds, akpm, linux-kernel, rlrevell

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

On Saturday 21 August 2004 21:46, David S. Miller wrote:
> FWIW, I would recommend a sparse bitmap implementation for the
> ioport stuff.

The problem is that the sparse bitmap would have to be unpacked to the "dense" 
bitmap that lives in the TSS on context switch. AFAICS, that would involve 
walking the previous task's spare bitmap, clearing all the ports it had 
access to, and then walking the next task's sparse bitmap and opening access 
to its ports. I doubt this would be a big win over what Ingo says usually 
reduces to a 128 byte or less memcpy(), especially when you consider the 
added complexity.

The only big speedup I can see is in the case of only one task having anything 
set in its IO bitmap at all. I assume that most desktops running a single X 
server fall in to this degenerate case, please correct me if I'm wrong. There 
we could simply set the TSS's io_bitmap_base to IO_BITMAP_OFFSET when 
switching to the IO bitmap'ed task, and set it back to 
INVALID_IO_BITMAP_OFFSET when we context switch away. That way the entire 
thing is accomplished with a single 4-byte store per context switch until a 
second IO bitmap'ed app is started, in which case we'd have to fall back to 
memcpy()ing. Seems like too much complexity for what amounts to a 
microoptimization, though.

BTW Ingo, have you looked at changing the almost identical code in 
arch/x86-64? Or did it not get its bitmap expanded?

-Ryan

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22  5:42   ` Ryan Cumming
@ 2004-08-22  6:00     ` Lee Revell
  2004-08-22  6:06       ` Ryan Cumming
  0 siblings, 1 reply; 10+ messages in thread
From: Lee Revell @ 2004-08-22  6:00 UTC (permalink / raw)
  To: Ryan Cumming
  Cc: David S. Miller, Ingo Molnar, torvalds, Andrew Morton,
	linux-kernel

On Sun, 2004-08-22 at 01:42, Ryan Cumming wrote:
> On Saturday 21 August 2004 21:46, David S. Miller wrote:
> > FWIW, I would recommend a sparse bitmap implementation for the
> > ioport stuff.
> 
> The problem is that the sparse bitmap would have to be unpacked to the "dense" 
> bitmap that lives in the TSS on context switch.

Can someone supply a link to the original LKML post with the ioport
change?  I was not able to find it in my mailbox nor in the archives.

Lee


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22  6:00     ` Lee Revell
@ 2004-08-22  6:06       ` Ryan Cumming
  0 siblings, 0 replies; 10+ messages in thread
From: Ryan Cumming @ 2004-08-22  6:06 UTC (permalink / raw)
  To: Lee Revell
  Cc: David S. Miller, Ingo Molnar, torvalds, Andrew Morton,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 966 bytes --]

On Saturday 21 August 2004 23:00, Lee Revell wrote:
> On Sun, 2004-08-22 at 01:42, Ryan Cumming wrote:
> > On Saturday 21 August 2004 21:46, David S. Miller wrote:
> > > FWIW, I would recommend a sparse bitmap implementation for the
> > > ioport stuff.
> >
> > The problem is that the sparse bitmap would have to be unpacked to the
> > "dense" bitmap that lives in the TSS on context switch.
>
> Can someone supply a link to the original LKML post with the ioport
> change?  I was not able to find it in my mailbox nor in the archives.

Here's what I could dig up:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm2/broken-out/larger-io-bitmap.patch
http://www.uwsg.iu.edu/hypermail/linux/kernel/0211.0/0477.html
http://www.uwsg.iu.edu/hypermail/linux/kernel/9807.1/1079.html

Looks like x86-64 does in fact need a similar change to the x86 one. It's late 
here, but it should be pretty trivial to port over.

-Ryan

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22 12:16 ` [patch] context-switching overhead in X, ioport(), 2.6.8.1 Andi Kleen
@ 2004-08-22 12:00   ` Alan Cox
  2004-08-22 14:23     ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cox @ 2004-08-22 12:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Linux Kernel Mailing List, eich

On Sul, 2004-08-22 at 13:16, Andi Kleen wrote:
> At least older XFree86 (4.2/3 time frame) used to only use iopl(). I
> know it because at some point ioperm() was completely broken on
> x86-64, but the X server never hit it. I wonder why they changed
> that. Anyways, perhaps it would be better to just change the X server
> back to use iopl(), because it will be always faster than using

Xorg and XFree assume the kernel will have intelligent limits. When the
range went up the EnableIO code in turn switched to ioperm.

The actual code is:

       if (ioperm(0, 1024, 1) || iopl(3))
                FatalError("xf86EnableIOPorts: Failed to set IOPL for
I/O\n")

(os-support/linux/lnx_video.c:xf86EnableIO)

Flip those around and rebuild.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
       [not found] <2vEzI-Vw-17@gated-at.bofh.it>
@ 2004-08-22 12:16 ` Andi Kleen
  2004-08-22 12:00   ` Alan Cox
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2004-08-22 12:16 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, alan, eich

Ingo Molnar <mingo@elte.hu> writes:

> while debugging/improving scheduling latencies i got the following
> strange latency report from Lee Revell:
>
>   http://krustophenia.net/testresults.php?dataset=2.6.8.1-P6#/var/www/2.6.8.1-P6
>
> this trace shows a 120 usec latency caused by XFree86, on a 600 MHz x86
> system. Looking closer reveals:
>
>   00000002 0.006ms (+0.003ms): __switch_to (schedule)
>   00000002 0.088ms (+0.082ms): finish_task_switch (schedule)
>
> it took more than 80 usecs for XFree86 to do a context-switch!
>
> it turns out that the reason for this (massive) context-switching
> overhead is the following change in 2.6.8:
>
>       [PATCH] larger IO bitmaps
[...]

At least older XFree86 (4.2/3 time frame) used to only use iopl(). I
know it because at some point ioperm() was completely broken on
x86-64, but the X server never hit it. I wonder why they changed
that. Anyways, perhaps it would be better to just change the X server
back to use iopl(), because it will be always faster than using
ioperm.

-Andi


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22 12:00   ` Alan Cox
@ 2004-08-22 14:23     ` Andi Kleen
  2004-08-22 14:47       ` Alan Cox
  2004-08-22 17:31       ` Ingo Molnar
  0 siblings, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2004-08-22 14:23 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ingo Molnar, Linux Kernel Mailing List, eich

> Xorg and XFree assume the kernel will have intelligent limits. When the
> range went up the EnableIO code in turn switched to ioperm.

Which was the wrong thing to do since it is slower.

> 
> The actual code is:
> 
>        if (ioperm(0, 1024, 1) || iopl(3))
>                 FatalError("xf86EnableIOPorts: Failed to set IOPL for
> I/O\n")
> 
> (os-support/linux/lnx_video.c:xf86EnableIO)
> 
> Flip those around and rebuild.

It would be better to do that in the official release.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22 14:23     ` Andi Kleen
@ 2004-08-22 14:47       ` Alan Cox
  2004-08-22 17:31       ` Ingo Molnar
  1 sibling, 0 replies; 10+ messages in thread
From: Alan Cox @ 2004-08-22 14:47 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Linux Kernel Mailing List, eich

On Sul, 2004-08-22 at 15:23, Andi Kleen wrote:
> > The actual code is:
> > 
> >        if (ioperm(0, 1024, 1) || iopl(3))
> >                 FatalError("xf86EnableIOPorts: Failed to set IOPL for
> > I/O\n")
> > 
> > (os-support/linux/lnx_video.c:xf86EnableIO)
> > 
> > Flip those around and rebuild.
> 
> It would be better to do that in the official release.

The current release is in final code freeze so such a change would need
the release wranglers agreement, I can't just go checking it into the
tree.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch] context-switching overhead in X, ioport(), 2.6.8.1
  2004-08-22 14:23     ` Andi Kleen
  2004-08-22 14:47       ` Alan Cox
@ 2004-08-22 17:31       ` Ingo Molnar
  1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2004-08-22 17:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Alan Cox, Linux Kernel Mailing List, eich


* Andi Kleen <ak@muc.de> wrote:

> > Xorg and XFree assume the kernel will have intelligent limits. When the
> > range went up the EnableIO code in turn switched to ioperm.
> 
> Which was the wrong thing to do since it is slower.

it is not significantly slower with older kernels or with the patch
applied, and it's safer than an all-ports iopl(3).

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-08-22 17:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <2vEzI-Vw-17@gated-at.bofh.it>
2004-08-22 12:16 ` [patch] context-switching overhead in X, ioport(), 2.6.8.1 Andi Kleen
2004-08-22 12:00   ` Alan Cox
2004-08-22 14:23     ` Andi Kleen
2004-08-22 14:47       ` Alan Cox
2004-08-22 17:31       ` Ingo Molnar
2004-08-21 13:55 Ingo Molnar
2004-08-22  4:46 ` David S. Miller
2004-08-22  5:42   ` Ryan Cumming
2004-08-22  6:00     ` Lee Revell
2004-08-22  6:06       ` Ryan Cumming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox