* [PATCH] Read only syscall tables for x86_64 and i386
@ 2005-06-28 18:47 Christoph Lameter
2005-06-28 18:56 ` Arjan van de Ven
[not found] ` <87oe9q70no.fsf@jbms.ath.cx>
0 siblings, 2 replies; 22+ messages in thread
From: Christoph Lameter @ 2005-06-28 18:47 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm, ak
Place x86_64 and i386 syscall table into the read only section.
Remove the syscall tables from the data section and place them into the
readonly section (like IA64).
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Index: linux-2.6.12-mm2/arch/i386/kernel/entry.S
===================================================================
--- linux-2.6.12-mm2.orig/arch/i386/kernel/entry.S 2005-06-28 17:46:31.000000000 +0000
+++ linux-2.6.12-mm2/arch/i386/kernel/entry.S 2005-06-28 17:47:11.000000000 +0000
@@ -680,6 +680,7 @@ ENTRY(spurious_interrupt_bug)
pushl $do_spurious_interrupt_bug
jmp error_code
+.section .rodata,"a"
#include "syscall_table.S"
syscall_table_size=(.-sys_call_table)
Index: linux-2.6.12-mm2/arch/i386/kernel/syscall_table.S
===================================================================
--- linux-2.6.12-mm2.orig/arch/i386/kernel/syscall_table.S 2005-06-28 17:46:31.000000000 +0000
+++ linux-2.6.12-mm2/arch/i386/kernel/syscall_table.S 2005-06-28 17:47:11.000000000 +0000
@@ -1,4 +1,3 @@
-.data
ENTRY(sys_call_table)
.long sys_restart_syscall /* 0 - old "setup()" system call, used for restarting */
.long sys_exit
Index: linux-2.6.12-mm2/arch/x86_64/kernel/syscall.c
===================================================================
--- linux-2.6.12-mm2.orig/arch/x86_64/kernel/syscall.c 2005-06-17 19:48:29.000000000 +0000
+++ linux-2.6.12-mm2/arch/x86_64/kernel/syscall.c 2005-06-28 18:22:41.000000000 +0000
@@ -19,7 +19,7 @@ typedef void (*sys_call_ptr_t)(void);
extern void sys_ni_syscall(void);
-sys_call_ptr_t sys_call_table[__NR_syscall_max+1] __cacheline_aligned = {
+const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
/* Smells like a like a compiler bug -- it doesn't work when the & below is removed. */
[0 ... __NR_syscall_max] = &sys_ni_syscall,
#include <asm-x86_64/unistd.h>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 18:47 [PATCH] Read only syscall tables for x86_64 and i386 Christoph Lameter
@ 2005-06-28 18:56 ` Arjan van de Ven
2005-06-28 19:26 ` Christoph Lameter
[not found] ` <87oe9q70no.fsf@jbms.ath.cx>
1 sibling, 1 reply; 22+ messages in thread
From: Arjan van de Ven @ 2005-06-28 18:56 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-kernel, akpm, ak
On Tue, 2005-06-28 at 11:47 -0700, Christoph Lameter wrote:
> Place x86_64 and i386 syscall table into the read only section.
>
> Remove the syscall tables from the data section and place them into the
> readonly section (like IA64).
I like it.. however I think the 32 bit compat syscall table on x86-64
deserves the same treatment....
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 18:56 ` Arjan van de Ven
@ 2005-06-28 19:26 ` Christoph Lameter
2005-06-28 19:41 ` Christoph Hellwig
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Lameter @ 2005-06-28 19:26 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel, akpm, ak
On Tue, 28 Jun 2005, Arjan van de Ven wrote:
> I like it.. however I think the 32 bit compat syscall table on x86-64
> deserves the same treatment....
Ok.
---
Place x86_64 and i386 syscall table into the read only section.
Remove the syscall tables from the data section and place them into the
readonly section (like IA64). Includes the ia32 syscall table on x86_64.
Note that AFS seems to be modifying the syscall table. Is that legit?
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Index: linux-2.6.12-mm2/arch/i386/kernel/entry.S
===================================================================
--- linux-2.6.12-mm2.orig/arch/i386/kernel/entry.S 2005-06-28 18:34:11.000000000 +0000
+++ linux-2.6.12-mm2/arch/i386/kernel/entry.S 2005-06-28 19:06:42.000000000 +0000
@@ -680,6 +680,7 @@ ENTRY(spurious_interrupt_bug)
pushl $do_spurious_interrupt_bug
jmp error_code
+.section .rodata,"a"
#include "syscall_table.S"
syscall_table_size=(.-sys_call_table)
Index: linux-2.6.12-mm2/arch/i386/kernel/syscall_table.S
===================================================================
--- linux-2.6.12-mm2.orig/arch/i386/kernel/syscall_table.S 2005-06-28 18:34:11.000000000 +0000
+++ linux-2.6.12-mm2/arch/i386/kernel/syscall_table.S 2005-06-28 19:06:42.000000000 +0000
@@ -1,4 +1,3 @@
-.data
ENTRY(sys_call_table)
.long sys_restart_syscall /* 0 - old "setup()" system call, used for restarting */
.long sys_exit
Index: linux-2.6.12-mm2/arch/x86_64/kernel/syscall.c
===================================================================
--- linux-2.6.12-mm2.orig/arch/x86_64/kernel/syscall.c 2005-06-28 18:34:11.000000000 +0000
+++ linux-2.6.12-mm2/arch/x86_64/kernel/syscall.c 2005-06-28 19:06:42.000000000 +0000
@@ -19,7 +19,7 @@ typedef void (*sys_call_ptr_t)(void);
extern void sys_ni_syscall(void);
-sys_call_ptr_t sys_call_table[__NR_syscall_max+1] __cacheline_aligned = {
+const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
/* Smells like a like a compiler bug -- it doesn't work when the & below is removed. */
[0 ... __NR_syscall_max] = &sys_ni_syscall,
#include <asm-x86_64/unistd.h>
Index: linux-2.6.12-mm2/arch/x86_64/ia32/ia32entry.S
===================================================================
--- linux-2.6.12-mm2.orig/arch/x86_64/ia32/ia32entry.S 2005-06-28 17:46:31.000000000 +0000
+++ linux-2.6.12-mm2/arch/x86_64/ia32/ia32entry.S 2005-06-28 19:13:20.000000000 +0000
@@ -298,7 +298,7 @@ ENTRY(ia32_ptregs_common)
jmp ia32_sysret /* misbalances the return cache */
CFI_ENDPROC
- .data
+ .section .rodata,"a"
.align 8
.globl ia32_sys_call_table
ia32_sys_call_table:
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
[not found] ` <Pine.LNX.4.62.0506281218030.1454@graphe.net>
@ 2005-06-28 19:27 ` Jeremy Maitin-Shepard
2005-06-28 19:31 ` Christoph Lameter
2005-06-28 19:47 ` Arjan van de Ven
0 siblings, 2 replies; 22+ messages in thread
From: Jeremy Maitin-Shepard @ 2005-06-28 19:27 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-kernel
Christoph Lameter <christoph@lameter.com> writes:
> On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
>> As I mentioned previously when this patch was first posted to the list,
>> AFS writes to the syscall table. It does this even for Linux 2.6.
>> Apparently, the rodata section is not actually mapped read-only, so this
>> patch will probably not break AFS; nonetheless, it seems it would still
>> be better to keep the syscall table in a section that is supposed to be
>> writable.
> Maybe this needs to be fixed?
It would probably be better implemented with a more generic mechanism,
but I don't believe anyone is working on that now, so it looks like AFS
will continue to use a special syscall.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:27 ` Jeremy Maitin-Shepard
@ 2005-06-28 19:31 ` Christoph Lameter
2005-06-28 19:41 ` Jeremy Maitin-Shepard
2005-06-28 19:42 ` Christoph Hellwig
2005-06-28 19:47 ` Arjan van de Ven
1 sibling, 2 replies; 22+ messages in thread
From: Christoph Lameter @ 2005-06-28 19:31 UTC (permalink / raw)
To: Jeremy Maitin-Shepard; +Cc: linux-kernel
On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
> It would probably be better implemented with a more generic mechanism,
> but I don't believe anyone is working on that now, so it looks like AFS
> will continue to use a special syscall.
We could put an #ifdef CONFIG_AFS into the syscall table definition?
That makes it explicit.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
[not found] <Pine.LNX.4.62.0506281141050.959@graphe.net.suse.lists.linux.kernel>
@ 2005-06-28 19:33 ` Andi Kleen
2005-06-28 19:41 ` Christoph Lameter
0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2005-06-28 19:33 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-kernel
Christoph Lameter <christoph@lameter.com> writes:
> Place x86_64 and i386 syscall table into the read only section.
>
> Remove the syscall tables from the data section and place them into the
> readonly section (like IA64).
It's unfortunately useless because all the kernel is mapped in the
same 2 or 4MB page has to be writable because it overlaps with real
direct mapped memory.
On x86-64 there is a separate kernel mapping which could be made
read only. But that would be useless again because the memory
is aliased in the real direct mapping which has the same
overlapping problem.
The only way to write protect the kernel would be to pad
it to 2MB (or 4MB on i386/non PAE) which would be a big waste
of memory or use significantly more TLB entries in normal
operation.
Both is probably not worth the modest safety increase you
get from such a change.
-Andi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:26 ` Christoph Lameter
@ 2005-06-28 19:41 ` Christoph Hellwig
0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2005-06-28 19:41 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Arjan van de Ven, linux-kernel, akpm, ak
On Tue, Jun 28, 2005 at 12:26:43PM -0700, Christoph Lameter wrote:
> Note that AFS seems to be modifying the syscall table. Is that legit?
No.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:31 ` Christoph Lameter
@ 2005-06-28 19:41 ` Jeremy Maitin-Shepard
2005-06-28 19:42 ` Christoph Hellwig
1 sibling, 0 replies; 22+ messages in thread
From: Jeremy Maitin-Shepard @ 2005-06-28 19:41 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-kernel
Christoph Lameter <christoph@lameter.com> writes:
> On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
>> It would probably be better implemented with a more generic mechanism,
>> but I don't believe anyone is working on that now, so it looks like AFS
>> will continue to use a special syscall.
> We could put an #ifdef CONFIG_AFS into the syscall table definition?
> That makes it explicit.
I haven't looked much at the AFS support in the mainline kernel,
but I believe it is read-only support, and doesn't support
authentication. It may well have no need for a system call.
I was actually referring to the OpenAFS implementation, which is built
separately from the kernel as a module; thus, the #ifdef CONFIG_AFS
would not work. An additional configuration option could be added, but
I'm not sure that is a good idea.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:33 ` Andi Kleen
@ 2005-06-28 19:41 ` Christoph Lameter
2005-06-29 0:06 ` Arnd Bergmann
2005-06-29 2:49 ` Andi Kleen
0 siblings, 2 replies; 22+ messages in thread
From: Christoph Lameter @ 2005-06-28 19:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Tue, 28 Jun 2005, Andi Kleen wrote:
> It's unfortunately useless because all the kernel is mapped in the
> same 2 or 4MB page has to be writable because it overlaps with real
> direct mapped memory.
The question is: Are syscall tables are supposed to be
writable? If no then this patch should go in. If yes then forget about it.
On IA64 they are readonly and so I thought they should also be readonly
on i386 and x86_64.
The ability to protect a readonly section may be another issue.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:31 ` Christoph Lameter
2005-06-28 19:41 ` Jeremy Maitin-Shepard
@ 2005-06-28 19:42 ` Christoph Hellwig
2005-06-28 19:52 ` Jeremy Maitin-Shepard
1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2005-06-28 19:42 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Jeremy Maitin-Shepard, linux-kernel
On Tue, Jun 28, 2005 at 12:31:33PM -0700, Christoph Lameter wrote:
> On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
>
> > It would probably be better implemented with a more generic mechanism,
> > but I don't believe anyone is working on that now, so it looks like AFS
> > will continue to use a special syscall.
>
> We could put an #ifdef CONFIG_AFS into the syscall table definition?
> That makes it explicit.
No. AFS is utterly wrong, and the sooner we make it fail to work the better.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:27 ` Jeremy Maitin-Shepard
2005-06-28 19:31 ` Christoph Lameter
@ 2005-06-28 19:47 ` Arjan van de Ven
2005-06-28 20:00 ` Jeremy Maitin-Shepard
1 sibling, 1 reply; 22+ messages in thread
From: Arjan van de Ven @ 2005-06-28 19:47 UTC (permalink / raw)
To: Jeremy Maitin-Shepard; +Cc: Christoph Lameter, linux-kernel
On Tue, 2005-06-28 at 15:27 -0400, Jeremy Maitin-Shepard wrote:
> Christoph Lameter <christoph@lameter.com> writes:
>
> > On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
> >> As I mentioned previously when this patch was first posted to the list,
> >> AFS writes to the syscall table. It does this even for Linux 2.6.
> >> Apparently, the rodata section is not actually mapped read-only, so this
> >> patch will probably not break AFS; nonetheless, it seems it would still
> >> be better to keep the syscall table in a section that is supposed to be
> >> writable.
>
> > Maybe this needs to be fixed?
>
> It would probably be better implemented with a more generic mechanism,
> but I don't believe anyone is working on that now, so it looks like AFS
> will continue to use a special syscall.
the kernel afs doesnt' seem to need a special syscall....
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:42 ` Christoph Hellwig
@ 2005-06-28 19:52 ` Jeremy Maitin-Shepard
2005-06-28 20:11 ` Arjan van de Ven
0 siblings, 1 reply; 22+ messages in thread
From: Jeremy Maitin-Shepard @ 2005-06-28 19:52 UTC (permalink / raw)
To: linux-kernel
Christoph Hellwig <hch@infradead.org> writes:
> On Tue, Jun 28, 2005 at 12:31:33PM -0700, Christoph Lameter wrote:
>> On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
>>
>> > It would probably be better implemented with a more generic mechanism,
>> > but I don't believe anyone is working on that now, so it looks like AFS
>> > will continue to use a special syscall.
>>
>> We could put an #ifdef CONFIG_AFS into the syscall table definition?
>> That makes it explicit.
> No. AFS is utterly wrong, and the sooner we make it fail to work the
> better.
Heh, well that is nice, but breaking it will only mean that I and every
other AFS user will have to revert the patch that breaks it;
furthermore, many distributions that provide binary kernels will
probably also have to revert the patch because many of their users will
want to use AFS.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:47 ` Arjan van de Ven
@ 2005-06-28 20:00 ` Jeremy Maitin-Shepard
0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Maitin-Shepard @ 2005-06-28 20:00 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Christoph Lameter, linux-kernel
Arjan van de Ven <arjan@infradead.org> writes:
> the kernel afs doesnt' seem to need a special syscall....
As I mentioned in a previous message, I believe the in-kernel AFS
supports neither writing nor authentication, making it hardly a viable
replacement of the out-of-kernel OpenAFS for most AFS users. I believe
OpenAFS requires the system call in order to support authentication.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:52 ` Jeremy Maitin-Shepard
@ 2005-06-28 20:11 ` Arjan van de Ven
2005-06-28 20:23 ` Jeremy Maitin-Shepard
0 siblings, 1 reply; 22+ messages in thread
From: Arjan van de Ven @ 2005-06-28 20:11 UTC (permalink / raw)
To: Jeremy Maitin-Shepard; +Cc: linux-kernel
On Tue, 2005-06-28 at 15:52 -0400, Jeremy Maitin-Shepard wrote:
> Christoph Hellwig <hch@infradead.org> writes:
>
> > On Tue, Jun 28, 2005 at 12:31:33PM -0700, Christoph Lameter wrote:
> >> On Tue, 28 Jun 2005, Jeremy Maitin-Shepard wrote:
> >>
> >> > It would probably be better implemented with a more generic mechanism,
> >> > but I don't believe anyone is working on that now, so it looks like AFS
> >> > will continue to use a special syscall.
> >>
> >> We could put an #ifdef CONFIG_AFS into the syscall table definition?
> >> That makes it explicit.
>
> > No. AFS is utterly wrong, and the sooner we make it fail to work the
> > better.
>
> Heh, well that is nice, but breaking it will only mean that I and every
> other AFS user will have to revert the patch that breaks it;
> furthermore, many distributions that provide binary kernels will
> probably also have to revert the patch because many of their users will
> want to use AFS.
AFS isn't even using it... after all it's not even exported.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 20:11 ` Arjan van de Ven
@ 2005-06-28 20:23 ` Jeremy Maitin-Shepard
0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Maitin-Shepard @ 2005-06-28 20:23 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel
Arjan van de Ven <arjan@infradead.org> writes:
> AFS isn't even using it... after all it's not even exported.
Even if it is not exported, the OpenAFS kernel module can locate the
system call table using various methods. It most certainly does write
to the system call table, setting entry 137, which is reserved for the
afs system call, to the correct function address.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:41 ` Christoph Lameter
@ 2005-06-29 0:06 ` Arnd Bergmann
2005-06-29 2:49 ` Andi Kleen
1 sibling, 0 replies; 22+ messages in thread
From: Arnd Bergmann @ 2005-06-29 0:06 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andi Kleen, linux-kernel
On Dinsdag 28 Juni 2005 21:41, Christoph Lameter wrote:
> The ability to protect a readonly section may be another issue.
Exactly. Mapping the readonly section readonly adds a nice way to
check that constant data is handled correctly by all of the code.
Otherwise, there might be some surprises if gcc performs
constant folding and we incorrectly rely on one copy to be writable.
A read-only text segment also raises the bar for authors of rootkits
or other evil hacks that patch the running kernel code.
Right now, s390 (and I believe arm, maybe others as well) is already
able to map in the readonly sections of the kernel from ROM, in order
to have more available RAM for other purposes.
Arnd <><
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-28 19:41 ` Christoph Lameter
2005-06-29 0:06 ` Arnd Bergmann
@ 2005-06-29 2:49 ` Andi Kleen
2005-07-01 20:10 ` Christoph Lameter
1 sibling, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2005-06-29 2:49 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andi Kleen, linux-kernel
On Tue, Jun 28, 2005 at 12:41:59PM -0700, Christoph Lameter wrote:
> On Tue, 28 Jun 2005, Andi Kleen wrote:
>
> > It's unfortunately useless because all the kernel is mapped in the
> > same 2 or 4MB page has to be writable because it overlaps with real
> > direct mapped memory.
>
> The question is: Are syscall tables are supposed to be
> writable? If no then this patch should go in. If yes then forget about it.
I think it would make sense in theory to write protect them
together with the kernel code and the modules
(just to make root kit writing slightly harder)
It is just that it is not practical on i386/x86-64 right now
without undue performance impact for the main kernel. TLB pressure is
unfortunately quite performance critical and we cannot goof off on this.
Write protecting the modules would be possible right now because
they're vmalloced, but might be a problem later if we move them to the
direct mapping again (that was a beneficial 2.4 optimization), so I am not sure
it would be a good idea.
BTW the kernel actually needs to write to code once
to apply alternative(), but it would't be a problem to use
a temporary mapping for this.
> On IA64 they are readonly and so I thought they should also be readonly
> on i386 and x86_64.
>
> The ability to protect a readonly section may be another issue.
Well, it's the overriding issue here. Just pretending it's readonly
when it isn't doesn't seem useful.
-Andi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-06-29 2:49 ` Andi Kleen
@ 2005-07-01 20:10 ` Christoph Lameter
2005-07-01 20:28 ` Andi Kleen
2005-07-01 20:34 ` Richard B. Johnson
0 siblings, 2 replies; 22+ messages in thread
From: Christoph Lameter @ 2005-07-01 20:10 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Wed, 29 Jun 2005, Andi Kleen wrote:
> On Tue, Jun 28, 2005 at 12:41:59PM -0700, Christoph Lameter wrote:
> > On Tue, 28 Jun 2005, Andi Kleen wrote:
> >
> > > It's unfortunately useless because all the kernel is mapped in the
> > > same 2 or 4MB page has to be writable because it overlaps with real
> > > direct mapped memory.
> >
> > The question is: Are syscall tables are supposed to be
> > writable? If no then this patch should go in. If yes then forget about it.
>
> I think it would make sense in theory to write protect them
> together with the kernel code and the modules
> (just to make root kit writing slightly harder)
Seems that you are evading the question that I asked. Are syscall tables
supposed to be writable?
> BTW the kernel actually needs to write to code once
> to apply alternative(), but it would't be a problem to use
> a temporary mapping for this.
What does this have to do with the syscall table???
> > The ability to protect a readonly section may be another issue.
>
> Well, it's the overriding issue here. Just pretending it's readonly
> when it isn't doesn't seem useful.
This is all are off-topic talking about a different issue. And we are
already "pretending" that lots of other stuff in the readonly section is
readonly.
The issue is correct placement of variables. Read only variables are
placed in a different section and the syscall tables are read only and
need to be place in the correct section.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-07-01 20:10 ` Christoph Lameter
@ 2005-07-01 20:28 ` Andi Kleen
2005-07-01 20:47 ` Richard B. Johnson
2005-07-01 20:34 ` Richard B. Johnson
1 sibling, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2005-07-01 20:28 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andi Kleen, linux-kernel
On Fri, Jul 01, 2005 at 01:10:12PM -0700, Christoph Lameter wrote:
> > I think it would make sense in theory to write protect them
> > together with the kernel code and the modules
> > (just to make root kit writing slightly harder)
>
> Seems that you are evading the question that I asked. Are syscall tables
> supposed to be writable?
I did answer it. But again: yes I think it makes sense in theory
to make them read only.
Just we cannot do it right now on i386/x86-64 due to the reasons I lined out
in my previous mail.
>
> > BTW the kernel actually needs to write to code once
> > to apply alternative(), but it would't be a problem to use
> > a temporary mapping for this.
>
> What does this have to do with the syscall table???
It is directly related to writable .text.
>
> > > The ability to protect a readonly section may be another issue.
> >
> > Well, it's the overriding issue here. Just pretending it's readonly
> > when it isn't doesn't seem useful.
>
> This is all are off-topic talking about a different issue. And we are
> already "pretending" that lots of other stuff in the readonly section is
> readonly.
Putting it into a "ro section" when it isn't actually read only is completely
useless and does not do anything useful. So unless you figure out
a way to make a true ro section without performance penalty I wouldn't bother.
If you really want it for cosmetic reasons you can still do it,
but it is about at the same usefullness level as shuffling white space in
the source around.
-Andi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-07-01 20:10 ` Christoph Lameter
2005-07-01 20:28 ` Andi Kleen
@ 2005-07-01 20:34 ` Richard B. Johnson
1 sibling, 0 replies; 22+ messages in thread
From: Richard B. Johnson @ 2005-07-01 20:34 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andi Kleen, Linux kernel
On Fri, 1 Jul 2005, Christoph Lameter wrote:
> On Wed, 29 Jun 2005, Andi Kleen wrote:
>
>> On Tue, Jun 28, 2005 at 12:41:59PM -0700, Christoph Lameter wrote:
>>> On Tue, 28 Jun 2005, Andi Kleen wrote:
>>>
>>>> It's unfortunately useless because all the kernel is mapped in the
>>>> same 2 or 4MB page has to be writable because it overlaps with real
>>>> direct mapped memory.
>>>
>>> The question is: Are syscall tables are supposed to be
>>> writable? If no then this patch should go in. If yes then forget about it.
>>
>> I think it would make sense in theory to write protect them
>> together with the kernel code and the modules
>> (just to make root kit writing slightly harder)
>
> Seems that you are evading the question that I asked. Are syscall tables
> supposed to be writable?
>
>> BTW the kernel actually needs to write to code once
>> to apply alternative(), but it would't be a problem to use
>> a temporary mapping for this.
>
> What does this have to do with the syscall table???
>
>>> The ability to protect a readonly section may be another issue.
>>
>> Well, it's the overriding issue here. Just pretending it's readonly
>> when it isn't doesn't seem useful.
>
> This is all are off-topic talking about a different issue. And we are
> already "pretending" that lots of other stuff in the readonly section is
> readonly.
>
> The issue is correct placement of variables. Read only variables are
> placed in a different section and the syscall tables are read only and
> need to be place in the correct section.
I modified my sycall table to put it in ".section .rodata". It appears
as though read-only is not enforced in the kernel. I don't know
why because, at least with ix86, both data and code can be made read-
only.
You just write to it with a segment descriptor that allows R/W, then
use another for R/O. It is true that kernel code can create any
segment descriptor it wants, dynamically, thus a properly-written
program can ultimately write to what was once R/O data. However,
I think it should default to R/O which should cut down on the
number of cheap hacks that can damage it.
Cheers,
Dick Johnson
Penguin : Linux version 2.6.12 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-07-01 20:28 ` Andi Kleen
@ 2005-07-01 20:47 ` Richard B. Johnson
2005-07-01 21:13 ` Alan Cox
0 siblings, 1 reply; 22+ messages in thread
From: Richard B. Johnson @ 2005-07-01 20:47 UTC (permalink / raw)
To: Andi Kleen; +Cc: Christoph Lameter, linux-kernel
On Fri, 1 Jul 2005, Andi Kleen wrote:
> On Fri, Jul 01, 2005 at 01:10:12PM -0700, Christoph Lameter wrote:
>>> I think it would make sense in theory to write protect them
>>> together with the kernel code and the modules
>>> (just to make root kit writing slightly harder)
>>
>> Seems that you are evading the question that I asked. Are syscall tables
>> supposed to be writable?
>
> I did answer it. But again: yes I think it makes sense in theory
> to make them read only.
>
> Just we cannot do it right now on i386/x86-64 due to the reasons I lined out
> in my previous mail.
>
>
>>
>>> BTW the kernel actually needs to write to code once
>>> to apply alternative(), but it would't be a problem to use
>>> a temporary mapping for this.
>>
>> What does this have to do with the syscall table???
>
>
> It is directly related to writable .text.
>
>>
>>>> The ability to protect a readonly section may be another issue.
>>>
>>> Well, it's the overriding issue here. Just pretending it's readonly
>>> when it isn't doesn't seem useful.
>>
>> This is all are off-topic talking about a different issue. And we are
>> already "pretending" that lots of other stuff in the readonly section is
>> readonly.
>
> Putting it into a "ro section" when it isn't actually read only is completely
> useless and does not do anything useful. So unless you figure out
> a way to make a true ro section without performance penalty I wouldn't bother.
>
> If you really want it for cosmetic reasons you can still do it,
> but it is about at the same usefullness level as shuffling white space in
> the source around.
>
> -Andi
The fact that the syscall table is R/W can be used to an advantage
for security. Yes!
After all modules are loaded, you (startup) loads a module that
makes the module-loader stuff return -ENOSYS. Then, nobody can
load any new modules. The running kernel is (more) secure.
You just need to make sure that all modules that you will need
are loaded before you do this. Also, you don't need to disable
auto module unloading because, nothing can be unloaded as
well.
Script started on Fri 01 Jul 2005 04:43:17 PM EDT
[root@chaos driver]# insmod LastDev.ko
[root@chaos driver]# insmod LastDev.ko
insmod: error inserting 'LastDev.ko': -1 Function not implemented
[root@chaos driver]# insmod LastDev.ko
insmod: error inserting 'LastDev.ko': -1 Function not implemented
[root@chaos driver]# insmod LastDev.ko
insmod: error inserting 'LastDev.ko': -1 Function not implemented
[root@chaos driver]# exit
Script done on Fri 01 Jul 2005 04:44:29 PM EDT
Cheers,
Dick Johnson
Penguin : Linux version 2.6.12 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Read only syscall tables for x86_64 and i386
2005-07-01 20:47 ` Richard B. Johnson
@ 2005-07-01 21:13 ` Alan Cox
0 siblings, 0 replies; 22+ messages in thread
From: Alan Cox @ 2005-07-01 21:13 UTC (permalink / raw)
To: linux-os; +Cc: Andi Kleen, Christoph Lameter, Linux Kernel Mailing List
On Gwe, 2005-07-01 at 21:47, Richard B. Johnson wrote:
> After all modules are loaded, you (startup) loads a module that
> makes the module-loader stuff return -ENOSYS. Then, nobody can
> load any new modules. The running kernel is (more) secure.
Just use an SELinux policy like everyone else 8). You need to block more
otherwise I can load a module by hand through /dev/mem etc
Alan
--
" If knowledge does not have owners, then intellectual property
is a trap set by neo-liberalism." -- Hugo Chavez
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2005-07-01 21:19 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-28 18:47 [PATCH] Read only syscall tables for x86_64 and i386 Christoph Lameter
2005-06-28 18:56 ` Arjan van de Ven
2005-06-28 19:26 ` Christoph Lameter
2005-06-28 19:41 ` Christoph Hellwig
[not found] ` <87oe9q70no.fsf@jbms.ath.cx>
[not found] ` <Pine.LNX.4.62.0506281218030.1454@graphe.net>
2005-06-28 19:27 ` Jeremy Maitin-Shepard
2005-06-28 19:31 ` Christoph Lameter
2005-06-28 19:41 ` Jeremy Maitin-Shepard
2005-06-28 19:42 ` Christoph Hellwig
2005-06-28 19:52 ` Jeremy Maitin-Shepard
2005-06-28 20:11 ` Arjan van de Ven
2005-06-28 20:23 ` Jeremy Maitin-Shepard
2005-06-28 19:47 ` Arjan van de Ven
2005-06-28 20:00 ` Jeremy Maitin-Shepard
[not found] <Pine.LNX.4.62.0506281141050.959@graphe.net.suse.lists.linux.kernel>
2005-06-28 19:33 ` Andi Kleen
2005-06-28 19:41 ` Christoph Lameter
2005-06-29 0:06 ` Arnd Bergmann
2005-06-29 2:49 ` Andi Kleen
2005-07-01 20:10 ` Christoph Lameter
2005-07-01 20:28 ` Andi Kleen
2005-07-01 20:47 ` Richard B. Johnson
2005-07-01 21:13 ` Alan Cox
2005-07-01 20:34 ` Richard B. Johnson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox