* using segmentation in the kernel @ 2005-10-11 20:15 Jonathan M. McCune 2005-10-11 20:36 ` Brian Gerst 2005-10-12 13:03 ` linux-os (Dick Johnson) 0 siblings, 2 replies; 11+ messages in thread From: Jonathan M. McCune @ 2005-10-11 20:15 UTC (permalink / raw) To: linux-kernel; +Cc: Arvind Seshadri, Bryan Parno [-- Attachment #1: Type: text/plain, Size: 1468 bytes --] Hello, We're starting work on a project for the 32-bit x86 Linux kernel that involves using segmentation in the kernel. As a first effort, we'd like to adjust the kernel code and data segment descriptors so that the kernel code, and data segment, bss, heap and stack exist in linear address range between 3GB and 4 GB. How could we implment this so that it breaks the memory management subsystem the least (or not at all if we are lucky ;-))? Our current thinking is to modify only the base address and the limit of the the kernel code and data segment descriptors (_KERNEL_CS and _KERNEL_DS). We set the base address to 3GB and the limit to 1GB. We would also change the kernel linker script (vmlinux.lds.S) by removing the relocation caused by PAGE_OFFSET. This would mean that the kernel would be linked to start at address 0 + 1MB in logical address space. Since we would set the base address of the kernel code and data segment descriptors to 3GB, the processor would translate all addresses emitted by the kernel so that the kernel would use addresses of 3GB + 1MB and above in the linear address space. Hopefully, this would mean that the all the paging code in the kernel would continue to work correctly. We do not understand the mm subsystem well enough to figure out if our method would work at all or if it works what things in the mm subsystem would be likely to break. Can someone who understands the mm subsystem please help us here? Thanks! -Jon [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/x-pkcs7-signature, Size: 3170 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:15 using segmentation in the kernel Jonathan M. McCune @ 2005-10-11 20:36 ` Brian Gerst 2005-10-11 20:24 ` Alon Bar-Lev 2005-10-12 13:03 ` linux-os (Dick Johnson) 1 sibling, 1 reply; 11+ messages in thread From: Brian Gerst @ 2005-10-11 20:36 UTC (permalink / raw) To: Jonathan M. McCune; +Cc: linux-kernel, Arvind Seshadri, Bryan Parno Jonathan M. McCune wrote: > Hello, > > We're starting work on a project for the 32-bit x86 Linux kernel that > involves using segmentation in the kernel. As a first effort, we'd > like to adjust the kernel code and data segment descriptors so that > the kernel code, and data segment, bss, heap and stack exist in linear > address range between 3GB and 4 GB. How could we implment this so that > it breaks the memory management subsystem the least (or not at all if > we are lucky ;-))? Why send the kernel back to the 2.0 days? There is no valid reason for doing this with they way x86 segmentation works, which is why it was done away with in 2.1. -- Brian Gerst ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:36 ` Brian Gerst @ 2005-10-11 20:24 ` Alon Bar-Lev 2005-10-11 21:12 ` Al Viro ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Alon Bar-Lev @ 2005-10-11 20:24 UTC (permalink / raw) To: Brian Gerst Cc: Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno Brian Gerst wrote: > Jonathan M. McCune wrote: > >> Hello, >> > Why send the kernel back to the 2.0 days? There is no valid reason for > doing this with they way x86 segmentation works, which is why it was > done away with in 2.1. > But with segmentation you can set code to be read-only, disallow execution from stack, separate modules so that they will not affect kernel and more... The main problem with segmentation is that it is x86 specific... Best Regards, Alon Bar-Lev. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:24 ` Alon Bar-Lev @ 2005-10-11 21:12 ` Al Viro 2005-10-11 21:14 ` Brian Gerst 2005-10-12 9:05 ` Arjan van de Ven 2 siblings, 0 replies; 11+ messages in thread From: Al Viro @ 2005-10-11 21:12 UTC (permalink / raw) To: Alon Bar-Lev Cc: Brian Gerst, Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno On Tue, Oct 11, 2005 at 10:24:46PM +0200, Alon Bar-Lev wrote: > Brian Gerst wrote: > >Jonathan M. McCune wrote: > > > >>Hello, > >> > >Why send the kernel back to the 2.0 days? There is no valid reason for > >doing this with they way x86 segmentation works, which is why it was > >done away with in 2.1. > > > > But with segmentation you can set code to be read-only, > disallow execution from stack, separate modules so that they > will not affect kernel and more... You do realize that it's a BS, don't you? * attacker that would rewrite kernel code can switch a pointer to method in any of the method tables (or pointer to the entire method table, while we are at it). * overwriting return address is trivial if you got stack smashing and there is a plenty of interesting functions in the kernel ready to elevate priveleges * modules rely on practically complete access to kernel data structures, so no amount of playing with rings will change anything for them. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:24 ` Alon Bar-Lev 2005-10-11 21:12 ` Al Viro @ 2005-10-11 21:14 ` Brian Gerst 2005-10-12 9:05 ` Arjan van de Ven 2 siblings, 0 replies; 11+ messages in thread From: Brian Gerst @ 2005-10-11 21:14 UTC (permalink / raw) To: Alon Bar-Lev Cc: Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno Alon Bar-Lev wrote: > Brian Gerst wrote: > >> Jonathan M. McCune wrote: >> >>> Hello, >>> >> Why send the kernel back to the 2.0 days? There is no valid reason >> for doing this with they way x86 segmentation works, which is why it >> was done away with in 2.1. >> > > But with segmentation you can set code to be read-only, disallow > execution from stack, separate modules so that they will not affect > kernel and more... > > The main problem with segmentation is that it is x86 specific... Too much pain for for not enough gain. Segments are not fine-grained enough to work well. Look at the PaX and execshield hacks for userspace. You are far better off working at the page-table level (RO and NX pages) which has the advantage of being portable. -- Brian Gerst ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:24 ` Alon Bar-Lev 2005-10-11 21:12 ` Al Viro 2005-10-11 21:14 ` Brian Gerst @ 2005-10-12 9:05 ` Arjan van de Ven 2005-10-12 16:07 ` Alan Cox 2 siblings, 1 reply; 11+ messages in thread From: Arjan van de Ven @ 2005-10-12 9:05 UTC (permalink / raw) To: Alon Bar-Lev Cc: Brian Gerst, Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno On Tue, 2005-10-11 at 22:24 +0200, Alon Bar-Lev wrote: > Brian Gerst wrote: > > Jonathan M. McCune wrote: > > > >> Hello, > >> > > Why send the kernel back to the 2.0 days? There is no valid reason for > > doing this with they way x86 segmentation works, which is why it was > > done away with in 2.1. > > > > But with segmentation you can set code to be read-only, you can do that without segmentation too, absolutely no problem > disallow execution from stack, That is why CPUs have NX nowadays. And it's not like the kernel is full of buffer overflows; due to the 4Kb stack space (total), there are very very few static buffers on the stack at all; simply because theres no space to do it. > separate modules so that they > will not affect kernel and more... and I don't believe this one yota. THe only way to do this is to run modules in ring 1, at which point you are in deep shit anyway. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-12 9:05 ` Arjan van de Ven @ 2005-10-12 16:07 ` Alan Cox 2005-10-12 15:44 ` Arjan van de Ven 2005-10-12 23:55 ` Jonathan M. McCune 0 siblings, 2 replies; 11+ messages in thread From: Alan Cox @ 2005-10-12 16:07 UTC (permalink / raw) To: Arjan van de Ven Cc: Alon Bar-Lev, Brian Gerst, Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno On Mer, 2005-10-12 at 11:05 +0200, Arjan van de Ven wrote: > > separate modules so that they > > will not affect kernel and more... > > and I don't believe this one yota. THe only way to do this is to run > modules in ring 1, at which point you are in deep shit anyway. Not neccessarily. Its how Xen works on x86-32 for example. It keeps itself protected from the entire Linux instance by using segmentation on 32bit processors (not 64bit however as x86-64 has no segments in 64bit) Doing that without major work on the kernel itself would be hard, and you'd need to isolate out things like page table updates and verify them whenever modules wanted to touch such stuff Alan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-12 16:07 ` Alan Cox @ 2005-10-12 15:44 ` Arjan van de Ven 2005-10-12 23:55 ` Jonathan M. McCune 1 sibling, 0 replies; 11+ messages in thread From: Arjan van de Ven @ 2005-10-12 15:44 UTC (permalink / raw) To: Alan Cox Cc: Alon Bar-Lev, Brian Gerst, Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno > > and I don't believe this one yota. THe only way to do this is to run > > modules in ring 1, at which point you are in deep shit anyway. > > Not neccessarily. Its how Xen works on x86-32 for example. It keeps > itself protected from the entire Linux instance by using segmentation on it only works if you make a very small syscall-like area which you use to talk to the "real" kernel. Which is entirely not how linux modules work right now.... at which point you're just about a userspace application anyway. Might be an interesting research project of course... > 32bit processors (not 64bit however as x86-64 has no segments in 64bit) afaik x86-64 grew segments recently for 64 bit mode for an unnamed other virtualization vendor ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-12 16:07 ` Alan Cox 2005-10-12 15:44 ` Arjan van de Ven @ 2005-10-12 23:55 ` Jonathan M. McCune 1 sibling, 0 replies; 11+ messages in thread From: Jonathan M. McCune @ 2005-10-12 23:55 UTC (permalink / raw) To: Alan Cox, linux-os Cc: Arjan van de Ven, Alon Bar-Lev, Brian Gerst, linux-kernel, Arvind Seshadri, Bryan Parno, Mark Luk [-- Attachment #1: Type: text/plain, Size: 3641 bytes --] Alan Cox wrote: >On Mer, 2005-10-12 at 11:05 +0200, Arjan van de Ven wrote: > > >>> separate modules so that they >>>will not affect kernel and more... >>> >>> >>and I don't believe this one yota. THe only way to do this is to run >>modules in ring 1, at which point you are in deep shit anyway. >> >> > >Not neccessarily. Its how Xen works on x86-32 for example. It keeps >itself protected from the entire Linux instance by using segmentation on >32bit processors (not 64bit however as x86-64 has no segments in 64bit) > >Doing that without major work on the kernel itself would be hard, and >you'd need to isolate out things like page table updates and verify them >whenever modules wanted to touch such stuff > >Alan > > > linux-os (Dick Johnson) wrote: >On the ix86 you have a problem. Let's say that you write some >code from scratch, that runs the CPU in 32-bit linear address-mode >without paging. Then you want to activate paging. To activate >paging, you MUST have provided some code and some data-space for >your descriptors, where there is a 1:1 mapping between virtual >and bus addresses. If you don't do this, at the instant you >change to paging mode, you crash. The CPU fetches garbage. > >This is why the first few megabytes of Linux are unity-mapped. >You will always need to run the kernel out of an area where >a portion of that "segment" is unity-mapped. That segment >is where the descriptors for addressing, paging, and interrupts >must reside. > >If you truly wanted to run the kernel from 3-4 GB as you state, >you must have RAM there, i.e., some physical stuff so that >a 1:1 mapping could be implemented. The 3-4 GB region is >where a lot of PCI addressing occurs on 32-bit machines and, >in fact, there are some "do-not-touch" addresses in that >region as well. > >Remember that the kernel runs in virtual address mode, but >the descriptors that specify that mode need to be in physical >memory, addressed at the same offset. You can experiment >by making a module that attempts to turn off paging and >then turn it back on. The kernel will crash instantly. >However, if you write some code somewhere in low address- >space where the startup code already exists, that turns >off paging, then turns it back on; and your module code >calls this other code, the machine will work fine. You >need the interrupts off when you play. > >So, basically you can't do what you want with any OS that >uses ix86 type CPUs. The question is; "What was it that >you really wanted to do?". What you gave us was the >"implementation details". What I want to know is what >you intend to accomplish. The ix86 architecture lends itself >to a lot of interesting things so if I knew your intentions >I might be able to help. > >Cheers, >Dick Johnson > Hello, Thanks for all the responses. The project we are working on does involve the use of Xen, so we have the advantage of Xen's taking care of the bootstrapping hassles with "unity mapping" parts of the kernel. To put it another way, the architecture we are really interested in is xen-i386. We are curious about the implications of restricting the kernel's code and data segments such that the kernel cannot read/write user space directly. We want to set the base address of the Kernel segment descriptors to 3GB and the limit to 1GB-64MB ( Xen uses the top 64 MB). We were just wondering if the best way to achieve this would be to change the kernel linker script and the segment base addresses appropriately. Any insight into whether this would work at all or what would work and how to debug something like this would be greatly appreciated. Thanks, -Jon [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/x-pkcs7-signature, Size: 3170 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-11 20:15 using segmentation in the kernel Jonathan M. McCune 2005-10-11 20:36 ` Brian Gerst @ 2005-10-12 13:03 ` linux-os (Dick Johnson) 2005-10-13 8:51 ` Denis Vlasenko 1 sibling, 1 reply; 11+ messages in thread From: linux-os (Dick Johnson) @ 2005-10-12 13:03 UTC (permalink / raw) To: Jonathan M. McCune; +Cc: linux-kernel, Arvind Seshadri, Bryan Parno On Tue, 11 Oct 2005, Jonathan M. McCune wrote: > Hello, > > We're starting work on a project for the 32-bit x86 Linux kernel that > involves using segmentation in the kernel. As a first effort, we'd > like to adjust the kernel code and data segment descriptors so that > the kernel code, and data segment, bss, heap and stack exist in linear > address range between 3GB and 4 GB. How could we implment this so that > it breaks the memory management subsystem the least (or not at all if > we are lucky ;-))? > > Our current thinking is to modify only the base address and the limit > of the the kernel code and data segment descriptors (_KERNEL_CS and > _KERNEL_DS). We set the base address to 3GB and the limit to 1GB. We > would also change the kernel linker script (vmlinux.lds.S) by removing > the relocation caused by PAGE_OFFSET. This would mean that the kernel > would be linked to start at address 0 + 1MB in logical address > space. Since we would set the base address of the kernel code and data > segment descriptors to 3GB, the processor would translate all > addresses emitted by the kernel so that the kernel would use addresses > of 3GB + 1MB and above in the linear address space. Hopefully, this > would mean that the all the paging code in the kernel would continue > to work correctly. > > We do not understand the mm subsystem well enough to figure out if our > method would work at all or if it works what things in the mm > subsystem would be likely to break. Can someone who understands the mm > subsystem please help us here? > > > Thanks! > -Jon > On the ix86 you have a problem. Let's say that you write some code from scratch, that runs the CPU in 32-bit linear address-mode without paging. Then you want to activate paging. To activate paging, you MUST have provided some code and some data-space for your descriptors, where there is a 1:1 mapping between virtual and bus addresses. If you don't do this, at the instant you change to paging mode, you crash. The CPU fetches garbage. This is why the first few megabytes of Linux are unity-mapped. You will always need to run the kernel out of an area where a portion of that "segment" is unity-mapped. That segment is where the descriptors for addressing, paging, and interrupts must reside. If you truly wanted to run the kernel from 3-4 GB as you state, you must have RAM there, i.e., some physical stuff so that a 1:1 mapping could be implemented. The 3-4 GB region is where a lot of PCI addressing occurs on 32-bit machines and, in fact, there are some "do-not-touch" addresses in that region as well. Remember that the kernel runs in virtual address mode, but the descriptors that specify that mode need to be in physical memory, addressed at the same offset. You can experiment by making a module that attempts to turn off paging and then turn it back on. The kernel will crash instantly. However, if you write some code somewhere in low address- space where the startup code already exists, that turns off paging, then turns it back on; and your module code calls this other code, the machine will work fine. You need the interrupts off when you play. So, basically you can't do what you want with any OS that uses ix86 type CPUs. The question is; "What was it that you really wanted to do?". What you gave us was the "implementation details". What I want to know is what you intend to accomplish. The ix86 architecture lends itself to a lot of interesting things so if I knew your intentions I might be able to help. Cheers, Dick Johnson Penguin : Linux version 2.6.13.4 on an i686 machine (5589.48 BogoMips). Warning : 98.36% of all statistics are fiction. . **************************************************************** The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: using segmentation in the kernel 2005-10-12 13:03 ` linux-os (Dick Johnson) @ 2005-10-13 8:51 ` Denis Vlasenko 0 siblings, 0 replies; 11+ messages in thread From: Denis Vlasenko @ 2005-10-13 8:51 UTC (permalink / raw) To: linux-os (Dick Johnson) Cc: Jonathan M. McCune, linux-kernel, Arvind Seshadri, Bryan Parno On Wednesday 12 October 2005 16:03, linux-os (Dick Johnson) wrote: > On the ix86 you have a problem. Let's say that you write some > code from scratch, that runs the CPU in 32-bit linear address-mode > without paging. Then you want to activate paging. To activate > paging, you MUST have provided some code and some data-space for > your descriptors, where there is a 1:1 mapping between virtual > and bus addresses. If you don't do this, at the instant you > change to paging mode, you crash. The CPU fetches garbage. > > This is why the first few megabytes of Linux are unity-mapped. > You will always need to run the kernel out of an area where > a portion of that "segment" is unity-mapped. That segment > is where the descriptors for addressing, paging, and interrupts > must reside. > > If you truly wanted to run the kernel from 3-4 GB as you state, > you must have RAM there, i.e., some physical stuff so that > a 1:1 mapping could be implemented. The 3-4 GB region is > where a lot of PCI addressing occurs on 32-bit machines and, > in fact, there are some "do-not-touch" addresses in that > region as well. This is untrue. After paging is enabled, you can jump to non-unity mapped location and remove small unity-mapped region. > Remember that the kernel runs in virtual address mode, but > the descriptors that specify that mode need to be in physical > memory, addressed at the same offset. You can experiment Some of us are smart enough to add an offset when doing virt<->phys conversion, if needed. -- vda ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-10-13 8:52 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-10-11 20:15 using segmentation in the kernel Jonathan M. McCune 2005-10-11 20:36 ` Brian Gerst 2005-10-11 20:24 ` Alon Bar-Lev 2005-10-11 21:12 ` Al Viro 2005-10-11 21:14 ` Brian Gerst 2005-10-12 9:05 ` Arjan van de Ven 2005-10-12 16:07 ` Alan Cox 2005-10-12 15:44 ` Arjan van de Ven 2005-10-12 23:55 ` Jonathan M. McCune 2005-10-12 13:03 ` linux-os (Dick Johnson) 2005-10-13 8:51 ` Denis Vlasenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox