* Potential bug in fs/binfmt_elf.c? @ 2004-03-05 17:38 Mike Hearn 2004-03-05 18:28 ` John Reiser 2004-03-06 18:46 ` Ulrich Drepper 0 siblings, 2 replies; 17+ messages in thread From: Mike Hearn @ 2004-03-05 17:38 UTC (permalink / raw) To: linux-kernel Hi, I believe there is a problem in fs/binfmt_elf.c, around line 700 (kernel 2.6.1) When mapping a nobits PT_LOAD segment with a memsize > filesize, the kernel calls set_brk (which in turns calls do_brk) to map and clear the area, but this discards access permissons on the mapping leading to rwx protection. This causes a load failure on systems where the VM cannot reserve swap space for the segment, unless overcommit is active (on many systems it's not on by default). I don't know this code well, but it seems that this discarding of access permissions on the unlikely codepath is incorrect. I filed bug #2255 [1] on it. Could somebody who understands the ELF loading code please check to see if this is a bug, and if so produce a patch? The ability to define a new (large) ELF section which isn't backed by swap space nor disk space and that will be mapped to a specific VMA range is needed by Wine to reserve the PE load area. Currently the fact that the section is always mapped rwx despite being marked read-only in the binary prevents us from using this as a solution to the problems caused by exec-shield/prelink, meaning the only solution is to bootstrap the ELF interpreter ourselves from a statically linked binary. Clearly we'd rather not do that. Thanks to pageexec@freemail.hu for bringing the matter to my attention. Your assistance is appreciated, thanks -mike [1] http://bugzilla.kernel.org/show_bug.cgi?id=2255 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-05 17:38 Potential bug in fs/binfmt_elf.c? Mike Hearn @ 2004-03-05 18:28 ` John Reiser 2004-03-06 18:46 ` Ulrich Drepper 1 sibling, 0 replies; 17+ messages in thread From: John Reiser @ 2004-03-05 18:28 UTC (permalink / raw) To: mike; +Cc: linux-kernel > When mapping a nobits PT_LOAD segment with a memsize > filesize, the > kernel calls set_brk (which in turns calls do_brk) to map and clear the > area, but this discards access permissons on the mapping leading to rwx > protection. This causes a load failure on systems where the VM cannot > reserve swap space for the segment, unless overcommit is active (on many > systems it's not on by default). [snip] I believe that's not the only problem with binfmt_elf. If the total address space described by the PT_LOADs is not exactly one contiguous interval, then 2.6.3 binfmt_elf fills in the gaps with 'prw.' of zero-filled pages, instead of the intended "holes" with no mapping at all between isolated PT_LOADs. One example is https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115913 -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-05 17:38 Potential bug in fs/binfmt_elf.c? Mike Hearn 2004-03-05 18:28 ` John Reiser @ 2004-03-06 18:46 ` Ulrich Drepper 2004-03-06 21:10 ` Mike Hearn 1 sibling, 1 reply; 17+ messages in thread From: Ulrich Drepper @ 2004-03-06 18:46 UTC (permalink / raw) To: mike; +Cc: linux-kernel Mike Hearn wrote: > When mapping a nobits PT_LOAD segment with a memsize > filesize, Show an example of what the file looks like. Just the ELF program header (readelf -l output). -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-06 18:46 ` Ulrich Drepper @ 2004-03-06 21:10 ` Mike Hearn 2004-03-07 6:11 ` Ulrich Drepper 2004-03-07 23:55 ` Eric W. Biederman 0 siblings, 2 replies; 17+ messages in thread From: Mike Hearn @ 2004-03-06 21:10 UTC (permalink / raw) To: Ulrich Drepper; +Cc: linux-kernel On Sat, 2004-03-06 at 18:46, Ulrich Drepper wrote: > Show an example of what the file looks like. Just the ELF program > header (readelf -l output). I can send the linker script and source file on request. They are probably a bit buggy, this isn't an area I know much about. The binutils guys seemed to think it should work however. thanks -mike Elf file type is EXEC (Executable file) Entry point 0x818 There are 6 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x00000034 0x00000034 0x000c0 0x000c0 R 0x4 INTERP 0x000400 0x00000400 0x00000400 0x00034 0x00034 R 0x4 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000 LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000 DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4 LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 Section to Segment mapping: Segment Sections... 00 01 .interp .note.ABI-tag 02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame 03 .data .dynamic .ctors .dtors .jcr .got .bss 04 .dynamic 05 .wine.loadarea ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-06 21:10 ` Mike Hearn @ 2004-03-07 6:11 ` Ulrich Drepper 2004-03-07 9:58 ` Mike Hearn 2004-03-07 23:55 ` Eric W. Biederman 1 sibling, 1 reply; 17+ messages in thread From: Ulrich Drepper @ 2004-03-07 6:11 UTC (permalink / raw) To: mike; +Cc: linux-kernel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mike Hearn wrote: > LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000 > LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000 > DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4 > LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 Not everything which can be expressed in ELF is supported. You don't want to load something, you want to reserve address space. And you want it allocated in a certain way. The ELF loader is no generic ELF interpreter. Now, if the only problem is the overcommit and making the do_brk() call allocate the memory as read-only a change to the do_brk() interface might be acceptable (well, ask somebody doing mm hacking). I wouldn't be entirely sure whether read-only pages alone are enough. This does not open any new holes as far as I can see. I'd say experiment with it and add a flags parameter which is the right combination of VM_READ | VM_WRITE | VM_EXEC. All calls but the one in binfmt_elf.c should pass all read bits, the one in binfmt_elf can respect the binaries flags. You must be sure, though, that the last page of the data area (i.e., writable area) in a regular binary is not mapped read-only. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFASr0L2ijCOnn/RHQRAjtZAJ931c+0Czw8jJc0kOv1+lIMyVLEOgCgtj3f aHnlBUWz8qFQitDqVBWyPpc= =f2UN -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-07 6:11 ` Ulrich Drepper @ 2004-03-07 9:58 ` Mike Hearn 2004-03-07 10:46 ` Ulrich Drepper 0 siblings, 1 reply; 17+ messages in thread From: Mike Hearn @ 2004-03-07 9:58 UTC (permalink / raw) To: linux-kernel On Sat, 06 Mar 2004 22:11:18 -0800, Ulrich Drepper wrote: > Not everything which can be expressed in ELF is supported. You don't > want to load something, you want to reserve address space. And you want > it allocated in a certain way. The ELF loader is no generic ELF > interpreter. Ah, OK. I was hoping this would not the answer. > Now, if the only problem is the overcommit and making the do_brk() call > allocate the memory as read-only a change to the do_brk() interface > might be acceptable (well, ask somebody doing mm hacking). I wouldn't > be entirely sure whether read-only pages alone are enough. This does > not open any new holes as far as I can see. This is certainly one long term solution, but we'd like to avoid kernel hacking if at all possible. We have a prototype of a program which is statically linked then turns itself into a dynamically linked app by bootstrapping the ELF interpreter in the same way the kernel does after mapping the range wanted with MAP_NORESERVE. Obviously we'd like the real fix, but something which works nicely on Fedora Core 1 machines today is also necessary. Thanks for your advice. One quick question - you said binfmt_elf is not a generic ELF interpreter, but the one in glibc presumably is yes? Would it be possible to achieve the effect wanted by having a dummy stub binary linked with -nostdlib etc, so it's a dynamically linked ELF program with only one DT_NEEDED entry which is against the real binary. This would short-circuit the kernel loader and pass control as soon as possible to glibc, which would follow the first DT_NEEDED entry and map in the real binary, which in turn contains the PE load area reservation section. IIRC glibc always uses mmap to map ELF sections so this could work better. Does this sound plausible? If so, do you have any tips on where to look for docs on it? Last time I tried compiling something with -nostdlib, I ran into problems with the default linker script not liking it (entry points I think). thanks -mike ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-07 9:58 ` Mike Hearn @ 2004-03-07 10:46 ` Ulrich Drepper 2004-03-07 11:53 ` Mike Hearn 0 siblings, 1 reply; 17+ messages in thread From: Ulrich Drepper @ 2004-03-07 10:46 UTC (permalink / raw) To: mike; +Cc: linux-kernel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mike Hearn wrote: > One quick question - you said binfmt_elf is not a > generic ELF interpreter, but the one in glibc presumably is yes? No. It only handles what is necessary. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFASv1y2ijCOnn/RHQRAihlAKCa6gGblyruVtYqk5OQFf3IvL4ELQCgyY5/ lkq7yoQelRbOFVuUwAAvcmk= =MFBI -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-07 10:46 ` Ulrich Drepper @ 2004-03-07 11:53 ` Mike Hearn 2004-03-07 21:32 ` Ulrich Drepper 0 siblings, 1 reply; 17+ messages in thread From: Mike Hearn @ 2004-03-07 11:53 UTC (permalink / raw) To: linux-kernel On Sun, 07 Mar 2004 02:46:10 -0800, Ulrich Drepper wrote: > No. It only handles what is necessary. But can it handle this case, or will it also map the load area ELF section wrongly? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-07 11:53 ` Mike Hearn @ 2004-03-07 21:32 ` Ulrich Drepper 0 siblings, 0 replies; 17+ messages in thread From: Ulrich Drepper @ 2004-03-07 21:32 UTC (permalink / raw) To: mike; +Cc: linux-kernel Mike Hearn wrote: > But can it handle this case, or will it also map the load area ELF section > wrongly? It will most probably do something you don't want. But ld.so is no particularly special program. Just write your own very small and specialized dynamic loader which does exactly what you need. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-06 21:10 ` Mike Hearn 2004-03-07 6:11 ` Ulrich Drepper @ 2004-03-07 23:55 ` Eric W. Biederman 2004-03-08 5:57 ` John Reiser 1 sibling, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2004-03-07 23:55 UTC (permalink / raw) To: mike; +Cc: Ulrich Drepper, linux-kernel Mike Hearn <mike@navi.cx> writes: > On Sat, 2004-03-06 at 18:46, Ulrich Drepper wrote: > > Show an example of what the file looks like. Just the ELF program > > header (readelf -l output). > > I can send the linker script and source file on request. They are > probably a bit buggy, this isn't an area I know much about. The binutils > guys seemed to think it should work however. > > thanks -mike > > > Elf file type is EXEC (Executable file) > Entry point 0x818 > There are 6 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > PHDR 0x000034 0x00000034 0x00000034 0x000c0 0x000c0 R 0x4 > INTERP 0x000400 0x00000400 0x00000400 0x00034 0x00034 R 0x4 > [Requesting program interpreter: /lib/ld-linux.so.2] > LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000 > LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000 > DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4 > LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 That last PT_LOAD segment looks like pure nonsense. What is the purpose of allocating 256MB of read-only zeros? Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-07 23:55 ` Eric W. Biederman @ 2004-03-08 5:57 ` John Reiser 2004-03-08 8:06 ` Jakub Jelinek 0 siblings, 1 reply; 17+ messages in thread From: John Reiser @ 2004-03-08 5:57 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel >> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 > > > What is the purpose of allocating 256MB of read-only zeros? To prevent the kernel from placing any shared libraries there [via mmap() from ld-linux.so.2], especially under the influence of exec-shield. This is 'wine', which wants to reserve that address space for mapping executables that were built for some other operating system. For this purpose, the .p_flags of PF_R instead could be 0 [==> PROT_NONE]; but do_brk() still turns either one into 'prw.' which has potential memory [over-]commit charges. The expected 'pr--' [or 'p---'] should have a memory commit cost of zero. -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Potential bug in fs/binfmt_elf.c? 2004-03-08 5:57 ` John Reiser @ 2004-03-08 8:06 ` Jakub Jelinek 2004-03-11 6:17 ` [PATCH] binfmt_elf.c allow .bss with no access (p---) John Reiser 0 siblings, 1 reply; 17+ messages in thread From: Jakub Jelinek @ 2004-03-08 8:06 UTC (permalink / raw) To: John Reiser; +Cc: Eric W. Biederman, linux-kernel On Sun, Mar 07, 2004 at 09:57:43PM -0800, John Reiser wrote: > >> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R > >> 0x1000 > > > > > >What is the purpose of allocating 256MB of read-only zeros? > > To prevent the kernel from placing any shared libraries there [via mmap() > from ld-linux.so.2], especially under the influence of exec-shield. > This is 'wine', which wants to reserve that address space for mapping > executables that were built for some other operating system. For this > purpose, the .p_flags of PF_R instead could be 0 [==> PROT_NONE]; but > do_brk() still turns either one into 'prw.' which has potential memory > [over-]commit charges. The expected 'pr--' [or 'p---'] should have > a memory commit cost of zero. It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't handle that properly. glibc ld.so indeed does the right thing with p_flags 0. Jakub ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] binfmt_elf.c allow .bss with no access (p---) 2004-03-08 8:06 ` Jakub Jelinek @ 2004-03-11 6:17 ` John Reiser 2004-03-11 14:23 ` Mike Hearn [not found] ` <20040412185317.79ac7d7d.akpm@osdl.org> 0 siblings, 2 replies; 17+ messages in thread From: John Reiser @ 2004-03-11 6:17 UTC (permalink / raw) To: Jakub Jelinek; +Cc: linux-kernel >>>>LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 > It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't > handle that properly. This ALPHA quality patch against 2.6.3 adds another argument to do_brk() which enables having a user ELF .bss with no-access (or read-only). Such cases must be page-aligned on the low-address end of .bss, else the padzero() runs into trouble. The new code applies .bss immediately to each PT_LOAD whenever .p_filesz < .p_memsz . In 99.9% of cases at most one PT_LOAD has any .bss, so this costs no more time than 2.6.3, which tries to wait as long as it can before applying .bss. This patch runs on x86. --- ./fs/binfmt_elf.c.orig 2004-03-10 19:24:51.243682696 -0800 +++ ./fs/binfmt_elf.c 2004-03-10 19:28:31.435208496 -0800 @@ -83,12 +83,12 @@ #define BAD_ADDR(x) ((unsigned long)(x) > TASK_SIZE) -static int set_brk(unsigned long start, unsigned long end) +static int set_brk(unsigned long start, unsigned long end, unsigned int vm_inhibit) { start = ELF_PAGEALIGN(start); end = ELF_PAGEALIGN(end); if (end > start) { - unsigned long addr = do_brk(start, end - start); + unsigned long addr = do_brk(start, end - start, vm_inhibit); if (BAD_ADDR(addr)) return addr; } @@ -114,6 +114,18 @@ } } +static /*inline*/ unsigned int +calc_bss_inhibit(struct elf_phdr const *phdr, unsigned elf_prot) +{ + unsigned vm_inhib = (VM_READ|VM_WRITE|VM_EXEC) ^ calc_vm_prot_bits(elf_prot); + /* Can readonly .data/.text have associated read+write .bss ? + if (phdr->p_filesz && !(phdr->p_flags & PF_W)) { + vm_inhib &= ~VM_WRITE; + } + */ + return vm_inhib; +} + /* Let's use some macros to make this stack manipulation a litle clearer */ #ifdef CONFIG_STACK_GROWSUP #define STACK_ADD(sp, items) ((elf_addr_t *)(sp) + (items)) @@ -298,7 +310,6 @@ struct elf_phdr *eppnt; unsigned long load_addr = 0; int load_addr_set = 0; - unsigned long last_bss = 0, elf_bss = 0; unsigned long error = ~0UL; int retval, i, size; @@ -340,7 +351,7 @@ int elf_type = MAP_PRIVATE | MAP_DENYWRITE; int elf_prot = 0; unsigned long vaddr = 0; - unsigned long k, map_addr; + unsigned long map_addr; if (eppnt->p_flags & PF_R) elf_prot = PROT_READ; if (eppnt->p_flags & PF_W) elf_prot |= PROT_WRITE; @@ -359,38 +370,20 @@ load_addr_set = 1; } - /* - * Find the end of the file mapping for this phdr, and keep - * track of the largest address we see for this. - */ - k = load_addr + eppnt->p_vaddr + eppnt->p_filesz; - if (k > elf_bss) - elf_bss = k; - - /* - * Do the same thing for the memory mapping - between - * elf_bss and last_bss is the bss section. - */ - k = load_addr + eppnt->p_memsz + eppnt->p_vaddr; - if (k > last_bss) - last_bss = k; - } - } - - /* - * Now fill out the bss section. First pad the last page up - * to the page boundary, and then perform a mmap to make sure - * that there are zero-mapped pages up to and including the - * last bss page. - */ - padzero(elf_bss); - elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1); /* What we have mapped so far */ - - /* Map the last of the bss segment */ - if (last_bss > elf_bss) { - error = do_brk(elf_bss, last_bss - elf_bss); - if (BAD_ADDR(error)) + if (eppnt->p_filesz < eppnt->p_memsz) { /* has local .bss */ + unsigned long elf_bss = load_addr + vaddr + eppnt->p_filesz; + padzero(elf_bss); + elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1); + + vaddr += load_addr + eppnt->p_memsz; + if (elf_bss < vaddr) { + error = do_brk(elf_bss, vaddr - elf_bss, + calc_bss_inhibit(eppnt, elf_prot) ); + if (BAD_ADDR(error)) goto out_close; + } + } + } } *interp_load_addr = load_addr; @@ -428,7 +421,7 @@ goto out; } - do_brk(0, text_data); + do_brk(0, text_data, 0); if (!interpreter->f_op || !interpreter->f_op->read) goto out; if (interpreter->f_op->read(interpreter, addr, text_data, &offset) < 0) @@ -437,7 +430,7 @@ (unsigned long)addr + text_data); do_brk(ELF_PAGESTART(text_data + ELF_MIN_ALIGN - 1), - interp_ex->a_bss); + interp_ex->a_bss, 0); elf_entry = interp_ex->a_entry; out: @@ -464,7 +457,7 @@ unsigned char ibcs2_interpreter = 0; unsigned long error; struct elf_phdr * elf_ppnt, *elf_phdata; - unsigned long elf_bss, elf_brk; + unsigned long elf_brk; int elf_exec_fileno; int retval, i; unsigned int size; @@ -526,7 +519,6 @@ fd_install(elf_exec_fileno = retval, bprm->file); elf_ppnt = elf_phdata; - elf_bss = 0; elf_brk = 0; start_code = ~0UL; @@ -687,44 +679,23 @@ the image should be loaded at fixed address, not at a variable address. */ - for(i = 0, elf_ppnt = elf_phdata; i < elf_ex.e_phnum; i++, elf_ppnt++) { - int elf_prot = 0, elf_flags; - unsigned long k, vaddr; - - if (elf_ppnt->p_type != PT_LOAD) - continue; - - if (unlikely (elf_brk > elf_bss)) { - unsigned long nbyte; - - /* There was a PT_LOAD segment with p_memsz > p_filesz - before this one. Map anonymous pages, if needed, - and clear the area. */ - retval = set_brk (elf_bss + load_bias, - elf_brk + load_bias); - if (retval) { - send_sig(SIGKILL, current, 0); - goto out_free_dentry; - } - nbyte = ELF_PAGEOFFSET(elf_bss); - if (nbyte) { - nbyte = ELF_MIN_ALIGN - nbyte; - if (nbyte > elf_brk - elf_bss) - nbyte = elf_brk - elf_bss; - clear_user((void *) elf_bss + load_bias, nbyte); - } - } - - if (elf_ppnt->p_flags & PF_R) elf_prot |= PROT_READ; - if (elf_ppnt->p_flags & PF_W) elf_prot |= PROT_WRITE; - if (elf_ppnt->p_flags & PF_X) elf_prot |= PROT_EXEC; - - elf_flags = MAP_PRIVATE|MAP_DENYWRITE|MAP_EXECUTABLE; - - vaddr = elf_ppnt->p_vaddr; - if (elf_ex.e_type == ET_EXEC || load_addr_set) { - elf_flags |= MAP_FIXED; - } else if (elf_ex.e_type == ET_DYN) { + for(i = 0, elf_ppnt = elf_phdata; i < elf_ex.e_phnum; i++, elf_ppnt++) { + int elf_prot = 0, elf_flags; + unsigned long const vaddr = elf_ppnt->p_vaddr; + unsigned long k, elf_bss; + + if (elf_ppnt->p_type != PT_LOAD) + continue; + + if (elf_ppnt->p_flags & PF_R) elf_prot |= PROT_READ; + if (elf_ppnt->p_flags & PF_W) elf_prot |= PROT_WRITE; + if (elf_ppnt->p_flags & PF_X) elf_prot |= PROT_EXEC; + + elf_flags = MAP_PRIVATE|MAP_DENYWRITE|MAP_EXECUTABLE; + + if (elf_ex.e_type == ET_EXEC || load_addr_set) + elf_flags |= MAP_FIXED; + else if (elf_ex.e_type == ET_DYN) /* Try and get dynamic programs out of the way of the default mmap base, as well as whatever program they might try to exec. This is because the brk will follow the loader, and is not movable. */ @@ -737,7 +708,7 @@ if (!load_addr_set) { load_addr_set = 1; - load_addr = (elf_ppnt->p_vaddr - elf_ppnt->p_offset); + load_addr = (vaddr - elf_ppnt->p_offset); if (elf_ex.e_type == ET_DYN) { load_bias += error - ELF_PAGESTART(load_bias + vaddr); @@ -745,43 +716,45 @@ reloc_func_desc = load_bias; } } - k = elf_ppnt->p_vaddr; + k = vaddr; if (k < start_code) start_code = k; if (start_data < k) start_data = k; - k = elf_ppnt->p_vaddr + elf_ppnt->p_filesz; - - if (k > elf_bss) - elf_bss = k; + k = vaddr + elf_ppnt->p_filesz; /* start local .bss */ + elf_bss = k; if ((elf_ppnt->p_flags & PF_X) && end_code < k) end_code = k; if (end_data < k) end_data = k; - k = elf_ppnt->p_vaddr + elf_ppnt->p_memsz; + + k = vaddr + elf_ppnt->p_memsz; /* end local .bss */ if (k > elf_brk) elf_brk = k; + + if (elf_bss < k) { + elf_bss += load_bias; + padzero(elf_bss); + elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1); + + k += load_bias; + if (elf_bss < k) { + retval = set_brk(elf_bss, k - elf_bss, + calc_bss_inhibit(elf_ppnt, elf_prot) ); + if (retval) { + send_sig(SIGKILL, current, 0); + goto out_free_dentry; + } + } + } } elf_ex.e_entry += load_bias; - elf_bss += load_bias; elf_brk += load_bias; start_code += load_bias; end_code += load_bias; start_data += load_bias; end_data += load_bias; - /* Calling set_brk effectively mmaps the pages that we need - * for the bss and break sections. We must do this before - * mapping in the interpreter, to make sure it doesn't wind - * up getting placed where the bss needs to go. - */ - retval = set_brk(elf_bss, elf_brk); - if (retval) { - send_sig(SIGKILL, current, 0); - goto out_free_dentry; - } - padzero(elf_bss); - if (elf_interpreter) { if (interpreter_type == INTERPRETER_AOUT) elf_entry = load_aout_interp(&interp_ex, @@ -947,7 +920,7 @@ len = ELF_PAGESTART(elf_phdata->p_filesz + elf_phdata->p_vaddr + ELF_MIN_ALIGN - 1); bss = elf_phdata->p_memsz + elf_phdata->p_vaddr; if (bss > len) - do_brk(len, bss - len); + do_brk(len, bss - len, 0); error = 0; out_free_ph: --- ./include/linux/mm.h.orig 2004-03-07 19:01:46.000000000 -0800 +++ ./include/linux/mm.h 2004-03-08 08:43:02.000000000 -0800 @@ -554,7 +554,7 @@ extern int do_munmap(struct mm_struct *, unsigned long, size_t); -extern unsigned long do_brk(unsigned long, unsigned long); +extern unsigned long do_brk(unsigned long, unsigned long, unsigned int); extern void __vma_unlink(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev); --- ./fs/binfmt_aout.c.orig 2004-03-07 19:01:46.000000000 -0800 +++ ./fs/binfmt_aout.c 2004-03-08 08:36:01.000000000 -0800 @@ -44,13 +44,13 @@ .min_coredump = PAGE_SIZE }; -static void set_brk(unsigned long start, unsigned long end) +static void set_brk(unsigned long start, unsigned long end, unsigned vm_inhibit) { start = PAGE_ALIGN(start); end = PAGE_ALIGN(end); if (end <= start) return; - do_brk(start, end - start); + do_brk(start, end - start, vm_inhibit); } /* @@ -319,10 +319,10 @@ loff_t pos = fd_offset; /* Fuck me plenty... */ /* <AOL></AOL> */ - error = do_brk(N_TXTADDR(ex), ex.a_text); + error = do_brk(N_TXTADDR(ex), ex.a_text, 0); bprm->file->f_op->read(bprm->file, (char *) N_TXTADDR(ex), ex.a_text, &pos); - error = do_brk(N_DATADDR(ex), ex.a_data); + error = do_brk(N_DATADDR(ex), ex.a_data, 0); bprm->file->f_op->read(bprm->file, (char *) N_DATADDR(ex), ex.a_data, &pos); goto beyond_if; @@ -343,7 +343,7 @@ map_size = ex.a_text+ex.a_data; #endif - error = do_brk(text_addr & PAGE_MASK, map_size); + error = do_brk(text_addr & PAGE_MASK, map_size, 0); if (error != (text_addr & PAGE_MASK)) { send_sig(SIGKILL, current, 0); return error; @@ -377,7 +377,7 @@ if (!bprm->file->f_op->mmap||((fd_offset & ~PAGE_MASK) != 0)) { loff_t pos = fd_offset; - do_brk(N_TXTADDR(ex), ex.a_text+ex.a_data); + do_brk(N_TXTADDR(ex), ex.a_text+ex.a_data, 0); bprm->file->f_op->read(bprm->file,(char *)N_TXTADDR(ex), ex.a_text+ex.a_data, &pos); flush_icache_range((unsigned long) N_TXTADDR(ex), @@ -412,7 +412,7 @@ beyond_if: set_binfmt(&aout_format); - set_brk(current->mm->start_brk, current->mm->brk); + set_brk(current->mm->start_brk, current->mm->brk, 0); retval = setup_arg_pages(bprm, 1); if (retval < 0) { @@ -478,7 +478,7 @@ error_time = jiffies; } - do_brk(start_addr, ex.a_text + ex.a_data + ex.a_bss); + do_brk(start_addr, ex.a_text + ex.a_data + ex.a_bss, 0); file->f_op->read(file, (char *)start_addr, ex.a_text + ex.a_data, &pos); @@ -502,7 +502,7 @@ len = PAGE_ALIGN(ex.a_text + ex.a_data); bss = ex.a_text + ex.a_data + ex.a_bss; if (bss > len) { - error = do_brk(start_addr + len, bss - len); + error = do_brk(start_addr + len, bss - len, 0); retval = error; if (error != start_addr + len) goto out; --- ./mm/mmap.c.orig 2004-03-07 19:01:46.000000000 -0800 +++ ./mm/mmap.c 2004-03-08 08:29:49.000000000 -0800 @@ -127,7 +127,7 @@ goto out; /* Ok, looks good - let it rip. */ - if (do_brk(oldbrk, newbrk-oldbrk) != oldbrk) + if (do_brk(oldbrk, newbrk-oldbrk, 0) != oldbrk) goto out; set_brk: mm->brk = brk; @@ -1354,7 +1354,7 @@ * anonymous maps. eventually we may be able to do some * brk-specific accounting here. */ -unsigned long do_brk(unsigned long addr, unsigned long len) +unsigned long do_brk(unsigned long addr, unsigned long len, unsigned vm_inhibit) { struct mm_struct * mm = current->mm; struct vm_area_struct * vma, * prev; @@ -1400,7 +1400,7 @@ if (security_vm_enough_memory(len >> PAGE_SHIFT)) return -ENOMEM; - flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags; + flags = (VM_DATA_DEFAULT_FLAGS &~ vm_inhibit) | VM_ACCOUNT | mm->def_flags; /* Can we just expand an old anonymous mapping? */ if (rb_parent && vma_merge(mm, prev, rb_parent, addr, addr + len, --- ./mm/nommu.c.orig 2004-02-17 19:59:20.000000000 -0800 +++ ./mm/nommu.c 2004-03-08 08:30:07.000000000 -0800 @@ -536,7 +536,7 @@ return ret; } -unsigned long do_brk(unsigned long addr, unsigned long len) +unsigned long do_brk(unsigned long addr, unsigned long len, unsigned vm_inhibit) { return -ENOMEM; } -- John Reiser, jreiser@BitWagon.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] binfmt_elf.c allow .bss with no access (p---) 2004-03-11 6:17 ` [PATCH] binfmt_elf.c allow .bss with no access (p---) John Reiser @ 2004-03-11 14:23 ` Mike Hearn 2004-03-11 19:18 ` John Reiser [not found] ` <20040412185317.79ac7d7d.akpm@osdl.org> 1 sibling, 1 reply; 17+ messages in thread From: Mike Hearn @ 2004-03-11 14:23 UTC (permalink / raw) To: linux-kernel On Wed, 10 Mar 2004 22:17:35 -0800, John Reiser wrote: > This ALPHA quality patch against 2.6.3 adds another argument to do_brk() > which enables having a user ELF .bss with no-access (or read-only). Hi, Does this fix the Wine case where we have a new RO section that isn't the BSS? thanks -mike ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] binfmt_elf.c allow .bss with no access (p---) 2004-03-11 14:23 ` Mike Hearn @ 2004-03-11 19:18 ` John Reiser 2004-03-12 16:42 ` Mike Hearn 0 siblings, 1 reply; 17+ messages in thread From: John Reiser @ 2004-03-11 19:18 UTC (permalink / raw) To: mike; +Cc: linux-kernel >>This ALPHA quality patch against 2.6.3 adds another argument to do_brk() >>which enables having a user ELF .bss with no-access (or read-only). > Does this fix the Wine case where we have a new RO section that isn't the > BSS? Yes, because the patch considers each PT_LOAD with p_filesz < p_memsz to have a "local" .bss. This is more general than plain 2.6.3 which creates only one "global" BSS after accumulating information from all of the PT_LOAD. -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] binfmt_elf.c allow .bss with no access (p---) 2004-03-11 19:18 ` John Reiser @ 2004-03-12 16:42 ` Mike Hearn 0 siblings, 0 replies; 17+ messages in thread From: Mike Hearn @ 2004-03-12 16:42 UTC (permalink / raw) To: linux-kernel On Thu, 11 Mar 2004 11:18:57 -0800, John Reiser wrote: > Yes, because the patch considers each PT_LOAD with p_filesz < p_memsz > to have a "local" .bss. This is more general than plain 2.6.3 which > creates only one "global" BSS after accumulating information from all > of the PT_LOAD. Fantastic. I'm afraid I'm unfamiliar with the kernel development process, how do I track the patch to find out when it's checked in and which releases it's available in (we'd really like 2.4 as well, though it may be too late to make it worthwhile for now....) thanks -mike ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20040412185317.79ac7d7d.akpm@osdl.org>]
* Re: [PATCH] binfmt_elf.c allow .bss with no access (p---) [not found] ` <20040412185317.79ac7d7d.akpm@osdl.org> @ 2004-04-13 17:33 ` John Reiser 0 siblings, 0 replies; 17+ messages in thread From: John Reiser @ 2004-04-13 17:33 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel >>>>>>LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000 >> >>>It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't >>>handle that properly. >> >>This ALPHA quality patch against 2.6.3 adds another argument to do_brk() >>which enables having a user ELF .bss with no-access (or read-only). Here are refreshed patches (now BETA quality) against recent kernels: http://www.bitwagon.com/elfdiet/elfdiet-2.6.5-mm5-1.patch.gz http://www.bitwagon.com/elfdiet/elfdiet-2.6.5.patch.gz (Patch mechanics: take 2.6.5, apply -mm5-1 [if desired], then apply the corresponding elfdiet patch.) A short introduction with links to past and future patches is: http://www.bitwagon.com/elfdiet/elfdiet.html -- John Reiser, jreiser@BitWagon.com ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2004-04-13 17:32 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-05 17:38 Potential bug in fs/binfmt_elf.c? Mike Hearn
2004-03-05 18:28 ` John Reiser
2004-03-06 18:46 ` Ulrich Drepper
2004-03-06 21:10 ` Mike Hearn
2004-03-07 6:11 ` Ulrich Drepper
2004-03-07 9:58 ` Mike Hearn
2004-03-07 10:46 ` Ulrich Drepper
2004-03-07 11:53 ` Mike Hearn
2004-03-07 21:32 ` Ulrich Drepper
2004-03-07 23:55 ` Eric W. Biederman
2004-03-08 5:57 ` John Reiser
2004-03-08 8:06 ` Jakub Jelinek
2004-03-11 6:17 ` [PATCH] binfmt_elf.c allow .bss with no access (p---) John Reiser
2004-03-11 14:23 ` Mike Hearn
2004-03-11 19:18 ` John Reiser
2004-03-12 16:42 ` Mike Hearn
[not found] ` <20040412185317.79ac7d7d.akpm@osdl.org>
2004-04-13 17:33 ` John Reiser
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox