public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Inflation of vmlinux by linker on x86_64
@ 2008-09-26 12:42 Joris van Rantwijk
  2008-09-26 18:52 ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Joris van Rantwijk @ 2008-09-26 12:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: H. Peter Anvin


I noticed that a lot of space in the uncompressed vmlinux image is
taken up by alignment issues, at least on x86_64. For example, the
linker decides to align the first section on a 2 MB file offset,
thereby inserting almost 2 MB of zeroes at the beginning of vmlinux.

Idx Name   Size      VMA               LMA               File off  Algn
  0 .text  001d9203  ffffffff80200000  0000000000200000  00200000  2**12
           CONTENTS, ALLOC, LOAD, READONLY, CODE

It seems this has gotten worse with recent versions of binutils;
ld 2.18 makes my vmlinux 2 MB larger than ld 2.17.

This inflation of vmlinux causes problems with certain boot loader
configurations, where decompression of the huge vmlinux may overwrite
part of an initramfs image.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=479101

Although alignment of memory addresses is required, alignment of
file offsets in vmlinux seems completely unnecessary. After all,
vmlinux is not a demand paged binary. I tried to avoid the alignment
by adding "-n" to the linker command line. This resulted in a 4 MB
reduction of vmlinux, and an immediate crash on boot.

The crash is caused by the ELF loader in x86/boot/compressed/misc.c:
parts of the kernel are memcpy-ed over themselves. This was previously
not an issue because the inflation of vmlinux provided so much slack
that all memcpy-s would turn out in the right direction.

After fixing the ELF loader, it now seems to work quite well.
vmlinux is 4 MB smaller, bzImage is 10 kB smaller and my problems
with LILO are solved.

Would it be appropriate to put this in the kernel?
Note that I am not at all sure that "ld -n" works on all architectures;
I only tested this on i386 and x86_64.

Greetings, Joris.

diff -urN linux-2.6.27-rc7/Makefile linux-2.6.27-rc7j/Makefile
--- linux-2.6.27-rc7/Makefile	2008-09-22 00:29:55.000000000 +0200
+++ linux-2.6.27-rc7j/Makefile	2008-09-26 12:21:34.000000000 +0200
@@ -577,7 +577,7 @@
 LDFLAGS_BUILD_ID = $(patsubst -Wl$(comma)%,%,\
 			      $(call ld-option, -Wl$(comma)--build-id,))
 LDFLAGS_MODULE += $(LDFLAGS_BUILD_ID)
-LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID)
+LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID) -n
 
 # Default kernel image to build when no specific target is given.
 # KBUILD_IMAGE may be overruled on the command line or
diff -urN linux-2.6.27-rc7/arch/x86/boot/compressed/misc.c linux-2.6.27-rc7j/arch/x86/boot/compressed/misc.c
--- linux-2.6.27-rc7/arch/x86/boot/compressed/misc.c	2008-09-22 00:29:55.000000000 +0200
+++ linux-2.6.27-rc7j/arch/x86/boot/compressed/misc.c	2008-09-26 13:20:35.000000000 +0200
@@ -281,13 +281,18 @@
 	return s;
 }
 
+/* This memcpy can handle overlapping dest and src, like memmove. */
 static void *memcpy(void *dest, const void *src, unsigned n)
 {
 	int i;
 	const char *s = src;
 	char *d = dest;
 
-	for (i = 0; i < n; i++) d[i] = s[i];
+	if (dest <= src) {
+		for (i = 0; i < n; i++) d[i] = s[i];
+	} else {
+		for (i = n - 1; i >= 0; i--) d[i] = s[i];
+	}
 	return dest;
 }
 
@@ -343,8 +348,8 @@
 	Elf32_Ehdr ehdr;
 	Elf32_Phdr *phdrs, *phdr;
 #endif
-	void *dest;
-	int i;
+	void *dest, *src;
+	int i, direction;
 
 	memcpy(&ehdr, output, sizeof(ehdr));
 	if (ehdr.e_ident[EI_MAG0] != ELFMAG0 ||
@@ -364,7 +369,17 @@
 
 	memcpy(phdrs, output + ehdr.e_phoff, sizeof(*phdrs) * ehdr.e_phnum);
 
-	for (i = 0; i < ehdr.e_phnum; i++) {
+	/* Go two times through the list of program headers;
+	 * first forward, then backward.
+	 */
+	for (i = 0, direction = 1; i >= 0; i += direction) {
+
+		if (i >= ehdr.e_phnum) {
+			/* reached the end; turn back */
+			direction = -1;
+			continue;
+		}
+
 		phdr = &phdrs[i];
 
 		switch (phdr->p_type) {
@@ -375,9 +390,17 @@
 #else
 			dest = (void *)(phdr->p_paddr);
 #endif
-			memcpy(dest,
-			       output + phdr->p_offset,
-			       phdr->p_filesz);
+			src = output + phdr->p_offset;
+
+			/* order the copying to avoid overwriting */
+			if ((direction == 1  && dest <= src) ||
+			    (direction == -1 && dest > src)) {
+				/* note: unlike standard memcpy, this one
+				 * is safe to use when dest and src overlap.
+				 */
+				memcpy(dest, src, phdr->p_filesz);
+			}
+
 			break;
 		default: /* Ignore other PT_* */ break;
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inflation of vmlinux by linker on x86_64
  2008-09-26 12:42 Inflation of vmlinux by linker on x86_64 Joris van Rantwijk
@ 2008-09-26 18:52 ` H. Peter Anvin
  2008-09-26 19:50   ` Joris van Rantwijk
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2008-09-26 18:52 UTC (permalink / raw)
  To: Joris van Rantwijk; +Cc: linux-kernel

Joris van Rantwijk wrote:
>  #endif
> -			memcpy(dest,
> -			       output + phdr->p_offset,
> -			       phdr->p_filesz);
> +			src = output + phdr->p_offset;
> +
> +			/* order the copying to avoid overwriting */
> +			if ((direction == 1  && dest <= src) ||
> +			    (direction == -1 && dest > src)) {
> +				/* note: unlike standard memcpy, this one
> +				 * is safe to use when dest and src overlap.
> +				 */
> +				memcpy(dest, src, phdr->p_filesz);
> +			}
> +
>  			break;

Instead of adding a comment like this, we should simply rename it memmove().

Furthermore, we probably spend enough time copying that using a real 
memmove() implementation, using string instructions, would be good.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inflation of vmlinux by linker on x86_64
  2008-09-26 18:52 ` H. Peter Anvin
@ 2008-09-26 19:50   ` Joris van Rantwijk
  2008-09-26 20:07     ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Joris van Rantwijk @ 2008-09-26 19:50 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel


On 26 sep 2008, at 20:52, H. Peter Anvin wrote:
> Instead of adding a comment like this, we should simply rename it 
> memmove().

Yes. I tried, but it clashed with an existing memmove declaration in 
asm-x86/string_32.h.

What is the accepted solution for this?
Redefining memmove should be allowed, but then it could no longer be a 
static function.
Using the memmove implementation from the main kernel would be painful 
and ugly.
We could also define "__memmove()" plus "#define memmove __memmove", 
which would also be ugly.

> Furthermore, we probably spend enough time copying that using a real 
> memmove() implementation, using string instructions, would be good.

Are string instructions that much faster?
We can also get some speedup by copying ints instead of chars.

Joris.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inflation of vmlinux by linker on x86_64
  2008-09-26 19:50   ` Joris van Rantwijk
@ 2008-09-26 20:07     ` H. Peter Anvin
  2008-09-26 21:55       ` Joris van Rantwijk
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2008-09-26 20:07 UTC (permalink / raw)
  To: Joris van Rantwijk; +Cc: linux-kernel

Joris van Rantwijk wrote:
> 
> On 26 sep 2008, at 20:52, H. Peter Anvin wrote:
>> Instead of adding a comment like this, we should simply rename it 
>> memmove().
> 
> Yes. I tried, but it clashed with an existing memmove declaration in 
> asm-x86/string_32.h.
> 
> What is the accepted solution for this?
> Redefining memmove should be allowed, but then it could no longer be a 
> static function.
> Using the memmove implementation from the main kernel would be painful 
> and ugly.
> We could also define "__memmove()" plus "#define memmove __memmove", 
> which would also be ugly.
> 
>> Furthermore, we probably spend enough time copying that using a real 
>> memmove() implementation, using string instructions, would be good.
> 
> Are string instructions that much faster?
> We can also get some speedup by copying ints instead of chars.
> 

String instructions are indeed very much faster, especially on recent 
hardware where they are optimized in microcode.

In this case, I think the easiest thing to do is to provide an optimized 
memmove and not making it a static function.  I have a reasonably 
optimized memmove in 32-bit assembly at:

http://git.kernel.org/?p=boot/syslinux/syslinux.git;a=blob;f=com32/lib/memmove.S;hb=HEAD

A 64-bit implementation can be done on similar principles.

	-hpa


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inflation of vmlinux by linker on x86_64
  2008-09-26 20:07     ` H. Peter Anvin
@ 2008-09-26 21:55       ` Joris van Rantwijk
  2008-09-26 21:57         ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Joris van Rantwijk @ 2008-09-26 21:55 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

On Fri, Sep 26, 2008 at 01:07:11PM -0700, H. Peter Anvin wrote:
> In this case, I think the easiest thing to do is to provide an optimized 
> memmove and not making it a static function.  I have a reasonably 
> optimized memmove in 32-bit assembly at:
> http://git.kernel.org/?p=boot/syslinux/syslinux.git;a=blob;f=com32/lib/memmove.S;hb=HEAD

Ok.  Right now, I'd like to focus on the semantics, preferably without
bringing in new assembler code.  I renamed memcpy to memmove and
dropped "static".  We can always replace it with optimized code
as a separate patch.

There is still the question whether linking vmlinux with -n (turn off
file offset alignment) works on all architectures.  I suspect not.

Joris.

diff -urN linux-2.6.27-rc7/Makefile linux-2.6.27-rc7j/Makefile
--- linux-2.6.27-rc7/Makefile	2008-09-26 23:01:41.000000000 +0200
+++ linux-2.6.27-rc7j/Makefile	2008-09-26 12:21:34.000000000 +0200
@@ -577,7 +577,7 @@
 LDFLAGS_BUILD_ID = $(patsubst -Wl$(comma)%,%,\
 			      $(call ld-option, -Wl$(comma)--build-id,))
 LDFLAGS_MODULE += $(LDFLAGS_BUILD_ID)
-LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID)
+LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID) -n
 
 # Default kernel image to build when no specific target is given.
 # KBUILD_IMAGE may be overruled on the command line or
diff -urN linux-2.6.27-rc7/arch/x86/boot/compressed/misc.c linux-2.6.27-rc7j/arch/x86/boot/compressed/misc.c
--- linux-2.6.27-rc7/arch/x86/boot/compressed/misc.c	2008-09-26 23:01:41.000000000 +0200
+++ linux-2.6.27-rc7j/arch/x86/boot/compressed/misc.c	2008-09-26 23:02:14.000000000 +0200
@@ -27,6 +27,7 @@
 #include <linux/linkage.h>
 #include <linux/screen_info.h>
 #include <linux/elf.h>
+#include <linux/types.h>
 #include <asm/io.h>
 #include <asm/page.h>
 #include <asm/boot.h>
@@ -122,6 +123,7 @@
 
 #undef memset
 #undef memcpy
+#define memcpy		memmove
 #define memzero(s, n)	memset((s), 0, (n))
 
 typedef unsigned char	uch;
@@ -195,7 +197,7 @@
 static long bytes_out;
 
 static void *memset(void *s, int c, unsigned n);
-static void *memcpy(void *dest, const void *src, unsigned n);
+void *memmove(void *dest, const void *src, size_t n);
 
 static void __putstr(int, const char *);
 #define putstr(__x)  __putstr(0, __x)
@@ -281,13 +283,18 @@
 	return s;
 }
 
-static void *memcpy(void *dest, const void *src, unsigned n)
+/* memmove can not be static due to a declaration in asm-x86/string_32.h */
+void *memmove(void *dest, const void *src, size_t n)
 {
-	int i;
+	size_t i;
 	const char *s = src;
 	char *d = dest;
 
-	for (i = 0; i < n; i++) d[i] = s[i];
+	if (dest <= src) {
+		for (i = 0; i < n; i++) d[i] = s[i];
+	} else {
+		for (i = n; i > 0; i--) d[i-1] = s[i-1];
+	}
 	return dest;
 }
 
@@ -343,8 +350,8 @@
 	Elf32_Ehdr ehdr;
 	Elf32_Phdr *phdrs, *phdr;
 #endif
-	void *dest;
-	int i;
+	void *dest, *src;
+	int i, direction;
 
 	memcpy(&ehdr, output, sizeof(ehdr));
 	if (ehdr.e_ident[EI_MAG0] != ELFMAG0 ||
@@ -364,7 +371,17 @@
 
 	memcpy(phdrs, output + ehdr.e_phoff, sizeof(*phdrs) * ehdr.e_phnum);
 
-	for (i = 0; i < ehdr.e_phnum; i++) {
+	/* Go two times through the list of program headers;
+	 * first forward, then backward.
+	 */
+	for (i = 0, direction = 1; i >= 0; i += direction) {
+
+		if (i >= ehdr.e_phnum) {
+			/* reached the end; turn back */
+			direction = -1;
+			continue;
+		}
+
 		phdr = &phdrs[i];
 
 		switch (phdr->p_type) {
@@ -375,9 +392,14 @@
 #else
 			dest = (void *)(phdr->p_paddr);
 #endif
-			memcpy(dest,
-			       output + phdr->p_offset,
-			       phdr->p_filesz);
+			src = output + phdr->p_offset;
+
+			/* order the copying to avoid overwriting */
+			if ((direction == 1  && dest <= src) ||
+			    (direction == -1 && dest > src)) {
+				memmove(dest, src, phdr->p_filesz);
+			}
+
 			break;
 		default: /* Ignore other PT_* */ break;
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inflation of vmlinux by linker on x86_64
  2008-09-26 21:55       ` Joris van Rantwijk
@ 2008-09-26 21:57         ` H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2008-09-26 21:57 UTC (permalink / raw)
  To: Joris van Rantwijk; +Cc: linux-kernel

Joris van Rantwijk wrote:
> 
> There is still the question whether linking vmlinux with -n (turn off
> file offset alignment) works on all architectures.  I suspect not.
> 

Almost certainly not, since it creates a noncompliant ELF file.  For x86 
it is probably safe, though, but it needs to be an arch option.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-09-26 21:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-26 12:42 Inflation of vmlinux by linker on x86_64 Joris van Rantwijk
2008-09-26 18:52 ` H. Peter Anvin
2008-09-26 19:50   ` Joris van Rantwijk
2008-09-26 20:07     ` H. Peter Anvin
2008-09-26 21:55       ` Joris van Rantwijk
2008-09-26 21:57         ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox