public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Dalecki <dalecki@evision-ventures.com>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Linus Torvalds <torvalds@transmeta.com>,
	linux-kernel@vger.kernel.org
Subject: PATCH 2.4.14 mregparm=3 compilation fixes
Date: Mon, 12 Nov 2001 12:28:33 +0100	[thread overview]
Message-ID: <3BEFB261.700729A4@evision-ventures.com> (raw)
In-Reply-To: <E161SuY-000498-00@the-village.bc.nu> <3BEE7A34.BF9526FB@evision-ventures.com>

[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]

Hello out there!

The attached patch is fixing compilation and running
of the kernel with -mregparm=3 on IA32. The fixes excluding
the change in arch/i386/Makefile of course apply to the stock kernel
as well, so Linus please include it in 2.4.15 - it just won't hurt...

Well the benchmarks I intended to do (i.e. the byte unix bench)
where not quite conclusive, so I include the results here just
for reference. They where done on a PIII Celeron notebook running
at 700 MHz with 192 of RAM.

- reparm3.report was gathered with the patch applied.

- report was probed without the patch applied.

Maybe someone with more time and who has the proper infrastructure at
hand may provide here some more fine grained tests? 

The patch itself turned out to be much smaller and simpler than
what I did expect. However the space savings are quite significant,
in esp. respective a so small change in the kernel...

BTW. The -pipe compiler options doesn't give any speed advantage
on systems where /tmp is on tmpfs anylonger!

Have fun!

[-- Attachment #2: mregparm.patch --]
[-- Type: text/plain, Size: 4136 bytes --]

diff -ur linux-2.4.14-2/arch/i386/Makefile linux-mdcki/arch/i386/Makefile
--- linux-2.4.14-2/arch/i386/Makefile	Thu Apr 12 21:20:31 2001
+++ linux-mdcki/arch/i386/Makefile	Sat Nov 10 00:07:17 2001
@@ -21,7 +21,7 @@
 LDFLAGS=-e stext
 LINKFLAGS =-T $(TOPDIR)/arch/i386/vmlinux.lds $(LDFLAGS)
 
-CFLAGS += -pipe
+CFLAGS += -freg-struct-return -mregparm=3
 
 # prevent gcc from keeping the stack 16 byte aligned
 CFLAGS += $(shell if $(CC) -mpreferred-stack-boundary=2 -S -o /dev/null -xc /dev/null >/dev/null 2>&1; then echo "-mpreferred-stack-boundary=2"; fi)
diff -ur linux-2.4.14-2/arch/i386/boot/compressed/misc.c linux-mdcki/arch/i386/boot/compressed/misc.c
--- linux-2.4.14-2/arch/i386/boot/compressed/misc.c	Fri Oct  5 03:42:54 2001
+++ linux-mdcki/arch/i386/boot/compressed/misc.c	Sat Nov 10 00:02:08 2001
@@ -9,6 +9,7 @@
  * High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996
  */
 
+#include <linux/linkage.h>
 #include <linux/vmalloc.h>
 #include <linux/tty.h>
 #include <asm/io.h>
@@ -304,7 +305,7 @@
 	short b;
 	} stack_start = { & user_stack [STACK_SIZE] , __KERNEL_DS };
 
-void setup_normal_output_buffer(void)
+static void setup_normal_output_buffer(void)
 {
 #ifdef STANDARD_MEMORY_BIOS_CALL
 	if (EXT_MEM_K < 1024) error("Less than 2MB of memory.\n");
@@ -320,7 +321,7 @@
 	uch *high_buffer_start; int hcount;
 };
 
-void setup_output_buffer_if_we_run_high(struct moveparams *mv)
+static void setup_output_buffer_if_we_run_high(struct moveparams *mv)
 {
 	high_buffer_start = (uch *)(((ulg)&end) + HEAP_SIZE);
 #ifdef STANDARD_MEMORY_BIOS_CALL
@@ -342,7 +343,7 @@
 	mv->high_buffer_start = high_buffer_start;
 }
 
-void close_output_buffer_if_we_run_high(struct moveparams *mv)
+static void close_output_buffer_if_we_run_high(struct moveparams *mv)
 {
 	if (bytes_out > low_buffer_size) {
 		mv->lcount = low_buffer_size;
@@ -355,7 +356,7 @@
 }
 
 
-int decompress_kernel(struct moveparams *mv, void *rmode)
+asmlinkage int decompress_kernel(struct moveparams *mv, void *rmode)
 {
 	real_mode = rmode;
 
diff -ur linux-2.4.14-2/arch/i386/kernel/bluesmoke.c linux-mdcki/arch/i386/kernel/bluesmoke.c
--- linux-2.4.14-2/arch/i386/kernel/bluesmoke.c	Thu Oct 11 18:04:57 2001
+++ linux-mdcki/arch/i386/kernel/bluesmoke.c	Sat Nov 10 02:24:25 2001
@@ -100,11 +100,11 @@
 
 /*
  *	Call the installed machine check handler for this CPU setup.
- */ 
- 
+ */
+
 static void (*machine_check_vector)(struct pt_regs *, long error_code) = unexpected_machine_check;
 
-void do_machine_check(struct pt_regs * regs, long error_code)
+asmlinkage void do_machine_check(struct pt_regs * regs, long error_code)
 {
 	machine_check_vector(regs, error_code);
 }
diff -ur linux-2.4.14-2/arch/i386/math-emu/fpu_proto.h linux-mdcki/arch/i386/math-emu/fpu_proto.h
--- linux-2.4.14-2/arch/i386/math-emu/fpu_proto.h	Wed Dec 10 02:57:09 1997
+++ linux-mdcki/arch/i386/math-emu/fpu_proto.h	Sat Nov 10 02:31:22 2001
@@ -53,7 +53,7 @@
 extern void fst_i_(void);
 extern void fstp_i(void);
 /* fpu_entry.c */
-extern void math_emulate(long arg);
+asmlinkage extern void math_emulate(long arg);
 extern void math_abort(struct info *info, unsigned int signal);
 /* fpu_etc.c */
 extern void FPU_etc(void);
diff -ur linux-2.4.14-2/include/linux/kernel.h linux-mdcki/include/linux/kernel.h
--- linux-2.4.14-2/include/linux/kernel.h	Fri Nov  9 20:11:22 2001
+++ linux-mdcki/include/linux/kernel.h	Sun Nov 11 12:35:46 2001
@@ -51,7 +51,7 @@
 extern struct notifier_block *panic_notifier_list;
 NORET_TYPE void panic(const char * fmt, ...)
 	__attribute__ ((NORET_AND format (printf, 1, 2)));
-NORET_TYPE void do_exit(long error_code)
+asmlinkage NORET_TYPE void do_exit(long error_code)
 	ATTRIB_NORET;
 NORET_TYPE void complete_and_exit(struct completion *, long)
 	ATTRIB_NORET;
diff -ur linux-2.4.14-2/kernel/sched.c linux-mdcki/kernel/sched.c
--- linux-2.4.14-2/kernel/sched.c	Fri Nov  9 19:56:42 2001
+++ linux-mdcki/kernel/sched.c	Sat Nov 10 02:07:01 2001
@@ -515,7 +515,7 @@
 #endif /* CONFIG_SMP */
 }
 
-void schedule_tail(struct task_struct *prev)
+asmlinkage void schedule_tail(struct task_struct *prev)
 {
 	__schedule_tail(prev);
 }

[-- Attachment #3: regparm3.report --]
[-- Type: text/plain, Size: 3083 bytes --]


  BYTE UNIX Benchmarks (Version 3.11)
  System -- Linux kozaczek 2.4.14-mdcki #15 nie lis 11 12:35:45 CET 2001 i686 unknown
  Start Benchmark Run: nie lis 11 14:40:32 CET 2001
   1 interactive users.
Dhrystone 2 without register variables   1263066.6 lps   (10 secs, 6 samples)
Dhrystone 2 using register variables     1264480.5 lps   (10 secs, 6 samples)
Arithmetic Test (type = arithoh)         3179144.1 lps   (10 secs, 6 samples)
Arithmetic Test (type = register)        188804.1 lps   (10 secs, 6 samples)
Arithmetic Test (type = short)           190760.8 lps   (10 secs, 6 samples)
Arithmetic Test (type = int)             188823.6 lps   (10 secs, 6 samples)
Arithmetic Test (type = long)            189990.7 lps   (10 secs, 6 samples)
Arithmetic Test (type = float)           182915.1 lps   (10 secs, 6 samples)
Arithmetic Test (type = double)          183937.8 lps   (10 secs, 6 samples)
System Call Overhead Test                363784.1 lps   (10 secs, 6 samples)
Pipe Throughput Test                     415828.7 lps   (10 secs, 6 samples)
Pipe-based Context Switching Test        196984.2 lps   (10 secs, 6 samples)
Process Creation Test                      3378.5 lps   (10 secs, 6 samples)
Execl Throughput Test                       619.3 lps   (9 secs, 6 samples)
File Read  (10 seconds)                  1327798.0 KBps  (10 secs, 6 samples)
File Write (10 seconds)                  138593.0 KBps  (10 secs, 6 samples)
File Copy  (10 seconds)                   19076.0 KBps  (10 secs, 6 samples)
File Read  (30 seconds)                  1337240.0 KBps  (30 secs, 6 samples)
File Write (30 seconds)                  147663.0 KBps  (30 secs, 6 samples)
File Copy  (30 seconds)                   14968.0 KBps  (30 secs, 6 samples)
C Compiler Test                             388.7 lpm   (60 secs, 3 samples)
Shell scripts (1 concurrent)               1065.8 lpm   (60 secs, 3 samples)
Shell scripts (2 concurrent)                562.8 lpm   (60 secs, 3 samples)
Shell scripts (4 concurrent)                287.0 lpm   (60 secs, 3 samples)
Shell scripts (8 concurrent)                146.0 lpm   (60 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places          28902.2 lpm   (60 secs, 6 samples)
Recursion Test--Tower of Hanoi            16393.7 lps   (10 secs, 6 samples)


                     INDEX VALUES            
TEST                                        BASELINE     RESULT      INDEX

Arithmetic Test (type = double)               2541.7   183937.8       72.4
Dhrystone 2 without register variables       22366.3  1263066.6       56.5
Execl Throughput Test                           16.5      619.3       37.5
File Copy  (30 seconds)                        179.0    14968.0       83.6
Pipe-based Context Switching Test             1318.5   196984.2      149.4
Shell scripts (8 concurrent)                     4.0      146.0       36.5
                                                                 =========
     SUM of  6 items                                                 435.9
     AVERAGE                                                          72.6

[-- Attachment #4: report --]
[-- Type: text/plain, Size: 3077 bytes --]


  BYTE UNIX Benchmarks (Version 3.11)
  System -- Linux kozaczek 2.4.14-2 #1 pi± lis 9 22:22:10 CET 2001 i686 unknown
  Start Benchmark Run: nie lis 11 16:10:53 CET 2001
   1 interactive users.
Dhrystone 2 without register variables   1263134.8 lps   (10 secs, 6 samples)
Dhrystone 2 using register variables     1263583.6 lps   (10 secs, 6 samples)
Arithmetic Test (type = arithoh)         3177830.7 lps   (10 secs, 6 samples)
Arithmetic Test (type = register)        189076.1 lps   (10 secs, 6 samples)
Arithmetic Test (type = short)           190665.1 lps   (10 secs, 6 samples)
Arithmetic Test (type = int)             188753.5 lps   (10 secs, 6 samples)
Arithmetic Test (type = long)            190094.2 lps   (10 secs, 6 samples)
Arithmetic Test (type = float)           182872.2 lps   (10 secs, 6 samples)
Arithmetic Test (type = double)          183902.9 lps   (10 secs, 6 samples)
System Call Overhead Test                360235.7 lps   (10 secs, 6 samples)
Pipe Throughput Test                     421456.7 lps   (10 secs, 6 samples)
Pipe-based Context Switching Test        194915.8 lps   (10 secs, 6 samples)
Process Creation Test                      3605.4 lps   (10 secs, 6 samples)
Execl Throughput Test                       608.6 lps   (9 secs, 6 samples)
File Read  (10 seconds)                  1294487.0 KBps  (10 secs, 6 samples)
File Write (10 seconds)                  138403.0 KBps  (10 secs, 6 samples)
File Copy  (10 seconds)                   19158.0 KBps  (10 secs, 6 samples)
File Read  (30 seconds)                  1278293.0 KBps  (30 secs, 6 samples)
File Write (30 seconds)                  147556.0 KBps  (30 secs, 6 samples)
File Copy  (30 seconds)                   15129.0 KBps  (30 secs, 6 samples)
C Compiler Test                             388.8 lpm   (60 secs, 3 samples)
Shell scripts (1 concurrent)               1063.2 lpm   (60 secs, 3 samples)
Shell scripts (2 concurrent)                563.1 lpm   (60 secs, 3 samples)
Shell scripts (4 concurrent)                287.4 lpm   (60 secs, 3 samples)
Shell scripts (8 concurrent)                145.7 lpm   (60 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places          28576.1 lpm   (60 secs, 6 samples)
Recursion Test--Tower of Hanoi            16445.3 lps   (10 secs, 6 samples)


                     INDEX VALUES            
TEST                                        BASELINE     RESULT      INDEX

Arithmetic Test (type = double)               2541.7   183902.9       72.4
Dhrystone 2 without register variables       22366.3  1263134.8       56.5
Execl Throughput Test                           16.5      608.6       36.9
File Copy  (30 seconds)                        179.0    15129.0       84.5
Pipe-based Context Switching Test             1318.5   194915.8      147.8
Shell scripts (8 concurrent)                     4.0      145.7       36.4
                                                                 =========
     SUM of  6 items                                                 434.5
     AVERAGE                                                          72.4

  parent reply	other threads:[~2001-11-12 10:36 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-11-06  7:18 Using %cr2 to reference "current" H. Peter Anvin
2001-11-06  8:01 ` Robert Love
2001-11-06 10:55   ` Alan Cox
2001-11-06 17:31     ` Michael Barabanov
2001-11-06 14:14   ` Manfred Spraul
2001-11-06 10:58 ` Alan Cox
2001-11-06 17:04   ` Linus Torvalds
2001-11-06 17:46     ` Alan Cox
2001-11-06 17:59       ` Linus Torvalds
2001-11-06 18:14         ` Alan Cox
2001-11-06 16:55           ` Marcelo Tosatti
2001-11-06 18:14           ` Linus Torvalds
2001-11-06 18:31             ` Alan Cox
2001-11-06 22:38               ` Linus Torvalds
2001-11-07  0:00           ` Martin Dalecki
2001-11-06 23:19             ` Alan Cox
2001-11-07  0:43               ` Martin Dalecki
2001-11-07  0:27                 ` Alan Cox
2001-11-07  0:35                 ` Jeff Garzik
2001-11-07 14:00               ` Martin Dalecki
2001-11-07 13:38                 ` Alan Cox
2001-11-07 14:59                   ` Martin Dalecki
2001-11-07 14:17                     ` Alan Cox
2001-11-07 14:34                       ` Dirk Moerenhout
2001-11-07 14:54                         ` Alan Cox
2001-11-07 15:32                           ` David Howells
2001-11-07 14:39                       ` Intel compiler [Re: Using %cr2 to reference "current"] Sebastian Heidl
2001-11-07 22:05                         ` lists
2001-11-07 15:36                       ` Using %cr2 to reference "current" Martin Dalecki
2001-11-08 14:08                       ` Martin Dalecki
2001-11-13 16:49                       ` Merge BUG in 2.4.15-pre4 serial.c Martin Dalecki
2001-11-13 16:21                         ` Russell King
2001-11-13 17:37                           ` Martin Dalecki
2001-11-13 16:53                             ` Russell King
2001-11-13 18:05                               ` Martin Dalecki
2001-11-13 17:11                             ` Alan Cox
2001-11-13 18:23                               ` Martin Dalecki
2001-11-07 20:04                   ` Using %cr2 to reference "current" Andrew Morton
2001-11-11 13:16                   ` Martin Dalecki
2001-11-11 13:06                     ` Keith Owens
2001-11-12 11:28                     ` Martin Dalecki [this message]
2001-11-12 16:10                       ` PATCH 2.4.14 mregparm=3 compilation fixes Keith Owens
2001-11-12 16:25                         ` Christoph Hellwig
2001-11-12 17:56                         ` Martin Dalecki
2001-11-12 16:42                       ` Linus Torvalds
2001-11-12 18:51                         ` Martin Dalecki
2001-11-12 20:05                           ` Corsspatch patch-2.4.15-pre2 patch-2.4.15-pre3 Martin Dalecki
2001-11-12 20:13                             ` BUG BUG hunt the bugs!!! patch-2.4.15-pre5 Martin Dalecki
2001-11-06 17:02 ` Using %cr2 to reference "current" Linus Torvalds
2001-11-06 17:13   ` Benjamin LaHaise
2001-11-06 17:49     ` Linus Torvalds
2001-11-06 18:19       ` Alan Cox
2001-11-09 21:52         ` Jamie Lokier
2001-11-06 18:42       ` Benjamin LaHaise
2001-11-06 19:09         ` H. Peter Anvin
2001-11-06 19:16         ` Dave Jones
2001-11-06 20:10           ` Ricky Beam
2001-11-06 23:09           ` Alan Cox
2001-11-06 23:15             ` Dave Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3BEFB261.700729A4@evision-ventures.com \
    --to=dalecki@evision-ventures.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=dalecki@evision.ag \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox